ABSTRACT GEORGE, BOBY, Analysis and Quantification of Test D

更新时间:2023-04-16 05:33:01 阅读量: 实用文档 文档下载

说明:文章内容仅供预览,部分内容可能不全。下载后的文档,内容与下面显示的完全一致。下载之前请确认下面内容是否您想要的,是否完整无缺。

ABSTRACT

GEORGE, BOBY, Analysis and Quantification of Test Driven Development Approach. (Under the direction of Dr. Laurie Ann Williams.)

Software industry is increasingly becoming more demanding on development schedules and resources. Often, software production deals with ever-changing requirements and with development cycles measured in weeks or months. To respond to these demands and still produce high quality software, over years, software practitioners have developed a number of strategies. One of the more recent one is Test Driven Development (TDD). This is an emerging object-oriented development practice that purports to aid in producing high quality software quickly. TDD has been popularized through the Extreme Programming (XP) methodology. TDD proponents profess that, for small to mid-size software, the technique leads to quicker development of higher quality code. Anecdotal evidence supports this. However, until now there has been little quantitative empirical support for this TDD claim.

The work presented in this thesis is concerned with a set of structured TDD experiments on very small programs with pair programmers. Programmers were both students and professionals. In each programmer category (students and professionals), one group used TDD and the other (control group) a waterfall-like software development approach. The experiments provide some interesting observations regarding TDD.

When TDD was used, both student and professional TDD developers appear to achieve higher code quality, as measured using functional black box testing. The TDD student pairs passed 16% more test cases while TDD

professional pair passed 18% more test cases than the their corresponding control groups.

However, professional TDD developer pairs did spent about 16% more time on development. It was not established whether the increase in the quality was due to extra development time, or due to the TDD development process itself. On the other hand, the student experiments were time-limited. Both the TDD and the non-TDD student programmers had to complete the assignment in 75 minutes. Professional programmers took about 285 minutes on the average, to complete the same assignment. Consequently, the development cycle of the student-developed software was severely constrained and the resulting code was underdeveloped and of much poorer quality than the professional code. Still, it is interesting to note that the code developed using the TDD approach under these severe restrictions appears to be less faulty than the one developed using the more classical waterfall-like approach. It is conjectured that this may be due to the granularity of the TDD process, one to two test cases per feedback loop, which may encourage more frequent and tighter verification and validation episodes. These tighter cycles may result in a code that is better when compared to that developed by a coarser granularity waterfall-like model.

As part of the study, a survey was conducted of the participating programmers. The majority of the programmers thought that TDD was an effective approach, which improved their productivity.

Analysis and Quantification of Test Driven Development

Approach

By

Boby George

A thesis submitted to the Graduate Faculty of

North Carolina State University

in partial fulfillment of the

requirements for the Degree of

Master of Science

COMPUTER SCIENCE

Raleigh

2002

APPROVED BY:

___________________________________

Chair of Advisory Committee

PERSONAL BIOGRAPHY

Boby George is a graduate student in the Department of Computer Science, North Carolina State University. In 1999, he got his Bachelor of Technology in Computer Science and Engineering from University of Kerala, India. After his graduation, he worked as a system executive at eFunds International India Private Limited. His responsibilities included the automation of the testing process and preparation of test cases. In his graduate study, from August 2001 at NC State, Boby George focused on agile software development methodologies, in particular the Test Driven Development approach.

ii

ACKNOWLEDGEMENTS

It is the hard work and contribution of many and not one that made this work possible. First and foremost, thanks to my research committee members, in particular the chair, Dr. Laurie Williams for proposing the topic and for all your persistent effort and guidance. A special thanks to Dr. Mladen Vouk for all your detailed review and suggestions that made this thesis work more accurate and to Dr. Aldo Dagnino, thank you for your keen interest in my research work.

To the Fall 2001 undergraduate software engineering students of North Carolina State University, the John Deere, RoleModel Software, and Ericsson developers, thank you for participating in the long and strenuous experiments. Also, I express my gratitude to Ken Auer for arranging the RoleModel Software experiment and giving valuable insight, to Doug Taylor for arranging John Deere experiment, and to Lisa Woodring for arranging the Ericsson experiment. Lastly, I express my appreciation to AT&T who provided the research funding for this work.

iii

TABLE OF CONTENTS

LIST OF TABLES (vi)

LIST OF FIGURES (vii)

1. INTRODUCTION (1)

1.1 Research motivation (1)

1.2 Historical perspective (2)

1.3 Research approach (5)

1.4 Thesis layout (7)

2. RELATED WORK (8)

2.1 Extreme Programming (XP) (8)

2.2 Unit Testing (9)

2.3 Software Models (11)

2.4 Test Driven Development (TDD) (12)

2.5 Differences between TDD and other models (14)

2.6 Refactoring (16)

2.7 Pair programming (17)

2.8 Object Oriented Metrics (18)

3. TESTING METHODOLOGIES (22)

3.1 Testing methodology in OO Development (22)

3.2 Unit testing (24)

3.3 Various Types of Testing (27)

3.3 Testability (28)

3.4 Mean Time Between Failure (30)

3.5 Code coverage (30)

3.6 Limitations of testing process (32)

4. TEST DRIVEN DEVELOPMENT (33)

4.1 Traditional OOP approaches (33)

4.1.1 Limitations of traditional OO development approaches (36)

4.2 TDD Explained (37)

4.2.1 TDD without High/Low Level Design (40)

4.2.2 Evolutionary software process models and TDD (41)

4.2.3 Reusability, Design Patterns and TDD (43)

5. EXPERIMENT DESIGN (45)

5.1 Basics of Software Engineering Experimentation (45)

5.2 Basics of Statistics (46)

5.2.1 Measures of Central Tendency (47)

5.2.2 Measures of Variability (47)

5.2.3 Box Plots (48)

5.3 Statistical significant analysis (48)

5.3.1 Normal approximation (49)

5.3.2 Spearman’s Rho Test (50)

6. EXPERIMENTAL RESULTS (51)

6.1 Experiment Details (51)

6.2 External Validity (52)

6.3 Quantitative Analysis (55)

iv

6.3.1 External code quality (55)

6.3.2 Internal code quality (58)

6.3.3 Productivity (61)

6.4 Qualitative Analysis (63)

6.5 Code coverage (65)

6.6 Relationship between quality and test cases developed (67)

7. CONCLUSION (68)

7.1 Future Work (69)

REFERENCE: (71)

APPENDIX A: Research Approach (75)

APPENDIX B: Experiments conducted with students (76)

APPENDIX C: Experiments conducted with professionals (83)

APPENDIX D: Raw Data of all Experiments (95)

APPENDIX E: Survey on Test Driven Development (102)

APPENDIX F: Problem statement TDD version (104)

APPENDIX G: Problem statement non-TDD version (105)

v

LIST OF TABLES

Table 1: Summary of metrics evaluation of student code (58)

Table 2: Summary of metrics evaluation of professional developers (60)

Table 3: Test case used for student code evaluation (78)

Table 4: Number of Test cases passed (79)

Table 5: Detailed metrics evaluation of student code (79)

Table 6: Mean Time Taken by professional programmers (85)

Table 7: Test cases used for professional code evaluation (86)

Table 8: Number of new Test Cases Passed (88)

Table 9: Detailed metrics evaluation of professional developers’ code (88)

Table 10: Professional Developers’ Survey Results (91)

Table 11: Raw Data Table on Test Cases Passed by Student Developers (95)

Table 12: Raw Data Table on Test Cases Passed by Professional Developers.97 Table 13: Raw Metrics Data Table of Student Code (98)

Table 14: Raw Metrics Data Table of professional code (99)

Table 15: Time Taken by Each Professional Programming Pair (101)

vi

LIST OF FIGURES

Figure 1: Comparison of various Software Models (16)

Figure 2: Box plot for Test Cases Passed by Students’ Code (56)

Figure 3: Box plot for Test Cases Passed by Professional Developers’ Code (57)

Figure 4: Box plot of Time Taken by Professional Developers (61)

Figure 5: Box Plot of Code Coverage by Professional Developers (66)

Figure 6: Quality vs. Test Cases Written (67)

vii

1. INTRODUCTION

1.1 Research motivation

Software development is the evolution of a concept from design to implementation. But unlike other fields, the actual implementation need not and most often does not, correspond to the initial design. Unlike building a bridge from its design on paper, software development is more of a dynamic process. It is dynamic in that changes in requirements, development technologies and market conditions necessitate continuous modifications to the design of the software being developed. Michael Feathers, a leading consultant, asserts “There is little correlation between that [design] formality and quality of the resulting design. At times, there even appears to be an inverse correlation” [1].

Test Driven Development (TDD) [2], is a practice that Kent Beck, an originator of Extreme Programming (XP) [3-6], has been advocating. TDD is also known by other names such as, Test First Design (TFD), Test First Programming (TFP) and Test Driven Design (TDD) [1, 7]. The TDD approach evolves the design of a system starting with the unit test cases of an object. Writing test cases and implementing that object or object methods then triggers the need for other objects/methods.

An object is the basic building block of Object Oriented Programming (OOP). Unless objects are designed judiciously, dependency problems, such as tight coupling of objects or fragile super classes (inadequate encapsulation) can creep in. These problems could result in a large and complex code base that compiles and runs slowly. Rather than building objects that we think we need

1

(due to improper understanding of requirements or incomplete specifications), the TDD approach guides us to build objects we know we need (based on test cases). TDD proponents anecdotally contend that, for small to mid-size systems, this approach leads to the implementation of a simpler design than that would result from a detailed upfront design. Moreover, the continuous regression testing done with TDD appears to improve the code quality.

Although intriguing, the TDD concept needs exhaustive analysis. Software practitioners are sometimes concerned about the lack of upfront design coupled with the ensuing need to make design decisions at every stage of development. Also questions such as what to test and how to test exist. However, many software developers anecdotally contend that TDD is more effective than traditional methodologies [7, 8]. This necessitates the need to empirically analyze and quantify the effectiveness of this approach to validate or invalidate these anecdotal claims.

1.2 Historical perspective

In TDD the design, coding and testing phases are integrated into a single phase. Such processes are named as single-phase software development processes. The earliest single phase software development process is the ad-hoc model [9] (which was also the first software development model), where the focus was to deliver the code quickly. The lack of formal documentation in ad-hoc model led to repeatability problems, very poor quality control and code improvement troubles, as only the original developers were knowledgeable about the implementation details of the software developed [9].

2

A structured system development approach, or waterfall model [9], was introduced to reduce the problems associated with the ad-hoc model. However, the strict adherence to top-down sequential and independent phases (analysis, design, code and test) followed in waterfall model necessitated the need for through understanding of requirements. The problem faced by the structured development model (complete understanding of requirements) led to the development of evolutionary software development models (where software evolves in iterative cycles) which required less up-front information and offered greater flexibility [9]. Incremental model is an evolutionary software development approach which comprise of short increments of the waterfall model in an overlapping fashion [10].

Developing new systems, where requirements are unclear, involves considerable risks. Such risks can be reduced by techniques like prototyping [10]. Prototyping, in which a simplified replica of the system is built, helps to understand the feasibility of developing the proposed system and the development approach used. After the development approach is decided, the prototypes are discarded. Alternately, models in which the prototypes are grown and refined into the final product are named evolutionary prototyping models. A common evolutionary prototyping model is the spiral model [9, 10]. The spiral model provides potential for rapid development of incremental versions of software through the use of non-overlapping iterations [9].

Along with the improvements in development process, software testing strategies also underwent considerable changes. The most popular testing model

3

is the V model [11] that stresses close integration of testing with the software analysis and development phases. The model specifies that unit testing verifies code implementation, integration testing verifies low-level design, system testing verifies system design, and acceptance testing validates business requirements [11]. Hence, the model (in which testing is done parallel with development) increases the prominence of testing throughout the development process.

In order to achieve a higher success rate in meeting the system objectives, the software models need to include Verification and Validation (V&V). Verification focuses on building the software right, while validation focuses on building the right software [12]. The IEEE Standard for software verification and validation plans [13] specifies that V&V be performed in parallel with software development. Iterative in nature, the V&V standard guides early defect detection and correction, reduced project risk and cost, and enhanced software quality and reliability.

The development of the TDD approach appears to be inspired from existing software development and testing strategies. TDD is based on bottom-up implementation strategy, where the implementation starts with lowest modules or lowest level of code. The various other development/testing strategies on which TDD appears to be based include incremental (as the approach proceeds in increments) and the V-model (as the approach also specifies for a tight test-code cycle). However, TDD seems to differ from other models such as the V-model in the level of granularity (the number of test cases written before implementation code per feedback loop). TDD specifies a higher granularity i.e.

4

one or two unit test be written first then the implementation code instead of thinking about several unit tests before code as specified in the V-model. The incremental, V-model bottom-up strategy that TDD seems to follow necessitates writing test drivers and mock objects to simulate the modules that are not implemented yet. A comparison of TDD with other software models is included in section 2.5.

1.3 Research approach

This research empirically examines TDD approach to see if the following three hypotheses hold:

?The TDD approach will yield code with superior external code quality when compared with code developed with a more traditional waterfall-

like model practice. External code quality will be assessed based on

the number of functional (black box test cases) test cases passed.

?The TDD approach will yield code with superior internal code quality when compared with code developed with a more traditional waterfall-

like model practice. Internal code quality will be assessed based on

established Object-Oriented Design (OOD) metrics. The definition of

superior value or desirable value is specified in table 2.

?Programmers who practice TDD approach will develop code quicker, than developers who develop code with a more traditional waterfall-like

model practice. Programmer speed will be measured by the time to

complete (hours) a specified program.

5

To investigate these hypotheses, research data was collected from structured experiments conducted with student and professional developers, who developed a very small application using OOP methodology. The experiments were conducted four times – first with 138 advanced undergraduate (juniors and seniors) students and then three times with professional developers (eight developers each from three companies).

All developers were randomly assigned to two groups, TDD and non-TDD pairs. Since all developers were familiar with pair programming concept (a practice where two programmers develop software side by side in one computer) [14] and two professional organizations development groups used pair programming for their day-to-day programming, all experiments were conducted with pair programming practice. Each pair was asked to develop a bowling game application. The typical size of the code developed was less than 200 lines of code (LOC). The specifications of this game were adapted from a XP episode [15] and are attached in Appendices F and G. The non-TDD pairs developed the application using the conventional waterfall-like approach while the TDD pairs used test-then-code approach. The controlled experiment environment ensured that the developer pairs used their assigned approaches.

The code developed was analyzed for quality based on functional test cases and on object-oriented quality metrics. The functional test cases used for the evaluation (of both TDD and non-TDD codes) were developed by the researcher and were different from the test cases the pairs developed. Additionally, the amount of time taken by each professional pair was recorded.

6

Code coverage analysis was done to understand the efficiency of the test cases written by the developers. Lastly, the qualitative impressions of the developers on the effectiveness of the technique were also recorded.

1.4 Thesis layout

The remainder of this thesis starts with a presentation on related work, including a brief introduction to XP. The second chapter explains TDD and its related topics, unit testing, existing software models and refactoring. Also a comparison of the test-then-code model and code-then-test model with TDD is presented in chapter 2. Chapter 3 focuses on testing, in particular unit testing, and the related concepts such as coverage and boundary analysis. Chapter 4 delineates traditional the OOP approach and explains in detail the working principle of TDD with an example. Chapter 5 lists the details of the experiment design and the talks about the various statistical tests used in this research work. Chapter 6 presents the external validity issues of the experiments conducted and analyses the experimental data to draw inferences and conclusions on the effectiveness of TDD. Chapter 7 summarizes the major conclusions of this research and suggests future related research. The Appendices expound each of the experiments conducted.

7

2. RELATED WORK

This chapter provides a survey of concepts relevant to TDD starting with the XP methodology and TDD as practice of XP. The chapter succinctly explains the TDD model, other test-then-code and code-then-test models followed by a comparison of the models. The chapter concludes with a discussion on the software metrics (and their limitations).

2.1 Extreme Programming (XP)

The current software industry faces an uncertain environment as stated by the Uncertainty Principle - “Uncertainty is inherent and inevitable in software development process and products” [16, 17]. However, the need for producing high quality software has become paramount due to increased competition. Although processes like Personal Software Process (PSP) [17] and Rational Unified Process (RUP) [18] provide guidelines on delivering quality software, the industry is concerned with delivering quality software faster while handling requirement changes efficiently. These concerns have led to the development of agile software development models [19]. Among the principles of agile models, the following are noteworthy: use light-but-sufficient rules of project behavior and use human- and communication-oriented rules. Agile models view process secondary to the primary goal of delivering software, i.e. the development process should adapt to project requirements.

In the late 1980s, Kent Beck, Ward Cunningham, and Ron Jeffries formulated XP, the most popular agile methodology [20]. XP (which is based on four values: simplicity, communication, feedback and courage,) is a compilation

8

of twelve practices of software development. TDD is one of those twelve practices. Major accomplishments of XP include the development of the Chrysler Comprehensive Compensation (C3) system, an application consisting of 2,000 classes and 30,000 methods, and Fords’ VCAPS system [21, 22]. However, like TDD, the success of XP is anecdotal. The popularity of XP is often attributed to the sound approach it takes in dealing with complexity and constant changes occurring in today’s software projects [23].

Fowler [24] attributes the success of the XP methodology to its strong emphasis on testing. XP relies on TDD for code development. It might be claimed that XP achieves its values of simplicity, courage and feedback by practicing TDD. The simplicity advocated in XP’s principle - “Do the simplest thing that could possibly work” is achieved by TDD’s approach of implementing only the minimal set of objects required. With TDD, test cases are written for each and every function. The existence of this strong testing framework aids developers in being confident in making required modifications without “breaking” existing code. Finally, the higher granularity of test-code cycle gives the developer constant feedback on defects in the code.

2.2 Unit Testing

Unit testing, performed by developers [25], is the practice of testing a unit of code (like a class in OOP) in isolation to verify whether all code in the inpidual unit performs as expected (that is, the unit gives proper results). The proper results are derived from system requirements and design specification. The aim of unit testing is to find defects in the software and not to demonstrate

9

the absence of the same [26]. Unit testing is very effective when done frequently and independently (i.e. running a unit test does not affect other unit tests). Unit testing independently reduces testing complexity and eliminates any side effects that might occur when all tests run together.

Proponents of Object Oriented (OO) development and agile methodologies advocate writing unit test code in parallel with code [27]. While many developers recommend to, not only think but also write unit test cases before writing production code [25]. Conversely, some developers strongly argue that, ideally, system testing (integrated testing of the application being developed) is the only required testing and unit testing is done to increase the speed of development [28].

Testing all the permutations and combinations of inputs is not practical. Hence, the set of tests to be performed should be decided prudently. Testing is often not practiced by developers. As a result, when pressure to meet deadlines increase, thorough testing might not take place [29]. As a result, unit testing demands a change in the practices of many developers.

JUnit1, developed by Kent Beck and Erich Gamma, is a popular testing framework tool for unit testing applications written in Java. The tool provides a framework for creating test cases, stubs, and harnesses with which unit testing can be automated for regression testing. Such testing stabilizes (decrease the defect rate) the system [25]. Similar to JUnit, there exists a series of xUnits

1 f10e636b7e21af45b307a849/index.htm

10

(HttpUnit2, CppUnit3, DelphiUnit4 and others) for different languages. Currently, the xUnit frameworks lack capabilities for testing Graphical User Interface (GUI).

2.3 Software Models

The waterfall model, also called as the linear sequential model, is a systematic, sequential approach to software development which was originally suggested by Winston Royce in 1970 [30]. The model, which originally had feedback loops, consisted of four main phase: analysis (requirements are gathered and documented), design (translation of the requirements into a representation of software), code (translation of design into machine-readable form) and test (verifies the correctness of the implementation) [12]. Although waterfall is the oldest model in use, it has many problems such as difficulty in stating all the requirements explicitly early in the process and long gestation period and blocking states due to the sequential nature of work flow [31, 32] . The non-TDD pairs (control group) in our experiment used the waterfall-like model for code development (i.e. they had a design, code and test phases followed by a debugging phase).

As stated before, the V-model emphasizes tight integration of testing in all phases of software development. The V-model, defined by Paul Rook in 1980, aims to improve the efficiency and effectiveness of software development [11]. The model depicts the entire process as “V” with development activities on left side (going downwards) and the corresponding test activities needed on the right

2 f10e636b7e21af45b307a849

3 f10e636b7e21af45b307a849/

4 f10e636b7e21af45b307a849/

11

side (going upwards) [11]. The various stages of lifecycle and the test cases that need to be developed in parallel include business requirements (develop acceptance test cases), high-level design (develop system test cases), low-level design (develop integration tests) and code (develop unit tests). As with TDD, the V-model specifies that the design of test cases must occur first (before the coding phase) and that various levels of testing be done in parallel with development phases [33]. Detractors of the V-model contend that the model fails to define in which sequence the units need to be build and tested, and that the model is suited only when the requirements are clearly defined [33].

2.4 Test Driven Development (TDD)

TDD is an emerging practice that is professed to produce high quality software quickly, while dealing with inevitable design and requirement changes. The TDD approach, used sporadically by developers for decades [34], was recently popularized by XP methodology. With TDD, automated unit test code is written before implementation code is written. New functionality is not considered to be properly implemented unless unit test cases for the new functionality and every other unit test cases ever written for the code base succeed. An important rule in TDD is: “If you can’t write test for what you are about to code, then you shouldn’t even be thinking about coding” [7]. Also when a defect is found later in the development cycle or by the customer, it is a TDD practice to write unit test cases that exposes the defect, before fixing it.

Some believe that strict adherence to TDD can help to minimize, if not eliminate, the need for upfront design. In addition, the practice is highly flexible

12

本文来源:https://www.bwwdw.com/article/xzsq.html

Top