Abstract
Techniques for detecting defects in source code are fundamental to the success of any software development approach. A software development organization therefore needs to understand the utility of techniques such as reading or testing in its own environment. Controlled experiments have proven to be an effective means for evaluating software engineering techniques and gaining the necessary understanding about their utility. This paper presents a characterization scheme for controlled experiments that evaluate defect-detection techniques. The characterization scheme permits the comparison of results from similar experiments and establishes a context for cross-experiment analysis of those results. The characterization scheme is used to structure a detailed survey of four experiments that compared reading and testing techniques for detecting defects in source code. We encourage educators, researchers, and practitioners to use the characterization scheme in order to develop and conduct further instances of this class of experiments. By repeating this experiment we expect the software engineering community will gain quantitative insights about the utility of defect-detection techniques in different environments.
Article PDF
We’re sorry, something doesn't seem to be working properly.
Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.
Avoid common mistakes on your manuscript.
References
Aronson E., Brewer M., and Carlsmith J. M. 1985. Experimentation in social psychology. Handbook of Social Psychology (Lindzey G., and Aronson E., eds.) Vol. 1, 3rd ed., New York: Random House.
Basili, V. R., Caldiera, G., and Rombach, H. D. 1994. Goal question metric paradigm. Encyclopedia of Software Engineering, (Marciniak, J. J., ed.) vol. 1. John Wiley & Sons, pp. 528–532.
Basili V. R., and Rombach H. D. 1988. The TAME project: Towards improvement-oriented software environments. IEEE Transactions on Software Engineering Se-14(6): 758–773.
Basili V. R., and Selby R. W. 1987. Comparing the effectiveness of software testing techniques. IEEE Transactions on Software Engineering 13(12): 1278–1296.
Basili V. R., Selby R. W., and Hutchens D. H. 1986. Experimentation in software engineering. IEEE Transactions on Software Engineering SE-12(7): 733–743.
Basili V. R., and Weiss D. M. 1984. A methodology for collecting valid software engineering data. IEEE Transactions on Software Engineering SE-10(6): 728–738.
Basili V. R., and Weiss D. M. 1985. Evaluating software development by analysis of changes: Some data from the software engineering laboratory. IEEE Transactions on Software Engineering 11(2): 157–168.
Box G. E. P., Hunter W. G., and Hunter J. S. 1978. Statistics for Experimenters. New York: John Wiley & Sons.
Briand L., El Emam K. and Morasca S. 1996. On the application of measurement theory in software Engineering Journal of Empirical Software Engineering 1(1): 61–88.
Briand L., Basili V. R., and Hetmanski C. J. 1993. Developing interpretable models with optimized set reduction for identifying high-risk software components. IEEE Transactions on Software Engineering 19(11): 1028–1044.
Brooks R. E. 1980. Studying programmer behavior experimentally: The problems of proper methodology. Communications of the ACM 23(4): 207–213.
Campbell D. T., and Stanley J. C. 1966. Experimental and Quasi-Experimental Designs for Research. Boston: Houghton Mifflin. 0–395–30787–2.
Cohen, J. 1988. Statistical Power Analysis for the Behavioral Sciences. Lawrence Erlbaum Associates.
Curtis B. 1980. Measurement and experimentation in software engineering. Proceedings of the IEEE 68(9): 1144–1157.
Differding C., Hoisl B., and Lott C. M. 1996. Technology package for the goal question metric paradigm. Technical Report 281–96, Department of Computer Science, University of Kaiserslautern, 67653 Kaiserslautern, Germany, April.
Fagan M. E. 1976. Design and code inspections to reduce errors in program development. IBM Systems Journal 15(3): 219–248.
Fenton N., Pfleeger S. L., and Glass R. L. 1994. Science and substance: A challenge to software engineering. IEEE Software 11(4): 86–95.
Gannon J. D., Hamlet R. B., and Mills H. D. 1987. Theory of modules. IEEE Transactions on Software Engineering 13(7): 820–829.
Glass, R. L. 1995. Pilot studies: What, why and how. The Software Practitioner, January, 4–11.
Hetzel, W. C. 1976. An Experimental Analysis of Program Verification Methods. PhD thesis, University of North Carolina at Chapel Hill.
Hoare C. A. R. 1969. An axiomatic basis for computer programming. Communications of the ACM 12(10): 576–580, 583.
Howden W. E. 1978. An evaluation of the effectiveness of symbolic testing. Software-Practice and Experience 8(4): 381–398.
Howden W. E. 1980. Functional program testing. IEEE Transactions on Software Engineering SE-6: 162–169.
Humphrey, W. H. 1995. A Discipline for Software Engineering. Addison-Wesley.
Institute of Electrical and Electronics Engineers. 1983. Standard Glossary of Software Engineering Terminology.
Judd, C. M., Smith, E. R., and Kidder, L. H. 1991. Research Methods in Social Relations. Holt, Rinehart and Winston, sixth edition.
Kamsties, E., and Lott, C. M. 1995. An empirical evaluation of three defect-detection techniques. Proceedings of the Fifth European Software Engineering Conference (Schäfer, W., and Botella, P., eds.), Lecture Notes in Computer Science Nr. 989, Springer-Verlag, 362–383, September.
Kamsties E., and Lott C. M. 1995. An empirical evaluation of three defect-detection techniques. Technical Report ISERN 95–02, Department of Computer Science, University of Kaiserslautem, 67653 Kaiserslautern, Germany, May.
Laitenberger O. 1995. Perspective-based reading technique, validation and research in future. Student project (Projektarbeit), Department of Computer Science, University of Kaiserslautern, 67653 Kaiserslautern, Germany.
Lee, A. S. 1989. A scientific methodology for MIS case studies. MIS Quarterly, 33–50, March.
Linger, R. C., Mills, H. D., and Witt, B. I. 1979. Structured Programming: Theory and Practice. Addison-Wesley Publishing Company.
Marick, B. 1994. The Craft of Software Testing. Prentice Hall.
Miller J., Daly J., Wood M., Roper M., and Brooks A. 1995. Statistical power and its subcomponents—missing and misunderstood concepts in software engineering empirical research. Technical Report RR/95/192, Department of Computer Science, University of Strathclyde, Livingstone Tower, Richmond Street, Glasgow G1 1XH, Scotland. http://www.cs.strath.ac.uk/cs/research/efocs/research-reports/efocs-15–95.ps.z.
Myers G. J. 1978. A controlled experiment in program testing and code walkthroughs/inspections. Communications of the ACM 21(9): 760–768.
Myers G. J. 1979. The Art of Software Testing. New York: John Wiley & Sons.
Parnas, D. L., and Weiss, D. M. 1985. Active design reviews: principles and practices. Proceedings of the Eighth International Conference on Software Engineering, IEEE Computer Society Press, 132–136.
Pfleeger S. L. 1995a. Experimental design and analysis in software engineering, part 3: Types of experimental design. ACM SIGSOFT Software Engineering Notes 20(2): 14–16.
Pfleeger S. L. 1995b. Experimental design and analysis in software engineering, part 4: Choosing an experimental design. ACM SIGSOFT Software Engineering Notes 20(3): 13–15.
Porter A. A., Siy H., and Votta L. G. 1995. A survey of software inspections. Technical Report CS-TR-3552, UMIACS-TR-95–104, Department of Computer Science, University of Maryland, College Park, Maryland 20742, October.
Porter A. A., Votta L. G., and Basili V. R. 1995. Comparing detection methods for software requirements inspections: A replicated experiment. IEEE Transactions on Software Engineering, 21(6): 563–575.
Preece J., and Rombach H. D. 1994. A taxonomy for combining software engineering and human-computer interaction measurement approaches: Towards a common framework. International Journal of Human-Computer Studies 41: 553–583.
Rombach, H. D., Basili, V. R., and Selby, R. W. (eds.) 1992. Experimental Software Engineering Issues: A Critical Assessment and Future Directions. Lecture Notes in Computer Science Nr. 706, Springer-Verlag.
Selby, R. W. 1985. Evaluations of Software Technologies: Testing, CLEANROOM, and Metrics. PhD thesis, Department of Computer Science, University of Maryland, College Park, MD 20742, May.
Selby R. W., and Porter A. A. 1988. Learning from examples: generation and evaluation of decision trees for software resource analysis. IEEE Transactions on Software Engineering 14(12): 1743–1757.
Spector P. E. 1981. Research Designs. Sage University Paper series on Quantitative Applications in the Social Sciences, series no. 07–023. Beverly Hills: Sage Publications.
Vander Wiel S. A., and Votta L. G. 1993. Assessing software designs using capture-recapture methods. IEEE Transactions on Software Engineering 19(11): 1045–1054.
Votta, L. G., and Porter, A. 1995. Experimental software engineering: A report on the state of the art. Proceedings of the Seventeenth International Conference on Software Engineering, IEEE Computer Society Press, 277–279.
Zweben S. H., Edwards S. H., Weide B. W., and Hollingsworth J. E. 1995. The effects of layering and encapsulation on software development cost and quality. IEEE Transactions on Software Engineering 21(3): 200–208.
Author information
Authors and Affiliations
Additional information
This work was conducted while the author was with the Department of Computer Science, University of Kaiserslautern, 67653 Kaiserslautern, Germany.
Rights and permissions
About this article
Cite this article
Lott, C.M., Rombach, H.D. Repeatable software engineering experiments for comparing defect-detection techniques. Empirical Software Engineering 1, 241–277 (1996). https://doi.org/10.1007/BF00127447
Issue Date:
DOI: https://doi.org/10.1007/BF00127447