Abstract
In this paper we discuss the application of a range of techniques to the verification of mission-critical flight software at NASA’s Jet Propulsion Laboratory. For this type of application we want to achieve a higher level of confidence than can be achieved through standard software testing. Unfortunately, given the current state of the art, especially when efforts are constrained by the tight deadlines and resource limitations of a flight project, it is not feasible to produce a rigorous formal proof of correctness of even a well-specified stand-alone module such as a file system (much less more tightly coupled or difficult-to-specify modules). This means that we must look for a practical alternative in the area between traditional testing and proof, as we attempt to optimize rigor and coverage. The approaches we describe here are based on testing, model checking, constraint-solving, monitoring, and finite-state machine learning, in addition to static code analysis. The results we have obtained in the domain of file systems are encouraging, and suggest that for more complex properties of programs with complex data structures, it is possibly more beneficial to use constraint solvers to guide and analyze execution (i.e., as in testing, even if performed by a model checking tool) than to translate the program and property into a set of constraints, as in abstraction-based and bounded model checkers. Our experience with non-file-system flight software modules shows that methods even further removed from traditional static formal methods can be assisted by formal approaches, yet readily adopted by test engineers and software developers, even as the key problem shifts from test generation and selection to test evaluation.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
About Static Driver Verifier: http://www.microsoft.com/whdc/DevTools/tools/SDV.mspx
ACL2 Version 6.3: http://www.cs.utexas.edu/moore/acl2
AdaCore: SPARK Pro. http://www.adacore.com/sparkpro/
CodeSonar: GrammaTech static analysis. http://www.grammatech.com/products/codesonar
Hyper-V - Microsoft: http://www.microsoft.com/servers/hyper-v-server/default.mspx
JPL Laboratory for Reliable Software (LaRS): http://lars-lab.jpl.nasa.gov
Mars Science Laboratory: http://mars.jpl.nasa.gov/msl
Open Group Base Specifications Issue 6: http://www.opengroup.org/onlinepubs/009695399/
Software development testing and static analysis tools: Coverity. http://www.coverity.com
Source code analysis tools for software security & reliability: Klockwork. http://klocwork.com
VCC: A C verifier - Microsoft Research: http://research.microsoft.com/en-us/projects/vcc/
Ammann, P., Offutt, J.: Introduction to Software Testing. Cambridge University Press, Cambridge (2008)
Andrews, J., Li, F., Menzies, T.: Nighthawk: a two-level genetic-random unit test data generator. Autom. Softw. Eng. 144–153 (2007)
Andrews, J.H., Groce, A., Weston, M., Xu, R.-G.: Random test run length and effectiveness. In: Automated Software Engineering, pp. 19–28 (2008)
Andrews, J.H., Haldar, S., Lei, Y., Li, C.H.F.: Tool support for randomized unit testing. In: Proceedings of the First International Workshop on Randomized Testing, pp. 36–45. Portland (2006)
Angluin, D.: Learning regular sets from queries and counterexamples. Info. Comput. 75(2), 87106 (1987)
Ball, T., Rajamani, S.: Automatically validating temporal safety properties of interfaces. In: SPIN Workshop on Model Checking of Software, pp. 103–122 (2001)
Barringer, H., Groce, A., Havelund, K., Smith, M.: Formal analysis of log files. J. Aerosp. Comput. Info. Commun. 7(11), 365–390 (2010)
Barringer, H., Havelund, K., Rydeheard, D., Groce, A.: Rule systems for runtime verification: a short tutorial. In: Proceedings of the 9th International Workshop on Runtime Verification (RV’09), LNCS, vol. 5779, pp. 1–24. Springer (2009)
Barringer, H., Rydeheard, D., Havelund, K.: Rule systems for run-time monitoring: from Eagle to RuleR. In: Proceedings of the 7th International Workshop on Runtime Verification (RV’07), LNCS, vol. 4839, pp. 111–125. Springer, Vancouver (2007)
Barringer, H., Rydeheard, D.E., Havelund, K.: Rule systems for run-time monitoring: from Eagle to RuleR. J. Log. Comput. 20(3), 675–706 (2010)
Beck, K.: Test Driven Development: By Example. Addison Wesley, Reading (2002)
Beizer, B.: Software Testing Techniques. International Thomson Computer Press, Boston (1990)
Beyer, D., Chlipala, A.J., Henzinger, T.A., Jhala, R., Majumdar, R.: Generating tests from counterexamples. In: International Conference on Software Engineering, pp. 326–335 (2004)
Biere, A.: The Evolution from Limmat to Nanosat. Technical Report 444, Dept. of Computer Science, ETH Zŭrich (2004)
Biere, A., Cimatti, A., Clarke, E.M., Zhu, Y.: Symbolic model checking without BDDs. In: Tools and Algorithms for the Construction and Analysis of Systems, pp. 193–207 (1999)
Biermann, A.W., Feldman, J.A.: On the synthesis of finite-state machines from samples of their behaviour. IEEE Trans. Comput. 21, 592–597 (1972)
Boonstoppel, P., Cadar, C., Engler, D.: Rwset: attacking path explosion in constraint-based test generation. In: Tools and Algorithms for the Construction and Analysis of Systems, pp. 351–366 (2008)
Brooks, F.: The Mythical Man-Month: Essays on Software Engineering, 20th Anniversary Edition. Addison-Wesley Professional (1995)
Cadar, C., Dunbar, D., Engler, D.: Klee: unassisted and automatic generation of high-coverage tests for complex systems programs. In: Operating System Design and Implementation, pp. 209–224 (2008)
Cadar, C., Ganesh, V., Pawlowski, P., Dill, D., Engler, D.: EXE: automatically generating inputs of death. In: Conference on Computer and Communications Security, pp. 322–335 (2006)
Chaki, S., Clarke, E.M., Groce, A., Jha, S., Veith, H.: Modular verification of software components in C. In: International Conference on Software Engineering, pp. 385–395 (2003)
Chen, F., Roşu, G.: MOP: an efficient and generic runtime verification framework. In: Object-Oriented Programming, Systems, Languages and Applications (OOPSLA’07) (2007)
Chen, Y., Groce, A., Zhang, C., Wong, W.-K., Fern, X., Eide, E., Regehr, J.: Taming compiler fuzzers. In: ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 197–208 (2013)
Ciupa, I., Leitner, A., Oriol, M., Meyer, B.: Experimental assessment of random testing for object-oriented software. In: Rosenblum, D.S., Elbaum, S.G. (eds.) International Symposium on Software Testing and Analysis, pp 84–94. ACM (2007)
Claessen, K., Hughes, J.: QuickCheck: a lightweight tool for random testing of haskell programs. In: ICFP, pp. 268–279 (2000)
Clarke, E., Grumberg, O., Peled, D.: Model Checking. MIT Press, Cambridge (2000)
Clarke, E.M., Emerson, E.: The design and synthesis of synchronization skeletons using temporal logic. In: Workshop on Logics of Programs, pp. 52–71 (1981)
Clarke, L.A., Rosenblum, D.S.: A historical perspective on runtime assertion checking in software development. ACM SIGSOFT Softw. Eng. Notes 31(3), 25–37 (2006)
Corbett, J., Dwyer, M., Hatcliff, J., Laubach, S., Pasareanu, C.S., Zheng, H.: Bandera: extracting finite-state models from Java source code. In: International Conference on Software Engineering, pp. 439–448 (2000)
Csallner, C., Smaragdakis, Y.: JCrasher: an automatic robustness tester for Java. Softw. Pract. Exp. 34(11), 1025–1050 (2004)
de Moura, L., Bjorner, N.: Z3: an efficient SMT solver. In: Conference on Tools and Algorithms for the Construction and Analysis of Systems, pp. 337–340 (2008)
DeMillo, R.A., Lipton, R.J., Sayward, F.G.: Hints on test data selection: help for the practicing programmer. Computer 4(11), 34–41 (1978)
Doong, R.-K., Frankl, P.G.: The ASTOOT approach to testing object-oriented programs. ACM Trans. Softw. Eng. Methodol. 3(2), 101–130 (1994)
Duran, J.W., Ntafos, S.C.: Evaluation of random testing. IEEE Trans. Softw. Eng. 10(4), 438–444 (1984)
Dwyer, M.B., Elbaum, S.G., Person, S., Purandare, R.: Parallel randomized state-space search. In: International Conference on Software Engineering, pp. 3–12 (2007)
Een, N., Sorensson, N.: An extensible SAT-solver. In: Symposium on the Theory and Applications of Satisfiability Testing (SAT), pp. 502–518 (2003)
Erickson, J., Joshi, R.: Proving Correctness of a Flash Filesystem in ACL2. Unpublished manuscript in preparation (2006)
Gligoric, M., Groce, A., Zhang, C., Sharma, R., Alipour, A., Marinov, D.: Comparing non-adequate test suites using coverage criteria. In: International Symposium on Software Testing and Analysis, pp. 302–313 (2013)
Godefroid, P.: Verisoft: a tool for the automatic analysis of concurrent software. In: Computer-Aided Verification, pp. 172–186 (1997)
Godefroid, P.: Compositional dynamic test generation. In: Principles of Programming Languages, pp. 47–54 (2007)
Godefroid, P., Klarlund, N., Sen, K.: DART: directed automated random testing. In: Programming Language Design and Implementation, pp. 213–223 (2005)
Groce, A.: (Quickly) testing the tester via path coverage. In: Workshop on Dynamic Analysis (2009)
Groce, A., Fern, A., Pinto, J., Bauer, T., Alipour, A., Erwig, M., Lopez, C.: Lightweight automated testing with adaptation-based programming. In: IEEE International Symposium on Software Reliability Engineering, pp. 161–170 (2012)
Groce, A., Havelund, K., Smith, M.H.: From scripts to specifications: the evolution of a flight software testing effort. In: 32nd International Conference on Software Engineering (ICSE’10), pp. 129–138. ACM SIG, Cape Town (2010)
Groce, A., Holzmann, G., Joshi, R.: Randomized differential testing as a prelude to formal verification. In: International Conference on Software Engineering, pp. 621–631 (2007)
Groce, A., Holzmann, G., Joshi, R., Xu, R.-G.: Putting flight software through the paces with testing, model checking, and constraint-solving. In: International Workshop on Constraints in Formal Verification, pp. 1–15 (2008)
Groce, A., Joshi, R.: Exploiting traces in program analysis. In: Tools and Algorithms for the Construction and Analysis of Systems, pp. 379–393 (2006)
Groce, A., Joshi, R.: Extending model checking with dynamic analysis. In: International Conference on Verification, Model Checking, and Abstract Interpretation, pp. 142–156 (2008)
Groce, A., Joshi, R.: Random testing and model checking: building a common framework for nondeterministic exploration. In: Workshop on Dynamic Analysis, pp. 22–28 (2008)
Groce, A., Zhang, C., Alipour, A., Eide, E., Chen, Y., Regehr, J.: Help, help, I’m being suppressed! the significance of suppressors in software testing. In: IEEE International Symposium on Software Reliability Engineering, pp. 390–399 (2013)
Groce, A., Zhang, C., Eide, E., Chen, Y., Regehr, J.: Swarm testing. In: International Symposium on Software Testing and Analysis, pp. 78–88 (2012)
Hamlet, R.: Testing programs with the aid of a compiler. IEEE Trans. Softw. Eng. 3(4), 279–290 (1977)
Hamlet, R., Taylor, R.: Partition testing does not inspire confidence. IEEE Trans. Softw. Eng. 16(12), 1402–1411 (1990)
Hamlet, R.: Random testing. In: Encyclopedia of Software Engineering, pp. 970–978. Wiley (1994)
Hamlet, R.: When only random testing will do. In: International Workshop on Random Testing, pp. 1–9 (2006)
Havelund, K.: Runtime verification of C programs. In: Proceedings of the 1st TestCom/FATES Conference, LNCS, vol. 5047. Springer, Tokyo (2008)
Havelund, K., Roşu, G.: Efficient monitoring of safety properties. Softw. Tools Technol. Tran. 6(2), 158–173 (2004)
Heimdahl, M.P.E., Rayadurgam, S., Visser, W., George, D., Gao, J.: Auto-generating test sequences using model checkers: a case study. In: International Workshop on Formal Approaches to Testing of Software (FATES) (2003)
Henzinger, T.A., Jhala, R., Majumdar, R., Sutre, G.: Lazy abstraction. In: Principles of Programming Languages, pp. 58–70 (2002)
Hoare, T.: The verifying compiler: a grand challenge for computing research. J. ACM 50(1), 63–69 (2003)
Holzmann, G.: Static source code checking for user-defined properties. In: Conference on Integrated Design and Process Technology (2002)
Holzmann, G.: The SPIN Model Checker: Primer and Reference Manual. Addison-Wesley Professional (2003)
Holzmann, G.: The power of ten: rules for developing safety critical code. IEEE Comput. 39(6), 95–97 (2006)
Holzmann, G., Joshi, R.: Model-driven software verification. In: SPIN Workshop on Model Checking of Software, pp. 76–91 (2004)
Holzmann, G., Joshi, R., Groce, A.: Swarm verification. In: Automated Software Engineering, pp. 1–6 (2008)
Holzmann, G., Joshi, R., Groce, A.: Tackling large verification problems with the swarm tool. In: SPIN Workshop on Model Checking of Software, pp. 134–143 (2008)
Holzmann, G.J., Joshi, R., Groce, A.: Swarm verification techniques. IEEE Trans. Softw. Eng. 37(6), 845–857 (2011)
Joshi, R., Holzmann, G.: A mini-challenge: build a verifiable filesystem. In: The Conference on Verified Software: Theories, Tools, Experiments (2005)
Kaufmann, M., Manolios, P., Moore, J.S.: Computer-Aided Reasoning: An Approach. Kluwer, Norwell (2000)
Kim, M., Choi, Y., Kim, Y., Kim, H.: Formal verification of a flash memory device driver — an experience report. In: SPIN Workshop on Model Checking of Software, pp. 144–159 (2008)
Kim, M., Choi, Y., Kim, Y., Kim, H.: Pre-testing flash device driver through model checking techniques. In: International Conference on Software Testing, Verification, and Validation, pp. 475–484 (2008)
Kim, M., Kannan, S., Lee, I., Sokolsky, O.: Java-MaC: a run-time assurance tool for Java. In: Proceedings of the 1st International Workshop on Runtime Verification (RV’01) of ENTCS, vol. 55, issue 2. Elsevier (2001)
Kim, M., Kim, Y., Kim, H.: Unit testing of flash memory device driver through a SAT-based model checker. In: Automated Software Engineering, pp. 198–207 (2008)
King, J.C.: Symbolic execution and program testing. Commun. ACM 19(7), 385–394 (1976)
Klein, G., Elphinstone, K., Heiser, G., Andronick, J., Cock, D., Derrin, P., Elkaduwe, D., Engelhardt, K., Kolanski, R., Norrish, M., Sewell, T., Tuch, H., Winwood, S.: sel4: formal verification of an OS kernel. In: ACM Symposium on Operating Systems Principles (2009)
Kroening, D., Clarke, E.M., Lerda, F.: A tool for checking ANSI-C programs. In: Tools and Algorithms for the Construction and Analysis of Systems, pp. 168–176 (2004)
Lei, Y., Andrews, J.H.: Minimization of randomized unit test cases. In: International Symposium on Software Reliability Engineering, pp. 267–276 (2005)
Majumdar, R., Sen, K.: Hybrid concolic testing. In: International Conference on Software Engineering, pp. 416–426 (2007)
McKeeman, W.: Differential testing for software. Digit. Tech. J. Digit. Equip. Corp. 10(1), 100–107 (1998)
Mercer, E., Jones, M.: Model checking machine code with the GNU debugger. In: SPIN Workshop on Model Checking Software (2005)
Miller, B.P., Fredriksen, L., So, B.: An empirical study of the reliability of UNIX utilities. Commun. ACM 105(33(12)), 32–44 (1990)
Moskewicz, M.W., Madigan, C.F., Zhao, Y., Zhang, L., Malik, S.: Chaff: engineering an efficient SAT solver. In: Design Automation Conference, pp. 530–535 (2001)
Musuvathi, M., Park, D., Chou, A., Engler, D., Dill, D.: CMC: a pragmatic approach to model checking real code. In: Symposium on Operating System Design and Implementation (2002)
Necula, G., McPeak, S., Rahul, S., Weimer, W.: CIL: intermediate language and tools for analysis and transformation of C programs. In: International Conference on Compiler Construction, pp. 213–228 (2002)
Nipkow, T., Paulson, L.C., Wenzel, M.: Isabelle/HOL - A Proof Assistant for Higher-Order Logic, LNCS, vol. 2283. Springer (2002)
Offutt, J., Abdurazik, A.: Mutation 2000: Mutation Testing in the Twentieth and the Twenty First Centuries (2000)
Pacheco, C., Lahiri, S.K., Ernst, M.D., Ball, T.: Feedback-directed random test generation. In: International Conference on Software Engineering, pp. 75–84 (2007)
Pettichord, B.: Design for testability. In: Pacific Northwest Software Quality Conference (2002)
Qadeer, S., Rehof, J.: Context-bounded model checking of concurrent software. In: Tools and Algorithms for the Construction and Analysis of Systems, pp. 93–107 (2005)
Reeves, G., Neilson, T.: The mars rover spirit FLASH anomaly. In: IEEE Aerospace Conference (2005)
Dwyer, R.M., Hatcliff, J.: Bogor: an extensible and highly-modular model checking framework. In: European Software Engineering Conference/Symposium on the Foundations of Software Engineering, pp. 267–276 (2003)
Sen, K., Marinov, D., Agha, G.: CUTE: a concolic unit testing engine for C. In: Foundations of Software Engineering, pp. 262–272 (2005)
Smith, M., Havelund, K.: Requirements capture with RCAT. In: 16th IEEE International Requirements Engineering Conference (RE’08). IEEE Computer Society, Barcelona (2008)
Tuch, H., Klein, G., Norrish, M.: Types, bytes, and separation logic. In: Principles of Programming Languages, pp. 97–108 (2007)
Various: A Collection of NAND Flash Application Notes, Whitepapers and Articles. Available at http://www.data-io.com/NAND/NANDApplicationNotes.asp
Runtime Verification: http://www.runtime-verification.org
Visser, W., Havelund, K., Brat, G., Park, S.J., Lerda, F.: Model checking programs. Autom. Softw. Eng. 10(2), 203–232 (2003)
Visser, W., Păsăreanu, C., Pelanek, R.: Test input generation for Java containers using state matching. In: International Symposium on Software Testing and Analysis, pp. 37–48 (2006)
Xu, R.-G., Majumdar, R., Godefroid, P.: Testing for buffer overflows with length abstraction. In: International Symposium on Software Testing and Analysis, pp. 19–28 (2008)
Yang, J., Sar, C., Engler, D.: EXPLODE: a lightweight, general system for finding serious storage system errors. In: Operating System Design and Implementation, pp. 131–146 (2006)
Yang, J., Sar, C., Twohey, P., Cadar, C., Engler, D.: Automatically generating malicious disks using symbolic execution. In: IEEE Symposium on Security and Privacy, pp. 243–257 (2006)
Yang, J., Twohey, P., Engler, D., Musuvathi, M.: Using model checking to find serious file system errors. In: Operating System Design and Implementation, pp. 273–288 (2004)
Yang, X., Chen, Y., Eide, E., Regehr, J.: Finding and understanding bugs in C compilers. In: ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 283–294 (2011)
Zeller, A., Hildebrandt, R.: Simplifying and isolating failure-inducing input. IEEE Trans. Softw. Eng. 28(2), 183–200 (2002)
Author information
Authors and Affiliations
Corresponding author
Additional information
The research described in this publication was carried out at the Jet Propulsion Laboratory, California Institute of Technology, under a contract with the National Aeronautics and Space Administration. Funding was also provided by NASA ESAS 6G. (c) 2008. All Rights Reserved
Rights and permissions
About this article
Cite this article
Groce, A., Havelund, K., Holzmann, G. et al. Establishing flight software reliability: testing, model checking, constraint-solving, monitoring and learning. Ann Math Artif Intell 70, 315–349 (2014). https://doi.org/10.1007/s10472-014-9408-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10472-014-9408-8