1 Introduction

Incremental verification addresses the unique opportunities and challenges that arise when a verification task can be performed in an incremental way, as a sequence of smaller closely related tasks. We present an implementation of the incremental verification of software with assertions that uses the insights obtained from a successful verification of earlier assertions. As a fundamental building block in storing the insights we use function summaries known to provide speed-up through localizing and modularizing verification [12, 13].

In this paper we describe the HiFrog verification tool that uses Craig interpolation [6] in the context of Bounded Model Checking (BMC) [4] for constructing function summaries. The novelty of the tool is in the unique way it combines function summaries with the expressiveness of satisfiability modulo theories (SMT). The system currently supports verification based on the quantifier-free theories of linear real arithmetics (\(\mathrm {QF\_LRA}\)) and uninterpreted functions (\(\mathrm {QF\_UF}\)), in addition to propositional logic (\(\mathrm {QF\_BOOL}\)). Compared to our earlier propositional tool FunFrog [13], the SMT summaries are smaller and more efficient in verification. They are also often significantly more human-readable, enabling their easier reuse, as well as injection of summaries provided directly by the user. The difference is due to the propositional summaries being based on correctness proofs over circuit-level representation of arithmetic operations. Theory encoding uses instead directly arithmetic symbols in the summaries. In addition, the tool offers a rich set of features such as verification of recursive programs, different ways of optimizing the summaries with respect to both size and strength, efficient heuristics for removing redundant safety properties, and easy-to-understand witnesses of property violations that can be directly mapped to bugs in the source code.

The paper provides an architectural description of the tool, an introduction to its use, and experimental evidence of its performance. The tool together with a comprehensive demo is available at http://verify.inf.usi.ch/hifrog.

Related Work. Incremental verification is extensively researched in domains such as hardware verification, deductive verification, and model checking. Due to space constraints we provide only a brief review of recent related work. The CPAchecker tool is able to migrate predicates across program versions [3]. Deductive verification tools such as Viper and Dafny offer modular verification [11] and caching the intermediate verification results [9] respectively. CBMC is a symbolic bounded model-checker for C that to a limited extent exploits incremental capabilities of a SAT solverFootnote 1, but does not use or output any reusable information like function summaries. Similar to HiFrog, ESBMC also shares the CProver infrastructure and is based on an SMT solver. To the best of our knowledge, it does not support incremental verification [5].

2 Tool Overview

HiFrog consists of two main components SMT encoder and interpolating SMT solver; and the function summaries (see Fig. 1). The components are initially configured with the theory and the interpolation algorithms. The tool then processes assertions sequentially using function summaries when possible. The results of a successful assertion verification are stored as interpolated function summaries, and failed verifications trigger a refinement phase or the printing of an error trace. This section details the tool features.

Fig. 1.
figure 1

HiFrog overview. Grey and black arrows connect different modules of the tool (dashed - optional). Blue arrows represent the flow of the input/output data. (Color figure online)

Preprocessing. The source code is parsed and transformed into an intermediate goto-program using the goto-cc symbolic compiler. The loops are unwound to the pre-determined number of iterations. HiFrog identifies the set of assertions from the source code, reads the user-defined function summaries (if any) in the smtlib2-format, and makes them available for the subsequent analysis.

SMT Encoding and Function Summarization. For a given assertion, the goto-program is symbolically executed function-per-function resulting in the “modular” Static Single Assignment (SSA) form of the unwound program, i.e., a form where each function has its own isolated SSA-representation. To reduce the size of the SSA form, HiFrog performs slicing that keeps only the variables in the SSA form that are syntactically dependent on the variables in the assertion.

When the SSA form is pruned, HiFrog creates the SMT formula in the pre-determined logic (\(\mathrm {QF\_BOOL}\), \(\mathrm {QF\_UF}\) or \(\mathrm {QF\_LRA}\)). The modularity of the SSA form comes in handy when the function summaries of the chosen logic (either user-defined, interpolation-based, or treated nondeterministically) are available. If this is the case, the call to a function with the available summary is replaced by the summary. The final SMT formula is pushed to an SMT solver to decide its satisfiability.

Due to over-approximating nature of function summaries, the program encoded with the summaries may contain spurious errors. The summary refiner identifies and marks summaries directly involved in the detected error, and HiFrog returns to the encoding stage to replace the marked summaries by the precise (up to the pre-determined logic) function representations. Note that due to refinement, HiFrog reveals nested function calls (including recursive ones) which are again replaced by available summaries. For an unsatisfiable SMT formula, HiFrog extracts function summaries using interpolation. The extracted summaries are serialized in a persistent storage so that they are available for other HiFrog runs. For a more detailed description we refer to [13].

Theories. HiFrog supports three different quantifier-free theories in which the program can be modelled: bit-precise \(\mathrm {QF\_BOOL}\), \(\mathrm {QF\_UF}\) and \(\mathrm {QF\_LRA}\). The use of theories beyond \(\mathrm {QF\_BOOL}\) allows the system to scale to larger problems since encoding in particular the arithmetic operations using bit-precision can be very expensive. As the precise arithmetics often do not play a role in the correctness of the program, substituting them with linear arithmetics, uninterpreted functions, or even nondeterministic behavior might result in a significant reduction in model-checking time (see Sect. 3). If a property is proved using one of the light-weight theories \(\mathrm {QF\_UF}\) and \(\mathrm {QF\_LRA}\), the proof holds also for the exact BMC encoding of the program. However, the loss of precision can sometimes produce spurious counterexamples due to the over-approximating encoding. The light-weight theories therefore need to be refined (i.e., using theory refiner) to \(\mathrm {QF\_BOOL}\) if the provided counter-example does not correspond to a concrete counterexample.

Obtaining Summaries by Interpolation. HiFrog relies on different interpolation frameworks for the different theories it supports. As a result the generation of propositional, \(\mathrm {QF\_UF}\) and \(\mathrm {QF\_LRA}\) interpolants can be controlled with respect to strength and size by specifying an interpolation algorithm for a theory. For propositional logic we provide the Labeled Interpolation Systems [7] including the Proof-Sensitive interpolation algorithms [1]. Interpolation for \(\mathrm {QF\_UF}\) is implemented with duality-based interpolation [2], and a similar extension is applied to the interpolation algorithm for \(\mathrm {QF\_LRA}\) based on [10]. HiFrog also provides a range of techniques to reduce the size of the generated interpolants through removing redundancies in propositional proofs [12]: the algorithms RecyclePivotsWithIntersection and LowerUnits, structural hashing, and a set of local rewriting rules.

Assertion Optimizer. In addition to incremental verification of a set of assertions, HiFrog supports the basic functionality of classical model checkers to verify all assertions at once. For the cases when the set of assertions is too large, it can be optimized by constructing an assertion implication relation and exploiting it to remove redundant assertions [8]. In a nutshell, the assertion optimizer considers pairs of spatially close assertions \(a_i\) and \(a_j\) and uses the SMT solver to check if \(a_i\) conjoined with the code between \(a_i\) and \(a_j\) implies \(a_j\) (if there is any other assertion between \(a_i\) and \(a_j\) then it is treated as assumption). If the check succeeds then \(a_j\) is proven redundant and its verification can be safely skipped.

3 HiFrog Usage

We provide a Linux binary of HiFrog reading as input a C-program, assertions to be verified, a set of parameters and the interpolated or user-defined function summaries in the SMT-LIB2 format. HiFrog exploits the CProver framework and inherits some of its options (e.g., --unwind for the loop unrolling, --show-claims and --claim for managing the assertions checks); the ability for the user to declare and to use a nondet_TYPE() function of a specific numerical type (e.g., int, long, double, unsigned, in \(\mathrm {QF\_LRA}\) only) or add a __CPROVER_assume() statement to limit the domain to a specific range of values.

HiFrog uses \(\mathrm {QF\_LRA}\) by default but can be switched to \(\mathrm {QF\_UF}\) via the --logic option.Footnote 2 HiFrog uses a variety of interpolation and proof compression algorithms to control the the precision (with --itp-uf-algorithm option for \(\mathrm {QF\_UF}\), --itp-lra-algorithm option for \(\mathrm {QF\_LRA}\), and --itp-algorithm option for propositional interpolation) and the size (with --reduce-proof) of summaries. The summary storage is controlled using the --save-summaries and --load-summaries options. In between verification runs, the summaries contained in the corresponding files for \(\mathrm {QF\_UF}\) and \(\mathrm {QF\_LRA}\) might be edited manually. Note that due to the SMT encoding constraints HiFrog does not allow interchanging summaries between the theories. Finally, HiFrog supports the identification and reporting of redundant assertions with --claims-opt, a useful feature for some automatically generated assertions [8].

In the end of each verification run, HiFrog either reports VERIFICATION SUCCESSFUL or VERIFICATION FAILED accompanied by an error trace. An error trace presents a sequence of steps with a direct reference to the code and the values of variables in these steps. In most cases when \(\mathrm {QF\_UF}\) and \(\mathrm {QF\_LRA}\) introduce a spurious error, HiFrog outputs a warning, and thus the user is advised to use HiFrog with a more precise theory. HiFrog also reports the statistics on the running time and the number of the summary-refinements performed.

Fig. 2.
figure 2

Running time by \(\mathrm {QF\_BOOL}\) against \(\mathrm {QF\_UF}\) and \(\mathrm {QF\_LRA}\).

Experimental Results. We evaluated HiFrog on a large set of C programs coming from both academic and industrial sources such as SV-COMP. All benchmarks contained multiple assertions to be checked. To demonstrate the advantages of the SMT-based summarization, here we provide data for analysis of benchmarks containing 1086 assertions from which 474 were proven to hold using \(\mathrm {QF\_BOOL}\) (meaning that those properties satisfy the system specifications). Even despite the over-approximating nature of \(\mathrm {QF\_UF}\) and \(\mathrm {QF\_LRA}\), our experiments witnessed a large amount of properties which were also proven to be correct by employing the light-weight theories of HiFrog (namely, 50.65% and 69.2% of validated properties out of 474 for \(\mathrm {QF\_UF}\) and \(\mathrm {QF\_LRA}\) respectively).

Furthermore, those experiments revealed that model checking using the \(\mathrm {QF\_UF}\) and \(\mathrm {QF\_LRA}\)-based summarization was extremely efficient. Figure 2 presents two logarithmic plots for comparison of running timesFootnote 3 of HiFrog with \(\mathrm {QF\_BOOL}\) to respectively \(\mathrm {QF\_UF}\) and \(\mathrm {QF\_LRA}\). Each point represents a pair of verification runs of a holding assertion with the two corresponding theories using the interpolation-based summaries. Note that for most of the assertions, the verification with \(\mathrm {QF\_UF}\) and \(\mathrm {QF\_LRA}\) is an order of magnitude faster than the verification with \(\mathrm {QF\_BOOL}\).