1 Introduction

Satisfiability Modulo Theories (SMT) problem is a decision problem for logical formulas with respect to combinations of background theories such as arithmetic, bit-vectors, arrays, and uninterpreted functions. Z3 is an efficient SMT solver with specialized algorithms for solving background theories. SMT solving enjoys a synergetic relationship with software analysis, verification and symbolic execution tools. This is in many respects thanks to the emphasis on supporting domains commonly found in programs and specifications. There are several scenarios where part of a query posed by these tools can be cast in terms of formulas in a supported logic. It is then useful for the tool writer to have an idea of what are available supported logics, and have an idea of how formulas are solved. But interacting with SMT solvers is not always limited to posing a query as a single formula. It may require a sequence of interactions to obtain a usable answer and the need emerges for the tool writer for having an idea of what methods and knobs are available. In summary, this tutorial aims to answer the following types of questions through examples and a touch of theory:

  • What are the available features in Z3, and what are they designed to be used for?

  • What are the underlying algorithms used in Z3?

  • How can I program applications on top of Z3?

Figure 1 shows an overall systems diagram of Z3, as of version 4.8. The top left summarizes the interfaces to Z3. One can interact with Z3 over SMT-LIB2 scripts supplied as a text file or pipe to Z3, or using API calls from a high-level programming language that are proxies for calls over a C-based API. We focus on using the Python front-end as a means of interfacing with Z3, and start out describing the abstract syntax of terms and formulas accepted by Z3 in Sect. 2. Formulas draw from symbols whose meaning are defined by a set of Theories, Sect. 3. Solvers, Sects. 45 and 6, provide services for deciding satisfiability of formula. Tactics, Sect. 7, provide means for pre-processing simplification and creating sub-goals. Z3 also provides some services that are not purely satisfiability queries. Optimization, Sect. 8, services allow users to solve satisfiability modulo objective functions to maximize or minimize values. There are also specialized procedures for enumerating consequences (backbone literals) described in Sect. 4.6.6.

Fig. 1.
figure 1

Overall system architecture of Z3

1.1 Resources

The main point of reference for Z3 is the GitHub repository

There is an interactive tutorial using the SMT-LIB2 front-end on

Examples from this tutorial that are executable can be found on

There are several systems that program with Z3. They use a variety of front-ends, some use OCaml, others C++, and others use the SMT-LIB2 text interfaces. A few instances that use the Python front-end include

1.2 Sources

The material in this tutorial is assembled from several sources. Some of the running examples originate from slides that have circulated in the SAT and SMT community. The first SAT example is shamelessly lifted from Armin Biere’s SAT tutorials and other examples appear in slides by Natarajan Shankar.

2 Logical Interfaces to Z3

Z3 takes as input simple-sorted formulas that may contain symbols with pre-defined meanings defined by a theory. This section provides an introduction to logical formulas that can be used as input to Z3.

As a basis, propositional formulas are built from atomic variables and logical connectives. An example propositional logical formula accepted by Z3 is:

figure a

The example introduces two Boolean variables and . It then creates a object and adds three assertions.

The call to produces a verdict ; there is a satisfying assignment for the formulas. A satisfying model, where is false and is true, can be extracted using . For convenience the Python front-end to Z3 contains some shorthand functions. The function sets up a solver, adds assertions, checks satisfiability, and prints a model if one is available.

Propositional logic is an important, but smaller subset of formulas handled by Z3. It can reason about formulas that combine symbols from several theories, such as the theories for arrays and arithmetic:

figure b

The formula is valid. It is true for all values of integers x, y, z, array A, and no matter what the graph of the function f is. Note that we are using as shorthand for . We can manually verify the validity of the formula using the following argument: The integer constants and are created using the function that creates a list of integer constants. Under the assumption that , the right side of the implication simplifies to

figure c

The left side is a term in the theory of arrays, which captures applicative maps. updates the array at position with the value 3. Then retrieves the contents of the array at index , which in this case is 3. Dually, the negation of is unsatisfiable and the call to Z3 produces .

Formulas accepted by Z3 generally follow the formats described in the SMT-LIB2 standard [3]. This standard (currently at version 2.6) defines a textual language for first-order multi-sorted logic and a set of logics that are defined by a selection of background theories. For example, the logic of quantifier-free linear integer arithmetic, known in SMT-LIB2 as , is a fragment of first-order logic, where formulas are quantifier free, variables range over integers, interpreted constants are integers, the allowed functions are +, −, integer multiplication, division, remainder, modulus with a constant, and the allowed relations are, besides equality that is part of every theory, also <,<=, >=, >. As an example, we provide an SMT-LIB and a Python variant of the same arbitrary formula:

figure d

Python version:

figure e

It is also possible to extract an SMT-LIB2 representation of a solver state.

figure f

produces the output

figure g

2.1 Sorts

Generally, SMT-LIB2 formulas use a finite set of simple sorts. It includes the built-in sort , and supported theories define their own sorts, noteworthy , , bit-vectors for every positive bit-width n, arrays for every sort Index and Elem, and sequences for every sort S. It is also possible to declare new sorts. Their domains may never be empty. Thus, the formula

figure h

is unsatisfiable.

2.2 Signatures

Formulas may include a mixture of interpreted and free functions and constants. For example, the integer constants 0 and 28 are interpreted, while constants x, y used in the previous example are free. Constants are treated as nullary functions. Functions that take arguments can be declared, such as creates the function declaration that takes one integer argument and its range is an integer. Functions with Boolean range can be used to create formulas.

2.3 Terms and Formulas

Formulas that are used in assertions or added to solvers are terms of Boolean sort. Otherwise, terms of Boolean and non-Boolean sort may be mixed in any combination where sorts match up. For example

figure i

could produce a solution of the form

figure j

The model assigns to False, the graph of maps all arguments to , and the graph of maps all values to . Standard built-in logical connectives are . Bi-implication is a special case of equality, so from Python, when saying for Boolean a and b it is treated as a logical formula for the bi-implication of a and b.

A set of utilities are available to traverse expressions once they are created. Every function application has a function declaration and a set of arguments accessed as children.

figure k

2.4 Quantifiers and Lambda Binding

Universal and existential quantifiers bind variables to the scope of the quantified formula. For example

figure l

has no solution because no matter what value we assigned to , there is a value for that is non-positive and smaller than that value. The bound occurrence of is unrelated to the free occurrence where is restricted to be . The equality constraint should also not be mistaken for an assignment to . It is not the case that bound occurrences of are a synonym for . Notice that the slightly different formula

figure m

has a solution where x is 1 and the free occurrence of y is 2.

Z3 supports also \(\lambda \)-binding with rudimentary reasoning support based on a model-constructing instantiation engine. \(\lambda \)s may be convenient when expressing properties of arrays and Z3 uses array sorts for representing the sorts of lambda expressions. Thus, the result of is an array from integers to integers, that produces the value in the range from to and otherwise behaves as outside the range. Z3 reasons about quantifier free formulas that contains by instantiating the body of the \(\lambda \).

figure n

Lambda binding is convenient for creating closures. Recall that meaning of , where is an expression with free occurrences of and is as a function that takes two arguments and substitutes their values for and in . Z3 uses Lambda lifting, in conjunction with Reynold’s defunctionalization, to reduce reasoning about closures to universally quantified definitions. Z3 treats arrays as general function spaces. All first-order definable functions may be arrays. Some second-order theorems can be established by synthesizing \(\lambda \) terms by instantiation. Thus,

figure o

is provable. Z3 synthesizes an instantiation corresponding to for Q.

3 Theories

We will here summarize the main theories supported in Z3. In a few cases we will give a brief taste of decision procedures used for these theories. Readers who wish to gain a more in-depth understanding of how these decision procedures are implemented may follow some of the citations.

3.1 EUF: Equality and Uninterpreted Functions

The logic of equality and uninterpreted function, EUF, is a basic ingredient for first-order predicate logic. Before there are theories, there are constants, functions and predicate symbols, and the built-in relation of equality. In the following example, f is a unary function, x a constant. The first invocation of is feasible with a model where x is interpreted as an element in S and f is an identify function. The second invocation of solve is infeasible; there are no models where f maps x to anything but itself given the two previous equalities.

figure p

Decision procedures for quantifier-free EUF formulas are usually based on union-find [57] to maintain equivalence classes of terms that are equated. Pictorially, a sequence of equality assertions \(a = b, b = c, b = s\) produce one equivalence class that captures the transitivity of equality.

figure q

It is possible to check for satisfiability of disequalities by checking whether the equivalence classes associated with two disequal terms are the same or not. Thus, adding \(a \ne d\) does not produce a contradiction, and it can be checked by comparing a’s class representative with d’s representative.

figure r

On the other hand, when asserting \(c \ne s\), we can deduce a conflict as the two terms asserted to be disequal belong to the same class. Class membership with union-find data-structures is amortized nearly constant time.

figure s

Union-find alone is insufficient when function symbols are used, as with the following example,

In this case decision procedures require reasoning with the congruence rule

$$ x_1 = y_1 \cdots x_n = y_n \ \Rightarrow \ f(x_1, \ldots , x_n) = f(y_1, \ldots , y_n) $$

As a preparation for solving our example, let us introduce constants that can be used as shorthands for sub-terms. Thus, introduce constants \(v_1, v_2, v_3, v_4\) as representatives for the four compound sub-terms.

\(a = b, b = c, d = e, b = s, d = t, v_3 \ne {v_4}\) \(v_1 := g(e), v_2 := g(d), v_3 := f(a, v_2), {v_4 := f(b, v_1)}\)

Having only the equality information available we obtain the equivalence classes:

figure t

Working bottom-up, the congruence rule dictates that the classes for \(v_1\) and \(v_2\) should be merged. Thus,

$$ e = d \ \Rightarrow \ g(e) = g(d) $$

implies the following coarser set of equivalences.

figure u

At this point, the congruence rule can be applied a second time,

$$ a = b, v_2 = v_1 \Rightarrow \ f(a, v_2) = f(b, v_1) $$

producing the equivalence classes

figure v

The classes for \(v_3\) and \(v_4\) are now merged. As our original formula required these to be distinct, congruence closure reasoning determines that the formula is unsatisfiable.

3.1.1 Congruence Closure

We have implicitly used the notion of congruence closure [20] to check satisfiability of equalities. Let us more formally define this notion. Let \(\mathcal {T}\) be a set of terms and \(\mathcal {E}\) set of equalities over \(\mathcal {T}\). A congruence closure over \(\mathcal {T}\) modulo \(\mathcal {E}\) is the finest partition cc of \(\mathcal {T}\), such that:

  • if \((s = t) \in \mathcal {E}\), then st are in the same partition in cc.

  • for \(s := f(s_1, \ldots , s_k), t := f(t_1, \ldots , t_k) \in \mathcal {T}\),

    • if \(s_i, t_i\) are in same partition of cc for each \(i = 1, \ldots k\),

    • then st are in the same partition under cc.

Definition 1

\(cc: \mathcal {T} \rightarrow 2^{\mathcal {T}}\), maps term to its equivalence class.

3.1.2 EUF Models

A satisfiable version of the running example is:

\(a = b, b = c, d = e, b = s, d = t, f(a, g(d)) \ne {f(g(e), b)}\)

It induces the following definitions and equalities: \(a = b, b = c, d = e, b = s, d = t, v_3 \ne v_4\)

\(v_1 := g(e), v_2 := g(d), v_3 := f(a, v_2), v_4 := f(v_1, b)\) and we can associate a distinct value with each equivalence class.

figure w

When presenting the formula to Z3 as

figure x

it produces a model, that may look as follows:

figure y

In the model the value is a fresh constant that is distinct from . The graph for maps the arguments to . All other arguments are mapped by the clause to . The clause is used as the default interpretation of arguments that are not listed in the interpretation. The interpretation of S is a finite set

figure z

3.2 Arithmetic

Arithmetical constraints are nearly ubiquitous in software models. Even though mainstream software operates with finite precision arithmetic, that is modeled precisely using bit-vectors, arithmetic over unbounded integers can often be used in a sound way to model software. Furthermore, arithmetic over the reals has been used for diverse areas such as models of cyber-physical systems or for axiomatic economics.

3.2.1 Solving LRA: Linear Real Arithmetic

We provide an outline of Z3’s main procedure for solving formulas over linear real arithmetic [21]. It maintains a (dual) Simplex tableau that encodes equalities of the form \(Ax = 0\). Feasibility of the equalities depends on bounds, \(lo_j \le x_j \le hi_j\), currently associated with the variables. For the following formula

figure aa

Z3 introduces auxiliary variables \(s_1, s_2\) and represents the formula as

Only bounds (e.g., \(s_1 \le 2\)) are asserted during search.

The first two equalities form the tableau. Thus, the definitions \({s_1} \equiv x + y, {s_2} \equiv x + 2y\) produce the equalities

$$ {s_1} = x + y, \ {s_2} = x + 2y $$

They are equivalent to the normal form:

$$ {s_1} - x - y = 0, \ {s_2} - x - 2y = 0 $$

where \({s_1, s_2}\) are basic (dependent) and xy are non-basic. In dual Simplex tableaux, values of a non-basic variable \(x_j\) can be chosen between \(lo_j\) and \(hi_j\). The value of a basic variable is a function of non-basic variable values. It is the unique value that satisfies the unique row where the basic variable occurs. Pivoting swaps basic and non-basic variables and is used to get values of basic variables within bounds. For example, assume we start with a set of initial values \(x = y = s_1 = s_2 = 0\) and bounds \(x \ge 0, s_1 \le 2, s_1 \ge 2\). Then \(s_1\) has to be 2 and it is made non-basic. Instead y becomes basic:

$$ {y} + x - s_1 = 0, \ {s_2} + x - 2s_1 = 0 $$

The new tableau updates the assignment of variables to \(x = 0, s_1 = 2, s_2 = 4, y = 2\). The resulting assignment is a model for the original formula.

3.2.2 Solving Arithmetical Fragments

The solvers available to reason about arithmetical constraints are wildly different depending on what fragments of arithmetic is used. We summarize the main fragments, available decision procedures, and examples in Table 1 where xy range over reals and ab range over integers.

Table 1. Arithmetic theories
Table 2. Fragments of arithmetic

There are many more fragments of arithmetic that benefit from specialized solvers. We later discuss some of the fragments where integer variables are restricted to the values \(\{0, 1\}\) when describing Pseudo-Boolean constraints. Other fragments that are not currently handled in Z3 in any special way include fragments listed in Table 2.

A user of Z3 may appreciate that a domain can be modeled using a fragment of the theory of arithmetic that is already supported, or belongs to a class where no special support is available. On a practical side, it is worth noting that Z3 uses infinite precision arithmetic by default. Thus, integers and rationals are represented without rounding. The benefit is that the representation ensures soundness of the results, but operations by decision procedures may end up producing large numerals taking most of the execution time. Thus, users who produce linear arithmetic constraints with large coefficients or long decimal expansions may face performance barriers.

3.3 Arrays

The declaration

figure ab

introduces a constant of the array sort mapping integers to integers. We can solve constraints over arrays, such as

figure ac

which produces a solution where necessarily equals .

Z3 treats arrays as function spaces, thus a function can be converted to an array using a \(\lambda \)

figure ad

If has sort \(A \times B \rightarrow C\), then has sort . A set of built-in functions are available for arrays. We summarize them together with their representation using bindings.

figure ae

3.3.1 Deciding Arrays by Reduction to EUF

Formulas using the combinators are checked for satisfiability by expanding the respective \(\lambda \) definitions on sub-terms. We illustrate how occurrences of produce constraints over EUF. In the following, assume we are given a solver with ground assertions using arrays.

For each occurrence in of and , add the following assertions:

The theory of arrays is extensional. That is, two arrays are equal if they behave the same on all selected indices. When Z3 produces models for quantifier free formulas in the theory of extensional arrays it ensures that two arrays are equal in a model whenever they behave the same on all indices. Extensionality is enforced on array terms a, b in by instantiating the axiom of extensionality.

Since the universal quantifier occurs in a negative polarity we can introduce a Skolem function that depends on and and represent the extensionality requirement as:

We can convince ourselves that asserting these additional constraints force models of a solver to satisfy the array axioms. Suppose we are given a model M satisfying all the additional asserted equalities. These equalities enforce the axioms for on all indices that occur in . They also enforce extensionality between arrays: Two arrays are equal if and only if they evaluate to the same value on all indices in .

3.4 Bit-Vectors

Let us play with some bit-fiddling. The resource

Fig. 2.
figure 2

Bit-vector addition circuit

lists a substantial repertoire of bit-vector operations that can be used as alternatives to potentially more expensive operations. Note that modern compilers already contain a vast set of optimizations that automatically perform these conversions and Z3 can be used to check and synthesize such optimizations [38]. For example, to test whether a bit-vector is a power of two we can use a combination of bit-wise operations and subtraction:

figure af

The absolute value of a variable can be obtained using addition and xor with a sign bit.

figure ag

Notice that the Python variable corresponds to the expression , the right arithmetic (signed) shift of . Notice also, that in classical first-order logic, all operations are total. In particular, for bit-vector arithmetic is fully specified, in contrast to, say C, which specifies that is undefined when is a signed integer with the value \(-2^{31}\).

3.4.1 Solving Bit-Vectors

Z3 mostly uses a bit-blasting approach to deciding bit-vectors. By bit-blasting we refer to a reduction of bit-vector constraints to propositional logic by treating each bit in a bit-vector as a propositional variable. Let us illustrate how bit-vector addition is compiled to a set of clauses. The expression , where and are bit-vectors is represented by a vector of output bits. The relation between , and is provided by clauses the encode a ripple-carry adder seen in Fig. 2. The encoding uses an auxiliary vector of carry bits that are internal to the adder.

3.4.2 Floating Point Arithmetic

Floating points are bit-vectors with an interpretation specified by the IEEE floating point standard.

figure ah

It declares a floating point number with 3 bits in the exponent and 4 for the significand. The result of adding 10 to is . We see that 10 is represented as a floating point number with exponent 3, that is the bit-vector 011. The significand is 1010.

3.5 Algebraic Datatypes

The theory of first-order algebraic data-types captures the theory of finite trees. It is characterized by the properties that:

  • All trees are finite (occurs check).

  • All trees are generated from the constructors (no junk).

  • Two trees are equal if and only if they are constructed exactly the same way (no confusion).

A basic example of a binary tree data-type is given in Fig. 3.

It may produce the solution

figure ai

Similarly, one can prove that a tree cannot be a part of itself.

figure aj
Fig. 3.
figure 3

Binary tree datatypes

3.6 Sequences and Strings

The theory of strings and sequences extend on the theory of the free monoid with a few additional functions that are useful for strings and sequences. A length operation is built-in for strings and sequences, and there are operations for converting strings to natural numbers and back.

If the lengths of a prefix and suffix of a string add up to the length of the string, the string itself must be the concatenation of the prefix and suffix:

figure ak

One can concatenate single elements to a sequence as units:

figure al

There are two solvers available in Z3 for strings. They can be exchanged by setting the parameter

  • with contributions by Thai Trinh, or

  • by Murphy Berzish.

4 Interfacing with Solvers

Solvers maintain a set of formulas and supports satisfiability checking, and scope management: Formulas that are added under one scope can be retracted when the scope is popped. In this section we describe the interface to solvers. Section 5 provides a set of use cases and Sect. 6 describes the underlying solver implementations available in Z3.

4.1 Incrementality

Solvers can be used to check satisfiability of assertions in an incremental way. An initial set of assertions can be checked for satisfiability followed by additional assertions and checks. Assertions can be retracted using scopes that are pushed and popped. Under the hood, Z3 uses a one-shot solver during the first check. If further calls are made into the solver, the default behavior is to switch to an incremental solver. The incremental solver uses the SMT core, see Sect. 6.1.1, by default. For use-cases that don’t require all features by the SMT core, it may be beneficiary to use specialized solvers, such as solvers for finite domains (bit-vectors, enumeration types, bounded integers, and Booleans) as specified using the logic.

4.2 Scopes

The operations and create, respectively revert, local scopes. Assertions that are added within a are retracted on a matching . Thus, the following session results in the verdicts , , and .

figure am

4.3 Assumptions

Alternative to scopes, it is possible to check satisfiability under the assumption of a set of literals. Thus, the session

figure an

also produces the verdict as the conjunction of \(p \rightarrow q\), \(\lnot q\), p is . The method has the same effect of adding , and it adds as an implicit assumption. Our running example becomes

figure ao

4.4 Cores

We can extract a subset of assumptions used to derive unsatisfiability. Such subsets of assumptions are known as unsatisfiable cores, or simply as a core. In the following example, the unsatisfiable core has the single element . The unrelated assumption does not appear in the core.

figure ap

Note that we invoke prior to extracting a core. Cores are only available after the last call to produced .

By default solvers do not return minimal cores. A core is minimal if there is no proper subset that is also a core. The default behavior can be changed when the solver corresponds to either the SMT Core or SAT Core (if the underlying solver is created from a sequence of pre-processing tactics, core minimization is not guaranteed to take effect). To force core minimization users can rely on setting the following parameters:

figure aq

4.5 Models

When returns Z3 can provide a model that assigns values to the free constants and functions in the assertions. The current model is accessed using and it offers access to an interpretation of the active assertions in . Consider the example:

figure ar

A possible model for is:

figure as

You can access models. They have a set of entries. Each entry maps a constant or function declaration (constants are treated as nullary functions) to an interpretation. It maps constants to a constant expression and it maps functions to a function interpretation. The stub

figure at

iterates over the assignments in a model and produces the output

figure au

Function interpretations comprise a set of entries that specify how the function behaves on selected argument combinations, and a else_value that covers arguments not listed in the entries.

figure av

It produces the output

figure aw

The easiest way to access a model is to use the method that lets you evaluate arbitrary expressions over a model. It reduces expressions to a constant that is consistent with the way the model interprets the constants and functions. For our model from above

figure ax

produces the output 2, 0, 1.

4.6 Other Methods

4.6.1 Statistics

You can gain a sneak peak at what the solver did by extracting statistics. The call

figure ay

displays values of internal counters maintained by the decision procedures. They are mostly valuable when coupled with a detailed understanding of how the decision procedures work, but may be used as an introductory view into the characteristics of a search.

4.6.2 Proofs

Proof objects, that follow a natural deduction style, are available from the Solver interface [42]. You have to enable proof production at top level in order to retrieve proofs.

figure az

The granularity of proof objects is on a best-effort basis. Proofs for the SMT Core, are relatively fined-grained, while proofs for procedures that perform quantifier elimination, for instance QSAT described in Sect. 6.4, are exposed as big opaque steps.

4.6.3 Retrieving Solver State

You can retrieve the current set of assertions in a solver using , the set of unit literals using and literals that are non-units using . The solver state can be printed to SMT-LIB2 format using .

4.6.4 Cloning Solver State

The method clones the solver state into a new solver based on the context that is passed in. It is useful for creating separate non-interfering states of a solver.

4.6.5 Loading Formulas

The methods and adds constraints to a solver state from a file or string. Files are by default assumed to be in the SMT2 format. If a file name ends with they are assumed to be in the DIMACS propositional format.

4.6.6 Consequences

Product configuration systems use constraints to describe the space of all legal configurations. As parameters get fixed, fewer and fewer configuration options are available. For instance, once the model of a car has been fixed, some options for wheel sizes become unavailable. It is furthermore possible that only one option is available for some configurations, once some parameters are fixed. Z3 can be used to answer queries of the form: Given a configuration space of values V, when fixing values \(V_0 \subseteq V\), what is the largest subset \(V_0\subseteq V_1 \subseteq V\) of values that become fixed? Furthermore, for some value \(v_1\) that is fixed, provide an explanation, in terms of the values that were fixed in \(V_0\), for why \(v_1\) got fixed. The functionality is available through the method.

figure ba

produces the result:

figure bb

In terms for SAT terminology, consequence finding produces the set of all backbone literals. It is useful for finding fixed parameters [29] in product configuration settings.

Z3 relies on a procedure that integrates tightly with the CDCL, Conflict Driven Clause Learning [56], algorithm, and it contains two implementations of the procedure, one in the SAT core, another in the SMT core. Section 6.1.1 expands on CDCL and integrations with theories (Fig. 4).

4.6.7 Cubes

You can ask Z3 to suggest a case split or a sequence of case splits through the cubing method. It can be used for partitioning the search space into sub-problems that can be solved in parallel, or alternatively simplify the problem for CDCL engines.

Fig. 4.
figure 4

Basic cube and conquer

When the underlying solver is based on the SAT Core, see Sect. 6.2, it uses a lookahead solver to select cubes [25]. By default, the cuber produces two branches, corresponding to a case split on a single literal. The SAT Core based cuber can be configured to produce cubes that represent several branches. An empty cube indicates a failure, such as the solver does not support cubing (only the SMT and SAT cores support cubing, and generic solvers based on tactics do not), or a timeout or resource bound was encountered during cubing. A cube comprising of the Boolean constant indicates that the state of the solver is satisfiable. Finally, it is possible for the method to return an empty set of cubes. This happens when the state of is unsatisfiable. Each branch is represented as a conjunction of literals. The cut-off for branches is configured using

We summarize some of the configuration parameters that depend on the value of in Table 3.

Table 3. Lookahead parameters

Heuristics used to control which literal is selected in cubes can be configured using the parameter:

5 Using Solvers

We now describe a collection of algorithms. They are developed on top of the interfaces described in the previous section.

5.1 Blocking Evaluations

Models can be used to refine the state of a solver. For example, we may wish to invoke the solver in a loop where new calls to the solver blocks solutions that evaluate the constants to the exact same assignment.

figure bc

5.2 Maximizing Satisfying Assignments

Another use of models is to use them as a guide to a notion of optimal model. A maximal satisfying solution, in short mss, for a set of formulas is a subset of that is consistent with respect to the solver state and cannot be extended to a bigger subset of without becoming inconsistent relative to . We provide a procedure, from [40], for finding a maximal satisfying subset in Fig. 5. It extends a set greedily by adding as many satisfied predicates from in each round as possible. If it finds some predicate that it cannot add, it notes that it is a backbone with respect to the current . As a friendly hint, it includes the negation of when querying the solver in future rounds.

Fig. 5.
figure 5

An algorithm for computing maximal satisfying subsets

Exercise 5a: Suppose is a list corresponding to digits in a binary number and is ordered by most significant digit down. The goal is to find an mss with the largest value as a binary number. Modify to produce such a number.

5.3 All Cores and Correction Sets

The Marco procedure [37] combines models and cores in a process that enumerates all unsatisfiable cores and all maximal satisfying subsets of a set of formulas with respect to solver . It maintains a solver that tells us which subsets of are not yet known to be a superset of a core or a subset of an .

Fig. 6.
figure 6

The MARCO algorithm for computing cores and maximal satisfying assignments

Efficiently enumerating cores and correction sets is an active area of research. Many significant improvements have been developed over the above basic implementation [1, 2, 40, 49, 53].

5.4 Bounded Model Checking

Figure 7 illustrates a bounded model checking procedure [4] that takes a transition system as input and checks if a goal is reachable. Transition systems are described as

$$ \langle Init , Trans , Goal , \mathcal {V}, \mathcal {Y} \rangle $$

where \( Init \) is a predicate over \(\mathcal {V}\), that describes the initial states, \( Trans \) is a transition relation over \(\mathcal {V} \times \mathcal {Y} \times \mathcal {V}'\). The set of reachable states is the set inductively defined as valuations s of \(\mathcal {V}\), such that either \(s\,\models \, Init \) or there is a reachable \(s_0\) and values v for \(\mathcal {Y}\), such that \(s_0, v, s\,\models \, Trans \). A goal is reachable if there is some reachable state where \(s\,\models \, Goal \) (Fig. 6).

In Python we provide the initial condition as , using variables , the transition that uses variables , and using variables . Bounded model checking unfolds the transition relation until it can establish that the goal is reachable. Bounded model checking diverges if is unreachable. The function takes an expression and a list of pairs of the form and replaces variables by in .

Fig. 7.
figure 7

Bounded model checking of a transition system

Example 1

Let us check whether there is some k, such that \(\underbrace{3 + 3 + \ldots + 3}_k = 10\) when numbers are represented using 4 bits. The corresponding transition system uses a state variable which is named in the next state. Initially and in each step the variable is incremented by 3. The goal state is .

figure bd

Bounded model checking is good for establishing reachability, but does not produce certificates for non-reachability (or safety). The IC3 [9] algorithm is complete for both reachability and non-reachability. You can find a simplistic implementation of IC3 using the Python API online

5.5 Propositional Interpolation

It is possible to compute interpolants using models and cores [12]. A procedure that computes an interpolant I for formulas A, B, where \(A \wedge B\) is unsatisfiable proceeds by initializing \(I = true \) and saturating a state \(\lceil A, B, I \rceil \) with respect to the rules:

The partial interpolant I produced by pogo satisfies \(B \vdash I\). It terminates when \(A \vdash \lnot I\). The condition \(A \wedge I \not \vdash \lnot L\) ensures that the algorithm makes progress and suggests using an implicant \(L' \supseteq L\) of \(A \wedge I\) in each iteration. Such an implicant can be obtained from a model for \(A \wedge I\) (Fig. 8).

Fig. 8.
figure 8

Propositional interpolation

Example 2

The (reverse) interpolant between \(A: x_1 = a_1 \ne a_2 \ne x_2\) and \(B: x_1 = b_1 \ne b_2 = x_2\) using vocabulary \(x_1, x_2\) is \(x_1 \ne x_2\). It is implied by B and inconsistent with A.

figure be

5.6 Monadic Decomposition

Suppose we are given a formula \(\varphi [x, y]\) using variables x and y. When is it possible to rewrite it as a Boolean combination of formulas \(\psi _1(x), \ldots , \psi _k(x)\) and \(\theta _1(y), \ldots , \theta _n(y)\)? We say that the formulas \(\psi _j\) and \(\theta _j\) are monadic; they only depend on one variable. An application of monadic decomposition is to convert extended symbolic finite transducers into regular symbolic finite transducers. The regular versions are amenable to analysis and optimization. A procedure for monadic decomposition was developed in [58], and we here recall the Python prototype (Fig. 9).

Fig. 9.
figure 9

Monadic decomposition

Example 3

A formula that has a monadic decomposition is the bit-vector assertion for x, y being bit-vectors of bit-width 2k.

$$ y > 0 \wedge (y \& (y - 1)) = 0 \wedge (x \& (y \% ((1 \ll k) - 1))) \ne 0 $$

We can compute the monadic decomposition

figure bf

6 Solver Implementations

There are five main solvers embedded in Z3. The SMT Solver is a general purpose solver that covers a wide range of supported theories. It is supplemented with specialized solvers for SAT formulas, polynomial arithmetic, Horn clauses and quantified formulas over theories that admit quantifier-elimination.

6.1 SMT Core

The SMT Solver is a general purpose solver that covers a wide range of supported theories. It is built around a CDCL(T) architecture where theory solvers interact with a SAT + EUF blackboard. Theory solvers, on the right in Fig. 10, communicate with a core that exchanges equalities between variables and assignments to atomic predicates. The core is responsible for case splitting, which is handled by a CDCL SAT solver, and for letting each theory learn constraints and equalities that are relevant in the current branch.

Fig. 10.
figure 10

Architecture of Z3’s SMT core solver.

To force using the SMT solver a user can create a simple solver using the function .

The SMT solver integrates two strategies for quantifier instantiation. By default, both strategies are enabled. To disable them, one has to disable automatic configuration mode and then disable the instantiation strategy:

figure bg

6.1.1 CDCL(T): SAT + Theories

The architecture of mainstream SMT solvers, including Z3’s SMT core, uses a SAT solver to enumerate combinations of truth assignments to atoms. The truth assignments satisfy a propositional abstraction of the formula. Theory solvers are used to check if assignment admit a model modulo the theories. The resulting architecture is known as DPLL(T) [52], but we refer to this as CDCL(T) because it really relies on SAT solvers that incorporate Conflict Driven Clause Learning [56], which goes beyond the algorithm associated with DPLL [18]. Importantly, CDCL supplies facilities for learning new clauses during search. The learned clauses block future case splits from exploring the same failed branches. Take the following example

figure bh

we obtain a propositional formula

figure bi

It is satisfiable and a possible truth assignment is

figure bj

It requires satisfiability of the following conjunction:

figure bk

It is already the case that

figure bl

is . To avoid this assignment we require also satisfying the blocking clause

figure bm

The new truth assignment

figure bn

produces

figure bo

which is satisfiable. The example illustrates the steps used in a CDCL(T) integration where the Theory Solver processes the final result of a SAT Solver. We can simulate this procedure using Z3’s API. Figure 11 shows a CDCL(T) solver that leverages a propositional solver to check a propositional abstraction and a theory solver whose role is to check conjunctions of literals produced by . Figure 12 lists auxiliary routines required to create the abstraction.

Fig. 11.
figure 11

Simple CDCL(T)

Fig. 12.
figure 12

Auxiliary routines for lazy CDCL(T)

We call it a simple CDCL(T) solver as it does not expose important features to drive performance. Importantly, efficient CDCL(T) solvers integrate theory propagation that let theories interact with the SAT solver to propagate assignments to atoms. Instead of adding blocking clauses by the time the SAT solver is done the theory solver interacts tightly with the SAT solver during back-jumping.

Exercise 6a

Dual Propagation and Implicants: The propositional assignment produced by is not necessarily minimal. It may assign truth assignments to literals that are irrelevant to truth of the set of clauses. To extract a smaller assignment, one trick is to encode the negation of the clauses in a separate dual solver. A truth assignment for the primal solver is an unsatisfiable core for the dual solver. The exercise is to augment with a dual solver to reduce assignments sent to the theory solver.

6.1.2 Theories + Theories

In practice we need to solve a combination of theories. The formulas we used in the initial example

figure bp

integrate several theory solvers. For modularity, it is desirable to maintain separate solvers per theory. To achieve this objective the main questions that an integration needs to address are:

  • Determine when the union of two theories \({T}_1 \cup {T}_2\) is consistent.

  • Given solvers for \({T}_1\) and \({T}_2\), how can we build a solver for \({T}_1 \cup {T}_2\).

We can address this objective when there is an effective theory \({T}_0\) over the shared signature of \({T}_1, {T}_2\), that when embedable into \({T}_1, {T}_2\) implies \({T}_1 \cup \, {T}_2\) is consistent. Sufficient conditions for this setting were identified by Nelson and Oppen [50]:

Theorem 1

The union of two consistent, disjoint, stably infinite theories is consistent.

Let us define the ingredients of this theorem.

Disjoint Theories. Two theories are disjoint if they do not share function/constant and predicate symbols. \(=\) is the only exception. For example,

  • The theories of arithmetic and arrays are disjoint.

    • Arithmetic symbols: .

    • Array symbols:

The process of purification can be used as a formal tool to bring formulas into signature-disjoint form. It introduces fresh symbols for shared sub-terms. A purified version of our running example is:

figure bq

In reality, purification is a no-op: the fresh variables correspond directly to nodes in the abstract syntax trees for expressions.

Stably Infinite Theories. A theory is stably infinite if every satisfiable quantifier-free formula is satisfiable in an infinite model.

  • EUF and arithmetic are stably infinite.

  • Bit-vectors are not.

Nelson-Oppen Combination. Let \({T}_1\) and \({T}_2\) be consistent, stably infinite theories over disjoint (countable) signatures. Assume satisfiability of conjunction of literals can be decided in \(O({T}_1(n))\) and \(O({T}_2(n))\) time respectively. Then

  1. 1.

    The combined theory \({T}\) is consistent and stably infinite.

  2. 2.

    Satisfiability of quantifier free conjunction of literals can be decided in \(O(2^{n^2} \times ({T}_1(n) + {T}_2(n)))\).

  3. 3.

    If \({T}_1\) and \({T}_2\) are convex, then so is \({T}\) and satisfiability in \({T}\) can be decided in \(O(n^3 \times ({T}_1(n) + {T}_2(n)))\).

Convexity. A theory \({T}\) is convex if for every finite sets S of literals, and every disjunction \(a_1 = b_1 \vee \ldots \vee a_n = b_n\):

Many theories are convex and therefore admit efficient theory combinations

  • Linear Real Arithmetic is convex.

  • Horn equational theories are convex.

    • Horn equations are formulas of the form \(a_1 \ne b_1 \vee \ldots a_n \ne b_n \vee a = b\).

Finally note that every convex theory with non trivial models is stably infinite.

But, far from every theory is convex. Notably,

  • Integer arithmetic

    • \(1 \le a \le 2, b = 1, c = 2\) implies \(a = b \vee a = c\).

  • Real non-linear arithmetic

    • \(a^2 = 1, b = 1, c = -1\) implies \(a = b \vee a = c\).

  • The theory of arrays

    • implies or .

A Reduction Approach to Theory Combination[35, 43]. Theory Combination in Z3 is essentially by reduction to a set of core theories comprising of Arithmetic, EUF and SAT. Bit-vectors and finite domains translate to propositional SAT. Other theories are reduced to core theories. We provided an example of this reduction in Sect. 3.3.

6.1.3 E-Matching Based Quantifier Instantiation

E-matching [46] based quantifier instantiation uses ground terms to find candidate instantiations of quantifiers. Take the example

figure br

The smallest sub-term that properly contains is . This pattern contains all the bound variables of the universal quantifier. Under the ground equality and instantiation of by , it equals . This triggers an instantiation by the following tautology

figure bs

Chasing the equalities we derive , which proves the implication.

The example illustrated that E-matching takes as starting point a pattern term p, that captures the variables bound by a quantifier. It derives an substitution \(\theta \), such that \(p\theta \) equals some useful term t, modulo some useful equalities. A useful source of useful terms are the current ground terms \(\mathcal {T}\) maintained during search, and the current asserted equalities during search may be used as the useful equalities. The congruence closure structure cc introduced in Sect. 3.1.1 contains relevant information to track ground equalities. For each ground term it represents an equivalence class of terms that are congruent in the current context. Now, given a pattern p we can compute a set of substitutions modulo the current congruence closure by invoking

$$ \bigcup _{t \in \mathcal {T}} match(p, t, \emptyset ) $$

where E-matching is defined by recursion on the pattern p:

It is not always possible to capture all quantified variables in a single pattern. For this purpose E-matching is applied to a sequence of patterns, known as a multi-pattern, that collectively contains all bound variables.

The secret sauce to efficiency is to find instantiations

  • with as little overhead as possible,

  • across large sets of terms, and

  • incrementally.

Z3 uses code-trees [54] to address scale bottlenecks for search involving thousands of patterns and terms.

6.1.4 Model-Based Quantifier Instantiation

E-matching provides a highly syntactic restriction on instantiations. An alternative to E-matching is based on using a current model of the quantifier-free part of the search state. It is used to evaluate the universal quantifiers that have to be satisfied in order for the current model to extend to a full model of the conjunction of all asserted constraints. We call this method Model-Based Quantifier Instantiation [8, 22, 47, 59]. Take the following example:

figure bt

It may produce a model of the form

figure bu

The interpretation of maps 3 to \(-10\), and all other values x are mapped to however \(f(11 + x)\) is interpreted (which happens to be the constant 2).

The method that allowed finding this satisfying assignment is based on a model evaluation loop. At a high level it can be described as the following procedure, which checks satisfiability of

$$ \psi \wedge \forall x \ . \ \varphi [{x}] $$

where \(\psi \) is quantifier free and for sake of illustration we have a single quantified formula with quantifier free body \(\varphi \). The Model-Based Quantifier Instantiation, MBQI, procedure is described in Fig. 13:

Fig. 13.
figure 13

Model-Based Quantifier Instantiation algorithm. Notice that this proto-algorithm code is not directly executable.

We use the notation \(t^M\) to say that t is partially evaluated using interpretation M, for example:

  • Let \(M := [y \mapsto 3, f(x) \mapsto if \ x = 1\ then \ 3\ else \ 5]\), and

  • \(t := y + f(y) + f(z)\), then

  • \(t^M = 3 + 5 + if \ z = 1\ then \ 3\ else \ 5\)

For our example formula assume we have a model of the quantifier-free constraints as follows

figure bv

The negated body of the quantifier, instantiated to the model is

figure bw

It is satisfied with the instantiation , which is congruent to under the current model. We therefore instantiate the quantifier with and add the constraint

figure bx

But notice a syntactic property of the quantifier body. It can be read as a definition for the graph of over the range . This format is an instance of guarded definitions [28]. Hence, we record this reading when creating the next model for . In the next round, , , and are instantiated as before, and evaluates to \(-10\) as before, but elsewhere follows the graph of , and thus the model for is given by .

Model-Based Quantifier Instantiation is quite powerful when search space for instantiation terms is finite. It covers many decidable logical fragments, including EPR (Effectively Propositional Reasoning), UFBV (uninterpreted functions and bit-vectors), the Array property fragment [10] and extensions [22]. We will here only give a taste with an example from UFBV [59]:

figure by

The following model is a possible solution:

figure bz

is the quantified logic of uninterpreted functions of bit-vectors. All sorts and variables have to be over bit-vectors, and standard bit-vector operations are allowed. It follows that the problem is finite domain and therefore decidable. It isn’t easy, however. The quantifier-free fragment is not only NP hard, it is NEXPTIME hard; it can be encoded into EPR [55]. The quantified fragment is another complexity jump. Related to , decision procedures for quantified bit-vector formulas were developed by John and Chakraborty in [31, 32], and by Niemetz et al. in [51].

Recall that EPR is a fragment of first-order logic where formulas have the quantifier prefix \(\exists \varvec{x} \forall \varvec{y}\), thus a block of existential quantified variables followed by a block of universally quantified variables. The formula inside the quantifier prefix is a Boolean combination of equalities, disequalities between bound variables and free constants as well as predicate symbols applied to bound variables or free constants. Noteworthy, EPR formulas do not contain functions. It is easy to see that EPR is decidable by first replacing the existentially quantified variables by fresh constants and then instantiate the universally quantified variables by all combinations of the free constant. If the resulting ground formula is satisfiable, we obtain a finite model of the quantified formula by bounding the size of the universe by the free constants. The formula \(\exists x \forall y. (p(x, y) \vee q(a, y) \vee y = a)\), where a is a free constant, is in EPR.

6.2 SAT Core

The SAT Core is an optimized self-contained SAT solver that solves propositional formulas. It takes advantage of the fact that it operates over propositional theories and performs advanced in-processing steps. The SAT solver also acts as a blackboard for select Boolean predicates that express cardinality and arithmetical (pseudo-Boolean) constraints over literals.

Generally, theories that are finite domain, are solved using the SAT solver. Z3 identifies quantifier-free finite domain theories using a designated logic . It supports propositional logic, bit-vector theories, pseudo-Boolean constraints, and enumeration data-types. For example, the following scenario introduces an enumeration type for color, and bit-vectors , . It requires that at least 2 out of three predicates are satisfied.

figure ca

is satisfiable, and a possible model is:

figure cb

Figure 14 shows the overall architecture of Z3’s SAT solver.

Fig. 14.
figure 14

Architecture of Z3’s SAT Solver

There are four main components. Central to the SAT solver is an engine that performs case splits, lemma learning and backtracking search. It is the main CDCL engine and is structured similar to mainstream CDCL solvers. It can draw on auxiliary functionality.

6.2.1 In-processing

In-processing provides a means for the SAT solver to simplify the current set of clauses using global inferences. In-processing is performed on a periodic basis. It integrates several of the techniques that have been developed in the SAT solving literature in the past decade, known as Blocked Clause Elimination, Asymmetric Literal Addition, Asymmetric Covered Clause Elimination, Subsumption, Asymmetric Branching [24].

6.2.2 Co-processing

A set of co-processors are available to support alternative means of search. The SAT Core solver can also be a co-processor of itself.

  • spawns 3 concurrent threads that use walk-sat to find a satisfying assignment while the main CDCL solver attempts to find either a satisfying assignment or produce an empty clause.

  • spawns 2 concurrent threads, in additional to the main thread, to find a proof of the empty clause or a satisfying assignment. The threads share learned unit literals and learned clauses.

  • spawns 1 concurrent thread that uses a local search heuristic that integrates unit propagation.

  • enables the lookahead solver as a simplifier during in-processing. It enables slightly more powerful techniques for learning new units and binary clauses.

The lookahead solver is used to find case splits through the Cube features, described in Sect. 4.6.7.

6.2.3 Boolean Theories

Three classes of Boolean functions are supported using specialized Boolean theory handlers. They are optional, as many problems can already be solved using the SAT core where the functions have been clausified. The cardinality and Pseudo-Boolean theory handlers are suitable for constraints where the encoding into clauses causes a significant overhead. The Xor solver is unlikely to be worth using, but is available for evaluation.

Cardinality Constraints. Cardinality constraints are linear inequalities of the form

$$ \sum _{i=1}^n F_i \ge k, \ \sum _{i=1}^n F_i \le k $$

where \(F_i\) are formulas and k is a constant between 1 and n. They say that at least k of the \(F_i\);s have to hold, and at most k of the \(F_i\)’s hold, respectively. Cardinality constraints do not have to appear at top-level in formulas. They can be nested in arbitrary sub-formulas and they can contain arbitrary formulas. For instance,

figure cc

has no solution.

The cardinality solver is enabled by setting the parameter

If the parameter is false, cardinality constraints are compiled to clauses. A few alternative encoding methods are made available, and they can be controlled using the parameter .

Pseudo-Boolean Constraints. Pseudo-Boolean constraints generalize cardinality constraints by allowing coefficients in the linear inequalities. They are of the form

$$ \sum _{i=1}^n a_iF_i \ge k, \ \sum _{i=1}^n a_iF_i \le k $$

where \(a_i\) are positive natural numbers. A value of \(a_i\) above k is legal, but can be safely truncated to k without changing the meaning of the formulas.

The constraints

$$ p + 2q + 2r \le 2 \wedge p + 2u + 3r \ge 4 \wedge u $$

can be written as

figure cd

and have a solution

figure ce

The pseudo-Boolean solver is enabled by setting the parameter

Other available options for compiling Pseudo-Boolean constraints are , , and . They compile Pseudo-Booleans into clauses.

6.3 Horn Clause Solver

The Horn Solver contains specialized solvers for Constrained Horn Clauses [5, 23, 26, 27, 39]. As a default it uses the SPACER Horn clause solver by Arie Gurfinkel to solve Horn clauses over arithmetic [36]. A Constrained Horn Clause is a disjunction of literals over a set of uninterpreted predicates and interpreted functions and interpreted predicates (such as arithmetical operations and relations ). The uninterpreted predicates, may occur negatively without restrictions, but only occur positively in at most one place.

The solver also contains a Datalog engine that can be used to solve Datalog queries (with stratified negation) over finite domains and “header spaces” that are large finite domains, but can be encoded succinctly using ternary bit-vectors. The context contains facilities for building Horn clauses, and generally a set of stratified Datalog rules, and for querying the resulting set of rules and facts. Additional information on the Fixedpoint engine can be found on https://rise4fun.com/z3/tutorial/fixedpoints.

We provide a very simple illustration of Horn clause usage here. McCarthy’s 91 function illustrates nested recursion in a couple of lines, but otherwise makes no sense: It computes a function that can be described directly as

figure cf

We will pretend this is a partial and interesting specification and prove this automatically using Horn clauses.

figure cg

Rewriting the functional program into logical form can be achieved by introducing a binary relation between the input and output of , and then representing the functional program as a logic program, that is, a set of Horn clauses. The assertions are also Constrained Horn Clauses: they contain the uninterpreted predicate negatively, but have no positive occurrences of .

figure ch

Z3 finds a solution for that is a sufficient invariant to establish the assertions.

We get a better view of the invariant for by evaluating it on symbolic inputs and .

figure ci

6.4 QSAT

The QSAT Solver is a decision procedure for satisfiability of select theories that admit quantifier elimination. It can be used to check satisfiability of quantified formulas over Linear Integer (Fig. 15), Linear Real (Fig. 16), Non-linear (polynomial) Real arithmetic (Fig. 17), Booleans, and Algebraic Data-types (Fig. 18). It is described in [6]. It is invoked whenever a solver is created for one of the supported quantified logics, or a solver is created from the tactic.

Fig. 15.
figure 15

Given a supply of 5 and 7 cent stamps. Is there a lower bound, after which all denominations of stamps can be produced? Thus, find , such that every larger or equal to can be written as a non-negative combination of 5 and 7.

Fig. 16.
figure 16

The set of reals is dense

Fig. 17.
figure 17

Quantified non-linear real polynomial arithmetic

Fig. 18.
figure 18

Checking for winning positions in a game of successors

Figure 18 encodes a simple game introduced in [16]. There is no SMT-LIB2 logic for quantified algebraic data-types so we directly instantiate the solver that performs QSAT through a tactic. Section 7 provides a brief introduction to tactics in Z3.

The solver builds on an abstraction refinement loop, originally developed for quantifier elimination in [41]. The goal of the procedure is, given a quantifier-free f, find a quantifier free G, such that \(G \equiv \exists \varvec{v} \;.\; F\). It assumes a tool, project, that eliminates \(\varvec{v}\) from a conjunction M into a satisfiable strengthening. That is, project(\(\varvec{v}, M\)) \(\Rightarrow \ \exists \varvec{v}\; .\; M\). The procedure, uses the steps:

  • Initialize: \(G \leftarrow \bot \)

  • Repeatedly: find conjunctions M that imply \(F \wedge \lnot G\)

  • Update: \(G \leftarrow G \vee \,\text {project}(\varvec{v}, M)\).

An algorithm that realizes this approach is formulated in Fig. 19.

Fig. 19.
figure 19

Quantifier elimination by core extraction and projection. Notice that this proto-algorithm code is not directly executable

QESTO [30] generalizes this procedure to nested QBF (Quantified Boolean Formulas), and the implementation in Z3 generalizes QESTO to SMT. The approach is based on playing a quantifier game. Let us illustrate the game for Boolean formulas. Assume we are given:

Then the game proceeds as follows:

  • \(\forall \): starts. \(u_1, u_2, \overline{e}_1, \overline{e}_2\,\models \,\lnot F\).

  • \(\exists \): strikes back. .

  • \(\forall \): has to backtrack. It doesn’t matter what \(u_1\) is assigned to. It is already the case that \(u_2, e_1, \overline{e}_2\,\models \,F\).

  • \(\forall \): learns \(\lnot u_2\).

  • \(\forall \): .

  • \(\exists \): counters - .

  • \(\forall \): has lost!. It is already the case that .

To summarize the approach:

  • There are two players

    • \(\forall \) - tries to satisfy \(\lnot F\)

    • \(\exists \) - tries to satisfy F

  • Players control their variables. For example, take \(\exists x_1 \forall x_2 \exists x_3 \forall x_4 \ldots F\) at round 2:

    • value of \(x_1\) is already fixed,

    • \(\forall \) fixes value of \(x_2\),

    • \(\forall \) fixes value of \(x_4\), but can change again at round 4,

    • \(\forall \) can guess values of \(x_3\) to satisfy \(\lnot F\).

  • Some player loses at round \(i + 2\):

    • Create succinct no-good to strengthen F resp. \(\lnot F\) depending on who lost.

    • Backjump to round i (or below).

The main ingredients to the approach is thus projection and strategies.

  • Projections are added to learn from mistakes. Thus, a player avoids repeating same losing moves.

  • Strategies prune moves from the opponent.

We will here just illustrate an example of projection. Z3 uses model based projection [36, 44] to find a satisfiable quantifier-free formula that implies the existentially quantified formula that encodes the losing state.

Example 4

Suppose we would want to compute a quantifier-free formula that implies \(\exists x \ . \ (2y \le x \wedge y - z \le x \wedge x \le z)\). Note that the formula is equivalent to a quantifier free formula:

\(\exists x \ . \ (2y \le x \wedge y - z \le x \wedge x \le z) \equiv (y - z \le 2y \le z) \vee (2y \le y - z \le z)\)

but the size of the equivalent formula is quadratic in the size of the original formula. Suppose we have a satisfying assignment for the formula inside of the existential quantifier. Say \(M = [x \mapsto 3, y \mapsto 1, z \mapsto 6]\). Then \(2y^M = 2\) and \((y-z)^M = -5\), and therefore \(2y > y - z\) under M. The greatest lower bound for x is therefore 2y and we can select this branch as our choice for elimination of x. The result of projection is then \(y - z \le 2y \le z.\)

6.5 NLSat

The solver created when invoking relies on a self-contained engine that is specialized for solving non-linear arithmetic formulas [34]. It is a decision procedure for quantifier-free formulas over the reals using polynomial arithmetic.

figure cj

The NLSat solver is automatically configured if the formula is syntactically in the fragment. So one can directly use it without specifying the specialized solver:

figure ck

7 Tactics

In contrast to solvers that ultimately check the satisfiability of a set of assertions, tactics transform assertions to sets of assertions, in a way that a proof-tree is comprised of nodes representing goals, and children representing subgoals. Many useful pre-processing steps can be formulated as tactics. They take one goal and create a subgoal.

7.1 Tactic Basics

You can access the set of tactics

figure cl

and for additional information obtain a description of optional parameters:

figure cm

We will here give a single example of a tactic application. It transforms a goal to a simplified subgoal obtained by eliminating a quantifier that is trivially reducible and by combining repeated formulas into one.

figure cn

Additional information on tactics is available from [45], https://rise4fun.com/Z3/tutorial/strategies and http://www.cs.tau.ac.il/~msagiv/courses/asv/z3py/strategies-examples.htm.

7.2 Solvers from Tactics

Given a tactic , the method extracts a solver object that applies the tactic to the current assertions and reports or if it is able to reduce subgoals to a definite answer.

7.3 Tactics from Solvers

There is no method that corresponds to producing a tactic from a solver. Instead Z3 exposes a set of built-in tactics for the main solvers. These are accessed through the names , , (and for quantified non-linear real arithmetic, e.g., the logic ), for and for .

7.4 Parallel Z3

The parameter enables Z3’s parallel mode. Selected tactics, including , that uses the SAT solver for sub-goals the option, when enabled, will cause Z3 to use a cube-and-conquer approach to solve subgoals. The tactics , and provide direct access to the parallel mode, but you have to make sure that is true to force them to use parallel mode. You can control how the cube-and-conquer procedure spends time in simplification and cubing through other parameters under the name-space.

The main option to toggle is . It caps the maximal number of threads. By default, the maximal number of threads used by the parallel solver is bound by the number of processes.

8 Optimization

Depending on applications, learning that a formula is satisfiable or not, may not be sufficient. Sometimes, it is useful to retrieve models that are optimal with respect to some objective function. Z3 supports a small repertoire of objective functions and invokes a specialized optimization module when objective functions are supplied. The main approach for specifying an optimization objective is through functions that specify whether to find solutions that maximize or minimize values of an arithmetical (in the case of Z3, the term has to a linear arithmetic term) or bit-vector term t. Thus, when specifying the objective \( maximize (t)\) the solver is instructed to find solutions to the variables in t that maximizes the value of t. An alternative way to specify objectives is through soft constraints. These are assertions, optionally annotated with weights. The objective is to satisfy as many soft constraints as possible in a solution. When weights are used, the objective is to find a solution with the least penalty, given by the sum of weights, for unsatisfied constraints. From the Python API, one uses the context to specify optimization problems. The context relies on the built-in solvers for solving optimization queries. The architecture of the optimization context is provided in Fig. 20.

Fig. 20.
figure 20

Optimization engines in Z3

The context provides three main extensions to satisfiability checking:

figure co

Using soft assertions is equivalent to posing an 0-1 optimization problem. Thus, the following formulations are equivalent and Z3 detects the second variant and turns it into a set of weighted soft assertions.

figure cp

8.1 Multiple Objectives

It is possible to add multiple objectives. There are three ways to combine objective functions.

figure cq

For instance, Pareto objectives can be specified as follows:

figure cr

8.2 MaxSAT

The conventional definition of MaxSAT is to minimize the number of violated soft assertions. There are several algorithms for MaxSAT, and developing new algorithms is a very active area of research. We will here describe MaxRes from [48]. It is also Z3’s default solver for MaxSAT/MaxSMT problems. As an illustration assume we are given an unweighted (all soft constraints have weight 1) MaxSAT problem \(F, F_1, \ldots , F_5\), where the first four soft constraints cannot be satisfied in conjunction with the hard constraint F. Thus, we have the case:

$$ A: F, \underbrace{F_1, F_2, F_3, F_4}_{core}, F_5 $$

The system is transformed to a weakened MaxSAT problem as follows:

$$ A': F, \ F_2 \vee F_1, F_3 \vee (F_2 \wedge F_1), F_4 \vee (F_3 \wedge (F_2 \wedge F_1)), F_5 $$

The procedure is formalized in Fig. 21. We claim that by solving \(A'\), we can find an optimal solution to A. For this purpose, consider the cost of a model with respect to a MaxSAT problem. The cost, written cost(MA) is the number of soft constraints in A that are false under M. More precisely,

Lemma 1

For every model M of F, \(cost(M, A) = 1 + cost(M, A')\)

Proof

(of Lemma 1) . To be able to refer to the soft constraints in the transformed systems \(A'\) we will give names to the new soft constraints, such that \(F_1'\) is a name for \(F_2 \vee F_1\), \(F_2'\) names \(F_3 \vee (F_2 \wedge F_1)\), \(F_3'\) is the name for \(F_4 \vee (F_3 \wedge (F_2 \wedge F_1))\) and \(F_4'\) is the new name of \(F_5\).

Consider the soft constraints in the core. Since it is a core, at least one has to be false under M. Let j be the first index among where \(M(F_j)\) is false. Then M evaluates all other soft constraints the same, e.g., \(\forall i < j: \ M(F'_i) = M(F_{i})\), and \(\forall i > j: M(F'_{i-1}) = M(F_i)\). \(\Box \)

Thus, eventually, it is possible to satisfy all soft constraints (weakening could potentially create 0 soft constraints), and a solution to the weakened system is an optimal solution.

Fig. 21.
figure 21

Core based MaxSAT using MaxRes

Weighted assertions can be handled by a reduction to unweighted MaxSAT. For example,

figure cs

Efficient implementations of MaxSAT flatten weights on demand. Given a core of soft constraints it is split into two parts: In one part all soft constraints have the same coefficient as the weight of the soft constraint with the minimal weight. The other part comprises of the remaining soft constraints. For our example, is a core and the weight of is 2, while the weight of is 3. The weight of can therefore be split into two parts, one where it has weight 2, and the other where it has weight 1. Applying the transformation for the core we obtain the simpler MaxSAT problem:

figure ct

9 Summary

This tutorial has presented an overview of main functionality exposed by Z3. By presenting some of the underlying algorithms in an example driven way we have attempted to give a taste of the underlying decision procedures and proof engines. By presenting examples of programming queries on top of Z3 we have attempted to provide an introduction to turning SMT solving into a service for logic queries that go beyond checking for satisfiability of a single formula. Tuning extended queries on top of the basic services provided by SAT and SMT solvers is a very active area of research with new application scenarios and new discoveries.