Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Higher-order model checking [14], or model checking of higher-order recursion schemes (HORS), has recently been applied to automated verification of higher-order functional programs [9, 11, 12, 15, 17]. A HORS is a higher-order tree grammar for generating a (possibly infinite) tree, and higher-order model checking is concerned about whether the tree generated by a given HORS satisfies a given property. Although the worst-case complexity of higher-order model checking is huge (\(k\)-EXPTIME complete for order-\(k\) HORS [14]), practical algorithms for higher-order model checking have been developed [4, 8, 16, 18], which do not always suffer from the \(k\)-EXPTIME bottleneck.

Fig. 1.
figure 1

Overview: indirect vs. direct style

A typical approach for applying higher-order model checking to program verification [11] is as follows. As illustrated on the left-hand side of Fig. 1, a source program, which is a call-by-value higher-order functional program, is first abstracted to a call-by-value, higher-order Boolean functional program, using predicate abstraction. The Boolean functional program is further translated to a HORS, which is essentially a call-by-name higher-order functional program, and then model checked. We call this approach indirect, as it involves many steps of program transformations. This indirect approach has an advantage that, thanks to the CPS transformation used in the translation to HORS, various control structures (such as exceptions and call/cc) and evaluation strategies (call-by-value and call-by-name) can be uniformly handled. The multi-step transformations, however, incur a number of drawbacks as well, such as code explosion and the increase of the order of programs (where the order of a program is the largest order of functions; a function is first-order if both the input and output are base values, and it is second-order if it can take a first-order function as an argument, etc.). The multi-step transformations also make it difficult to propagate the result of higher-order model checking back to the source program, e.g., for the purpose of counter-example-guided abstraction refinement (CEGAR), and certificate generation.

In view of the drawbacks of the indirect approach mentioned above, we advocate higher-order model checking in a more direct style, where call-by-value higher-order Boolean programs are directly model checked, without the translation to HORS, as illustrated on the right-hand side of Fig. 1. That would avoid the increase of the size and order of programs (recall that the complexity of higher-order model checking is \(k\)-EXPTIME complete for order-\(k\) HORS; thus the order is the most critical parameter for the complexity). In addition, the direct style approach would take an advantage of optimization using the control structure of the original program, which has been lost during the CPS-transformation in indirect style.

Our goal is then to develop an appropriate algorithm that directly solves the model-checking problem for call-by-value higher-order Boolean programs. We focus here on the reachability problem (of checking whether a given program reaches a certain program point); any safety properties can be reduced to the reachability problem in a standard manner.

From a purely theoretical point of view, this goal has been achieved by Tsukada and Kobayashi [20]. They developed an intersection type system for reachability checking of call-by-value higher-order Boolean programs, which gives a better (and exact in a certain sense) upper bound of the worst case complexity of the problem than the naïve indirect approach. However their algorithm, which basically enumerates all the types of subterms, is far from practical since the number of candidate types for a given subterm is hyper-exponential.

Now the challenge is to find an appropriate subset of types to be considered for a given program: this subset has to be large enough to correctly analyze the behaviour of the program and, at the same time, sufficiently small to be manipulated in realistic time. In previous work [4, 18] for a call-by-name language, this subset is computed with the help of the control-flow analysis, which gives an over-approximation of the behaviour of the program. The naïve adaptation of this idea to a call-by-value language, however, does not work well. This is because the flow-information tends to be less accurate for call-by-value programs: in an application \( t_1\,t_2 \), one has to over-approximate the evaluation of both \( t_1 \) and \( t_2 \) in call-by-value, whereas in call-by-name \( t_2 \) itself is the accurate actual argument. We propose an algorithm (the 0-Control-Flow-Analysis (CFA) guided saturation algorithm) that deeply integrates the type system and the 0-CFA. The integration reduces the inaccuracy of the flow analysis and makes the algorithm efficient, although it is technically more complicated.

We have implemented the algorithm, and confirmed through experiments that for large instances, our direct approach for model checking call-by-value Boolean programs outperforms the previous indirect approach of translating a call-by-value program to HORS and then model-checking the HORS.

The contributions of this paper are summarized as follows.

  • A practical algorithm for the call-by-value reachability problem in direct style. The way to reduce type candidates using control-flow analysis is quite different from that of previous algorithms [4, 18].

  • The formalization of the algorithm and a proof of its correctness. The proof is more involved than the corresponding proof of the correctness of Tsukada and Kobayashi’s algorithm [20] due to the flow-based optimization, and also than that of the correctness of the HorSat algorithm [4], due to the call-by-value evaluation strategy.

  • Implementation and evaluation of the algorithm.

The rest of this paper is structured as follows. Section 2 defines the target language, its semantics, and the reachability problem. Section 3 gives an intersection type system that characterizes the reachability of a program. Section 4 describes the 0-CFA guided saturation algorithm, and proves its correctness. Section 5 describes the experimental results. Section 6 discusses related work, and the final section concludes the paper. Some proofs and figures are omitted in this version and are available in the long version [19].

2 Call-by-Value Reachability Problem

2.1 Target Language

We first introduce notations used in the rest of this paper. We write \(\mathbf {Lab}\), \(\mathbf {Var}\), and \(\mathbf {Fun}\), respectively for the countable sets of labels, variables, and function symbols. We assume that the meta-variable \(\ell \) represents a label, the meta-variables xy represent variables, and fg represent function symbols. We write \(\mathrm {dom}(g)\) for the domain set of a function g, and \(\tilde{x}\) for a finite sequence like \(x_1,\dots ,x_k\). Let \(\rho \) be a map. We denote \(\rho [ x \mapsto v ]\) as the map that maps y to v if \(x = y\) and that behaves as the same as \(\rho \) otherwise. We denote \(\emptyset \) as both the empty set and the empty map, whose domain set is the empty set.

Fig. 2.
figure 2

Syntax

The target language of the reachability analysis in this paper is a simply-typed, call-by-value lambda calculus with Booleans, tuples and global mutual recursions. The syntax of the language is given in Fig. 2. Each subterm is labeled with \(\mathbf {Lab}\) in this language, for the control-flow analysis described later. We call terms for labeled ones, and expressions for unlabeled ones. The expression \(\mathrm {op}(\tilde{t})\) is a Boolean operation, such as \(t_1 \wedge t_2\), \(t_1 \vee t_2\), and \(\lnot t\), and \( \pi _i^k t \) is the i-th (zero-indexed) projection for the k-tuple t. The expression \( t_1 \oplus t_2 \) is a non-deterministic choice of \( t_1 \) and \( t_2 \). The terms \( \varOmega \) and \( \mathbf {fail}\) represent divergence and failure, respectively. The assume-expression \( \mathbf {assume}\ t_1; t_2 \) evaluates \(t_2\) only if \(t_1\) is evaluated to \(\mathbf {true}\) (and diverges if \(t_1\) is evaluated to \(\mathbf {false}\)).

A sort is the simple type of a term, which is either Boolean sort \(\mathbf {bool}\), a tuple sort, or a function sort; we use the word “sort” to distinguish simple types from intersection types introduced later. A local sort environment and (resp. global sort environment) is a finite map from variables (resp. function symbols) to sorts. A global definition is a finite map from function symbols to lambda-expressions. A program consists of a global definition D, a global sort environment \(\mathcal {K}\), and a term, called the main term.

Next, we define well-sorted terms. Let \(\mathcal {K}\) be a global sort environment, \(\varSigma \) a local sort environment, and \(\kappa \) a sort. A sort judgment for a term t (resp. an expression e) is of the form \(\mathcal {K}, \varSigma \vdash t : \kappa \) (resp. \(\mathcal {K}, \varSigma \vdash e : \kappa \)). The sort system of the target language is the standard simple type system with the following primitive types: \(\mathbf {fail}: \kappa \), \(\varOmega : \kappa \), \(\mathbf {assume} : \mathbf {bool}\rightarrow \kappa \), and \(\oplus : \kappa \rightarrow \kappa \rightarrow \kappa \). The inference rules are given in the long version [19].

The depth of a sort \(\kappa \), written \(\mathtt {dep}(\kappa )\), is defined as follows: \(\mathtt {dep}(\mathbf {bool}) = 1\), \(\mathtt {dep}(\langle \kappa _1,\dots ,\kappa _k \rangle ) = \max (\mathtt {dep}(\kappa _1),\dots ,\mathtt {dep}(\kappa _k))\), and \(\mathtt {dep}(\kappa _1 \rightarrow \kappa _2) = 1 + \max (\mathtt {dep}(\kappa _1),\mathtt {dep}(\kappa _2))\). The depth of a well-sorted term t, written \(\mathtt {dep}(t)\), is the maximum depth of sorts which appear in the derivation tree of \(\mathcal {K},\varSigma \vdash t : \kappa \). Let D be a global definition, and \(\mathcal {K}\) be a global sort environment. We write \(\vdash D : \mathcal {K}\) if \(\mathrm {dom}(D) = \mathrm {dom}(\mathcal {K})\) and \(\forall f \in \mathrm {dom}(D).\,\mathcal {K},\emptyset {} \vdash D(f) : \mathcal {K}(f)\). We say program \(P = \mathbf {let}\ \mathbf {rec}\ D : \mathcal {K}\ \mathbf {in}\ t_0\) has sort \(\kappa \) if \(\vdash D : \mathcal {K}\), and \(\mathcal {K},\emptyset {} \vdash t_0 : \kappa \). We say P is well-sorted if P has some sort \(\kappa \). The depth of a well-sorted program P is the maximum depth of terms in P.

Example 1

Consider the program \(P_1 = \mathbf {let}\ \mathbf {rec}\ D_1 : \mathcal {K}_1\ \mathbf {in}\ t_1\) where:

\(P_1\) is well-sorted and has sort \(\mathbf {bool}\).

2.2 Semantics

We define the operational semantics of the language in the style of Nielson et al. [13]; this style of operational semantics is convenient for discussing flow analysis later. First, we define the following auxiliary syntactic objects:

The term \(\mathbf {close}\ p\ \mathbf {in}\ \rho \) represents a closure, and the term \(\mathbf {bind}\ \rho \ \mathbf {in}\ t\) evaluates t under the environment \(\rho \). An environment is a finite map from variables to values. A value is either a Boolean, a function symbol, a closure, or a tuple of values. We note that values (resp. pre-values) are subclass of terms (resp. expressions). We extend the sort inference rules to support these terms as follows:

figure a

A sort judgment for environments is of the form \(\mathcal {K}\vdash \rho : \varSigma \), which means that for each binding \(x \mapsto v\) in \(\rho \), v has type \(\varSigma (x)\).

Next, we define reduction relations. We fix some well-sorted program \(P = \mathbf {let}\ \mathbf {rec}\ D : \mathcal {K}\ \mathbf {in}\ e_0\). Let \(\rho \) be an environment, and \(\varSigma \) be a local sort environment such that \(\mathcal {K}\vdash \rho : \varSigma \). The reduction relation for terms is of the form \(\rho \vdash _D t \longrightarrow t'\), where \(\mathcal {K}, \varSigma \vdash t : \kappa \) for some sort \(\kappa \). The reduction rules are given in Fig. 3. In rule (Op-2), \([\![\mathrm {op}]\!]\) denotes the Boolean function that corresponds to each operation \(\mathrm {op}\), and \(FV(p)\) denotes the set of free variables of \(p\). We write \(\rho \vdash _D t \longrightarrow ^* t'\) for the reflexive and transitive closure of \( \rho \vdash _D t_1 \longrightarrow t_2\).

Fig. 3.
figure 3

Reduction relation

2.3 Reachability Problem

We are interested in the reachability problem: whether a program P may execute the command \(\mathbf {fail}\) We define the set of error expressions, called \(\mathbf {Err}\), as follows:Footnote 1

Then, the reachability problem is defined as follows.

Definition 1

(Reachability Problem). A program \(P = \mathbf {let}\ \mathbf {rec}\ D : \mathcal {K}\ \mathbf {in}\ t_0\) is unsafe if \( \emptyset \vdash _D t_0 \longrightarrow ^* \phi ^{\ell } \) holds for some \(\phi \in \mathbf {Err}\). A well-sorted program P is called safe if P is not unsafe. Given a well-sorted program, the task of the reachability problem is to decide whether the program is safe.

Example 2

For example, \(P_1 = \mathbf {let}\ \mathbf {rec}\ D_1 : \mathcal {K}_1\ \mathbf {in}\ t_1\) in Example 1 is unsafe, and the program \(P_2 = \mathbf {let}\ \mathbf {rec}\ D_1 : \mathcal {K}_1\ \mathbf {in}\ t_2\) below is safe.

$$\begin{aligned} t_2&= (\mathbf {let}\ w = (\mathbf {true}^{20} \oplus \mathbf {false}^{21})^{22} \ \mathbf {in}\ (f^{23}\ (\lambda (x : \mathbf {bool}).\ w^{24})^{25})^{26})^{27} \end{aligned}$$

3 Intersection Type System

In this section, we present an intersection type system that characterizes the unsafety of programs, which is an extension of Tsukada and Kobayashi’s type system [20].

The sets of value types \(\sigma \) and term types \(\tau \) are defined by:

Value types are those for values, and term types are for terms, as the names suggest. Intuitively the type \(\mathbf {true}\) describes the value \(\mathbf {true}\). The type of the form \(\langle \sigma _1,\dots ,\sigma _k \rangle \) describes a tuple whose i-th element has type \(\sigma _i\). A type of the form \(\bigwedge _{i \in I} (\sigma _i \rightarrow \tau _i)\) represents a function that returns a term of type \(\tau _i\) if the argument has type \(\sigma _i\) for each \(i \in I\). Here, we suppose that I be some finite set. We write \(\bigwedge \emptyset \) if I is the empty set. A term type is either a value type or the special type \(\mathbf {fail}\), which represents a term that is evaluated to an error term. We also call a local type environment \(\varDelta \) (resp. a global type environment \(\varGamma \)) for a finite map from variables (resp. function symbols) to value types.

The refinement relations \(\sigma \,\,{::}\,\, \kappa \) and \(\tau \,\,{::}\,\, \kappa \) for value/term types are defined by the following rules:

figure b

We naturally extend this refinement relation to those for local/global type environments and denote \(\varDelta \,\,{::}\,\, \varSigma \) and \(\varGamma \,\,{::}\,\, \mathcal {K}\).

Fig. 4.
figure 4

Typing rules

There are four kinds of type judgments in the intersection type system;

  • \(\varGamma ,\varDelta \vdash t : \tau \) for term t;

  • \(\varGamma , \varDelta \vdash e : \tau \) for expression e;

  • \(\varGamma , \varDelta \vdash \tilde{t} : \tilde{\sigma }\) or \(\varGamma , \varDelta \vdash \tilde{t} : \mathbf {fail}\) for sequence \(\tilde{t}\); and

  • \(\varGamma \vdash \rho : \varDelta \) for environment \(\rho \).

The typing rules for those judgments are given in Fig. 4. Intuitively, the type judgment for terms represents “under-approximation” of the evaluation of the term. The judgment \(\varGamma , \varDelta \vdash t : \sigma \) intuitively means that there is a reduction \(\rho \vdash _D t \longrightarrow ^* v\) for some value v of type \(\sigma \), and \(\varGamma , \varDelta \vdash t : \mathbf {fail}\) means that \(\rho \vdash _D t \longrightarrow ^* \phi ^\ell \) for some error expression \(\phi \). For example, for the term \(t_1 = \langle \mathbf {true}\oplus \mathbf {false}, \mathbf {true} \rangle ^\ell \), the judgments \({\varGamma }, \varDelta \vdash t_1 : \langle \mathbf {true},\mathbf {true} \rangle \) and \({\varGamma }, \varDelta \vdash t_1 : \langle \mathbf {false},\mathbf {true} \rangle \) should hold because there are reductions \( \rho \vdash _D t_1 \longrightarrow ^* \langle \mathbf {true},\mathbf {true} \rangle ^\ell \) and \( \rho \vdash _D t_1 \longrightarrow ^* \langle \mathbf {false},\mathbf {true} \rangle ^\ell \). Furthermore, for the term \(t_2 = (\mathbf {let}\ x = \mathbf {true}\oplus \mathbf {false} \ \mathbf {in}\ \mathbf {assume}\ x; \mathbf {fail})^\ell \), \(\varGamma , \varDelta \vdash t_2 : \mathbf {fail}\) because \(\rho \vdash _D t_2 \longrightarrow ^* (\mathbf {bind}\ \rho [x \mapsto \mathbf {true}]\ \mathbf {in}\ \mathbf {fail})^\ell \). We remark that a term that always diverges (e.g. \(\varOmega \) and \(\mathbf {assume}\ \mathbf {false}; t\)) does not have any types. The judgments \(\varGamma , \varDelta \vdash \tilde{t} : \tilde{\sigma }\) and \(\varGamma , \varDelta \vdash \tilde{t} : \mathbf {fail}\) are auxiliary judgments, which correspond to the evaluation strategy that evaluates \(\tilde{t}\) from left to right. For example, the rule (Seq-F) means that the evaluation \(\tilde{t} = t_1 \dots t_k \) fails (e.g. \(\varGamma , \varDelta \vdash \tilde{t} : \mathbf {fail}\)) if and only if some \(t_i\) fails (e.g. \(\varGamma ,\varDelta \vdash t_i : \mathbf {fail}\)), and \(t_0,\dots ,t_{i-1}\) are evaluated to some values \(\tilde{v}\) (e.g. \(\varGamma , \varDelta \vdash {t_1,\dots ,t_{i-1}} : \tilde{\sigma } \)). The judgment for environments \(\varGamma , \varDelta \vdash \rho : \varDelta \) represents that for each binding \([x \mapsto v]\) in \(\rho \), v has type \(\varDelta (x)\).

The type system above is an extension of Tsukada and Kobayashi’s one [20]. The main differences are:

  • Our target language supports tuples as first-class values, while tuples may occur only as function arguments in their language. By supporting them, we avoid hyper-exponential explosion of the number of types caused by the CPS-transformation to eliminate first-class tuples.

  • Our target language also supports let-expressions. Although it is possible to define them as syntactic sugar, supporting them as primitives makes our type inference algorithm more efficient.

We define some operators used in the rest of this section. Let \(\sigma \) and \(\sigma '\) be value types that are refinements of some function sort. The intersection of \(\sigma \) and \(\sigma '\), written as \(\sigma \wedge \sigma '\), is defined by:

$$ \bigwedge _{i \in I}(\sigma _i \rightarrow \tau _i) \wedge \bigwedge _{j \in J}(\sigma _j \rightarrow \tau _j) = \bigwedge _{k \in (I \cup J)}(\sigma _k \rightarrow \tau _k), $$

where \(\sigma = \bigwedge _{i \in I}(\sigma _i \rightarrow \tau _i)\) and \(\sigma ' = \bigwedge _{j \in J}(\sigma _j \rightarrow \tau _j)\). Let D be a global definition, and \(\varGamma \) and \(\varGamma '\) be global type environments. We say \(\varGamma '\) is a D-expansion of \(\varGamma \), written as \(\varGamma \lhd _D \varGamma '\), if the following condition holds:

$$ \varGamma \lhd _D \varGamma ' \iff \begin{array}{l} \mathrm {dom}(\varGamma ) = \mathrm {dom}(\varGamma '), \\ \forall f \in \mathrm {dom}(\varGamma ). \exists \sigma .\ \varGamma , \emptyset \vdash D(f) : \sigma \ \text {and}\ \varGamma '(f) = (\varGamma (f) \wedge \sigma ) \end{array} $$

This expansion soundly computes types of each recursive function. Intuitively, \(\varGamma \lhd _D \varGamma '\) means that, assuming \(\varGamma \) is a sound type environment for D, \(\varGamma '(f)\) is a sound type of f because \(\varGamma '(f)\) is obtained from \(\varGamma (f)\) by adding a valid type of D(f). We write \(\varGamma _D^\top \) for the environment \(\{ f \mapsto \bigwedge \emptyset \mid f \in \mathrm {dom}(D) \}\), which corresponds to approximating D as \(D^\top = \{ f \mapsto \lambda x : \kappa .\ \varOmega \mid f \in \mathrm {dom}(D) \}\). It is always safe to approximate the behaviour of D with \(\varGamma _D^\top \). We write \(\lhd _D^*\) for the reflexive and transitive closure of \(\lhd _D\). We say \(\varGamma \) is sound for D if \( \varGamma _D^\top \lhd _D^* \varGamma \).

Theorem 1 indicates that the intersection type system characterizes the reachability of Boolean programs. The proof is similar to the proof of the corresponding theorem for Tsukada and Kobayashi’s type system [20]: see the long version [19].

Theorem 1

Let \(P = \mathbf {let}\ \mathbf {rec}\ D : \mathcal {K}\ \mathbf {in}\ t_0\) be a well-sorted program. P is unsafe if and only if there is a global type environment \(\varGamma \) that is sound for D, and that \(\varGamma , \emptyset \vdash t_0 : \mathbf {fail}\).

According to this theorem, the reachability checking problem is solved by saturation-based algorithms. For example, it is easily shown that the following naïve saturation function \(\mathcal {F}_D\) is sufficient for deciding the reachability.

The saturation function is effectively computable. To compute the second operand of \(\mathord {\wedge }\) in the definition of \(\mathcal {F}_D(\varGamma )(f)\), it suffices to pick each \(\sigma \) such that \(\sigma \,\,{::}\,\, \kappa \), and computes \(\tau \) such that \(\varGamma , [x \mapsto \sigma ] \vdash t : \tau \). Note that there are only finitely many \(\sigma \) such that \(\sigma \,\,{::}\,\, \kappa \). Given a well-sorted program \(\mathbf {let}\ \mathbf {rec}\ D : \mathcal {K}\ \mathbf {in}\ t_0\), let \(\varGamma _0 = \varGamma _D^\top \) and \(\varGamma _{i+1} = \mathcal {F}_D(\varGamma _{i})\). The sequence \(\varGamma _0,\varGamma _1,\dots , \varGamma _m, \dots \) converges for some m, because \(\varGamma _i \lhd _D \varGamma _{i+1}\) for each i, and \(\lhd _D\) is a partial order on the (finite) set of type environments. Thus, the reachability is decided by checking whether \(\varGamma _m, \emptyset \vdash t_0 : \mathbf {fail}\) holds.

4 The 0-CFA Guided Saturation Algorithm

In the following discussion, we fix some well-sorted program \(P = \mathbf {let}\ \mathbf {rec}\ D : \mathcal {K}\ \mathbf {in}\ t_0\). We assume that all variables bound in lambda-expressions or let-expressions in P are distinct from each other, and that all the labels in P are also distinct from each other. Therefore, we assume each variable x has the corresponding sort, and we write \(\mathrm {sort}(x)\) for it.

This section presents an efficient algorithm for deciding the reachability problem, based on the type system in the previous section. Unfortunately, the naïve algorithm presented in Sect. 3 is impractical, mainly due to the (Fun) rule:

figure c

The rule tells us how to enumerate the type judgments for \(\lambda x : \kappa .\ t\) from those for \(t\), but there are a huge number of candidate types of the argument \(x\) because they are only restricted to have a certain sort \(\kappa \); when the depth of \(\kappa \) is \(k\), the number of candidate types is \(k\)-fold exponential. Therefore, we modify the type system to reduce irrelevant type candidates.

4.1 The \({\hat{\delta }}\)-Guided Type System

A flow type environment is a function that maps a variable x to a set of value types that are refinement of \(\mathrm {sort}(x)\). Let \(\varGamma \) be a global type environment, \({\hat{\delta }}\) be a flow type environment, and \(\varDelta \) be a local type environment. We define the \({\hat{\delta }}\)-guided type judgment of the form either \(\varGamma , \varDelta \vdash _{{\hat{\delta }}} t : \tau \) or \(\varGamma , \varDelta \vdash _{{\hat{\delta }}} e : \tau \). The typing rules for this judgment are the same as that of \(\varGamma , \varDelta \vdash t : \tau \), except for (Fun), which is replaced by the following rule:

figure d

This modified rule derives the “strongest” type of the lambda-abstraction, assuming \({\hat{\delta }}(x)\) is an over-approximation of the set of types bound to x. This type system, named the \({\hat{\delta }}\) -guided type system, is built so that the type judgments are deterministic for values, lambda-abstractions and environments.

Proposition 1

Suppose \(\varGamma \,\,{::}\,\, \mathcal {K}\). Then,

  • \(\mathcal {K}, \emptyset \vdash v : \kappa \) implies \(\exists ! \sigma . \sigma \,\,{::}\,\, \kappa \wedge \varGamma , \emptyset \vdash _{\hat{\delta }}v : \sigma \),

  • \(\mathcal {K}, \emptyset \vdash p : \kappa \) implies \(\exists ! \sigma . \sigma \,\,{::}\,\, \kappa \wedge \varGamma , \emptyset \vdash _{\hat{\delta }}p : \sigma \), and

  • \(\mathcal {K}\vdash \rho : \varSigma \) implies \(\exists ! \varDelta . \varDelta \,\,{::}\,\, \varSigma \wedge \varGamma \vdash _{\hat{\delta }}\rho : \varDelta \).

Thereby, we write \([\![v]\!]_{\varGamma ,{\hat{\delta }}}\), \([\![p]\!]_{\varGamma ,{\hat{\delta }}}\) and \([\![\rho ]\!]_{\varGamma ,{\hat{\delta }}}\) for the value type of value v, lambda-abstraction p, and environment \(\rho \), respectively.

We define the \({\hat{\delta }}\)-guided saturation function \(\mathcal {G}_{D}({\hat{\delta }}, \varGamma )\) as follows:

$$ \mathcal {G}_{D}({\hat{\delta }}, \varGamma )(f) = \varGamma (f) \wedge [\![D(f)]\!]_{\varGamma ,{\hat{\delta }}} $$

It is easily shown that the soundness theorem of \({\hat{\delta }}\)-guided type system holds.

Theorem 2

(Soundness). Let \(P = \mathbf {let}\ \mathbf {rec}\ D : \mathcal {K}\ \mathbf {in}\ t_0\) be a well-sorted program. Let \({\hat{\delta }}_0, {\hat{\delta }}_1, \dots \) be a sequence of flow type environments. We define a sequence of global type environments \(\varGamma _0, \varGamma _1, \dots \) as follows: (i) \(\varGamma _0 = \varGamma _D^\top \), and (ii) \(\varGamma _{i + 1} = \mathcal {G}_{D}({\hat{\delta }}_i, \varGamma _i)\) for each \(i \ge 0\). The program P is unsafe if there is some m such that \(\varGamma _m, \emptyset \vdash _{{\hat{\delta }}_m} t_0 : \mathbf {fail}\).

However, the completeness of the \({\hat{\delta }}\)-guided type system depends on the flow environments used during saturation. For example, if we use the largest flow type environment, that is, , we have the completeness, but we lose the efficiency. We have to find a method to compute a sufficiently large flow type environment \({\hat{\delta }}\) such that the \({\hat{\delta }}\)-guided type system achieves both the completeness and the efficiency.

In the call-by-name case, a sufficient condition on \({\hat{\delta }}\) to guarantee the completeness can be formalized in terms of flow information [4]. For each function call \(t_1\ t_2\), we just need to require that for each possible value \(\lambda x.\ t\) of \(t_1\).

However, in the call-by-value case, the condition on \({\hat{\delta }}\) is more subtle because the actual value bound to argument x is not \(t_2\) itself but an evaluation result of \(t_2\). In order to prove that the \({\hat{\delta }}\)-guided type system is complete, it is required that \({\hat{\delta }}(x)\) contains all the types of the values bound to x during the evaluation,Footnote 2 i.e. . Therefore, we have to prove that , but this fact follows from the completeness of the \({\hat{\delta }}\)-guided type system, which causes a circular reasoning.

In the rest of this section, we first formalize 0-CFA for our target language, propose our 0-CFA guided saturation algorithm, and prove the correctness of the algorithm.

4.2 0-CFA

We adopt the formalization of 0-CFA by Nielson et al. [13].

An abstract value is defined by:

$$ av \,\, (\mathrm {abstract\, values}) \,\,{::}{=}\,\, \mathtt {bool}\mid p \mid f \mid \langle av _1, \dots , av _k \rangle . $$

The set of abstract values is denoted as \(\widehat{\mathbf {Value}}\). An abstract value is regarded as a value without environments. The abstract value of a value v, written \(\hat{v}\), is defined by:

figure e

An abstract cache is a function from \(\mathbf {Lab}\) to \(\mathcal {P}(\widehat{\mathbf {Value}})\), and an abstract environment is a function from \(\mathbf {Var}\) to \(\mathcal {P}(\widehat{\mathbf {Value}})\). Let \(\hat{C}\) be an abstract cache, and \(\hat{\rho }\) be an abstract environment. We define the relations \((\hat{C},\hat{\rho }) \models _D e^\ell \) and \((\hat{C},\hat{\rho }) \models _D \rho \), which represents \(({\hat{C}},{\hat{\rho }})\) is an acceptable CFA result of the term \(e^\ell \) and the environment \(\rho \), respectively.

Fig. 5.
figure 5

0-CFA rules

The relations are co-inductively defined by the rules given in Fig. 5. In the (tuple) rule, \({\hat{C}}(\ell _1) \otimes \dots \otimes {\hat{C}}(\ell _k)\) means the set . In the (proj) rule, . The relation \(({\hat{C}},{\hat{\rho }}) \models _D e^\ell \) is defined so that if \(e^\ell \) is evaluated to a value v, then the abstract value of v is in \({\hat{C}}(\ell )\). The relation \(({\hat{C}},{\hat{\rho }}) \models _D \rho \) means that for each binding \(x \mapsto v\) in \(\rho \), \({\hat{\rho }}(x)\) contains the abstract value of v.

4.3 The 0-CFA Guided Saturation Algorithm

We propose a method to compute a sufficiently large \({\hat{\delta }}\) so that the \({\hat{\delta }}\)-guided type system would be complete. Let \({\hat{C}}\) be an abstract cache. We define two relations \(({\hat{C}},{\hat{\delta }}) \models _{D,\varGamma } (t,\varDelta )\), and \(({\hat{C}},{\hat{\delta }}) \models _{D,\varGamma } \rho \). The relation \(({\hat{C}},{\hat{\delta }}) \models _\varGamma (t,\varDelta )\) means intuitively that, during any evaluations of t under an environment \(\rho \) such that \(\varGamma \vdash \rho : \varDelta \), the type of values bound to variable x is approximated by \({\hat{\delta }}(x)\). The derivation rules for those relations are given in Fig. 6. We regard these rules as an algorithm to saturate \({\hat{\delta }}\), given \({\hat{C}}\), \(\varDelta \) and t. The algorithm basically traverses the term t with possible \(\varDelta \) using \({\hat{\delta }}\)-guided type system as dataflow, and propagates types to \({\hat{\delta }}\) using the rule (App): if t is an function call \(e_1^\ell \ t_2\), the algorithm enumerates each lambda abstraction \(\lambda x : \kappa .\ t_0\) that \(e_1^\ell \) may be evaluated to by using \({\hat{C}}\), and propagates each type \(\sigma \) of \(t_2\) (i.e. \(\varGamma ,\varDelta \vdash _{\hat{\delta }}t : \sigma \)) to \({\hat{\delta }}(x)\).

Fig. 6.
figure 6

Derivation rules for \(({\hat{C}},{\hat{\delta }}) \models _{D,\varGamma } (t, \varDelta )\) and \(({\hat{C}},{\hat{\delta }}) \models _{D,\varGamma } \rho \)

Algorithm 1 shows our algorithm for the reachability problem, named the 0-CFA guided saturation algorithm. Given a well-sorted program \(P = \mathbf {let}\ \mathbf {rec}\ D : \mathcal {K}\ \mathbf {in}\ t_0\), the algorithm initializes \(\varGamma _0\) with \(\varGamma _D^\top \), computes a 0-CFA result \(({\hat{C}},{\hat{\rho }})\) such that \(({\hat{C}},{\hat{\rho }}) \models _D t\), sets \(i = 0\), and enters the main loop. In the main loop, it computes \({\hat{\delta }}_i\) such that \(({\hat{C}},{\hat{\delta }}_i) \models _{D,\varGamma _i} t_0\), and then, sets \(\varGamma _{i+1}\) with \(\mathcal {G}_D({\hat{\delta }}_i,\varGamma _i)\) and increments i. The algorithm outputs “Unsafe” if \(\varGamma _{i} \vdash _{{\hat{\delta }}_i} t_0 : \mathbf {fail}\) holds for some i. Otherwise, the main loop eventually breaks when \(\varGamma _i = \varGamma _{i-1}\) holds, and then, the algorithm outputs “Safe”.

figure f

We explain how the saturation algorithm runs for the program \(P_1\) in Example 1. Let \(\ell _1\) and \(\ell _2\) be the labels of the first application of y and the second application of y in function f. A result of 0-CFA would be \({\hat{C}}(\ell _1) = {\hat{C}}(\ell _2) = \lambda (x : \mathbf {bool}).\ \mathbf {true}\oplus \mathbf {false}\). Let . Then, \({\hat{\delta }}_0\) would be

Therefore, \(\varGamma _0, \emptyset \vdash _{{\hat{\delta }}_0} D_1(f) : (\mathbf {true}\rightarrow \mathbf {true}) \wedge (\mathbf {true}\rightarrow \mathbf {false}) \rightarrow \mathbf {fail}\) holds, and it would be . In the next iteration, there are no updates, i.e. \({\hat{\delta }}_1 = {\hat{\delta }}_0\) and \(\varGamma _1 = \varGamma _0\). Because \(\varGamma _1,\emptyset \vdash _{{\hat{\delta }}_1} t_1 : \mathbf {fail}\) holds, the algorithm outputs “Unsafe”.

4.4 Correctness of the 0-CFA Guided Saturation Algorithm

We prove the correctness of Algorithm 1. If the algorithm outputs “Unsafe”, the given program is unsafe by using Theorem 2. In order to justify the case that the algorithm outputs “Safe”, we prove the completeness of the \({\hat{\delta }}\)-guided type system.

First, the following lemma indicates that \(({\hat{C}},{\hat{\rho }}) \models _D t\) and \(({\hat{C}},{\hat{\delta }}) \models _{D,\varGamma } (t,\varDelta )\) satisfy subject reduction, and also that the \({\hat{\delta }}\)-guided type system satisfies subject expansion. This lemma solves the problem of circular reasoning discussed at the end of Sect. 4.1.

Lemma 1

Let \(\varGamma \) be a global type environment such that \(\varGamma = \mathcal {G}_D({\hat{\delta }},\varGamma )\). Suppose that \(({\hat{C}},{\hat{\rho }}) \models _D t_1\), \(({\hat{C}},{\hat{\rho }}) \models _D \rho \), \(\rho \vdash _D t_1 \longrightarrow t_2\), \(({\hat{C}},{\hat{\delta }}) \models _{D,\varGamma } \rho \), and \(({\hat{C}},{\hat{\delta }}) \models _{D,\varGamma } (t_1,\varDelta )\), where \(\varDelta = [\![\rho ]\!]_{\varGamma ,{\hat{\delta }}}\). Then, (i) \(({\hat{C}},{\hat{\delta }}) \models _D t_2\), (ii) \(({\hat{C}},{\hat{\delta }}) \models _{D,\varGamma } (t_2,\varDelta )\), and (iii) for any term type \(\tau \), \(\varGamma , \varDelta \vdash _{\hat{\delta }}t_2 : \tau \) implies \(\varGamma ,\varDelta \vdash _{\hat{\delta }}t_1 : \tau \).

We use the fact that \({\hat{\delta }}\)-guided type system derives \(\mathbf {fail}\) for error terms.

Lemma 2

Let \(\phi \) be a well-sorted error expression. Then, \(\varGamma , \emptyset \vdash _{\hat{\delta }}\phi : \mathbf {fail}\).

Then, we have the following completeness theorem, which justifies the correctness of Algorithm 1.

Theorem 3

Let \(P = \mathbf {let}\ \mathbf {rec}\ D : \mathcal {K}\ \mathbf {in}\ t_0\) be a well-sorted program, \(\varGamma \) be a global type environment such that \(\varGamma \,\,{::}\,\, \mathcal {K}\) and \(\varGamma = \mathcal {G}_D({\hat{\delta }},\varGamma )\). Suppose that \(({\hat{C}},{\hat{\rho }}) \models _{D}\,t_{0}\), and \(({\hat{C}},{\hat{\delta }}) \models _{D,\varGamma } (t_0,\emptyset )\). If then P is safe.

Proof

We prove the contraposition. Assume that P is unsafe, i.e., that there is a sequence \(e_0 \dots e_n\) such that \(e_0^\ell = t_0\), \(\emptyset \vdash _D e_i^\ell \longrightarrow e_{i+1}^\ell \) for each \(0 \le i \le n-1\), and that \(e_n^\ell \) is an error term. We have \(\forall \tau . \varGamma ,\emptyset \vdash _{\hat{\delta }}e_n^\ell : \tau \implies \varGamma ,\emptyset \vdash _{\hat{\delta }}t_0 : \tau \) by induction on n and using Lemma 1. By Lemma 2, \(\varGamma , \emptyset \vdash _{\hat{\delta }}e_n^\ell : \mathbf {fail}\). Therefore, we have \(\varGamma , \emptyset \vdash _{\hat{\delta }}t_0 : \mathbf {fail}\).    \(\square \)

5 Implementation and Experiments

5.1 Benchmarks and Environment

We have implemented a reachability checker named HiBoch for call-by-value Boolean programs. In order to evaluate the performance of our algorithm, we prepared two benchmarks. The first benchmark consists of Boolean programs generated by a CEGAR-based verification system for ML programs. More precisely, we prepared fourteen instances of verification problems for ML programs, which have been manually converted from the MoCHi benchmark [17], and passed them to our prototype CEGAR-based verification system, which uses HiBoch as a backend reachability checker. During each CEGAR cycle, the system generates an instance of the reachability problem for Boolean programs by predicate abstraction, and we used these problem instances for the first benchmark.

The second benchmark consists of a series of Boolean programs generated by a template named “Flow”, which was manually designed to clarify the differences between the direct and indirect styles. More details on this benchmark are given in the long version [19].

We compared our direct method with the previous indirect method, which converts Boolean programs to HORS and checks the reachability with a higher-order model checker. We use HorSat [4] as the main higher-order model checker in the indirect method; since HorSat also uses a 0-CFA-based saturation algorithm (but for HORS, not for Boolean programs), we believe that HorSat is the most appropriate target of comparison for evaluating the difference between the direct/indirect approaches. We also report the results of the indirect method using the other state-of-the-art higher-order model checkers HorSat2 [10] and Preface [16], but one should note that the difference of the performance may not properly reflect that between the direct/indirect approaches, because HorSat2 uses a different flow analysis and Preface is not based on saturation.

The experimental environment was as follows. The machine spec is 2.3 GHz Intel Core i7 CPU, 16 GB RAM. Our implementation was compiled with the Glasgow Haskell Compiler, version 7.10.3, HorSat and HorSat2 were compiled with the OCaml native-code compiler, version 4.02.1, and Preface was run on Mono JIT compiler version 3.2.4. The running times of each model checker were limited to 200 s.

5.2 Experimental Result

Figures 7 and 8 show the experimental results. The horizontal axis is the size of Boolean programs, measured on the size of the abstract syntax trees, and the vertical axis is the elapsed time of each model checker, excluding the elapsed times for converting the reachability problem instances to the higher-order model checking instances.

For the first benchmark, HiBoch solves all the test cases in a few seconds. For the instances of size within 5000, HorSat2 is the fastest, and HiBoch is the second fastest, which is 4–7 times faster than HorSat (and also Preface). For the instances of size over 5000, HiBoch is the fastestFootnote 3 by an order of magnitude. We regard the reason of this result as the fact that these instances have larger arity (where the arity means the number of function arguments). The indirect style approach suffers from huge numbers of combinations between argument types. Our direct approach reduces many irrelevant combinations using the structure of call-by-value programs, which is lost during the CPS-transformation.

For the second benchmark, as we expected, HiBoch clearly outperforms the indirect approaches, even the one using HorSat2.

Fig. 7.
figure 7

Experimental result for MoCHi benchmark

Fig. 8.
figure 8

Experimental result for flow benchmark

6 Related Work

As mentioned already, the reachability of higher-order call-by-value Boolean programs has been analyzed by a combination of CPS-transformation and higher-order model checking [11, 17]. Because the naïve CPS-transformation algorithm often generates too complex HORS, Sato et al. [17] proposed a method called selected CPS transformation, in which insertion of some redundant continuations is avoided. The experiments reported in Sect. 5 adapt this selective CPS transformation, but the indirect method still suffers from the complexity due to the CPS transformation.

Tsukada and Kobayashi [20] studied the complexity of the reachability problem, and showed that the problem is \(k\)-EXPTIME complete for depth-k programs. They also introduced an intersection type system and a type inference algorithm, which are the basis of our work. However, their algorithm has been designed just for proving an upper-bound of the complexity; the algorithm is impractical in the sense that it always suffers from the k-EXPTIME bottleneck, while our 0-CFA guided algorithm does not.

For first-order Boolean programs, Ball and Rajamani [2] proposed a path-sensitive, dataflow algorithm and implemented Bebop tool, which is used as a backend of SLAM [1]. It is not clear whether and how their algorithm can be extended to deal with higher-order Boolean programs.

Flow-based optimizations have been used in recent model checking algorithms for higher-order recursion schemes [3, 4, 16, 18]. However, naïve application of such optimizations to call-by-value language would be less accurate because we need to estimate the evaluation result of not only functions but also their arguments. Our method employs the intersection type system to precisely represent the evaluation results.

Some of the recent higher-order model checkers [10, 16] use more accurate flow information. For example, Preface [16] dynamically refines flow information using type-based abstraction. We believe it is possible to integrate more accurate flow analysis [57] also into our algorithm.

7 Conclusion

We have proposed a direct algorithm for the reachability problem of higher-order Boolean programs, and proved its correctness. We have confirmed through experiments that our direct approach improves the performance of the reachability analysis.

We are now developing a direct-style version of MoCHi, a fully automated software model checker for OCaml programs, on top of our reachability checker for Boolean programs, and plan to compare the overall performance with the indirect style. We expect that avoiding CPS transformations also benefits the predicate discovery phase.