1 Introduction

Specifications of programs (or other systems) are often described by temporal or modal logics such as linear temporal logic (LTL), computational tree logic (CTL) and modal \(\mu \)-calculusΒ [2, 8, 12, 16, 19, 22]. Formulas of these logics are built from atomic propositions representing basic properties of run-time states, e.g.Β whether the control is at a certain program point and whether the value of a certain global variable is positive. The set of atomic propositions in these logics is static in the sense that it remains unchanged during evaluation of programs.

We are interested in verification of higher-order functional programs, and logics suitable for describing temporal properties of such programs. For example, consider a function \( g : (\mathtt {unit} \rightarrow \mathtt {string}) \rightarrow (\mathtt {int} \rightarrow \mathtt {int}) \), which takes (the reader function of) a read-only file and creates a function on integers. The following is a possible specification for \( g \):

  • \( g \) is allowed to access the reader function only until it returns.

The implementation

$$\begin{aligned} \mathbf {let}\,g\,r\;=\; \big (\mathbf {let}\, w\;=\; {int\_of\_string}\,(r\,()) \,\mathbf {in}\,\lambda x. x + w \big ) \end{aligned}$$

meets the specification, whereas

$$\begin{aligned} \mathbf {let}\,g\,r\;=\; \lambda x. x + ({int\_of\_string}\,(r\,())) \end{aligned}$$

violates it.

Properly describing this specification is difficult because the property refers to dynamic notions such as β€œthe argument of the first call of \( g \).” The program

$$\begin{aligned} \begin{array}{l} \mathbf {let}\,g\,r\;=\; \big (\mathbf {let}\, w\;=\; {int\_of\_string}\,(r\,()) \,\mathbf {in}\,\lambda x. x + w \big ) \;\mathbf {in} \\ \mathbf {let}\, fp = {open}\;{filename}\;\mathbf {in} \\ \mathbf {let}\,r = \lambda \_. {read\_line}\; fp \;\mathbf {in} \\ \mathbf {let}\,v_1 = g\,r\;\mathbf {in} \\ \mathbf {let}\,v_2 = g\,r\;\mathbf {in} \\ \quad \mathbf {print}\,(v_1\,1 + v_2\,2) \end{array} \end{aligned}$$

is an example that calls \( g \) twice. The sequence of call and return events of \( g \) and \( r \) in the evaluation is written as

$$\begin{aligned} \mathbf {call}\,g_1 \,\rightarrow \, \mathbf {call}\,r_1 \,\rightarrow \, \mathbf {ret}\,r_1 \,\rightarrow \, \mathbf {ret}\,g_1 \,\rightarrow \, \mathbf {call}\,g_2 \,\rightarrow \, \mathbf {call}\,r_2 \,\rightarrow \, \mathbf {ret}\,r_2 \,\rightarrow \, \mathbf {ret}\,g_2, \end{aligned}$$

where \( g_1 \) means the first call of \( g \) and \( r_1 \) is its argument and similarly for \( g_2 \) and \( r_2 \). The above program satisfies the property since there is no \( \mathbf {call}\,r_i \) after \( \mathbf {ret}\,g_i \) for \( i = 1,2 \). However, by ignoring the subscripts, i.e.Β confusing the first call of \( g \) with the second call, the program may seem to violate the specification since there is \( \mathbf {call}\,r_2 \) after \( \mathbf {ret}\,g_1 \). This means that references to dynamic notions like the subscripts in the above sequence are inevitable for precise description of the specification. For this reason, the specification does not seem to be expressible in the standard logics listed above.

This paper proposes a new temporal logic, named Linear Temporal Logic of Calls (LTLC for short), by which one can describe the above property. The logic is an extension of Linear Temporal Logic (LTL) with a new operator \( call \,f(x_1,\dots ,x_n).\varphi \), the call operator, where the occurrences of \( x_1, \dots , x_n \) are binding occurrences. Intuitively this formula is true just if the function \( f \) is called in the current step (i.e.Β the current expression is of the form \( E[f\,e_1\,\dots \,e_n] \)) and \( \varphi [e_1/x_1,\dots ,e_n/x_n] \) holds at the next step.Footnote 1 We shall see in ExampleΒ 2 an LTLC formula describing the above property.

LTLC is expressive. One can describe properties written by dependent refinement types in the form of [21], relational propertiesΒ [1, 5] (e.g.Β monotonicity of a given function) and some examples in resource usage analysisΒ [10].

Furthermore LTLC is tractable. The LTLC model-checking problem is not more difficult than the standard temporal verification problem, because the LTLC model-checking problem is effectively reducible to the standard temporal verification of programs. In particular, for programs over finite types,Footnote 2 the LTLC model-checking is reducible to higher-order model checkingΒ [12, 17] and thus decidable.

Organisation of this Paper. After defining the target language in Sect.Β 2, we present the syntax and semantics of LTLC in Sect.Β 3. SectionΒ 4 shows examples of properties expressible by LTLC. SectionΒ 5 proves the reducibility result. SectionΒ 6 gives a brief discussion on other topics. After discussing related work in Sect.Β 7, Sect.Β 8 concludes the paper.

2 Target Language

This section describes the target language of this paper, which is a simply-typed call-by-name higher-order functional programming language with recursion.

We assume a set of base types, ranged over by \( b \), as well as a set \( V_b \) of values for each base type \( b \). We require that \( V_b \cap V_{b'} = \emptyset \) whenever \( b\ne b'\). Examples of base types are boolean type and (bounded or unbounded) integer type. We assume the set of base types contains the boolean type \( \mathtt {Bool}\) and \( V_{\mathtt {Bool}} = \{\, tt , ff \,\}\).

We also assume a set of binary operators \( Op \). Each binary operator \( op \in Op \) is associated with their sort \( b_1, b_2 \rightarrow b_3 \), meaning that it takes two arguments of types \( b_1 \) and \( b_2 \) and yields a value of type \( b_3 \). Examples of binary operations are \( +, -, \times \) (of sort \( \mathtt {Int}, \mathtt {Int}\rightarrow \mathtt {Int}\)) and \( =_{\mathtt {Int}} \) (of sort \( \mathtt {Int}, \mathtt {Int}\rightarrow \mathtt {Bool}\)).

Most results of this paper are independent of the choice of basic types and operators. The only exception is the decidability of model checking (TheoremΒ 4) for which we assume base types are finite (i.e.Β \( V_b \) is finite for each base type \( b \)).

The set of types is given by:

$$ \tau := \star \mid \sigma \rightarrow \tau \qquad \qquad \sigma := b \mid \tau . $$

A type is the unit type \( \star \), a base type \( b \) or a function type \( \sigma \rightarrow \tau \). For a technical convenience, the above syntax requires that the return type of a function is not a base type. Therefore a type \( \tau \) must be of the form \( \sigma _1 \rightarrow \dots \rightarrow \sigma _n \rightarrow \star \) for some \( n \ge 0 \) and \( \sigma _i \), \( 1 \le i \le n \). This does not lose generality because this requirement can be fulfilled by applying the CPS translation.

Assume disjoint sets \( V \) of variables and \( F \) of function names. The set of expressions is defined by the following grammar:

$$ e \quad :=\quad () \mid c \mid x \mid f \mid e_1\,e_2 \mid \mathbf {op}\mid \mathbf {if}$$

where \( x \) and \( f \) are meta-variables ranging respectively over variables and function names. The expression \( () \) is the unit value and \( c \in \bigcup _b V_b \) is a constant of a base type. Each binary operation \( op \in Op \) has the associated constructor \( \mathbf {op}\), which is in CPS (see the type system below) for a technical convenience. We also have a constructor \( \mathbf {if}\) of conditional branching.

A function definition \(\mathcal {P}\) is a finite set of equations of the form \( f\,x_1\,\dots \,x_n = e \), where \( f \) is the name of the function, \( x_1, \dots , x_n \) (\( n \ge 0 \)) are formal parameters and \( e \) is the body of the function. We assume that a function definition \(\mathcal {P}\) contains at most one equation for each function name \( f \). A program is a pair \((\mathcal {P},e)\) of a function definition and an expression.

We shall consider only well-typed programs. A type environment is a finite set of type bindings of the form \( x :\sigma \) or \( f :\tau \). We shall use \( \varDelta \) for type environments of function names, and \( \varGamma \) for variables. A type judgement is a tuple \( \varDelta \mid \varGamma \vdash e : \sigma \). The typing rules for expressions are straightforward, e.g.,

Some notable points are (1) the then- and else-branches of an if-expression have to be of unit type, and (2) the binary operation \( \mathbf {op}\,e_1\,e_2\,e_3 \) in CPS takes two arguments \( e_1 : b_1 \) and \( e_2 : b_2 \) of base types and a continuation \( e_3 : b_3 \rightarrow \star \). We say that a function definition \( f\,x_1\,\dots \,x_n = e \) defines a function of type \( \tau \) under the type environment \( \varDelta \), written \( \varDelta \vdash f\,x_1\,\dots \,x_n = e : \tau \), if \( \tau = \sigma _1 \rightarrow \dots \rightarrow \sigma _n \rightarrow \star \) and \( \varDelta \mid x_1 :\sigma _1, \dots , x_n :\sigma _n \vdash e : \star \). Note that the function body \( e \) is required to have type \( \star \) (this restriction can be fulfilled by \( \eta \)-expanding the definition, which does not change the meaning of a function in the call-by-name setting). A function definition \( \mathcal {P}= \{\,f_i\,\widetilde{x}_i = e_i \}_{1 \le i \le m} \) is well-typed under \( \varDelta = \{\, f_1 :\tau _1, \dots , f_m :\tau _m \,\} \), written \( \varDelta \vdash \mathcal {P}\), if \( \varDelta \vdash f_i\,\widetilde{x}_i = e_i : \tau _i \) for every \( i \). A program \( (\mathcal {P},e) \) is well-typed if there exists \( \varDelta \) such that \( \varDelta \vdash \mathcal {P}\) and \( \varDelta \mid \emptyset \vdash e : \star \).

The operational semantics of the language is fairly straightforward. We define the small-step reduction relation \(\longrightarrow \) by the following rules:

$$ \begin{array}{ll} \dfrac{c_1 \mathbin { op } c_2 = c}{\mathbf {op}\, c_1\, c_2\, e \longrightarrow e\, c} \qquad \dfrac{}{\mathbf {if}\, tt \, e_1\, e_2 \longrightarrow e_1} \qquad \dfrac{}{\mathbf {if}\, ff \, e_1\, e_2 \longrightarrow e_2} \\ \\ \dfrac{(f \, x_1\, \dots x_n = e) \in \mathcal {P}}{f\, e_1\,\dots \,e_n \longrightarrow [e_1/x_1, \dots , e_n/x_n]e} \qquad \dfrac{}{() \longrightarrow ()} \end{array} $$

The last rule is an artificial rule, which ensures that every well-typed expression has an infinite reduction sequence, somewhat simplifying some definitions in the next section. We write \(\longrightarrow ^*\) for the reflexive, transitive closure of \(\longrightarrow \).

If \( V_b \) is finite and the equality \( =_b \) on \( b \) is in \( Op \), the case analysis of values of type \( b \) is definable in the language. We write

$$\begin{aligned} \mathbf {case}\,e\,\mathbf {of}\,c_1 \rightarrow e_1 \mid \dots \mid c_n \rightarrow e_n, \end{aligned}$$

where \( e, c_1, \dots , c_n : b \) and \( e_1, \dots , e_n : \star \), for the expression

$$\begin{aligned} \mathbf {if}\,(=_b\,e\,c_1)\,e_1 (\mathbf {if}\,(=_b\,e\,c_2)\,e_2\, (\dots (\mathbf {if}(=_b\,e\, c_n)\,e_n\,()) \dots )). \end{aligned}$$

Example 1

Recall the example in Introduction, which was written in direct style. By abstracting unimportant details and transforming it into CPS, we obtain the following program:

$$\begin{aligned} \begin{array}{l} \mathbf {let}\,g\,r\,k\;=\;r\,(\lambda w. k\,(\lambda x h. h\,(x + w))) \;\mathbf {in} \\ \mathbf {let}\,r\,k\;=\; k\,0 \;\mathbf {in} \\ \mathbf {let}\, p \,x\;=\;()\;\mathbf {in} \\ \qquad g\,r\,(\lambda v_1. g\,r\,(\lambda v_2. v_1\,1\,(\lambda u_1. v_2\,2\,(\lambda u_2. p\,(u_1 + u_2))))) \end{array} \end{aligned}$$

Here \( r \) is a function reading the value from a file (consisting of only 0 s) and passing it to the continuation \( k \); \( p \) is the function that prints the argument (but formally does nothing). This program can be seen as a program in our language,Footnote 3 of which functions definitions are given by sequences of \( \mathbf {let} \) and the initial expression \( e \) is that in the last line. The type environment for functions is given by

$$\begin{aligned}&r :(\mathtt {Int}\rightarrow \star ) \rightarrow \star ,\qquad p :\mathtt {Int}\rightarrow \star , \\&g :((\mathtt {Int}\rightarrow \star ) \rightarrow \star ) \rightarrow ((\mathtt {Int}\rightarrow (\mathtt {Int}\rightarrow \star ) \rightarrow \star ) \rightarrow \star ) \rightarrow \star \end{aligned}$$

The evaluation of the program is

$$\begin{aligned}&g\,r\,k_1 \quad \longrightarrow \quad r\,(\lambda w. k_1\,(\lambda xh. h(x+w))) \quad \longrightarrow ^*\quad k_1\,(\lambda xh. h(x+0)) \quad \longrightarrow \quad \\&g\,r\,k_2 \quad \longrightarrow \quad r\,(\lambda w. k_2\,(\lambda xh. h(x+w))) \quad \longrightarrow ^*\quad k_2\,(\lambda xh. h(x+0)) \quad \longrightarrow \cdots \end{aligned}$$

where

$$\begin{aligned} k_1&= (\lambda v_1. g\,r\,(\lambda v_2. v_1\,1\,(\lambda u_1. v_2\,2\,(\lambda u_2. p\,(u_1 + u_2))))) \\ k_2&= [(\lambda xh. h(x+0))/v_1]\;(\lambda v_2. v_1\,1\,(\lambda u_1. v_2\,2\,(\lambda u_2. p\,(u_1 + u_2)))). \end{aligned}$$

Note that \( r \) is called twice: The first (resp.Β the second) call of \( r \) is between the first (resp.Β the second) call of \( g \) and the call of the corresponding continuation \( k_1 \) (resp.Β \( k_2 \)).

3 Linear Temporal Logic of Calls

This section defines a novel temporal logic that we call Linear Temporal Logic of Calls (LTLC for short). It is an extension of the standard linear temporal logic (LTL) by a modal operator, called the call operator, which describes a property on function calls. Let \(\mathcal {P}\) be a function definitions and \( \varDelta \) be the type environment for functions, fixed in the sequel.

3.1 Syntax

Assume a set \(L\) of variables that is disjoint from the sets of function names and of variables in expressions. We use \( \alpha \), \( \beta \) and \( \gamma \) for variables in \( L \). The set of LTLC formulas is defined by the following grammar:

$$ \varphi := true \mid false \mid {\lnot \varphi } \mid {\varphi \vee \varphi } \mid {\varphi \wedge \varphi } \mid {\bigcirc \varphi } \mid {\varphi \mathbin {\mathbf {U}}\varphi } \mid {\varphi \mathbin {\mathbf {R}}\varphi } \mid call \, \xi (\widetilde{\beta }).\varphi \mid p(\widetilde{\beta }) $$

where \( \xi \) is either a function name \( f \) or a variable \( \alpha \in L \). It is the standard LTL with next \( \bigcirc \), (strong) until \( \mathbin {\mathbf {U}}\) and release \( \mathbin {\mathbf {R}}\) extended by the call operator \( call \,\xi (\widetilde{\beta }).\varphi \) and predicates \( p(\widetilde{\beta }) \) on values of base types (such as the order < and equivalence \( = \) of integers). The occurrences of variables \( \widetilde{\beta } \) in \( call \,\xi (\widetilde{\beta }).\varphi \) are binding occurrences, and \( \xi \) is free.

The meaning of formulas should be clear except for \( call \,\xi (\widetilde{\beta }).\varphi \). Intuitively \( call \,f(\widetilde{\beta }).\varphi \) is true just if the current expression is of the form \( f\,\widetilde{e} \) and \( \varphi \{\widetilde{e}/\widetilde{\beta }\} \) holds in the next step \( e_f\{\widetilde{e}/\widetilde{x}\} \) (where \( f\,\widetilde{x} = e_f \in \mathcal {P}\)), although here is a subtle point that we shall discuss in the next subsection.

Each variable \( \beta \) in a formula naturally has its type, and a formula should respect the types to make sense. For example, \( call \,f(\beta ).\varphi \) would be nonsense if the function \( f \) has two arguments. We use a type system to filter out such meaningless formulas. We write \( \varTheta \) for a type environment for variables in \( L \). A type judgement is of the form \( \varDelta \mid \varTheta \vdash \varphi \), meaning that \( \varphi \) is well-formed under \( \varDelta \) and \( \varTheta \). Here \( \varDelta \) and \( \varTheta \) declares the types for function names and variables in \( L \), respectively. Examples of typing rules are as follows:

We shall use the following abbreviations. As usual, the temporal operators β€œfuture” \( \mathbf {F}\) and β€œalways” \( \mathbf {G}\) are introduced as derived operators, defined by

$$\begin{aligned} \mathbf {F}\varphi := true \mathbin {\mathbf {U}}\varphi \qquad \text {and}\qquad \mathbf {G}\varphi := false \mathbin {\mathbf {R}}\varphi . \end{aligned}$$

We also use a derived operator \( ifcall \) given by

$$\begin{aligned} ifcall \,\xi (\widetilde{\beta }).\varphi \quad :=\quad \lnot \mathbf {F} call \, \xi (\widetilde{\beta }). (\lnot \varphi ) \end{aligned}$$

meaning that \( call \,\xi (\widetilde{\beta }).\varphi \) holds for every call of \( \xi \) in the future. This operator can alternatively be defined by

$$\begin{aligned} ifcall \,\xi (\widetilde{\beta }).\varphi \quad =\quad \mathbf {G}( call \,\xi (\widetilde{\beta }).\varphi \,\vee \, \lnot call \,\xi (\widetilde{\beta }). true ). \end{aligned}$$

We write \(\bigcirc ^n \varphi \) for \(\underbrace{\bigcirc \cdots \bigcirc }_n \varphi \), meaning that \( \varphi \) holds after \( n \) steps.

Example 2

Recall the program in ExampleΒ 1. The formula meaning β€œ\( g \) does not call its first argument after returning the value” can be written as follows:

$$\begin{aligned} ifcall \,g(\alpha ,\beta )\;.\; ifcall \,\beta (\gamma )\;.\;(\lnot \mathbf {F} call \,\alpha (\delta )\;.\; true ). \end{aligned}$$

Since the programs are now written in CPS, β€œreturning the value” means β€œcalling the continuation \( \beta \).” The above formula says that, for every call of \( g \), if it returns the value (via the continuation \( \beta \)), then it will never call the first argument \( \alpha \).

3.2 NaΓ―ve Semantics and a Problem

Every closed expression \( \varDelta \mid \emptyset \vdash e : \star \), possibly using functions in \(\mathcal {P}\), induces the unique infinite reduction sequence

$$\begin{aligned} e \,=\, e_0 \,\longrightarrow \, e_1 \,\longrightarrow \, e_2 \,\longrightarrow \, \dots \,\longrightarrow \, e_n \,\longrightarrow \, \cdots . \end{aligned}$$

An LTLC formula \( \varphi \) describes a property on such infinite reduction sequences. Thus the satisfaction relation \( e \models \varphi \) is defined as a relation between an expression \( \varDelta \vdash e : \star \) and an LTLC formula \( \varDelta \vdash \varphi \).

The definition of the relation for logical connectives from the standard LTS is straightforward. For example, \( \bigcirc \varphi \) means that \( \varphi \) holds in the next step, and thus \( e \models \bigcirc \varphi \) if and only if \( e' \models \varphi \) for the unique \( e' \) such that \( e \longrightarrow e' \).

The main issue is how to define the semantics of \( call \,f(\widetilde{\beta }).\varphi \). Intuitively \( e \models call \,f(\beta _1,\dots ,\beta _n).\varphi \) holds if and only if \( e = f\,e_1\,\dots \,e_n \) and \( e_f[e_1/x_1, \dots , e_n/x_n] \models \varphi [e_1/\beta _1,\dots ,e_n/\beta _n] \), where \( e_f \) is the body of the definition of \( f \). However this naΓ―ve definition has a problem.

Let us explain the problem by using an example. Consider the function \( doTask \) defined by

$$\begin{aligned} doTask \,f\,g \quad =\quad f\,(g\,()). \end{aligned}$$

It would be natural to expect that β€œ\( doTask \) does not call the second argument unless it does not call the first argument” should be true independent of the context in which \( doTask \) is used. Formally we expect \( C[ doTask ] \models \varphi \) for every context \( C \), where \( \varphi \) is the formula given by

$$\begin{aligned} \varphi \quad =\quad call \, doTask (\alpha ,\beta )\;.\; \big ((\lnot \, call \,\beta (\gamma ). true ) \mathbin {\mathbf {U}}( call \,\alpha (\delta ). true )\big ). \end{aligned}$$

However it is not true. Consider, for example, \( C = []\,h\,h \) where \( h \) is an arbitrary function. Then

$$\begin{aligned} doTask \,h\,h \models \varphi \quad \Leftrightarrow \quad h\,(h\,()) \models (\lnot \, call \,h(\gamma ). true ) \mathbin {\mathbf {U}}( call \,h(\delta ). true ) \end{aligned}$$

but the right-hand-side is false because \( h\,(h\,()) \models call \,h(\gamma ). true \) and thus \( h\,(h\,()) \not \models \lnot \, call \,h(\gamma ). true \).

The problem is caused by confusion between \( h \) as the first argument and \( h \) as the second argument. In the formula \( \varphi \), the first and second arguments of \( doTask \) are distinguished by their names, \( \alpha \) and \( \beta \). However they become indistinct by the substitution \( [h/\alpha , h/\beta ] \).

3.3 Formal Semantics

We use labels to correctly keep track of expressions. A label is just a variable \( \alpha \in L \) in a formula. Labelled expressions are those obtained by extending expressions with the labelling construct, as follows:

$$\begin{aligned} e&::= \dots \mid e^{\alpha }, \qquad \alpha \in L. \end{aligned}$$

For a possibly empty sequence \( S = \alpha _1 \dots \alpha _n \), we write \( e^S \) for \( ((e^{\alpha _1})\dots )^{\alpha _n} \). Given a labelled expression \( e \), we write \( \natural e \) for the (ordinary) expression obtained by removing labels in \( e \).

The labels do not affect reduction. For example,

$$\begin{aligned} (((f^{S_0}\,e_1)^{S_1}\,e_2)^{S_2}\dots \,e_n)^{S_n} \quad \longrightarrow \quad e_f[e_1/x_1,\dots ,e_n/x_n] \end{aligned}$$

provided that \( f\,x_1\,\dots \,x_n = e_f \in \mathcal {P}\). Therefore, if \( e \longrightarrow e' \) as labelled expressions, then \( \natural e \longrightarrow \natural e' \).

Now we formally define the satisfaction relation \( \models \). It is a ternary relation \( e, \rho \models \varphi \) on a labelled expression \( e : \star \), a valuation map \( \rho \) from free variables in \( \varphi \) to labelled expressions, and an LTLC formula \( \varphi \). It is defined by induction on the complexityFootnote 4 of formulas by the rules in Fig.Β 1.

Fig. 1.
figure 1

Semantics of formulas. The operation \( \natural \) removes labels from a given expression.

Remark 1

Given a judgement \( e, \rho \models \varphi \), one can remove the following data without changing the meaning of the judgement:

  • mapping \( \alpha \mapsto e \) from \( \rho \) if \( \alpha \) is not of a base type, and

  • label \( \beta \) in \( (d)^\beta \) from \( e \) if \( d \) is an expression of a base type.

This is because the information on a base-type variable \( \beta \) is recorded in \( \rho \), and the information on a non-base-type variable \( \alpha \) is tracked by labels in the expression. We put both information to both \( \rho \) and \( e \) just to simplify the definition (by avoiding the case split by types).

The main difference from the naΓ―ve semantics is the meaning of the call operator. Instead of substituting \( \beta _i \) in the formula to the actual argument \( e_i \) in the expression, we annotate the actual argument \( e_i \) by \( \beta _i \).

We see how the labelling works by using the example in the previous subsection. By the labelling semantics, we have

$$\begin{aligned} ( doTask \,h\,h), \emptyset \models \varphi \quad \Leftrightarrow \quad h^{\alpha }\,(h^{\beta }\,()), \rho \models (\lnot \, call \,\beta (\gamma ). true ) \mathbin {\mathbf {U}}( call \,\alpha (\delta ). true ) \end{aligned}$$

for some \( \rho \) (whose contents are not important here). Notice that \( h \) as the first argument of \( doTask \) can be distinguished from \( h \) as the second argument of \( doTask \): the former has the label \( \alpha \) whereas the latter is annotated by \( \beta \). Now \( h^{\alpha }\,(h^{\beta }\,()), \rho \not \models call \,\beta (\gamma ). true \) and \( h^{\alpha }\,(h^{\beta }\,()), \rho \models \lnot call \,\beta (\gamma ). true \) as expected. It is not difficult to see that \( doTask \,h\,h, \emptyset \models \varphi \) indeed holds, whatever \( h \) is.

3.4 Negation Normal Form

The negation \( \lnot \) in a formula can be pushed inwards in many cases, without changing the meaning of the formula. For example,

$$\begin{aligned} \lnot true = false \qquad \lnot (\varphi _1 \mathbin {\mathbf {U}}\varphi _2) = (\lnot \varphi _1) \mathbin {\mathbf {R}}(\lnot \varphi _2) \quad \text {and}\quad \lnot {\bigcirc } \varphi = \bigcirc \lnot \varphi . \end{aligned}$$

Unfortunately the negation of the call operator \( \lnot call \,\xi (\widetilde{\beta }).\varphi \) cannot be pushed inwards in general, but we can restrict the shape of the formula to which the negation is applied. The formula \( call \,\xi (\widetilde{\beta }).\varphi \) does not hold if either (a) \( \xi \) is now called but the following computation violates \( \varphi \) or (b) \( \xi \) is not called in the current step. This observation can be expressed by the equation

$$\begin{aligned} \lnot call \,\xi (\widetilde{\beta }).\varphi \quad =\quad call \,\xi (\widetilde{\beta }).(\lnot \varphi ) \,\vee \, \lnot call \,\xi (\widetilde{\beta }). true . \end{aligned}$$

We shall abbreviate \( \lnot call \,\xi (\widetilde{\beta }). true \) as \( \lnot call \,\xi \).

The above argument gives an effective rewriting process, yielding a formula in the following syntax that we call the negation normal form:

$$\begin{aligned} \varphi&:= true \mid false \mid {\varphi \vee \varphi } \mid {\varphi \wedge \varphi } \mid {\bigcirc \varphi } \mid {\varphi \mathbin {\mathbf {U}}\varphi } \mid {\varphi \mathbin {\mathbf {R}}\varphi } \\&~\mid ~ call \, \xi (\widetilde{\beta }).\varphi \mid \lnot call \,\xi \mid p(\widetilde{\beta }) \mid \lnot p(\widetilde{\beta }). \end{aligned}$$

We shall use this normal form in the following section.

4 Expressiveness

This section briefly explains the expressiveness of LTLC.

4.1 Dependent Refinement Types

Properties described by dependent refinement types in the form of [21] are expressible by LTLC formulas.

Example 3

Consider the type

$$\begin{aligned} T_0 \quad :=\quad (x :\{ \mathtt {Int}\mid \nu \ge 0 \}) \rightarrow \{ \mathtt {Int}\mid \nu > x \} \end{aligned}$$

for call-by-value programs. This is the type for functions \( f \) on integers such that \( f(x) > x \) for every positive \( x \). As the target language of this paper is call-by-name, we need to apply the CPS translation to call-by-value programs of interest and the corresponding translation to dependent types. The resulting type is

$$\begin{aligned} T \quad :=\quad (x :\{ \mathtt {Int}\mid \nu \ge 0 \}) \rightarrow \big (\{ \mathtt {Int}\mid \nu \ge x \} \rightarrow \star \big ) \rightarrow \star . \end{aligned}$$

The LTLC formula \( \varphi _T \) corresponding to the judgement \( \vdash f : T \) is

$$\begin{aligned} \varphi _T \quad :=\quad ifcall f(\alpha ,\beta )\;.\; \big (\alpha \ge 0 \Rightarrow ifcall \,\beta (\gamma )\;.\; \alpha < \gamma \big ). \end{aligned}$$

We explain the general rule of the translation, focusing on the image of function types by the call-by-value CPS translation. The syntax of dependent refinement types is given by

$$\begin{aligned} T,S ~::=~ (\alpha : U) \rightarrow \big (V \rightarrow \star \big ) \rightarrow \star \qquad \quad U,V ~::=~ \{ \mathtt {Int}\mid \vartheta (\nu ) \} ~~\mid ~~ T \end{aligned}$$

where \( \nu \) is a distinguished variable and \( \vartheta (\nu ) \) is a formula of the underlying logic. The occurrence of \( \alpha \) is a binding occurrence and \( \vartheta \) may contain variables other than \( \nu \). The LTLC formula \( \varPhi _U \) is defined by the following rules:

$$\begin{aligned} \varPhi _{\{ \mathtt {Int}\mid \vartheta (\nu ) \}}(\alpha )&:= \vartheta (\alpha ) \\ \varPhi _{(\beta : U) \rightarrow (V \rightarrow \star ) \rightarrow \star }(\alpha )&:= ifcall \, \alpha (\beta , \kappa ). \,\big ( \varPhi _U(\beta ) \Rightarrow ifcall \,\kappa (\gamma )\,.\, \varPhi _V(\gamma ) \big ). \end{aligned}$$

A judgement \( \vdash f : T \) corresponds to the LTLC formula \( \varPhi _T(f) \).

4.2 Relational Property

Some relational propertiesΒ [1, 5], such as the relationship between two functions and that between two calls of a function, can be described by LTLC. An example of relational property is monotonicity; if a given function \( f \) is not monotone, one can find two inputs \( x \le y \) such that \( f(x) \nleq f(y) \). Monotonicity can be naturally expressed by LTLC.

Example 4 (Monotonicity)

Assume a function \( f :\mathtt {Int}\rightarrow (\mathtt {Int}\rightarrow \star ) \rightarrow \star \). This function is monotone if \( n \le n' \), \( f\,n\,k \) calls the continuation \( k \) with the value \( m \) and \( f\,n'\,k' \) calls \( k' \) with \( m' \), then \( m \le m' \). (Recall that \( f \) is in CPS and thus β€œcalling \( k \) with \( m \)” can be understood as β€œreturning \( m \)”.) If \( f \) is assumed to be non-recurrent, this property can be written as

$$\begin{aligned} ifcall \,f(\alpha ,\beta )\;.\; ifcall \,\beta (\gamma ) \;.\; ifcall \,f(\alpha ',\beta ')\;.\; ifcall \,\beta '(\gamma ') \;.\; \psi (\alpha ,\gamma ,\alpha ',\gamma ') \end{aligned}$$

where \( \psi (\alpha ,\gamma ,\alpha ',\gamma ') = (\alpha \le \alpha ' \Rightarrow \gamma \le \gamma ') \wedge (\alpha \ge \alpha ' \Rightarrow \gamma \ge \gamma ') \). The meaning of this formula can be expressed by a natural language as follows:

Let \( \alpha \) be the argument to the first call of \( f \), and \( \gamma \) be the β€œreturn value” of the first call. Similarly let \( \alpha ' \) be the argument to the second call of \( f \), and \( \gamma ' \) be the β€œreturn value” of the second call. We require that \( \alpha \le \alpha ' \) implies \( \gamma \le \gamma ' \), and that \( \alpha \ge \alpha ' \) implies \( \gamma \ge \gamma ' \).

A formula applicable to the case of \( f \) being recurrent is a bit complicated, since the order of two calls and returns is not determined. The formula applicable to the general case is

$$\begin{aligned}&ifcall \,f(\alpha ,\beta )\;.\; ifcall \,f(\alpha ',\beta ')\;.\; ifcall \,\beta (\gamma ) \;.\; ifcall \,\beta '(\gamma ') \;.\; \psi (\alpha ,\gamma ,\alpha ',\gamma ') \\ \wedge \;&ifcall \,f(\alpha ,\beta )\;.\; ifcall \,f(\alpha ',\beta ')\;.\; ifcall \,\beta '(\gamma ') \;.\; ifcall \,\beta (\gamma ) \;.\; \psi (\alpha ,\gamma ,\alpha ',\gamma ') \\ \wedge \;&ifcall \,f(\alpha ,\beta )\;.\; ifcall \,\beta (\gamma ) \;.\; ifcall \,f(\alpha ',\beta ')\;.\; ifcall \,\beta '(\gamma ') \;.\; \psi (\alpha ,\gamma ,\alpha ',\gamma ') \end{aligned}$$

The conjunction enumerates all possible orders of two calls and returns of \( f \).

4.3 Resource Usage Verification

The final example is verification/analysis of programs using resources, known as resource usage analysisΒ [10]. An example of resource is read-only files. For simplicity, we focus on the verification of usage of read-only files.

Let us first consider the simplest case in which a program generates a unique resource only at the beginning. In this case, a target is a program with distinguished functions \( r, c : \star \rightarrow \star \) for reading and closing the file. The specification requires (1) the program does not read the file after closing it, and (2) the file must be closed before the termination of the program. The specification can be described by an LTLC formula:

$$\begin{aligned} \varphi (r,c) \quad :=\quad \mathbf {G}( call \,c \Rightarrow \lnot \mathbf {F} call \,r) \wedge (\lnot \mathtt {end} \mathbin {\mathbf {U}} call \,c), \end{aligned}$$

where \( \mathtt {end} \) is the event meaning the termination. Indeed this is an LTL formula when one regards \( call \,c \) and \( call \,r \) as atomic propositions.

In the general case, a program can dynamically create read-only file resources. The target program has a distinguished type \( \mathtt {File} \) for file resources and a distinguished function \( gen : (\mathtt {File} \rightarrow \star ) \rightarrow \star \). Since the possible operations for \( \mathtt {File} \) is read and close, we identify the type \( \mathtt {File} \) as \( (\star \rightarrow \star ) \times (\star \rightarrow \star ) \), the pair of reading and closing functions. The specification requires that, for each call of \( gen \), the created resource should be used in the manner following \( \varphi \); this specification can be written by an LTLC formula as

$$\begin{aligned} ifcall \, gen (\alpha )\,.\, ifcall \,\alpha (r,c)\,.\, \varphi (r,c). \end{aligned}$$

Note that \( ifcall \,\alpha (r,c) \) intuitively means that β€œif the function \( gen \) returns the value \((r,c)\)” since \( gen \) is in CPS and \( \alpha \) is the continuation.

5 LTLC Model Checking

This section focuses on the LTLC model-checking problem, i.e.Β the problem to decide, given a program \( \mathcal {P}\), an expression \( e \) and an LTLC formula \( \varphi \), whether \( e, \emptyset \models \varphi \) (where \( \emptyset \) is the empty valuation). The main result of this section is that the LTLC model-checking problem is effectively reducible to the standard temporal verification problem, which could be solved by model checkers for higher-order programs. In particular, for programs and expressions over finite types, this reduction yields an instance of higher-order model checkingΒ [12, 17], for which several model-checkers are availableΒ [6, 20].

5.1 Preliminaries: Higher-Order Model Checking

Higher-order model checking is a problem to decide, given a higher-order tree grammar and an \( \omega \)-regular tree property, whether the (possibly infinite) tree generated by the grammar satisfies the property. Higher-order model checking has been proved to be decidable by OngΒ [17]Footnote 5 and applied to many verification problems of higher-order programs (see, e.g., [12]). We prove that LTLC model checking is decidable by reducing it to higher-order model checking.

This subsection briefly reviews higher-order model checking, tailored to our purpose. See [12, 17] for formal definitions and general results.

Let \( \varSigma \) be a ranked alphabet defined by

$$\begin{aligned} \varSigma \,:=\, \{\, \top , \bot \mapsto 0,\;\; \sqcup , \sqcap \mapsto 2,\;\; \mathtt {U},\mathtt {R} \mapsto 3 \,\}. \end{aligned}$$

This means that \( \top \) is a leaf and \( \sqcup \) (resp.Β \( \mathtt {U} \)) is a binary (resp.Β ternary) branching tree constructor, and so on. A \( \varSigma \)-labelled tree is a possibly infinite tree of which each node is labelled by a symbol in \( \varSigma \). We shall consider only well-ranked trees: we require that the number of children of a node labelled by \( \sqcap \) is \( 2 \), for example. We shall often use the infix notation for \( \sqcup \) and \( \sqcap \), e.g.Β \( T_1 \sqcup T_2 \) is the tree whose root is \( \sqcup \) and its children are \( T_1 \) and \( T_2 \).

A nondeterministic BΓΌchi automaton is a tuple \( (Q, q_0, \delta , F) \), where \( Q \) is a finite set of states, \( q_0 \in Q \) is an initial state, \( \delta :\prod _{a \in \mathrm {dom}(\varSigma )} (Q \rightarrow \mathcal {P}(Q^{\varSigma (a)})) \) is a transition function and \( F \subseteq Q \) is the set of accepting states. A run-tree of \( \mathcal {A}\) over a tree \( T \) is an association of states \( q \in Q \) to nodes in \( T \) that respects the transition function in a certain sense. A run-tree is accepting if each infinite branch contains infinitely many occurrences of an accepting state. A tree \( T \) is accepted by \( \mathcal {A}\) if there is an accepting run-tree over \( T \).

A tree-generating program is a variant of programs introduced in Sect.Β 2, but has different set of operators on type \( \star \). The syntax of expressions is

$$ e \quad :=\quad \top \mid \bot \mid \sqcup \mid \sqcap \mid \mathtt {U} \mid \mathtt {R} \mid c \mid x \mid f \mid e_1\,e_2 \mid \mathbf {op}\,e_1\,e_2\,e_3 \mid \mathbf {if}\,e_1\,e_2\,e_3, $$

obtained by replacing \( () \) with the tree constructors in \( \varSigma \). Their types are

$$\begin{aligned} \top , \bot :\star \qquad \qquad \sqcup , \sqcap :\star \rightarrow \star \rightarrow \star \qquad \text {and}\qquad \mathtt {U}, \mathtt {R} :\star \rightarrow \star \rightarrow \star \rightarrow \star . \end{aligned}$$

So \( \star \) should be now regarded as the type for trees. The notion of function definition remains unchanged, except that the body of a function is now an expression with tree constructors.

The operational semantics is basically the same as before. The only difference is that reduction may occur under tree constructors, i.e.Β the following rules

$$ \dfrac{ e_1 \longrightarrow e_1' }{ (e_1 \sqcup e_2) \longrightarrow (e_1' \sqcup e_2) } \quad \text {and}\quad \dfrac{ e_2 \longrightarrow e_2' }{ (e_1 \sqcup e_2) \longrightarrow (e_1 \sqcup e_2') } $$

are added, as well as similar rules for other constructors. A program \( (\mathcal {P}, e) \) generates a possibly infinite tree as a result of possibly infinite steps of reduction.

Higher-order model checking asks to decide, given a program \( (\mathcal {P},e) \) and a nondeterministic BΓΌchi automaton \(\mathcal {A}\), whether the tree generated by \( (\mathcal {P},e) \) is accepted by \( \mathcal {A}\).

Theorem 1

(OngΒ [17]). Given a tree-generating program \( (\mathcal {P}, e) \), of which all basic data types are finite, and a nondeterministic BΓΌchi automaton \( \mathcal {A}\), one can effectively decide whether the tree generated by \( (\mathcal {P}, e) \) is accepted by \( \mathcal {A}\).

5.2 Satisfaction Tree

Let \(\mathcal {P}\) be a function definition, fixed in this subsection. Given an expression \( e :\star \), a valuation \( \rho \) and an LTLC formula \( \varphi \) in negation normal form, we define a tree \( \mathcal {T}(\varphi , \rho , e) \), called the satisfaction tree, which represents the process evaluating \( e, \rho \models \varphi \). This subsection shows that the satisfaction tree correctly captures the satisfaction relation, in the sense that \( e, \rho \models \varphi \) if and only if \( \mathcal {T}(\varphi , \rho , e) \) belongs to a certain \( \omega \)-regular tree language.

A satisfaction tree is a \( \varSigma \)-labelled tree. The meaning of \( \top , \bot , \sqcup \) and \( \sqcap \) should be obvious. The trees \( \top \) and \( \bot \) represent immediate truth and falsity. The tree \( T_1 \sqcap T_2 \) means that the evaluation process invokes two subprocess, represented by \( T_1 \) and \( T_2 \), and the result is true just if the results of both subprocesses are true. The meaning of \( \sqcup \) is similar.

The constructors \( \mathtt {U} \) and \( \mathtt {R} \), corresponding respectively to \( \mathbin {\mathbf {U}}\) and \( \mathbin {\mathbf {R}}\), require some expositions. The meaning of \( \mathtt {U} \) is based on a classical but important observation: whether \( e, \rho \models \varphi _1 \mathbin {\mathbf {U}}\varphi _2 \) or not is completely determined by three judgements, namely \( e, \rho \models \varphi _1 \), \( e, \rho \models \varphi _2 \) and \( e, \rho \models \bigcirc (\varphi _1 \mathbin {\mathbf {U}}\varphi _2) \). That means, \( e, \rho \models \varphi _1 \mathbin {\mathbf {U}}\varphi _2 \) if and only if either (a) \( e, \rho \models \varphi _1 \) and \( e, \rho \models \varphi _2 \), or (b) \( e, \rho \models \varphi _1 \) and \( e, \rho \models \bigcirc (\varphi _1 \mathbin {\mathbf {U}}\varphi _2) \). So the process of checking \( e, \rho \models \varphi _1 \mathbin {\mathbf {U}}\varphi _2 \) invokes three subprocesses; the three subtrees of \( \mathtt {U} \) correspond to these judgements. A similar observation applies to \( \mathbin {\mathbf {R}}\).

The definition of satisfaction trees is co-inductively defined by the rules in Figs.Β 2, 3, 4 and 5. The meaning of the rules in Fig.Β 2 should now be clear. For example, the rule for \( \varphi _1 \mathbin {\mathbf {U}}\varphi _2 \) says that \( e, \rho \models \varphi _1 \mathbin {\mathbf {U}}\varphi _2 \) depends on satisfaction of \( e, \rho \models \bigcirc (\varphi _1 \mathbin {\mathbf {U}}\varphi _2) \), \( e, \rho \models \varphi _1 \) and \( e, \rho \models \varphi _2 \).

Fig. 2.
figure 2

Satisfaction tree: (1) Boolean connectives, until and release.

FigureΒ 3 defines the rules for the call modality \( call \,\alpha (\widetilde{\beta }).\varphi \). The first two rules check if \( e \) is calling an expression labelled by \( \alpha \). If one finds the label \( \alpha \), then the arguments are recorded to \( \rho \) and labelled by \( \widetilde{\beta } \) as required. In this case, we also change the target formula to \( \bigcirc \varphi \). In the second rule, a label other than \( \alpha \) should be simply ignored. The last three rules deal with the case of \( \alpha \) not being annotated; then \( e, \rho \models call \,\alpha (\widetilde{\beta }).\varphi \) is immediately false.

Fig. 3.
figure 3

Satisfaction tree: (2) Call modality. We assume \( \gamma \ne \alpha \). The satisfaction tree for \( call \,f(\widetilde{\beta }).\varphi \) is similar; we omit the rules here.

FigureΒ 4 defines the rules for the negation of call. If \( e \) is calling an expression labelled by \( \alpha \), then \( e, \rho \models \lnot call \,\alpha (\_) \) is obviously false. The last three rules describe the case of \( \alpha \) not being found, in which case \( \lnot call \,\alpha (\_) \) holds.

Fig. 4.
figure 4

Satisfaction tree: (3) Negation of call modality. We assume \( \gamma \ne \alpha \). The satisfaction tree for \( \lnot call \,f(\_) \) is similar; we omit the rules here.

FigureΒ 5 defines the rules for the next modality. It simply ignores labels and reduces the expression in one step.

Fig. 5.
figure 5

Satisfaction tree: (4) Next modality. It ignores labels and reduces the expression in one step. We assume that \( (f\,x_1\,\dots \,x_n = e_f) \in \mathcal {P}\).

We omit the rules for \( call \,f(\widetilde{\beta }).\varphi \) and \( \lnot call \,f(\_) \), which are basically the same as those for \( call \,\alpha (\widetilde{\beta }).\varphi \) and \( \lnot call \,\alpha (\_) \).

We formalise the meaning of a satisfaction tree by giving a nondeterministic BΓΌchi automaton. The definition of the automaton is basically straightforward, but there is a subtlety in the meaning of \(\mathtt {U}\). Recall that \( \varphi _1 \mathbin {\mathbf {U}}\varphi _2 \) holds if either

  1. 1.

    both \( \varphi _1 \) and \( \varphi _2 \) hold, or

  2. 2.

    both \( \varphi _1 \) and \( \bigcirc (\varphi _1 \mathbin {\mathbf {U}}\varphi _2) \) hold.

Similarly \( \varphi _1 \mathbin {\mathbf {R}}\varphi _2 \) holds if and only if

  1. 1.

    both \( \varphi _1 \) and \( \varphi _2 \) hold, or

  2. 2.

    both \( \varphi _2 \) and \( \bigcirc (\varphi _1 \mathbin {\mathbf {U}}\varphi _2) \) hold.

The condition for \( \mathbin {\mathbf {U}}\) quite resembles that for \( \mathbin {\mathbf {R}}\), but there is a crucial difference which cannot be captured by the above descriptions. That is, \( \varphi _1 \mathbin {\mathbf {U}}\varphi _2 \) requires that \( \varphi _2 \) eventually holds, but \( \varphi _1 \mathbin {\mathbf {R}}\varphi _2 \) is true even if \( \varphi _1 \) never becomes true. This difference should be captured by the acceptance condition of the BΓΌchi automaton.

The BΓΌchi automaton \( \mathcal {A}\) has three states, \( q_0 \), \( q_1 \) and \( *\). The states \( q_0 \) and \( q_1 \) have the same behaviour except that \( q_0 \) is accepting and \( q_1 \) is not accepting. The state \( *\) accepts every tree; this state is used to describe a rule which ignores some of children. The set of accepting states is \( \{q_0, *\} \) and the initial state is \( q_0 \). The transition rules are given by:

$$\begin{aligned} \delta _\top (q)&:= \{ () \}&\delta _\bot (q)&:= \{ () \} \\ \delta _\sqcap (q)&:= \{(q_0, q_0)\}&\delta _\sqcup (q)&:= \{(q_0,*), (*,q_0)\} \\ \delta _{\texttt {U}}(q)&:= \{(q_1,q_0,*), (*,q_0,q_0)\}&\delta _{\texttt {R}}(q)&:= \{(q_0,*,q_0), (*,q_0,q_0)\}, \end{aligned}$$

where \( q = q_0 \) or \( q_1 \). We omit the rules for the state \( *\), which accepts every tree. The tree \( \texttt {U}(T_1, T_2, T_3) \) is accepted from \( q_0 \) if \( T_1 \) is accepted from \( q_1 \) and \( T_3 \) is accepted from \( q_0 \); note that we assign \( q_1 \) to \( T_1 \), instead of \( q_0 \), because the until formula \( \varphi _1 \mathbin {\mathbf {U}}\varphi _2 \) expects \( \varphi _2 \) eventually holds.

The following theorem is the first half of the reduction.

Theorem 2

\( e, \rho \models \varphi \) if and only if \( \mathcal {T}[e,\rho \models \varphi ] \) is accepted by \( \mathcal {A}\).

5.3 A Tree-Generating Program Generating the Satisfaction Tree

The previous subsection introduced satisfaction trees, which concern only about LTL features of LTLC, i.e.Β the tree does not have any information on the call nor next modality, which are related to reduction of programs.

This subsection discusses a way to deal with these features missing in satisfaction trees. Technically, given a program \( (\mathcal {P},e) \) and a formula \( \varphi \), we construct a tree-generating program \( (\mathcal {P}^{\#},e') \) that generates the satisfaction tree \( \mathcal {T}[e, \emptyset \models \varphi ] \). The construction of \( (\mathcal {P}^{\#}, e') \) is effective, and if the original program and the formula use only finite base types, then so does \( (\mathcal {P}^{\#},e') \). Therefore this construction, together with the result of the previous subsection, shows that the LTLC model checking is decidable for programs and formulas over finite data types.

We first give the formal statement of the theorem, which shall be proved in the rest of this subsection.

Theorem 3

Given a program \( (\mathcal {P},e_0) \) and an LTLC formula \( \varphi \), one can effectively construct a tree-generating program \( (\mathcal {P}^{\#}, e_0') \) that generates the satisfaction tree \( \mathcal {T}[e,\emptyset \models \varphi ] \). Furthermore, if both the program \((\mathcal {P},e)\) and the formula \( \varphi \) contain only finite base types, then so does \((\mathcal {P}^{\#},e_0')\).

Let \( \varphi _0 \) be a formula of interest, fixed in the sequel. By renaming bound variables if necessary, we can assume without loss of generality that different variables in \( \varphi _0 \) have different names. Let \( L_0 \subseteq L \) be the finite set of bound variables in \( \varphi _0 \). Note that each \( \alpha \in L_0 \) is associated to its type in \( \varphi _0 \).

The idea of the translation, written \( \# \), is as follows. Recall that the satisfaction tree \( \mathcal {T}[e, \rho \models \varphi ] \) is determined by the three data, namely an expression \( e : \star \), a valuation \( \rho \) and a formula \( \varphi \). Hence the translation \( e^{\#} \) of the expression \( e \) should take two extra arguments \( \rho \) and \( \varphi \) to compute \( \mathcal {T}[e, \rho \models \varphi ] \).

Let us first consider the translation of types. Because the translation of an expression \( e \) of unit type \( \star \) takes two additional arguments, namely a formula and a valuation, the translation of the unit type should be given by

$$\begin{aligned} \star \qquad {\mathop {\longmapsto }\limits ^{\#}}\qquad ( valuation \rightarrow formula \rightarrow \star ), \end{aligned}$$

where \( valuation \) and \( formula \) are the β€œtypes” for valuations and formulas, which shall be described below. The translation can be naturally extended to base types and function types by

$$\begin{aligned} b^{\#} := b \quad \text {and}\quad (\sigma \rightarrow \tau )^{\#} := \sigma ^{\#} \rightarrow \tau ^{\#}. \end{aligned}$$

The β€œtype” \( formula \) can be defined as an additional finite base type. An important observation is that only finitely various formulas are reachable by unfolding the definition of \( \mathcal {T}[e, \emptyset \models \varphi _0] \). It is easy to see that the following set

$$\begin{aligned} \{\, \psi , \bigcirc \psi ~\mid ~ \psi \text { is a subformula of } \varphi _0 \,\} \end{aligned}$$

is an overapproximation. So we define the values in \( V_{ formula } \) as this set. We shall write \( \lfloor {\psi }\rfloor \) for the formula \( \psi \) seen as a value in \( V_{ formula } \). We assume an operation \( =_{ formula } \) to compare formulas. Since \( formula \) is now a finite base type, one can define a function by using pattern matching of formulas.

The β€œtype” \( valuation \) can be implemented as a tuple. Note that valuations \( \rho \) reachable from \( \mathcal {T}[e,\emptyset \models \varphi _0] \) have subsets of \( L_0 \) as their domains. So a reachable valuation \( \rho \) can be represented as a tuple of length \( |L_0| \), where \( |L_0| \) is the number of elements in \( L_0 \). If \( \rho (\alpha ) \) is undefined for some \( \alpha \in L_0 \), one can fill the corresponding place in a tuple by an arbitrary expression.

Summarising the above argument, the translation of the unit type is

$$\begin{aligned} \star \quad {\mathop {\longmapsto }\limits ^{\#}}\quad (\sigma _1 \rightarrow \sigma _2 \rightarrow \dots \rightarrow \sigma _n \rightarrow formula \rightarrow \star ) \end{aligned}$$

if the set of variables \( L_0 \) in \( \varphi _0 \) is \( \{\, \alpha _1, \dots , \alpha _n \,\} \) and \( \sigma _i \) is the type for \( \alpha _i \), \( 1 \le i \le n \). We shall fix the enumeration \( \alpha _1, \dots , \alpha _n \) of \( L_0 \) in the sequel.

We give the translation of expressions. The function definition \( \mathcal {P}^{\#} \) after translation defines the following functions:

  • \( f^{\#} :\tau ^{\#} \) for each function \( f \) defined in \( \mathcal {P}\),

  • \( \alpha ^{\#} :\tau ^{\#} \rightarrow \tau ^{\#} \) for each variable \( \alpha \in L_0 \) of type \( \tau \),

  • \( \mathbf {op}^{\#} :b_1 \rightarrow b_2 \rightarrow (b_3 \rightarrow \star ^{\#}) \rightarrow \star ^{\#} \) for each operation \( op \in Op \),

  • \( \mathbf {if}^{\#} :\mathtt {Bool}\rightarrow \star ^{\#} \rightarrow \star ^{\#} \rightarrow \star ^{\#} \), the translation of \( \mathbf {if}\), and

  • \( ()^{\#} : \star ^{\#} \), the translation of the unit value.

Note that \( \alpha \in L_0 \) does not have the translation if \( \alpha \) has a base type; the label \( ({-})^{\alpha } \) is simply ignored by the translation (see RemarkΒ 1). Using these functions, the translation of expressions is given as follows:

$$\begin{aligned} c^{\#} := c \qquad x^{\#} := x \qquad (e_1\,e_2)^{\#} := e_1^{\#}\,e_2^{\#} \qquad (e^{\alpha })^{\#} := \alpha ^{\#}\,e^{\#} \quad \text {and}\quad (e^{\beta })^{\#} := e^{\#} \end{aligned}$$

where \( \alpha \) (resp.Β \( \beta \)) is a variable in \( L_0 \) of a non-base type (resp.Β a base type). Other cases have already given by \( \mathcal {P}^{\#} \): for example, \( () {\mathop {\mapsto }\limits ^{\#}} ()^{\#} \) and \( f {\mathop {\mapsto }\limits ^{\#}} f^{\#} \). A notable point is that the label annotation \( e^{\alpha } \) is translated to application to \( \alpha ^{\#} \).

The translation of valuations should now be clear. A valuation is translated to a sequence of expressions, defined by

$$\begin{aligned} \rho \quad {\mathop {\longmapsto }\limits ^{\#}}\quad \rho (\alpha _1)^{\#} \,\dots \, \rho (\alpha _n)^{\#}. \end{aligned}$$

If \( \rho (\alpha _i) \) is undefined, then \( \rho (\alpha _i)^{\#} \) can be arbitrary (but fixed a priori) expression of the required type. We use \( \varrho \) for sequences of this kind. We write \( \varrho [\alpha _i \mapsto e] \) for the sequence obtained by replacing the \( i \)th element in \( \varrho \) with \( e \).

What remains is to give definitions of functions \( \mathcal {P}^{\#} \) so that the value tree of \( e^{\#}\,\rho ^{\#}\,\lfloor {\varphi }\rfloor \) will coincide with the satisfaction tree \( \mathcal {T}[e, \rho \models \varphi ] \). Each function definition is of the form \( h\,\widetilde{x}\,\varrho \,\lfloor {\varphi }\rfloor = e \), where \( \widetilde{x} \) is a sequence of arguments in the original definition, \( \varrho \) is a sequence representation of a tuple of type \( valuation \), and \( \lfloor {\varphi }\rfloor \in V_{ formula } \) is the value of type \( formula \). All functions in \( \mathcal {P}^{\#} \) are defined by pattern matching on the final argument \( \lfloor {\varphi }\rfloor \). For example, consider the case of the final argument being \( \lfloor {\psi _1 \wedge \psi _2}\rfloor \). Because

$$\begin{aligned} \mathcal {T}[h\,\widetilde{e}, \rho \models \psi _1 \wedge \psi _2] \quad =\quad \mathcal {T}[h\,\widetilde{e}, \rho \models \psi _1] \,\sqcap \, \mathcal {T}[h\,\widetilde{e}, \rho \models \psi _2], \end{aligned}$$

the definitionFootnote 6 of \( h \) for this case has to be

$$\begin{aligned} h\,\widetilde{x}\,\varrho \,\lfloor {\psi _1 \wedge \psi _2}\rfloor \quad =\quad (h\,\widetilde{x}\,\varrho \,\lfloor {\psi _1}\rfloor ) \,\sqcap \, (h\,\widetilde{x}\,\varrho \,\lfloor {\psi _2}\rfloor ). \end{aligned}$$

As an example of more complicated case, let us consider the rule

$$\begin{aligned} \mathcal {T}[e^{\alpha }\,e_1\,\dots \,e_n,\; \rho \models call \,\alpha (\beta _1,\dots ,\beta _n).\varphi ] \quad =\quad \mathcal {T}[e\,e_1^{\beta _1}\,\dots \,e_n^{\beta _n},\; \rho ' \models \bigcirc \varphi ] \end{aligned}$$

where \( \rho ' = \rho \cup \{\beta _i \mapsto e_i\}_{1 \le i \le n} \). Because

$$\begin{aligned} (e^{\alpha }\,e_1\,\dots \,e_n)^{\#} \quad =\quad \alpha ^{\#}\,e^{\#}\,e_1^{\#}\,\dots \,e_n^{\#}, \end{aligned}$$

the above rule can be seen as (a part of) the definition of \( \alpha ^{\#} \):

$$\begin{aligned} \alpha ^{\#}\,g\,\widetilde{x}\,\varrho \, \lfloor { call \,\alpha (\beta _1,\dots ,\beta _n).\varphi }\rfloor \quad =\quad g\,(\beta _1^{\#}\,x_1)\,\dots \,(\beta _n^{\#}\,x_n)\,\varrho '\,\lfloor {\bigcirc \varphi }\rfloor \end{aligned}$$

where \( \varrho ' = \varrho [\beta _1 \mapsto x_1]\dots [\beta _n \mapsto x_n] \). It is easy to check that

$$\begin{aligned} (e^{\alpha }\,e_1\,\dots \,e_n)^{\#}\,\rho ^{\#}\, \lfloor { call \,\alpha (\beta _1,\dots ,\beta _n).\varphi }\rfloor \quad \longrightarrow ^*\quad (e\,e_1^{\beta _1}\,\dots \,e_n^{\beta _n})^{\#}\,\rho '^{\#}\,\lfloor {\bigcirc \varphi }\rfloor \end{aligned}$$

as expected. All other cases are given in the same way.

Now the definition of the translation has been given in sufficient detail, we believe. It is not difficult to establish the following lemma.

Lemma 1

The value tree of \( e^{\#}\,\rho ^{\#}\,\lfloor {\varphi }\rfloor \) is equivalent to \( \mathcal {T}[e,\rho \models \varphi ] \), provided that \( \rho \) and \( \varphi \) are reachable from the definition of \( \mathcal {T}[e_1, \emptyset \models \varphi _0] \) for some expression \( e_1 \) of type \( \star \).

TheoremΒ 3 is a consequence of this lemma: \( e_0' \) can be defined as \( e_0^{\#}\,\emptyset ^{\#}\,\lfloor {\varphi _0}\rfloor \).

The decidablity result is a corollary of TheoremsΒ 2 and 3.

Theorem 4

Let \( (\mathcal {P},e) \) is a program and \( \varphi \) is an LTLC formula. If \( (\mathcal {P},e) \) and \( \varphi \) contain only finite base types, then one can effectively decide whether \( e, \emptyset \models \varphi \).

Remark 2

Let us briefly discuss the time complexity of the algorithm. The cost of the translation is negligible; we estimate the running time of the higher-order model checking. If we fix the property automaton to \( \mathcal {A}\) in TheoremΒ 2, the higher-order model checking be solved in time \( O(P^2 \, \mathbf {exp}_N( poly (A\,D))) \) for some polynomial \( poly \) ([13, SectionΒ 5] adopted to our setting), where \( P \), \( N \), \( A \) and \( D \) are parameters determined by the program after translation; \( P \) is the size, \( N \) is the order, \( A \) is the maximum arity of types and \( D \) is the maximum number of values in base types. Easy calculation shows that

$$\begin{aligned} P = O(|(\mathcal {P},e)| \times |\varphi |) \qquad N \le order (\mathcal {P},e) + 2 \qquad N = O(|\varphi |) \qquad A = O(|\varphi |) \end{aligned}$$

where \( |(\mathcal {P},e)| \) and \( |\varphi | \) are the sizes of the program and of the formula.

6 Discussions

Compositional Reasoning. LTLC model checking is a kind of whole-program verification. Actually \( C[f], \rho \models \varPhi _T(f) \), where \( \varPhi _T \) is the LTLC formula corresponding to a dependent type \( T \) (see Sect.Β 4.1), only means that the behaviour of \( f \) in the context \( C \) does not violate the specification \( T \); it does not ensure that \( f \) meets \( T \) in other contexts as well.

This is in contrast to a compositional approach such as a type-based one, in which \( \vdash t : T \) means that \( t \) satisfies the specification \( T \) in whatever the context \( t \) is used. In this sense \( \varPhi _T(f) \) is not like a type judgement but like dynamic monitoring of a contractΒ [9].

A way to fill the gap is to consider all possible contexts. That means, we define \( \models f : T \) to mean that \( C[f], \emptyset \models \varPhi _T(f) \) for every context \( C \). In a sufficiently expressive programming language, \( \forall C.\big (\, C[f], \emptyset \models \varPhi _T(f)\,\big ) \) is equivalent to \( C_0[f], \emptyset \models \varPhi _T(f) \) for a certain β€œworst” context \( C_0 \); this observation gives us a way to reduce compositional reasoning to whole-program analysis. This strategy is actually used in [23], for example.

A typical way to construct the β€œworst” context \( C_0 \) is to use nondeterminismΒ [23]; intuitively \( C_0 \) is a β€œmaximally” nondeterministic context, which may do anything allowed. Unfortunately this construction is not directly applicable to our case, since our reducibility result (in particular, TheoremΒ 2) essentially relies on the determinism of programs.

Non-deterministic Programs. Determinism of programs is essential to our reducibility result. In fact, even the definition of the satisfaction relation becomes β€œincorrect” in the presence of non-determinism.

To see the reason, consider an LTLC formula

$$\begin{aligned} \varphi \quad :=\quad ifcall \,f \vee \lnot ifcall \,f, \end{aligned}$$

which is obviously true for every program. By definition, we have

$$\begin{aligned} e, \rho \models ifcall \,f \vee \lnot ifcall \,f \qquad \text{ iff }\qquad e, \rho \models ifcall \,f \quad \text{ or }\quad e, \rho \models \lnot ifcall \,f, \end{aligned}$$

for every expression \( e \). This rule is problematic in the presence of nondeterminism. For example, consider \( e = (f\,()) \oplus () \) where \( \oplus \) is the nondeterministic branching. This expression decides nondeterministically whether it calls \( f \) or not. Then \( e, \rho \models ifcall \,f \vee \lnot ifcall \,f \) but neither \( e, \rho \models ifcall \,f \) nor \( e, \rho \models \lnot ifcall \,f \).

This problem can be easily fixed by changing the definition of the satisfaction relation. It should be a relation \( \pi , \rho \models \varphi \) on an infinite reduction sequence \( \pi \) (instead of an expression \( e \)), a valuation and a formula in the presence of nondeterminism.

However TheoremΒ 2 cannot be modified accordingly to the new definition. The definition of \( \mathcal {T}[e, \rho \models \varphi ] \) is so deeply related to the current definition of the satisfaction that we cannot obtain a variant of TheoremΒ 2 applicable to nondeterministic setting.

Actually we conjecture that LTLC model-checking for nondeterministic programs is undecidable even for programs with only finite data types. The proof of the conjecture is left for future work.

7 Related Work

LTLC model checking is a kind of temporal verification of higher-order programs, which has been extensively studiedΒ [11, 12, 14, 15, 22]. The temporal properties of higher-order programs have also been studied in the context of contracts, named temporal higher-order contractsΒ [7].

Alur et al. proposed a linear temporal logic called CARETΒ [2], which is designed for specifying properties for first-order programs modeled by Recursive State MachinesΒ [3] and Pushdown SytemsΒ [16]. Neither CARET nor LTLC subsumes the other. On the one hand, CARET cannot describe properties of higher-order functions, such as β€œa function argument of some function is eventually called.” On the other hand, CARET can describe caller properties such as β€œa caller function of the function currently invoded is never returned,” which cannot be expressed in LTLC. An extension of LTLC for specifying caller properties is left for future work. Alur and Madhusudan proposed Visibly Pushdown Languages (VPL)Β [4], which can specify properties of function calls and returns, and subsumes CARET. Like CARET, VPL is for first-order programs, not for higher-order programs.

Recently Satake and Unno proposed a dynamic logic for higher-order programs, named HOT-PDLΒ [22]. Their logic is not directly comparable to ours, as their logic is for call-by-value programs. The gap can be partially filled by applying the CPS translation, and the formulas in their logic can be translated to LTLC formulas in many cases, although we need to extend LTLC by anonymous call operator \( call \,\_(\widetilde{\beta }).\varphi \) to fully capture their logic. Many LTLC formulas such as those in ExampleΒ 2 and Sect.Β 4.3 cannot be expressed in HOT-PDL.

Applications of HORS model checking to program verification has been studiedΒ [11, 12, 14, 15, 17, 18, 22]. Decidability of resource usage verification has been proved in [12] by using a program translation tailor-made for the resource usage verification problem. The argument in Sect.Β 4.3 together with TheoremΒ 4 gives another, more principled proof of the decidability result, although the current argument proves only a partial result of [12].

8 Conclusion

We have proposed a temporal logic called LTLC, which is an extension of LTL and can specify properties for call-by-name higher-order programs. Thanks to the call operator, LTLC can describe properties of arguments of function currently called. For example, LTLC can specify the order of function calls such as β€œthe first argument passed to the function f is called before the call of the second argument passed to f.” We have shown that LTLC model checking is decidable for a finite-data deterministic programs via a reduction to HORS model checking.

The most important future work is to prove the undecidability (possibly, the decidability) of LTLC model checking for non-deterministic programs. To further widen the scope of our method, it is worth extending LTLC for specifying branching properties by embedding the call operator into CTL* or modal \(\mu \)-calculus.