1 Introduction

In computer science, program termination is the quintessential example of a property that is undecidable, a fact that is well-known as the undecidability of the halting problem [12]. This undecidability implies that it is not possible to build a compiler that would verify whether a program terminates for any given input. Despite this undecidability, it is possible to construct algorithms that partially decide termination, i.e., they correctly answer whether an input program “terminates or not”, but may also answer “do not know”. Termination analysis of programs is an active area of research. Indeed, substantial progress in this area is regularly presented in meetings such as the International Workshop on Termination and the Annual International Termination Competition.

To formally verify correctness of termination analysis algorithms, it is often necessary to specify and prove equivalence among multiple notions of termination. Given a formal model of computation, one natural notion of termination is specified as for all inputs there exists an output provided under the operational semantics of the model. Another notion of termination could be specified considering whether or not the depth of the expansion tree of computation steps for all inputs is finite. These two notions rely on the semantics of the computational model. A more syntactic approach, attributed to Turing [13], is to verify that the actual arguments decrease in any repeating control structure, e.g., recursion, unbounded loop, fix-point, etc., of the program according to some well-founded relation. This notion is used in the majority of proof assistants, where the user must provide the well-founded relation.

The main contribution of this work is the formalization in the Prototype Verification System (PVS) [10] of the theorem of undecidability of the halting problem for a model of computation given by a functional language called PVS0. The formal development includes the definition of PVS0, its operational semantics, and the specification and proof of several concepts used in termination analysis of PVS0 programs. For the undecidability proof of the halting problem, only the semantic notions of termination are used. Turing termination for the language PVS0 is also discussed to show how semantic and syntactic termination criteria are related. The formalization is available as part of the NASA PVS Library under the directory PVS0.Footnote 1 All lemmas and theorems presented in this paper were formalized and verified in PVS.

2 Semantic Termination

The PVS0 language is a simple functional language whose expressions are described by the following grammar.

The grammar above is specified in PVS through the abstract data type:

figure a

A PVS specification of an abstract data type includes the constructors, e.g., ite, the accessors, e.g., get_cond, and the recognizers, e.g., ite?. In this data type, T is a parametric nonempty type that represents the type of values that serve as inputs and outputs of PVS0 programs. Furthermore, \(\mathtt{cnst}\) is the constructor of constant values of type T, \(\mathtt{vr}\) is the unique variable constructor, op1 and op2 are constructors of unary and binary operators, respectively, rec is the constructor of recursion, and ite is the constructor of conditional “if-then-else” expressions. The first parameter of the constructors op1 and op2 is an index representing built-in unary and binary operators, respectively.

The uninterpreted type T and uninterpreted unary and binary operators enable the encoding of arbitrary first-order PVS functions as programs in PVS0. Indeed, the operational semantics of \(\mathtt{PVS0Expr}\) is given in terms of a non-empty set \( Val \), which interprets the type T. The type \(\mathtt{PVS0}[ Val ]\) of PVS0 programs with values in \( Val \) consists of all 4-tuples of the form \((O_1,O_2,\bot , expr )\), such that

  • \(O_1\) is a list of PVS functions of type \( Val \rightarrow Val \), where \(O_1(i)\), i.e., the i-th element of the list \(O_1\), interprets the index i in the constructor \(\mathtt{op1}\),

  • \(O_2\) is a list of PVS functions of type \( Val \times Val \rightarrow Val \), where \(O_2(i)\), i.e., the i-th element of the list \(O_2\), interprets the index i in the constructor \(\mathtt{op2}\),

  • \(\bot \) is a constant of type \( Val \) representing the Boolean value false in the conditional construction ite, and

  • \( expr \) is a \(\mathtt{PVS0Expr}[ Val ]\), which is the syntactic representation of the program itself.

Henceforth, \(|O_1|\) and \(|O_2|\) represent the length of the lists \(O_1\) and \(O_2\), respectively. The choice of lists of functions for interpreting unary and binary operators helps in the enumeration of PVS0 programs, which is necessary in the undecidability proof.

Given a program \((O_1,O_2,\bot ,e_f)\) of type \(\mathtt{PVS0}[ Val ]\), the semantic evaluation of an expression e of type \(\mathtt{PVS0Expr}[ Val ]\) is given by the curried inductive relation \(\varepsilon \) of type \(\mathtt{PVS0}[ Val ]\rightarrow (\mathtt{PVS0Expr}[ Val ]\times Val \times Val ) \rightarrow \mathtt{bool}\) defined as follows.

Intuitively, the relation \(\varepsilon (O_1,O_2,\bot ,e_f)(e, v_i, v_o)\) defined below holds when given a program \((O_1,O_2,\bot ,e_f)\) the evaluation of the expression e on the input value \(v_i\) is the value \(v_o\).

In the definition of \(\varepsilon \), the parameters \(e_f\) and e are needed since the evaluation of a program \((O_1,O_2,\bot ,e_f)\) leads to evaluation of sub expressions e of \(e_f\) and, when a recursive call is evaluated, the whole expression \(e_f\) should be considered again (see the recursive case \(\mathtt{rec}({e_{1}})\) above).

For example, consider below a PVS0 program that computes the Ackermann function.

Example 1

Let \( Val \) be the set \(\mathbb {N}\times \mathbb {N}\) of pairs of natural numbers, \(\top =(1,0)\), \(\bot =(0,0)\), and a be the PVS0 program \((O_1,O_2,\bot ,e_a)\), where

It is proved in PVS that a computes the Ackermann function, i.e., for any \(n,m,k \in \mathbb {N}\), \( ackermann (m,n) = k\) if and only if \(\varepsilon (a)(e_a,(n,m),(k,i))\), for some i, where \( ackermann \) is the recursive function defined in PVS as

figure b

In the definition of a, the type \( Val \) encodes the two inputs of the Ackermann function, but also the output of the function, which is given by the first entry of the second pair.

The proof of one of the implications in the statement of Example 1 proceeds by induction using a lexicographic order on (mn). The other implication is proved using the induction schema generated for the inductive relation \(\varepsilon \). Although it is not logically deep, this proof is tedious and long. However, it is mechanizable assuming that the PVS function and the PVS0 program share the same syntactical structure. As part of the work presented in this paper, a PVS strategy that automatically discharges equivalences between PVS functions and PVS0 programs was developed. This strategy is convenient since one of the objectives of this work is to reason about computational aspects of PVS functions through their embeddings in PVS0.

Example 1 also illustrates the use of built-in operators in PVS0. Despite its simplicity, this language is not minimal from a fundamental point of view. Indeed, since the type T is generic, any PVS function can be used as a building block in the construction of a PVS0 program. This feature is justified since all PVS functions are total. Therefore, they can be considered as atomic built-in operators. However, in contrast to proof assistants based on constructive logic, PVS allows for the definition of non-computable functions. The consequences of these features will be clear in the undecidability proof of the halting problem.

The following lemma states that the semantic evaluation relation \(\varepsilon \) is deterministic.

Lemma 1

Let \( pvso \) be a program of type \(\mathtt{PVS0}[ Val ]\). For any expression e of type \(\mathtt{PVS0Expr}[ Val ]\) and all values \(v_i,v_o',v_o'' \in Val \),

The proof of this lemma uses the induction schema generated for the inductive relation \(\varepsilon \).

The relation \(\varepsilon \) is functional but not total, i.e., there are programs \( pvso \) and values \(v_i\), for which there is no value \(v_o\) that satisfies \(\varepsilon ( pvso )(pvso_e,v_i,v_o)\), where \( pvso _e\) is the program expression in \( pvso \). This suggests the following definition of the semantic termination predicate.

$$\begin{aligned} T_{\varepsilon }( pvso ,v_i) := \exists \, {v_o} \in Val \,:\, \varepsilon ( pvso )( pvso _e,{v_i},{v_o}). \end{aligned}$$

This predicate states that for a given program \( pvso \) and input \(v_i\), the evaluation of the program’s expression \( pvso _e\) on the value \(v_i\) terminates with the output value \(v_o\). The program \( pvso \) is total with respect to \(\varepsilon \) if it satisfies the following predicate.Footnote 2

$$\begin{aligned} T_{\varepsilon }( pvso ) := \forall \, v \in Val \,:\, T_{\varepsilon }( pvso ,v). \end{aligned}$$

Semantic termination can also be specified by a function \(\chi \) of type \(\mathtt{PVS0}[ Val ]\rightarrow (\mathtt{PVS0Expr}[ Val ]\times Val \times \mathbb {N}\rightarrow Val \cup \{\varvec{\diamondsuit }\})\) defined as follows.

In the definition of \(\chi \), n is the maximum number of nested recursive calls that are allowed in the evaluation of the recursive program for a given input. If this limit is reached during an evaluation, the function \(\chi \) returns the symbol \(\varvec{\diamondsuit }\), which represents a “none” value. This function can be used to define an alternative predicate for semantic termination as follows.

$$\begin{aligned}T_{\chi }( pvso ,v) := \exists \, n \in \mathbb {N}\,:\, \chi ( pvso )( pvso _e,v,n) \ne \varvec{\diamondsuit }. \end{aligned}$$

The program \( pvso \) is total with respect to \(\chi \) if it satisfies the following predicate.

$$\begin{aligned} T_{\chi }( pvso ) := \forall \, v \in Val \,:\, T_{\chi }( pvso ,v). \end{aligned}$$

The following theorem states that \(T_{\varepsilon }\) and \(T_{\chi }\) captures the same notion of termination.

Theorem 1

Let \( pvso \) be a PVS0 program of type \(\mathtt{PVS0}[ Val ]\). The following conditions hold:

  1. 1.

    For any \(v_i \in Val \) and e of type \(\mathtt{PVS0Expr}[ Val ]\), \(\varepsilon ( pvso )(e,v_i,v_o)\) if and only if \(v_o = \chi ( pvso )(e,{v_i},n)\), for some n, where \(v_o \ne \varvec{\diamondsuit }\).

  2. 2.

    For any \(v \in Val \), \(T_{\varepsilon }( pvso ,v)\) if and only if \(T_{\chi }( pvso ,v)\).

  3. 3.

    \(T_{\varepsilon }( pvso )\) if and only if \(T_{\chi }( pvso )\).

In Theorem 1, Statement 2 and Statement 3 are consequences of Statement 1. Assuming \(T_{\chi }( pvso ,v)\), the proof of the Statement 1 requires the construction of the number \(\mu ( pvso ,v) \in \mathbb {N}^+\) as follows.

$$\begin{aligned} \mu ( pvso ,v) := \min (\{n:\mathbb {N}\,|\, \chi ( pvso )( pvso _e,v,n) \ne \varvec{\diamondsuit }\}). \end{aligned}$$

This number satisfies the following property.

Lemma 2

Let \( pvso \) be a program of type \(\mathtt{PVS0}[ Val ]\) and \(v \in Val \) such that \(T_{\chi }( pvso ,v)\). For any \(n \ge \mu ( pvso ,v)\), \(\chi ( pvso )( pvso _e,{v},n) \ne \varvec{\diamondsuit }.\)

A PVS0 program \( pvso \) that satisfies \(T_{\chi }( pvso )\) (or equivalently, \(T_{\varepsilon }( pvso )\)) is said to be terminating. The following lemma shows that there are non-terminating PVS0 programs.

Lemma 3

Let \(\varDelta = (O_1,O_2,\bot ,\mathtt{rec}(\mathtt{vr}))\). For any \(v \in Val \), \(\lnot T_{\chi }(\varDelta ,v)\).

The proof proceeds by showing that if \(\chi (\varDelta )(\varDelta _e,v,n) \ne \varvec{\diamondsuit }\) for some v, where \(n = \mu (\varDelta ,v)\), then it is also the case that \(\chi (\varDelta )(\varDelta _e,v,n-1) \ne \varvec{\diamondsuit }\). By definition of \(\mu \), this is a contradiction.

Proving that a PVS0 program terminates using the semantic predicates \(T_{\varepsilon }( pvso )\) and \(T_{\chi }( pvso )\) is not very convenient. Indeed, these notions rely on actual computations of the program on a potentially infinite set of inputs. A syntactic criterion here called Turing termination relies on the existence of a well-founded relation < over a type A and measure function \(\mathcal{M}\) of type \( Val \rightarrow A\) on the parameters of the recursive function such that \(\mathcal{M}\) strictly decreases on every recursive call. This notion is adopted as the meta-theoretical definition of termination in several proof assistants. In PVS, this notion is implemented through the generation of the so called termination TCCs (Type Correctness Conditions), where the measure function and the well-founded relation are provided by the user. This notion is formalized by defining, in PVS, an algorithm that generates termination TCCs for PVS0 programs. It has been proved that this Turing termination is equivalent to \(T_{\varepsilon }( pvso )\) and, therefore, to \(T_{\chi }( pvso )\). The formal infrastructure that is needed for defining Turing termination of PVS0 programs is out of the scope of the present work.

For a PVS0 program \((O_1,O_2,\bot ,e_f)\) and two values \(v_i\) and \(v_r\), the definition of \(v_i \rightarrow v_r\) is given by:

$$ v_i \rightarrow v_r := \varepsilon (O_1,O_2,\bot ,e_f)(e_f,v_i,v_r) \wedge \mathcal{M}(v_r) < \mathcal{M}(v_i)$$

The definition above relates an input value \(v_i\) of a PVS0 program and the argument \(v_r\) of a recursive call such that \(\mathcal{M}(v_r) < \mathcal{M}(v_i)\). A key construction that is needed in the equivalence proof between the syntactic and the semantic termination criteria is the definition of the number \(\varOmega (v)\), where \(v \in Val\), as follows.

$$\varOmega (v) := \min (\{n:\mathbb {N}^+\,|\, \forall \,v' \in V\,:\,\lnot (v \rightarrow ^n v')\}).$$

Intuitively, \(\varOmega (v)\) is the length of the longest path downwards starting from v. The following lemma states a relation between \(\mu \) and \(\varOmega \).

Lemma 4

  Let \( pvso \) be a PVS0 program that satisfies Turing termination for a well-founded relation < over A and a measure function \(\mathcal{M}\). For any value \(v \in Val \), \(\mu ( pvso ,v) \le \varOmega (v)\).

3 Partial Recursive Functions

By design, the PVS0 language can directly encode any PVS function f of type \(T \rightarrow T\), where T is an arbitrary PVS type. This feature enables the use of PVS functions as built-in operators in PVS0 program. The following lemma states that any PVS function can be embedded in a terminating PVS0 program.

Lemma 5

Let f be a PVS function of type \(T \rightarrow T\). The program \( mk\_pvs0 (f) = (O_1,O_2,\bot ,e_f)\) of type \(\mathtt{PVS0}\), where \(e_f = \mathtt{op1}(0,\mathtt{vr})\) and \(O_1(0)(t) = f(t)\), satisfies the following properties:

  • \( mk\_pvs0 (f)\) is terminating, i.e., \(T_{\varepsilon }( mk\_pvs0 (f))\).

  • For any \(t\in T\), \(\varepsilon ( mk\_pvs0 (f))(e_f,t,f(t))\).

The converse of Lemma 5 is not true in general as illustrated by the following theorem.

Lemma 6

There is no PVS function f of type \(T \rightarrow T\) such that for any \(t \in T\), \(\varepsilon (\varDelta )(\mathtt{rec}(\mathtt{vr}),t,f(t))\), where \(\varDelta \) is the function defined in Lemma 3.

However, any PVS0 program, even non-terminating ones, can be encoded as a PVS function of type \(T \rightarrow T\cup \{\varvec{\diamondsuit }\}\), as stated in the following lemma.

Lemma 7

Let \( pvso \) be a, possibly non-terminating, program of type PVS0 and \( pvso _e\) the PVS0 expression of \( pvso \). The PVS function

$$f(t) := \mathtt{IF}\;T_{\chi }( pvso ,t) \;\mathtt{THEN}\;\chi ( pvso )( pvso _e,t,\mu ( pvso ,t)) \;\mathtt{ELSE}\;\varvec{\diamondsuit }$$

satisfies the following property for any \(t \in T\).

$$T_{\varepsilon }( pvso ,t) \text{ if } \text{ and } \text{ only } \text{ if } \varepsilon ( pvso )( pvso _e,t,f(t)).$$

The proofs of the previous lemmas are straightforward applications of the definitions of \(T_{\varepsilon }\), \(\varepsilon \), \(T_{\chi }\), and \(\chi \).

A consequence of Lemmas 5 and 7 is that is possible to define an oracle of type \(\mathtt{PVS0}[ Val ]\), where \( Val = \mathtt{PVS0}\), that decides if a program of type \(\mathtt{PVS0}\) is terminating or not. The existence of that oracle is stated in the following theorem.

Theorem 2

The program \( Oracle = (O_1,O_2,\bot ,e_f)\) of type \(\mathtt{PVS0}[ Val ]\), where \( Val = \mathtt{PVS0}\), \(e_f = mk\_pvs0 (\mathtt{LAMBDA}( pvso :\mathtt{PVS0}):\mathtt{IF}\;T_{\varepsilon }( pvso )\;\mathtt{THEN}\;\top \;\mathtt{ELSE}\;\bot )\), and \(\top \ne \bot \), has the following properties.

  • \( Oracle \) is a terminating \(\mathtt{PVS0}[ Val ]\) program, i.e., \(T_{\varepsilon }( Oracle )\).

  • \( Oracle \) decides termination of any \(\mathtt{PVS0}\) program, i.e., for any \( pvso \) of type \(\mathtt{PVS0}\),

    $$\begin{aligned} \chi ( Oracle )( Oracle _e, pvso ,\mu ( Oracle , pvso )) = \top \text{ if } \text{ and } \text{ only } \text{ if } T_{\varepsilon }( pvso ). \end{aligned}$$

This counterintuitive result is possible because PVS, in contrast to proof assistants based on constructive logic, allows for the definition of total functions that are non-computable, e.g., \(\mathtt{LAMBDA}( pvso :\mathtt{PVS0}):\mathtt{IF}\;T_{\varepsilon }( pvso )\;\mathtt{THEN}\;\top \mathtt{ELSE}\;\bot \). These non-computable functions can be used in the construction of terminating PVS0 programs through the built-in operators. Therefore, in order to formalize the notion of partial recursive functions in PVS0, it is necessary to restrict the way in which programs are built.

First, the basic type T is set to \(\mathbb {N}\), i.e., \( Val {} = \mathbb {N}\), where the number 0 represents the value false, i.e., \(\bot =0\). Any value different from 0 represents a true value, in particular \(\top =1\). Second, the built-in operators used in the construction of programs are restricted by a hierarchy of levels: the operators in the first level can only be defined using projections (\(\varPi _1(x,y:\mathbb {N}) := x\) and \(\varPi _2(x,y:\mathbb {N}) := y\)), successor (\(succ(x: \mathbb {N}): x+1\)), and greater or equal than (\(ge(x,y: \mathbb {N}): \mathtt{IF}\;x \ge y \;\mathtt{THEN}\;\top \; \mathtt{ELSE}\;\bot {}\)) functions. Operators in higher levels can only be constructed using programs from the previous level. This idea is formalized by the predicate as shown below.

where the function \(\kappa _{ 2 }\) is an encoding of pairs of natural numbers onto natural numbers defined as \(\kappa _{ 2 }(m,n) := (m+n+1) \times (m+n)/2 + n\).

The type PartialRecursive is defined to be a subtype of \(\mathtt{PVS0}[\mathbb {N}]{}\) containing all the programs \( pvso \) such that there is a natural n for which holds. Additionally, Computable is a subtype of PartialRecursive containing those elements that are also terminating according to the aforementioned definitions.

The following theorem states that PartialRecursive, and thus Computable, are enumerable types.

Theorem 3

There exists a PVS function of type \(\mathbb {N}\rightarrow \mathtt{PartialRecursive}\) that is surjective.

As a corollary of this theorem, the inverse of this surjective function, denoted as \(\kappa _{P}{}\) is an injective function of type \(\mathtt{PartialRecursive}\rightarrow \mathbb {N}\).

The proof of Theorem 3 is technically involved. The proof proceeds by showing that each level is enumerable. Therefore, the function \(\kappa _{P}{}\) exists since the countable union of countable sets is also countable. A similar work is presented in [4] by Foster an Smolka. They encode a lambda term into another lambda term such that it represents a natural number using the Scott numbers codification and use that encoding to formalize in Coq the Rice’s Theorem.

The function \(\kappa _{P}{}\) is used in the proof of the undecidability of the halting problem for PVS0 to encode PartialRecursive programs as inputs of a \(\mathtt{Computable}\) program. This function is a key element in the construction of the self-reference argument used in the diagonalization approach of the undecidability proof.

4 Undecidability of the Halting Problem

The classical formalization of the undecidability of the halting problem starts by assuming the existence of an oracle capable of deciding whether a program halts for any input. A Gödelization function transforms the tuple of program and input into a single input to the oracle. After that, using the oracle, another program is created such that if the encoded program halts it enters into an infinite loop. Otherwise, it produces an answer and halts. Passing this program as an input to itself results in the expected contradiction.

Here, the undecidability of the halting problem was formalized using the notions of termination for PVS0 defined in the previous section and the Cantor’s diagonalization technique.

Theorem 4

(Undecidability of the Halting Problem for PVS0). There is no program \( oracle {} = (O_1,O_2,\bot ,e_o)\) of type \(\mathtt{Computable}\) such that for all \( pvso = (O_1{}',O_2{}',\bot ,e_f)\) of type \(\mathtt{PartialRecursive}{}\) and for all \(n \in \mathbb {N}\),

$$\begin{array}{l} T_{\varepsilon }( pvso ,n) \text{ if } \text{ and } \text{ only } \text{ if } \lnot \varepsilon ( oracle {})(e_o,\kappa _{ 2 }(\kappa _{P}( pvso ),n),\bot {}). \end{array}$$

Proof

The proof proceeds by assuming the existence of an oracle to derive a contradiction. Suppose there exists a program \( oracle {} = (O_1,O_2,\bot ,e_o)\) of type \(\mathtt{Computable}\) such as the one presented in the statement of the theorem. Then, a \(\mathtt{PVS0}[\mathbb {N}]{}\) program \( pvso {} = ( O_1{}',O_2{}',\bot {},e_f)\) can be defined, where \(O_1'(k) = O_1(k)\), for \(k < |O_1|\), \(O_2'(k) = O_2(k)\), for \(k < |O_2|\), and

  • \(O_1{}'(|O_1|)(i) = choose (\{ a: \mathbb {N}\;|\; \varepsilon ( oracle {})(e_o,i,a)\})\),

  • \(O_2{}'(|O_2|)(i,j) = choose (\{ a: \mathbb {N}\;|\; \varepsilon ( oracle {})(e_o,\kappa _{ 2 }(i,j),a)\})\), and

  • \(e_f = \mathtt{ite}(\mathtt{op2}(|O_2{}|,\mathtt{vr},\mathtt{vr}),rec(\mathtt{vr}),\mathtt{vr})\).

The PVS function \( choose \) returns an arbitrary element from a non-empty set. The sets used in the definitions of \(O_1'\) and \(O_2'\) are non-empty since \( oracle \) is Computable and, therefore, terminating. The program \( pvso \) is built in such a way that it belongs to the next level from the level of \( oracle \).

Let n be the natural number \(\kappa _{P}( pvso {})\). The rest of the proof proceeds by case analysis.

  • \(\mathbf{Case\,1\mathbf{:}}\ \varepsilon ( oracle {})(e_o,\kappa _{ 2 }(n,n),\bot {})\). This case holds if and only if \(\lnot T_{\varepsilon }( pvso ,n)\). Expanding \(T_{\varepsilon }\) one obtains

    $$\begin{aligned} \lnot \exists (v : \mathbb {N}): \exists (v_o : \mathbb {N}): \begin{array}[t]{l} \varepsilon ( pvso {})(\mathtt{op2}(|O_2{}|,\mathtt{vr},\mathtt{vr}), n, v_o) \;\wedge \\ \mathtt{IF}\;v_o \ne \bot {}\; \\ \mathtt{THEN}\;\varepsilon ( pvso {})(rec(\mathtt{vr}), n, v) \\ \mathtt{ELSE}\;\varepsilon ( pvso {})(\mathtt{vr}, n, v).\end{array} \end{aligned}$$
    (1)

    Expanding \(\varepsilon \) in \(\varepsilon ( pvso {})(\mathtt{op2}(|O_2{}|,\mathtt{vr},\mathtt{vr}), n, v_o)\) yields

    $$\begin{array}{l} choose (\{ a: \mathbb {N}\;|\; \varepsilon ( oracle {})(e_o,\kappa _{ 2 }(n,n),a)\}) = v_o.\end{array}$$

    Since \(\varepsilon ( oracle {})(e_o,\kappa _{ 2 }(n,n),\bot {})\) holds, \(\bot {} = v_o\). Therefore, Formula (1) is equivalent to

    $$\begin{aligned} \lnot \exists (v : \mathbb {N}): \varepsilon ( pvso {})(\mathtt{vr}, n, v). \end{aligned}$$
    (2)

    The predicate \(\varepsilon ( pvso {})(\mathtt{vr}, n, v)\) holds if and only if \(n=v\). Hence, Formula (2) states that \(\lnot \exists (v : \mathbb {N}): n=v\), where n is a natural number. This is a contradiction.

  • \(\mathbf{Case\,2\mathbf{:}}\ \lnot \varepsilon ( oracle {})(e_o,\kappa _{ 2 }(n,n),\bot {})\). This case holds if and only if \(T_{\varepsilon }( pvso ,n)\). From Theorem 1, \(T_{\chi }( pvso ,n).\) If the proof starts directly from \(T_{\varepsilon }( pvso ,n)\), after expanding and simplifying it, \(T_{\varepsilon }( pvso ,n)\) is obtained once again, which implies that there is not such an n, giving a contradiction. However, since PVS does not accept the definition of a function that enters into such an infinite loop, the solution is to apply the equivalence Theorem 1. Expanding the definition of \(T_\chi \) yields

    $$\exists \, m \in \mathbb {N}\,:\, \chi ( pvso )(\mathtt{ite}(\mathtt{op2}(|O_2{}|,\mathtt{vr},\mathtt{vr}),rec(\mathtt{vr}),\mathtt{vr}),n,m) \ne \varvec{\diamondsuit }.$$

    If there exists such m, it can be chosen as the minimal natural that makes the above proposition hold. Expanding the definition of \(\chi \) yields

    $$\begin{aligned} \left( \begin{array}{l} \mathtt{IF}\;\chi ( pvso )(\mathtt{op2}(|O_2{}|,\mathtt{vr},\mathtt{vr}) ,n,m) \ne \varvec{\diamondsuit }\;\mathtt{THEN}\;\\ \; \;\mathtt{IF}\;\chi ( pvso )(\mathtt{op2}(|O_2{}|,\mathtt{vr},\mathtt{vr}) ,n,m) \ne \bot {} \;\mathtt{THEN}\;\\ \quad \chi ( pvso )(rec(\mathtt{vr}) ,n,m) \\ \quad \mathtt{ELSE}\;\chi ( pvso )(\mathtt{vr},n,m) \\ \mathtt{ELSE}\;\varvec{\diamondsuit }\end{array}\right) \;\ne \;\varvec{\diamondsuit }. \end{aligned}$$
    (3)

    If the condition of the first if-then-else were false, then Formula (3) reduces to \(\varvec{\diamondsuit }\ne \varvec{\diamondsuit }\), which is a contradiction. Therefore, this condition must be true. After expanding and simplifying \(\chi \), \(\chi ( pvso )(\mathtt{op2}(|O_2{}|,\mathtt{vr},\mathtt{vr}) ,n,m)\) reduces to

    $$ choose (\{ a: \mathbb {N}\;|\; \varepsilon ( oracle {})(e_o,\kappa _{ 2 }(n,n),a)\}).$$

    Let \(v = choose (\{ a: \mathbb {N}\;|\; \varepsilon ( oracle {})(e_o,\kappa _{ 2 }(n,n),a)\})\). If \(v = \bot {}\), then

    $$\varepsilon ( oracle {})(e_o,\kappa _{ 2 }(n,n),\bot {}).$$

    This is a contradiction since \(n=\kappa _{P}( pvso {})\). Thus, \(\chi ( pvso )(\mathtt{op2}(|O_2{}|,\mathtt{vr},\mathtt{vr}) ,n,m) \ne \bot {}\). Then, Formula (3) can be simplified to

    $$\chi ( pvso )(rec(\mathtt{vr}),n,m) \ne \varvec{\diamondsuit }.$$

    Finally, expanding \(\chi \) results in \(\chi ( pvso )(e_f,n,m-1) \ne \varvec{\diamondsuit }\). This contradicts the minimality of m,  completing the proof.    \(\square \)

5 Related Work

Formalization of models of computation is not a novelty. In recent work, Foster and Smolka [4] formalized, in Coq, Rice’s Theorem, which states that non-trivial semantic properties of procedures are undecidable. This theorem is a variant of Post’s Theorem, which states that a class of procedures is decidable if it is recognizable, co-recognizable, and logically decidable, and the fact that the class of total procedures is not recognizable. This work was done for a model of a functional language presented as a weak call-by-value lambda-calculus in which \(\beta \)-reduction can be applied only if the redex is not below an abstraction and if the argument is an abstraction. Results are adaptable for call-by-value functional languages and Turing-complete models formalized in Coq. Larchey-Wendling [6] gave a formalization that “the Coq type \(\mathtt{nat}^k \rightarrow \mathtt{nat}\) contains every recursive function of arity k which can be proved total in Coq.” There, nat is the Coq type for Peano naturals. This is a class of functions between the primitive functions \(\mathbb {N}^k \rightarrow \mathbb {N}\) and the partial recursive functions \(\mathbb {N}^k \rightharpoondown \mathbb {N}\). Proving that the former class of functions is a set of terms of Coq type \(\mathbb {N}^k \rightarrow \mathbb {N}\) is simple since Coq includes all required recursive schemes (basic functions plus composition and recursion). The paper advances into characterizing the type \(\mathbb {N}^k \rightharpoondown \mathbb {N}\), which is not straightforward and is related to the class of terminating functions.

Highly relevant related papers include [9] in which Norrish presented a formalization in HOL4 of several results of computability theory using as models recursive functions and the \(\lambda \)-calculus. The mechanizations include proofs of the equivalence of the computational power of both models, undecidability of the halting problem, existence of universal machines, and Rice’s Theorem in the \(\lambda \)-calculus. In addition, in [14] Xu et al. presented a formalization of computability theorems in Isabelle/HOL using as models Turing machines, abacus machines (a kind of register machines) and recursive functions. Formalized results include also undecidability of the halting problem and existence of universal Turing machines.

In contrast to those approaches, this paper deals with a computational model that is specified as a concrete functional programing language, namely PVS0. For this language, all the elements required in a classic-style proof of undecidability of termination, such as Gödelization of programs and inputs, are developed. Having a concrete programming language such as PVS0 enables the formalization and comparison of different termination analysis techniques for this language. In fact, the current formalization is part of a larger library that relates different termination criteria such as semantic termination, Turing termination [13], dependency pairs [1, 2, 15], and techniques based on the size change principle [5, 7, 11] such as calling context graphs [8] and matrix-weighted graphs [3].

6 Conclusion and Future Work

This paper presents the formalization of the undecidability of the halting problem for a simple functional language called PVS0. This formalization required the definition of several notions of termination, which are all proven to be equivalent. Since PVS0 generally allows for the encoding of non-computable functions through the use built-in operators, the undecidability proof is done on a restriction of PVS0 programs. First, the input type is restricted to natural numbers. Then, PartialRecursive and Computable programs are constructed using a layered approach where programs at level \(n+1\) can only depend on programs at level n or below. The existence of a surjective mapping from natural numbers to PVS0 programs constructed using this layered approach is formally verified. This mapping enables the definition of a Gödelization function for the type of programs \(\mathtt{PartialRecursive}\), which is crucial in the undecidability proof of the halting problem.

PVS0 is used to study termination and totality properties of PVS functions with the objective of automating the generation of measure functions in PVS recursive definitions. As part of this study, termination analysis techniques for PVS0 programs have been formalized and verified. Furthermore, strategies that automatically prove the equivalence between PVS recursive functions and PVS0 programs and that discharge termination TCCs of PVS recursive functions through their PVS0 counterpart have been developed. Future work includes extending the syntax of the PVS0 minimal language to support constructs such as let-in expressions and extending the proposed framework to support higher-order functions.

Finally, it is conjectured that the PVS0 language is Turing-complete. The proof of this property, which is work in progress, is not conceptually difficult but is technically involved. For simplicity, the PVS0 language only allows for the definition of one recursive function. Composition could be encoded using the lists of built-in operators. However, by definition, built-in operators are terminating. Therefore, to enable composition of (possibly) non-terminating programs, it is necessary to encode it in PVS0 itself using, among other things, the Gödelization of programs. This encoding increases the complexity of the proof of this conjecture.