1 Introduction

The goal of this paper is to sketch a fully classical construction of a satisfaction class for the language of first-order arithmetic. The initial introductory remarks describe the motivation for this endeavor.

It is well-known that non-standard models of arithmetic contain nonstandard arithmetical formulas. Indeed, for an arbitrary nonstandard model M there will be an \(a \in M\) such that \(M\,\models \)a is an arithmetical formula’ even though in the real world a is not a formula at all. A natural question arises whether it is possible to develop semantics for such objects (we will call them ‘formulas in the sense of the model’). First attempts in this direction were made by Robinson [8] and Krajewski [7]. To this aim, the notion of a satisfaction class has been introduced. Roughly, a satisfaction class is a subset of the model which can be treated as a reasonable interpretation of the satisfaction predicate obeying the usual Tarski-style compositional clauses.

Further work on satisfaction classes brought remarkable results. In particular, it transpired that a non-inductive satisfaction class can be constructed in an arbitrary countable recursively saturated model of arithmetic. This is the famous theorem of Kotlarski, Krajewski and Lachlan (KKL in short), which demonstrates the conservativity of non-inductive satisfaction axioms over first-order arithmetic.Footnote 1

However, the original proof of the theorem uses techniques which many readers found exotic. From the author’s experience, the machinery of (so-called) ‘approximations’, developed by KKL in their paper, remains one of the main stumbling blocks in the wider dissemination of this important result. Accordingly, the question has been asked whether the result can be proved by purely classical methods. One successful attempt in this direction has been recently made by Enayat and Visser [2]. In their paper, they showed how to construct a satisfaction class using classical techniques of formal semantics (namely, compactness and the union of elementary chain theorem).Footnote 2

In the present paper I propose to prove the theorem by the classical techniques of formal proof theory, namely, by cut elimination. Coupled with Enayat and Visser construction, this makes the fascinating field of satisfaction classes accessible to the students and the logicians, whose primary interest is either model theory or proof theory.

2 Preliminaries

The language of first-order Peano arithmetic will be denoted here as \(L_{PA}\). Primitive non-logical symbols of \(L_{PA}\) are ‘\(+\)’, ‘\(\times \)’, ‘0’, ‘S’. The expressions Var, Tm, \(Tm^c\), \(Fm_{L_{PA}}\) and \(Sent_{L_{PA}}\) will be used as referring (respectively) to the sets of variables, terms, constant terms, formulas and sentences of \(L_{PA}\). We will also use these expressions as shorthands for arithmetical predicates representing the relevant sets in PA. Given a model M, we write \(Sent_{L_{PA}}(M)\) for the set of all objects a such that \(a \in M\) and \(M\,\models \,Sent_{L_{PA}}(a)\).

The perspective adopted in this paper is that of truth, not satisfaction. Accordingly, we will consider the language \(L_T\) obtained from \(L_{PA}\) by adding the unary truth predicate ‘T(x)’ (instead of a binary satisfaction predicate). \(Sent_{L_T}\) is the set of sentences of \(L_T\).

We introduce now the basic theory of truth, denoted as \(CT^-\). The acronym ‘CT’ stands for ‘compositional truth’; the superscript indicates that in this theory we have induction only for the arithmetical language (not with the truth predicate).

Definition 1

\(CT^-\) is the theory in the language \(L_T\) axiomatized by all the usual axioms of Peano arithmetic together with the following truth axioms:

  • \( \forall s, t \in Tm^c \big ( T ( s = t)\equiv val(s)= val(t)\big ) \)

  • \(\forall \varphi \big ( Sent_{L_{PA}}(\varphi )\rightarrow (T \lnot \varphi \equiv \lnot T \varphi )\big ) \)

  • \(\forall \varphi \forall \psi \big ( Sent_{L_{PA}}(\varphi \vee \psi )\rightarrow (T (\varphi \vee \psi )\equiv (T \varphi \vee T \psi )) \big ) \)

  • \( \forall v \forall \varphi (x) \big ( Sent_{L_{PA}}(\forall v \varphi (v))\rightarrow (T (\forall v \varphi (v) )\equiv \forall x T ( \varphi (\dot{x}))) \big ) \)

In effect, the truth axioms of \(CT^-\) follow the familiar pattern of Tarski’s inductive truth definition. Let us only emphasize that the quantifier axiom given here employs numerals. A numeral is an arithmetical term of the form ‘\(S \ldots S(0)\)’; in other words, numerals are expressions obtained by preceding the symbol ‘0’ with arbitrarily many successor symbols. Accordingly, the intended meaning of the quantifier axiom is that ‘\(\forall v \varphi (v)\)’ is true iff the result of substituting an arbitrary numeral for v in \(\varphi (v)\) is true.

A ‘truth class’ in a model M is a subset T of M which makes all the axioms of \(CT^-\) true. Now, the conservativity theorem states that the truth axioms of \(CT^-\) are conservative not just over full Peano arithmetic, but also over the fragments of arithmetic sufficient for reconstructing basic theory of syntax. In particular, adding the truth axioms to \(I \varSigma _1\) (the theory just like Peano arithmetic but with induction restricted to \(\varSigma _1\) arithmetical formulas) produces a conservative extension of \(I \varSigma _1\). This is a direct corollary of the KKL theorem, which can be formulated as an expandability result concerning recursively saturated models of \(I \varSigma _1\).

Two definitions below introduce the notion of a recursive type and the concept of a recursively saturated model.

Definition 2

Let Z be a set of formulas with one free variable x and with parameters \(a_1 ... a_n\) from a model M. We say that:

  1. (a)

    Z is realized in M iff there is an \(s \in M\) such that every formula in Z is satisfied in M under a valuation assigning s to x.

  2. (b)

    Z is a type of M iff every finite subset of Z is realized in M.

  3. (c)

    Z is a recursive type of M iff apart from being a type of M, Z is also recursive.

Definition 3

M is recursively saturated iff every recursive type of M is realized in M.

The KKL theorem can now be formulated as follows.

Theorem 4

For every \(M\,\models \,I\varSigma _1\), if M is countable and recursively saturated, then there is a set \(T \subset M\) such that \((M, T)\,\models \,CT^-\).

3 From Consistent M-Logic to a Truth Class

From now on, we will work with a fixed countable and recursively saturated model M of \(I\varSigma _1\). As in the original KKL’s argument, our first step is the development of a proof system called ‘M-logic’ (ML in short). Intuitively, ML is a system which permits us to process arbitrary sentences in the sense of M, including the nonstandard ones. The system is described externally (not in the model) in the form of a sequent calculus.Footnote 3 We will use ‘\(\Rightarrow \)’ for the sequent arrow, with expressions of the form ‘\(\varGamma \Rightarrow \varDelta \)’ referring to sequents. We shall always assume that both \(\varGamma \) and \(\varDelta \) are externally finite sequences of M-sentences. Note that we do not admit formulas with free variables in the sequents.

The definition of M-logic is framed after Gentzen’s original system LK (see [3]). All the initial sequents have the form \(\varphi \Rightarrow \varphi \), for an arbitrary \(\varphi \in Sent_{L_{PA}}(M)\). The following rules of ML are copied directly from Gentzen’s system:

  • Weakening, left and right (W-left and W-right):

  • Exchange, left and right (E-left and E-right):

  • Contraction, left and right (C-left and C-right):

  • Cut:

  • \(\lnot \)-left and \(\lnot \)-right:

  • \(\wedge \)-left and \(\wedge \)-right (for arbitrary sentences A and B such that one of them is \(\varphi \)):

  • \(\vee \)-left and \(\vee \)-right (for arbitrary sentences A and B such that one of them is \(\varphi \)):

  • \(\rightarrow \)-left and \(\rightarrow \)-right:

In addition, M-logic has the following rules of inference:

  • The truth rule for literals (Tr-lit). Let \(\varphi \) be of the form \(t = s\) with \(M \models t=s\) or of the form \( t \ne s\) with \(M \models t \ne s\):

  • The M-rule, left and right (M-left, M-right):

  • \(\exists \)-right and \(\forall \)-left:

Proofs in ML are (possibly infinite) trees of finite height, where the height of a proof is defined (as usual) as the length of the maximal path. By definition, trees with no maximal path do not qualify as proofs in ML. Observe that in ML, the infinitary rules M-left and M-right replace the original rules \(\exists \)-left and \(\forall \)-right of Gentzen.Footnote 4 It should be also emphasized that in all the quantifier rules of ML we employ numerals. Thus, for example, in order to apply \(\exists \)-right, we need a sentence \(\varphi (a)\) with a numeral for a. In contrast, in Gentzen’s original system the rule \(\exists \)-right would permit us to derive \(\varGamma \Rightarrow \varDelta , \exists x \varphi (x)\) from \(\varGamma \Rightarrow \varDelta , \varphi (t)\) for an arbitrary term t, not necessarily a numeral. The effect of this modification of Gentzen’s system is that the truth class which we construct can contain term pathologies. Thus, in a model (MT) of \(CT^-\) which we eventually obtain there can exist a nonstandard formula \(\varphi (x)\) such that for some term t, \(\varphi (t)\) belongs to T (so that, loosely speaking, the model thinks that \(\varphi (t)\) is true), while the sentence \(\lnot \exists x \varphi (x)\) also belongs to T. In this way we obtain a disconcerting effect: the model thinks that \(\lnot \exists x \varphi (x)\) is true even though it considers as true some term instantiation of \(\varphi (x)\).Footnote 5 However, this accords with our formulation of \(CT^-\), where all the quantifier axioms employ numerals.

Lemma 5

If M-logic is consistent, then M can be expanded to a model of \(CT^-\).

For the proof of the lemma, we introduce first the family of unary arithmetical predicates ‘\(Pr_n(S)\)’ with the intuitive reading ‘sequent S has a proof in M-logic of height at most n’ (in short, S is n-provable). Observe that for each rule R of M-logic, the relation ‘S can be obtained by R from n-provable sequents’ can always be expressed by an arithmetical formula, provided that n-provability is arithmetically expressible. In view of this, we introduce the following definition.

Definition 6

  • \(Pr_0(S):= S\) is an initial sequent,

  • can be obtained by R from n-provable sequents).

By external induction on natural numbers it can be demonstrated that:

Observation 7

\(\forall k \in \omega ~\forall S \big [ML \vdash _k S \equiv M\,\models \,Pr_k(S) \big ]\).

We can now turn to the proof of Lemma 5.

Proof

Let \(\varphi _0, \varphi _1, \ldots \) be an enumeration of the set of M-sentences (this is the only place where the countability assumption is used).

We define:

\(T_0 = \emptyset \)

The expression ‘\(T_n\)’ on the right side of the definition (as in ‘\(ML \nvdash (T_n \rightarrow \lnot \varphi _n)\)’) stands for the conjunction of all the sentences \(\varphi _i\) or their negations, whichever of them were added on previous levels. We need to verify that whenever \(ML \nvdash (T_n \rightarrow \lnot \exists x \psi (x))\), there will exist an \(a \in M\) such that \(ML \nvdash (T_n \rightarrow \lnot \psi (a))\). This is the only place where recursive saturation is employed.

Thus, assume that \(ML \nvdash (T_n \rightarrow \lnot \exists x \psi (x))\). Define:

$$\begin{aligned} p(x) = \{\lnot Pr_k(T_n \rightarrow \lnot \psi (x)): k \in \omega \}. \end{aligned}$$

We observe that p(x) is a type. Otherwise there is a natural number k such that \(M\,\models \,\forall a Pr_k(T_n \rightarrow \lnot \psi (a))\). Hence for all a, \(ML \vdash _k T_n \rightarrow \lnot \psi (a)\). But then by the M-rule and cut, \(ML \vdash T_n \rightarrow \lnot \exists x \psi (x)\), which is a contradiction.

Since p(x) is a type, by recursive saturation there is an \(a \in M\) which realizes it and we have: \(\forall k M\,\models \,\lnot Pr_k(T_n \rightarrow \lnot \psi (a))\), hence the sentence \(T_n \rightarrow \lnot \psi (a)\) is not provable in M-logic, as required.

Checking that \((M, T)\,\models \,CT^-\) provided that M-logic is consistent is now routine and we leave it to the reader.    \(\Box \)

4 Consistency of M-Logic

At this stage all that is missing is the proof of consistency of M-logic. In KKL [5] the consistency of M-logic is proved by the technique of approximations. Here we propose cut elimination as the proof method. Let us start by the following simple observation.

Observation 8

If every sequent provable in M-logic has a cut-free proof, then M-logic is consistent.

Proof

If M-logic is inconsistent, then it proves that \(0 = 1\). By cut elimination, take a cut-free proof P of \(0=1\). It is easy to observe that every sentence in P has to be either atomic or negated atomic.Footnote 6 For a sequent S belonging to P, let the height of S in P be defined as the length of maximal path generated by S in P.Footnote 7 Let \(Tr_0(x)\) be the arithmetical truth predicate for atomic sentences and their negations. By external induction on the height of sequents in P, it can be demonstrated that for every sequent S in P, if all sentences in the antecedent of S are \(Tr_0\), then some sentence in the succedent of S is \(Tr_0\). It immediately follows that \(M\,\models \,Tr_0(0=1)\), which is impossible.    \(\Box \)

Lemma 9

For every sequent S, if S is provable in ML, then S has a cut-free proof in ML.

The aim of the remaining part of the paper is to lead the proof of Lemma 9 to the point at which it can be completed simply by repeating Gentzen’s original argument for cut elimination. It should be emphasized that we are not there yet. Our setting is that of possibly nonstandard sentences (sentences in the sense of M) and this generates an obstacle which first has to be removed.

In order to see the obstacle, let us recap the classical argument. The aim is to show that the system with the following mix rule (which is a generalized version of cut) admits mix elimination:

where \(\varSigma \) and \(\varDelta \) contain \(\varphi \) (the mix formula); \(\varSigma ^*\) and \(\varDelta ^*\) differ from \(\varSigma \) and \(\varDelta \) only in that they do not contain any occurrence of \(\varphi \). Since mix and cut produce equivalent proof systems, mix elimination gives us the desired result.

In the next stage it is demonstrated that mix can be eliminated from any proof which contains only a single application of the mix rule in the last step. This is done by double induction on the degree of proofs (main induction) and on the rank of proofs (subinduction). For proofs with mix used only in the last step, we define:

  • The left rank of the proof is the largest number of consecutive sequents in a path starting with the left-hand upper sequent of the mix and such that every sequent in the path contains the mix formula in the succedent.

  • The right rank of the proof is the largest number of consecutive sequents in a path starting with the right-hand upper sequent of the mix and such that every sequent in the path contains the mix formula in the antecedent.

  • The rank of the proof is the left rank of the proof + the right rank of the proof.

  • The degree of the proof is the syntactic complexity of the mix formula.

There is no problem in our setting with induction on the rank of proofs, since both the left and the right rank of the proof in ML will always be a (standard) natural number, restricted by the height of the proof. However, the induction on the degree of proofs is quite problematic. Since the mix formula might be non-standard, its syntactic complexity might be a non-standard element of M. Arguing externally by induction on non-standard numbers is clearly an invalid move and this is the main obstacle complicating the situation.

Our remedy is to replace the general notion of a degree with a notion relativized to a proof. Assume that we are given a proof P with mix applied only in the last step, that eliminates the (possibly non-standard) mix formula \(\varphi \). The guiding intuition formalized below is that in the cut elimination proof the syntactic shape of \(\varphi \) matters only comparatively. For example, \(\varphi \) might have the form \(\lnot \psi \). The intuition is that this will matter only provided that \(\psi \) itself (without negation) appears somewhere in P; otherwise \(\varphi \) might just as well be treated as a formula of complexity 0, even if it is non-standard.

Our objective is to make these ideas precise. In what follows the word ‘sequence’ should always be interpreted externally; in other words, sequences are finite or infinite objects in the real world, not necessarily elements of M. The length of a finite sequence \(a = (a_0 \ldots a_k)\) is the number of its elements, that is, \(lh(a) = k+1\). For an infinite sequence a we define lh(a) as \(\omega \).

Definition 10

  • \(x \vartriangleleft y\) (‘x is a direct subsentence of y’) is an abbreviation of the following arithmetical formula:

    figure a
  • Let \(\varphi \in Sent_{L_{PA}}(M)\). We say that s is a \(\vartriangleleft \)-sequence for \(\varphi \) iff \(s_0 = \varphi \) and for every \(k < lh(s)-1~s_{k+1} \vartriangleleft s_k\).

The notion of a degree can now be defined in the following way.

Definition 11

Let P be an arbitrary proof in ML with mix used only in the last step. Let \(\varphi \) be the mix formula in P. We define:

  • \(d(\varphi , P)\) (the degree of \(\varphi \) in P) \(= sup\{lh(s): s\) is a \(\vartriangleleft \)-sequence for \(\varphi \) such that for every \(k < lh(s)~ s_k \in P \}\).

  • d(P) (the degree of P) is defined as \(d(\varphi , P)\).

Lemma 12

Let P be an arbitrary proof in ML with mix used only in the last step. Then d(P) is a natural number (in other words, it is never \(\omega \)).

In order to prove the lemma, we introduce first the function str(x) (‘the structure of a formula x’). Let the letter p be a new symbol (it will be treated as a propositional variable). The function is defined as follows (Q is either \(\exists \) or \(\forall \) and \(\circ \) is an arbitrary binary connective).

Definition 13

  • \(str(\ulcorner t=s \urcorner ) = \ulcorner p \urcorner \)

  • \(str(\ulcorner \lnot \psi \urcorner ) = \lnot str(\ulcorner \psi \urcorner )\)

  • \(str(\ulcorner \varphi \circ \psi \urcorner ) = str(\ulcorner \varphi \urcorner ) \circ str(\ulcorner \psi \urcorner )\)

  • \(str(\ulcorner Qx \varphi \urcorner ) = Qx~str(\ulcorner \varphi \urcorner )\)

Intuitively, given a formula \(\varphi \), the function produces a formula which is exactly like \(\varphi \), except that the letter ‘p’ is substituted for all occurences of atomic formulas in \(\varphi \). Abbreviate \(str(\varphi ) = str(\psi )\) as \(\varphi \sim \psi \). The key property of the equivalence relation \(\sim \) is encapsulated in the following observation.

Observation 14

Let \(Z \subseteq Sent_{L_{PA}}(M)\). For every s, if s is a \(\vartriangleleft \)-sequence with elements from Z, then \(lh(s) \le card\big ( \{[\varphi ]_{\sim }: \varphi \in Z \}\big )\), where \([\varphi ]_{\sim }\) is a class of sentences \(\psi \) from Z such that \(\varphi \sim \psi \).

Let \(compl(\varphi )\) be the number of connectives and quantifiers in \(\varphi \). Observation 14 follows immediately from the following fact (we use \(\vartriangleleft ^*\) for the transitive closure of \(\vartriangleleft \)).

Fact 15

  1. (a)

    \(\forall \varphi , \psi \big (\varphi \vartriangleleft ^* \psi \rightarrow compl(\varphi ) < compl(\psi ) \big )\).

  2. (b)

    \(\forall \varphi , \psi \big (\varphi \sim \psi \rightarrow compl(\varphi ) = compl(\psi ) \big )\).

The proof of Fact 15 is done by easy induction and does not contain any surprises. For part (a), proceed with induction on the length of the \(\vartriangleleft \)-sequence s leading from \(\psi \) to \(\varphi \). Part (b) can be done by induction on the complexity of \(\psi \).

Proof of Lemma 12 (idea). Fix a proof P in ML which contains mix only in the last step. Let Z be the set of all sentences which appear in P. We demonstrate that \(\{[\varphi ]_{\sim }: \varphi \in Z \}\) is finite, which by Observation 14 guarantees the conclusion of Lemma 12.

For an arbitrary sequent S in P, let l(S) (the level of S) be the length of the path leading from S to the end sequent of P. We denote by \(S_i\) the set of all sequents in P whose level is not greater than i. Let \(Sent_i\) be defined as the set of all sentences which appear in some element of \(S_i\). Let k be the height of P. The task is to show that:

$$\begin{aligned} \forall i \le k \{[\varphi ]_{\sim }: \varphi \in Sent_i \} \text { is finite}. \end{aligned}$$

This will end the proof, since \(Sent_k = Z\).

We proceed by induction. Observe that for \(i=0\) the conclusion is trivial, as \(Sent_0\) itself is finite (\(Sent_0\) is the set of sentences which appear in the end sequent of P). The proof is concluded by demonstrating that \(\{[\varphi ]_{\sim }: \varphi \in Sent_{i+1} \}\) is finite, under the assumption that \(\{[\varphi ]_{\sim }: \varphi \in Sent_i \}\) is finite.Footnote 8    \(\Box \)

In effect, Definition 11 and Lemma 12 give us a notion of a degree of the proof which can be used in a Gentzen-style proof even in a non-standard setting. The way to proving cut elimination, and thus the consistency of ML, is now open.

I will not present the whole cut elimination proof, since it is mostly a repetition of Gentzen’s reasoning. Instead, I will restrict myself to discussing one example of a new rule (the one not present in the original Gentzen’s system).

Our task is to demonstrate that mix can be eliminated from any proof which contains only a single application of the mix rule in the last step. Let us assume (main induction) that cut can be eliminated in every proof of a degree \(< n\). Let us also assume (subinduction) that cut can be eliminated in every proof of a degree n but with rank \(< k\). Our task is to show that cut can be eliminated in proofs of degree n and rank k.

The proof starts with the case of \(k =2\) (the lowest possible rank) and proceeds by analysing subcases. Here we analyse only one subcase corresponding to a rule of ML absent in LK. Namely, let us assume that the mix formula of the form \(\forall x \varphi (x)\)Footnote 9 is obtained by a logical rule in both the succedent of the left-hand upper sequent of the mix and in the antecedent of the right-hand upper sequent of the mix. Then the last stage of the proof runs as follows:

figure b

We can then eliminate mix in the following way:

figure c

We use the inductive assumption here, namely, we show that the same end sequent can be obtained by applying mix to the formula \(\varphi (c)\), which has the degree \(n-1\) in P (the sentence \(\forall x \varphi (x)\) has the degree n). Observe that in the modified proof \(\varphi (c)\) will preserve the same degree \(n-1\); observe also that the modification did not involve adding to the proof any new formula (in general: in the present setting new proofs without mix are produced from sentences belonging to the initial proof P).