1 Introduction

Sometimes problems about a \(\uplambda \)-calculus are most easily solved through consideration of the corresponding combinatory calculus. This is because combinatory terms have, in a way, a simpler structure than \(\uplambda \)-terms, being built up from atoms by (in the untyped or simply typed case) the single operation of application rather than the two operations of application on the one hand and \(\uplambda \)-binding of variables on the other.

However it is not the case that, for every \(\uplambda \)-calculus, an equivalent combinatory calculus has as yet been formulated. While there is indeed an untyped combinatory calculus which is (in a sense that can be made precise) equivalent to the untyped \(\uplambda \eta \)-calculusFootnote 1 and something similar holds for the simply typed \(\uplambda \eta \)-calculus, there is as yet no combinatory calculus that is equivalent to the polymorphically typed \(\uplambda \eta \)-calculus.Footnote 2 The purpose of the present paper is to close this gap.

A particular example of a problem concerning a combinatory calculus which seems to be technically easier than the corresponding problem for the equivalent \(\uplambda \eta \)-calculus will be discussed in Sect. 8 below. Moreover, according to [3] (p. 656), ‘a computer handles \(\uplambda \)-terms more easily if these are translated into combinatory terms’.Footnote 3

The structure of this paper is as follows. Sections 2 through 5 are fundamental to the rest: they are devoted to defining the concept of a polymorphically typed combinatory term and a reduction relation on these terms. Sections 6 through 11 are devoted to demonstrating that results concerning polymorphically typed combinatory terms will have applications to other areas of logic, above all to the polymorphically typed \(\uplambda \)-calculus and to the proof theory of second-order logic. These later sections are to a large extent independent of each other and can be read, apart from the need for a few cross-references, in any order. Readers who are concerned about motivational matters might like to look at the later sections first.

2 Classes of terms, type-assignment systems and equational calculi

The concept of a polymorphically typed \(\uplambda \)-term is presumably familiar to most readers, but for the sake of maximum precision one particular definition of this concept will be presented in the present section. This section will also define the common substratum of all the various languages that will be studied in the sequel.

In Definition 2.1 below, the concept of a (polymorphic) semi-type is defined simultaneously with the function VarF which takes every semi-type to a (possibly empty) set of bindable type-variables. Throughout this paper, ‘\(\equiv \)’ will denote identity between syntactic objects (terms, formulae and types).

Definition 2.1

  1. (i)

    Each free type-variable \((\alpha , \beta , \gamma , \alpha _0, \alpha _1, \ldots )\) is a semi-type.VarF\((\alpha ) = \lbrace \rbrace \).

  2. (ii)

    Each bindable type-variable \((\phi , \psi , \omega , \phi _0, \phi _1, \ldots )\) is a semi-type. VarF\((\phi ) = \lbrace \phi \rbrace \).

  3. (iii)

    If A and B are semi-types, so is \((A\rightarrow B)\). VarF\((A\rightarrow B)\) = VarF\((A)\cup VarF(B)\).

  4. (iv)

    If A is a semi-type, so is \(\forall \phi \)A for every bindable type-variable \(\phi \). VarF\((\forall \phi \)A\() =\)VarF\((A)-\lbrace \phi \rbrace \).

As usual, outermost parentheses are omitted and \(A\) \(\rightarrow \) \(B\) \(\rightarrow \) \(C\) should be understood as short for \(A\) \(\rightarrow \) \((B\) \(\rightarrow \) \(C)\). Note that in clause (iv) there is no prohibition on \(\forall \phi \) occurring in A. To ensure that a bindable variable is bound by at most one quantifier, the binding relation is defined as follows:

Definition 2.2

The relation binds, which holds between quantifier-occurrences and bindable variable-occurrences, is defined by recursion on lengths of semi-types. Suppose that \(\sharp \) is an occurrence of \(\phi \) inside the semi-type A; then

  1. (i)

    If \(\sharp \) is not within the scope of some quantifier \(\forall \phi \) occurring inside A, \(\sharp \) is bound by the outermost \(\forall \phi \) in \(\forall \phi A\).

  2. (ii)

    If \(\sharp \) is within the scope of some quantifier \(\forall \phi \) occurring inside A, then the same quantifier-occurrence binds \(\sharp \) within \(\forall \phi A\) as binds it within A.

Definition 2.3

A semi-type A is a (sc., polymorphic) type iff VarF\((A) = \lbrace \rbrace \).

Definition 2.4

A semi-type A is a simple type iff it is built up from free type-variables by clause (iii) alone of Definition 2.1.

Types can be thought of as formulae of second-order implicational logic, as on p. 345f. of [24]. The notion of substituting a type B for a free type-variable occurring in a type A can therefore be defined as is standardly done in textbooks on second-order logic (loc. cit.). Changing bindable variables in B is unnecessary, thanks to Definition 2.2.

Definition 2.5

The result of substituting the type B for the free variable \(\alpha \) in type A shall be \(A(^{\alpha }_B)\).

Notational conventions As it is usually possible to tell from the context which variable it is for which a substitution is being made, this notation can be simplified. For example, if “\(A(\alpha )\)” and “A(B)” are used in the same context, “A(B)” should be read as meaning \(A(^{\alpha }_B)\). This implies that \(\alpha \) does not occur in A(B), unless it is in B. Occasionally the notion of substitution will be applied also to semi-types: thus “\(A(^{\phi }_{\alpha })\)” and “\(A(^{\alpha }_{\phi })\)” will be used with the obvious meaning. When either of these last two symbols is used, it must be understood that the indicated occurrences of \(\phi \) are not bound by a quantifier in A, i.e. that they are in \(VarF\) \((A)\). In the case of “\(A(^{\alpha }_{\phi })\)”, this can be ensured by tacitly changing some of the binding variables in A if necessary.

Every one of the languages studied in this paper includes among its atomic terms a stock of variables. These will occasionally be called term-variables, when it is necessary to emphasize that they are distinct from the type-variables introduced by Definition 2.1. For each type A, denumerably many variables of type A are assumed, over which letters like \(``x^A\)”, “\(y^A\)”, “\(x^A_0\)”, “\(x^A_1\)”, \(\ldots \) range; or (when it is not essential to draw attention to the type) simply “x”, “y\(\ldots \). Also that there is a denumerable stock of untyped variables (x, y, etc.). When x and y are variables, \(x \equiv y\) only if x and y are either both untyped or both have the same type.

Sometimes, for example in Definition 5.4, it is useful to be able to assume that the variables within each type are ordered as an \(\omega \)-sequence, so this will be required from the outset.

Definition 2.6

  1. (i)

    The class \(\Lambda \) of untyped \(\uplambda \)-terms is built up from the untyped variables by application and \(\uplambda \)-abstraction, as on p. 3 of [9].

  2. (ii)

    The class CL of untyped combinatory terms is built up from untyped variables and the three combinators , and by application alone.

  3. (iii)

    The class \({\mathfrak {J}}\) of \(untyped\;\)J-terms is built up from the same basis as CL, but employing \(\uplambda \)-abstraction as well.

Remark

in this paper \(\uplambda x.x\) and \(\uplambda y.y\) count as distinct terms unless \(x \equiv y\).

In addition to the three classes of untyped terms distinguished by Definition 2.6, this paper will distinguish six classes of typed terms. An inductive definition of a class of typed terms can be presented in the shape of a formal system such as the one in Table 2.1 (cf. p. 16 of [2]). The expression “\(\in \Lambda _2(A)\)” should be read as “is an \(\Lambda _2\)-term of type A”. The first line of Table 2.1 shows the shape of initial statements of this formal system; the other four lines show the rules of inference.

Table 2.1: the formal system \(\Lambda _2\)

  1. (a)

    \(x^A \in \Lambda _2(A)\)         for every polymorphic type A

  2. (b)

    \(a \in \Lambda _2(B)\quad \Rightarrow \quad \uplambda x^{A}a\in \Lambda _2(A\) \(\rightarrow \) \(B)\)

  3. (c)

    \(a \in \Lambda _2(A\) \(\rightarrow \) \(B),\, b \in \Lambda _2(A)\quad \Rightarrow \quad ab\in \Lambda _2(B)\)

  4. (d)

    \(a \in \Lambda _2(A(\alpha ))\,\quad \Rightarrow \;\quad \Lambda _{\alpha ,\phi }a \in \Lambda _2(\forall \phi A(\phi ))\)   

  5. (e)

    \(a \in \Lambda _2(\forall \phi A(\phi ))\,\quad \Rightarrow \;\quad aC \in \Lambda _2(A(C))\)   for any type C

In clause (d), it is required that \(\alpha \) be in the type of no free variable in a and that \(\phi \) differ from every variable \(\psi \) such that \(\forall \psi \) occurs in A and has \(\alpha \) within its scope.

The rules presented in clauses (b)\({-}\)(e) of Table 2.1 clearly resemble the rules of a natural deduction calculus and will be called \(\rightarrow \)I, \(\rightarrow \)E, \(\forall \)I and \(\forall \)E respectively.

Definition 2.7

For any a, a shall be a \(\Lambda _2\)-term iff \(a \in \Lambda _2(A)\) for some A.

Definition 2.8

  1. (i)

    \(\Lambda _{\rightarrow }\) is the subystem of \(\Lambda _2\) in which all types are required to be simple and the only rules of inference are \(\rightarrow \)I and \(\rightarrow \)E.

  2. (ii)

    For any a, a is a \(\Lambda _{\rightarrow }\)-term iff \(a \in \Lambda _{\rightarrow }(A)\) for some simple type A.

Definition 2.9

\({\mathfrak {J}}_{\rightarrow }\) shall be the formal system defined otherwise like \(\Lambda _{\rightarrow }\) but with all statements of the shapes shown in Table 2.2 included among the initial statements.

Table 2.2

  1. (a)

    \({{{\mathbf {\mathsf{{I}}}}}}_A \in {\mathfrak {J}}_{\rightarrow }(A\) \(\rightarrow \) \(A)\)         for every simple type A

  2. (b)

    \({{{\mathbf {\mathsf{{K}}}}}}_{AB} \in {\mathfrak {J}}_{\rightarrow }(A\) \(\rightarrow \) \(B\) \(\rightarrow \) \(A)\)         for all simple types AB

  3. (c)

    \({{{\mathbf {\mathsf{{S}}}}}}_{ABC} \in {\mathfrak {J}}_{\rightarrow }((A\) \(\rightarrow \) \(B\) \(\rightarrow \) \(C)\) \(\rightarrow \) \((A\) \(\rightarrow \) \(B)\) \(\rightarrow \) \(A\) \(\rightarrow \) \(C)\)      for all simple types ABC

Definition 2.10

  1. (i)

    CL\(_{\rightarrow }\) is the formal system defined otherwise like \({\mathfrak {J}}_{\rightarrow }\) but with only one rule of inference, namely \(\rightarrow \)E (clause (c) of Table 2.1).

  2. (ii)

    For any a, a is a \({\mathfrak {J}}_{\rightarrow }\) -term iff \(a \in {\mathfrak {J}}_{\rightarrow }(A)\) for some simple type A.

  3. (iii)

    For any a, a is a CL\(_{\rightarrow }\)-term iff \(a \in CL_{\rightarrow }(A)\) for some simple type A.

Definition 2.11

A function \(x^A, a\,\mapsto \,[x^A]a\) which, for any term a and variable \(x^A\) of \(CL_{\rightarrow }\), outputs a new term [x]a is defined by recursion on lengths of terms.

  1. (a)

    If \(a \in CL_{\rightarrow }(B)\) and \(x^A\) does not occur in a, then \([x^A]a \equiv {{{\mathbf {\mathsf{{K}}}}}}_{BA}a\);

  2. (b)

    If \(a \equiv x^A\), then \([x^A]a \equiv {{{\mathbf {\mathsf{{I}}}}}}_{A}\);

  3. (c)

    If \(a \equiv bx^A\) for some b in which \(x^A\) does not occur, then \([x^A]a \equiv b\); and

  4. (f)

    If \(a \equiv bc\) for some bc, if a is not as described in any of clauses (a)\({-}\)(c), if \(b \in CL_{\rightarrow }(B\) \(\rightarrow \) \(C)\) while \(c \in CL_{\rightarrow }(B)\), then \([x^A]a \equiv {\mathbf {\mathsf{{S}}}}_{ABC}[x^A]b[x^A]c\).

Definition 2.12

A function \(x, a\,\mapsto \,[x]a\) which takes a term a and variable x of CL to a new term [x]a is defined by deleting all references to types from Definition 2.11 and generalizing over CL-terms instead of \(CL_{\rightarrow }\)-terms.

Remark

the clauses of these definitions are labelled (a), (b), (c) and (f) because of tradition. See p. 190 of [4] and cf., for example, Definition 2.18 on p. 26 of [9].

Discussion: the formal systems \(\Lambda _2\), \(\Lambda _{\rightarrow }\), \({\mathfrak {J}}_{\rightarrow }\) and CL\(_{\rightarrow }\) are all paradigms of what will here be called “systems á la Church” or “Church-style systems”. See, for example, p. 16 of [2]. Usually nowadays ([1, 19]), these names are given to systems in which the terms have shapes like \(\uplambda x^A.x\), in which typed variables occur in binding, untyped in bound, positions.Footnote 4 Such systems will, however, play no part in the present paper. It is for this reason that our definition of the set of terms of \(\Lambda _2\) is taken from [7] (p. 82f.) rather than [1] or [19], while the style of formulation is that used on p. 16 of [2].

While \(\Lambda _2\) is the polymorphic extension of \(\Lambda _{\rightarrow }\), polymorphic extensions of \({\mathfrak {J}}_{\rightarrow }\) and CL\(_{\rightarrow }\) have not yet been defined. The goal of the present paper is to close this gap.

Seven classes of term have been defined in this section: the three untyped classes mentioned in Definition 2.6 and the four typed classes mentioned in the discussion following Definition 2.11. In the following two definitions, the letter “\(\Sigma \)” ranges over these seven classes.

Definition 2.13

A \(\Sigma \) -equation is an equation \(a = b\), where a, b \(\in \Sigma \) and (if \(\Sigma \) is a typed class) a, b have the same type.

Definition 2.14

For any \(\Sigma \), a calculus of \(\Sigma \)-terms is a deductive system having as axioms some nonempty set of \(\Sigma \)-equations and rules of inference expressing general properties of equality, including possibly some form of rule \((\xi )\) appropriate to \(\Sigma \).

Thus, for example, the familiar untyped calculi \(\uplambda \beta \) and \(\uplambda \eta \) are both calculi of \(\Lambda \)-terms.

3 A problem and its solution

The main problem in finding a Church-style system that stands in the same sort of relation to CL\(_{\rightarrow }\) as \(\Lambda _2\) stands in to \(\Lambda _{\rightarrow }\) is that it is not easy to see how anything corrresponding to rules \(\rightarrow \)I and \(\forall \)I can be derivable within it. The general shape of \(\forall \)I must be: from \(\Gamma \vdash a_0 : A\)(\(\alpha \)) infer \(\Gamma \vdash a_1 : \forall \phi A\)(\(\phi \)), but what will be the relationship between \(a_0\) and \(a_1\)? \(a_0 \equiv a_1\) is not possible as, in a Church-style system, each term must determine its type uniquely. The two obvious strategies are:

  1. (1)

    To introduce a new combinator \(\Theta \) and stipulate that \(a_1 \equiv \Theta a_0\).

  2. (2)

    To postulate some new operation of term-construction, distinct from application, which, applied to \(a_0\) (alongside possibly other arguments), yields \(a_1\).

Both strategies run into difficulties. The class of combinators \(\Theta \) envisaged in (1), if applicable to arbitrary terms, would have to inhabit between them all types of the shape A(\(\alpha \))\(\rightarrow \) \(\forall \phi A\)(\(\phi \)); but formulae of this shape are not in general admissible in a sound logical calculus. As for (2), it is not clear how a calculus of combinatory terms constructed in accordance with (2) could admit the rule of \(\beta \)-conversion. In the combinatory calculi that have been constructed so far, the fact that the set of terms is closed under the function \(x^B, a \mapsto [x^B]a\) or \(x, a \mapsto [x]a\) (Definitions 2.11 and 2.12) is what makes \(\beta \)-conversion admissible; but these functions are in general applicable only to terms built up by application alone and would not necessarily be applicable if a new term-building operation were allowed.

Rejection of (2) motivates experimenting with the idea that \(a_1\), a term of type \(\forall \phi A\)(\(\phi \)), should be built up from atoms without using any term-constructing operations other than application. This problem is a bit like the problem of finding a formal system that generates the first-order logical validities without any quantifier-introducing rules of inference. This problem had been solved at least by the time Quine wrote [13]; however the solution depends on stipulating that the set of logical axioms shall be closed under prefixing of quantifiers. In the present context, this would be like insisting that there must be, not only for every type of the shape \(A\) \(\rightarrow \) \(B\) \(\rightarrow \) \(A\) but also for every type of the shape \(\forall \phi \)(\(A\) \(\rightarrow \) \(B\) \(\rightarrow \) \(A\)) and so on, a typed instance of K which inhabits that type. Since this contradicts the usual way of thinking about combinators, the solution adopted here will be to postulate, after all, a new term-constructing operation which, applied to a term of type A(\(\alpha \)) yields a term of type \(\forall \phi A\)(\(\phi \)), but to allow this new operation to be applied only to terms which contain no variables. In this way, it does not impede the definition of an operation which generalizes the one defined in 2.11. At the same time, we also admit new combinators with the property that (1) envisages \(\Theta \) as having, but with the restriction that they can be applied to a term of type A only if \(\alpha \) does not occur in A. The detailed working-out of these ideas is found in Sect. 4 below.

4 The formal systems CL\(_2\) and \({\mathfrak {J}}_2\)

The goal of the next few definitions is to define the Church-style system \({\mathfrak {J}}_2\), which extends \({\mathfrak {J}}_{\rightarrow }\) and which will be, in a sense to be explored presently, equivalent to \(\Lambda _2\). The shapes of the initial statements of \({\mathfrak {J}}_2\) are enumerated in the following Table, which extends Table 2.2. The expression “\(\in {\mathfrak {J}}_2(A)\)” should be read as “is a \({\mathfrak {J}}_2\)-term of type A”.

Table 4.1

  1. (a)

    \(x^A \in {\mathfrak {J}}_{2}(A)\)         for every type A

  2. (b)

    \({{{\mathbf {\mathsf{{I}}}}}}_A \in {\mathfrak {J}}_{2}(A\) \(\rightarrow \) \(A)\)         for every type A

  3. (c)

    \({{{\mathbf {\mathsf{{K}}}}}}_{AB} \in {\mathfrak {J}}_{2}(A\) \(\rightarrow \) \(B\) \(\rightarrow \) \(A)\)         for all types AB

  4. (d)

    \({{{\mathbf {\mathsf{{S}}}}}}_{ABC} \in {\mathfrak {J}}_{2}((A\) \(\rightarrow \) \(B\) \(\rightarrow \) \(C)\) \(\rightarrow \) \((A\) \(\rightarrow \) \(B)\) \(\rightarrow \) \(A\) \(\rightarrow \) \(C)\)      for all types ABC

  5. (e)

    \({{{\mathbf {\mathsf{{U}}}}}}^{\forall \phi B(\phi )\rightarrow B(C)} \in {\mathfrak {J}}_2(\forall \phi \) \(B(\phi )\) \(\rightarrow \) \(B(C))\) under the conditions described below

  6. (f)

    \(\Xi _{\forall \phi (F\rightarrow G)} \in {\mathfrak {J}}_2(\forall \phi (F\) \(\rightarrow \) \(G\) \()\) \(\rightarrow \) \(\forall \phi F\) \(\rightarrow \) \(\forall \phi G)\)

  7. (g)

    \(\Theta _{\forall \phi B} \in {\mathfrak {J}}_2(B\) \(\rightarrow \) \(\forall \phi B)\)      for every type B

In line (e) \(B(\alpha )\) and C must be types and \(\phi \) a variable such that no quantifier in \(B(\alpha )\) that binds \(\phi \) should have \(\alpha \) in its scope.

In (f), FG must be semi-types and \(\phi \) a variable such that \((VarF(F)\cup VarF(G)){-}\lbrace \phi \rbrace = \emptyset \).

In (g), the requirement that B be a type (not a semi-type) entails that the outermost quantifier in \(\forall \phi B\) does not actually bind anything.

The rules of inference of \({\mathfrak {J}}_2\) are shown in Table 4.2:

Table 4.2

  1. (a)

    \(a \in {\mathfrak {J}}_2(B)\quad \Rightarrow \quad \uplambda x^{A}a\in {\mathfrak {J}}_2(A\) \(\rightarrow \) \(B)\)

  2. (b)

    \(a \in {\mathfrak {J}}_2(A\) \(\rightarrow \) \(B),\, b \in {\mathfrak {J}}_2(A)\quad \Rightarrow \quad ab\in {\mathfrak {J}}_2(B)\)

  3. (c)

    \(a \in {\mathfrak {J}}_2(A(\alpha ))\,\quad \Rightarrow \;\quad \Lambda _{\alpha ,\phi }a \in {\mathfrak {J}}_2(\forall \phi A(\phi ))\)

  4. (d)

    \(a \in {\mathfrak {J}}_2(A(\alpha ))\,\quad \Rightarrow \;\quad \Pi _{\alpha ,\phi }a \in {\mathfrak {J}}_2(\forall \phi A(\phi ))\)

In (c), \(\alpha \) must not occur in the type of any term-variable free in a. In (d), a may not contain any free term-variables. In lines (c) and (d), it is required that the variable \(\phi \) be distinct from every \(\psi \) that occurs in a quantifier \(\forall \psi \) in A having \(\alpha \) within its scope.

Remark

the difference in function between the prefixes \(\Lambda _{\alpha ,\phi }\) and \(\Pi _{\alpha ,\phi }\) will become clear in the next section. It will turn out that whether rule (c) or rule (d) is used in building up a \({\mathfrak {J}}_2\)-term makes a difference to how the term may be reduced.

Definition 4.1

The subsystem of \({\mathfrak {J}}_2\) in which only clauses (b) and (d) of Table 4.2 are used shall be \(CL _2\).

Definition 4.2

\({\mathfrak {J}}_2\)-term” and “CL\(_2\)-term” shall be defined analogously to “\(\Lambda _2\)-term” etc (see Sect. 2 above); and “\(\in CL _2(A)\)” analogously to “\(\in \Lambda _2(A)\)’.

Thus the CL\(_2\)-terms are precisely the atoms shown in Table 4.1 together with well-formed terms built up from them by the two operations \(a, b\mapsto ab\) and \(a \mapsto \Pi _{\alpha ,\phi }a\).

Definition 4.3

Whenever a and b are \(CL_2\)-terms, b is an \(\alpha \)/\(\phi \)-generalization of a iff:

  1. (i)

    b \(\equiv \,\Pi _{\alpha ,\phi }\) \(a\); or

  2. (ii)

    b \(\equiv \,\Theta _{\forall \phi A}\) \(a\); or

  3. (iii)

    There are terms \(a_0, a_1\), \(b_0, b_1\) and semi-types F, G such that:

(\(\alpha \)):

\(b_0\) and \(b_1\) are \(\alpha \)/\(\phi \)-generalizations of \(a_0, a_1\) respectively; and

(\(\beta \)):

\(b_0 \in CL _2(\forall \phi (\)F\(\rightarrow \)G)) and \(b_1 \in CL _2(\forall \phi F)\); and

(\(\gamma \)):

a \(\equiv a_0a_1\) and \(b \equiv (\Xi _{\forall \phi (F\rightarrow G)}b_0)b_1\), abbreviated as usual to \(\Xi _{\forall \phi (F\rightarrow G)}b_0b_1\).

Definition 4.4

A free type-variable \(\alpha \) is used as an eigenvariable in a \(CL_2\)-term a iff there are b and \(\phi \) such that \(\Pi _{\alpha ,\phi }\) \(b\) is a subterm of a.

Proposition 4.5

For any \(CL_2\)-term a and any type-variables \(\alpha \) and \(\phi \) such that

  1. (i)

    \(\alpha \) is not used as an eigenvariable in a; and

  2. (ii)

    \(B(\alpha )\) is the type of a; and

  3. (iii)

    \(\alpha \) does not occur in the type of any term-variable inside a; and

  4. (iv)

    No occurrence of \(\forall \phi \) in B has \(\alpha \) within its scope,

there is an \(\alpha \)/\(\phi \)-generalization b of a having type \(\forall \phi B(\phi )\).

Proof

if a contains no variables, we can take \(\Pi _{\alpha ,\phi }a\) for b. If \(\alpha \) does not occur in \(B(\alpha )\), we can take \(\Theta _{\forall \phi B}a\) for b. The case where a is a variable with \(\alpha \) in its type does not arise. The only remaining possibility, therefore, is that \(a \in CL _2B(\alpha )\) in virtue of line (b) of Table 4.2, which means a is a CL\(_2\)-term because both its immediate subterms are. Then the theorem is proven by induction on the construction of a and applying clause (iii) of Definition 4.3.

Discussion: if \(\alpha \) is used as an eigenvariable in a, then there will always be another term, differing from a only in that eigenvariables have been changed, in which \(\alpha \) is no longer used as an eigenvariable. Proposition 4.5 means in effect that the third rule in Table 2.1 can be mimicked in CL\(_2\). We will express this by saying that \(\forall \)I is derivable in both these systems. In general, there will be more than one way of forming an \(\alpha \)/\(\phi \)-generalization of a given term. It is possible, however, to define a function from terms to their \(\alpha \)/\(\phi \)-generalizations by requiring that, whenever clauses (i) and (ii) of Definition 4.3 are applied to suberms of a, the subterms chosen shall be as large as possible. The \(\alpha \)/\(\phi \)-generalization of a formed in this way shall be called \(\wedge _{\alpha ,\phi }a\).

If a is a CL\(_2\)-term of type C and \(x^A\) a CL\(_2\)-variable, the term [\(x^A\)]a is defined by:-

Definition 4.6

  1. (a)

    If \(x^A\) does not occur in a, then \([x^A]a\,\equiv \,{{{\mathbf {\mathsf{{K}}}}}}_{CA}\) \(a\).

  2. (b)

    If a \(\equiv x^A\), then \([x^A]a\,\equiv \,{{{\mathbf {\mathsf{{I}}}}}}_A\).

  3. (c)

    If a \(\equiv bx^A\) for some b in which \(x^A\) does not occur, then \([x^A]a\,\equiv \) b.

  4. (f)

    If a \(\equiv bc\) and a is not as described in clause (a), (b) or (c), if \(b \in CL _2(B\) \(\rightarrow \) \(C)\) and \(c \in CL _2(B)\), then \([x^A]a\,\equiv \,{{{\mathbf {\mathsf{{S}}}}}}_{ABC}[x^A]b[x^A]c\).

Proposition 4.7

Every CL\(_2\)-term has one of the four shapes distinguished in Definition 4.6.

Proof

by Definition 4.1, using the fact that a term of the shape \(\Pi _{\alpha ,\phi }a\) contains no term-variables.

Corollary

for every CL\(_2\)-term a of type C, [\(x^A\)]a is defined and has type \(A\) \(\rightarrow \) \(C\).

This corollary may be expressed by saying that rule \(\rightarrow \)I is derivable in CL\(_2\).

Remark

the formulation of \({\mathfrak {J}}_2\) implements the ideas outlined informally in the last paragraph of Sect. 3. The “new term-forming operation” mentioned there is precisely the operation which takes b to \(\Pi _{\alpha ,\phi }b\) when there are no variables in b. The “new combinators” mentioned two sentences later are those introduced in line (g) of Table 4.1. Other new combinators, introduced in line (f) of Table 4.1, are needed in order to make these ideas work.

5 Reduction of \(CL_2\)-terms

In section 6F2 of [4], the relation of strong reduction on CL-terms is defined by first embedding the CL-terms in the wider class of \({\mathfrak {J}}\)-terms (see Definition 2.6 above) and then defining strong reduction over the \({\mathfrak {J}}\)-terms. Similarly, a form of strong reduction on CL\(_2\)-terms will be defined here by first defining a reduction relation on \({\mathfrak {J}}_2\)-terms and then restricting it to CL\(_2\)-terms.

The class of \({\mathfrak {J}}_2\)-terms which may be reduced is defined by reference to the class Ir, which comprises precisely the CL\(_2\)-terms which may not be reduced. First an auxiliary definition:

Definition 5.1

For any \({\mathfrak {J}}_2\)-terms a, b, we say that b is a head of a or occurs in head position within a iff one of the following holds.

  1. (i)

    a is an atom and \(b \equiv a\); or

  2. (ii)

    a \(\equiv a_0a_1\) and b is either a itself or a head of \(a_0\); or

  3. (iii)

    a \(\equiv \uplambda x^Ac\) and b is a head of c; or

  4. (iv)

    a \(\equiv \Lambda _{\alpha ,\phi }c\) and b is a head of c; or

  5. (v)

    a \(\equiv \Pi _{\alpha ,\phi }c\) and b is a head of c.

A maximal head of a is a head of a that is not a proper part of any larger head of a.

The inductive definition of the class Ir itself will be presented in a similar style to the definition of the class of \({\mathfrak {J}}_2\)-terms; here “\(\in \) Ir(A)” should be read as “is an irreducible term of type A”.

Table 5.1

  1. (a)

    \(x^A \in {Ir}(A)\)   for every typed variable \(x^A\) of CL\(_2\)

  2. (b)

    \(a \in {Ir}(B)\quad \Rightarrow \quad [x^A]a \in \) Ir(\(A\) \(\rightarrow \) \(B\))   subject to restrictions (see below)

  3. (c)

    \(a \in {Ir}(B(\alpha ))\,\Rightarrow \, a' \in {Ir}(\forall \phi B(\phi ))\;\) whenever \(a'\) is an \(\alpha /\phi \)-generalization of a

  4. (d)

    \(a \in {Ir}(A\) \(\rightarrow \) \(B),\,b \in {Ir}(A) \,\Rightarrow \, ab \in {Ir}(B)\quad \) subject to restrictions (see below)

  5. (e)

    \(a \in {Ir}(\forall \phi B(\phi ))\,\Rightarrow \, {{{\mathbf {\mathsf{{U}}}}}}^{\forall \phi B(\phi )\rightarrow B(C)}a \in {Ir}(B(C))\,\) subject to restrictions (below)

  6. (f)

    \(a \in {Ir}(\forall \phi (F\) \(\rightarrow \) \(G)),\,b \in {Ir}(\forall \phi F) \,\Rightarrow \, \Xi _{\forall \phi (F\rightarrow G)}ab \in {Ir}(\forall \phi G)\,\)

  7. (g)

    \(a \in {Ir}(\forall \phi (F\) \(\rightarrow \) \(G))\,\,\Rightarrow \, \Xi _{\forall \phi (F\rightarrow G)}a \in {Ir}(\forall \phi F\) \(\rightarrow \) \(\forall \phi G)\,\)

In (b) it is required that at least one of the two following conditions should hold:-

  1. (3)

    \([x^A]a\) is not formed from a by clause (c) of Definition 4.6 alone; or

  2. (4)

    a does not have a variable in head position and, if a has a \({{{\mathbf {\mathsf{{U}}}}}}\)-term in head position, a has the shape \({{{\mathbf {\mathsf{{U}}}}}}^{\forall \phi B(\phi )\rightarrow B(C)}x\).

In (d) it is required that a should have either a variable or a \({{{\mathbf {\mathsf{{U}}}}}}\)-term in head position but not be itself a \({{{\mathbf {\mathsf{{U}}}}}}\)-term.

In (e) and (g), it is required that a should not be a generalization and, in (f), that at least one of ab should not be a generalization.

The motivation for these restrictions, and indeed for the way the rules presented in Table 5.1 are formulated, will be discussed in Sect. 9.

Proposition 5.2

Every irreducible term is a \(CL_2\)-term

Proof

if every occurrence of “Ir” in Table 5.1 is replaced with “CL\(_2\)”, then the resulting rules are all either primitive or admissible rules of CL\(_2\). This can be checked using, in particular, Proposition 4.5 and the Corollary to 4.7.

Definition 5.3

A \({\mathfrak {J}}_2\)-term is reducible iff it is not in Ir.

Five classes of \({\mathfrak {J}}_2\)-terms will now be identified as redexes and a contractum will be associated with every redex. The first three classes of redex correspond to what are called, at p. 223 of [4], “type I (resp. II resp. III) redexes”; but it may be inadvisable in the present context to overwork the word “type”. So the five classes will here be called: (I) weak redexes; (II) open redexes; (III) \(\uplambda '\)-redexes; (IV) \(\wedge \)-redexes; (V) \(\Lambda \)-redexes. Throughout Table 5.2 and Definitions 5.4 to 5.7, it is required that the terms abc be in Ir (a requirement which means that the reduction relation defined here somewhat resembles the strict or standard reduction of [23]).

The weak redexes comprise all CL\(_2\)-terms of the shapes displayed in the left-hand column of Table 5.2, while the shapes of the corresponding contracta are on the right. In the last row it is required that a, besides being irreducible, should be an \(\alpha /\phi \)-generalization of \(b \in \) CL\(_2(B(\alpha )\)).

Table 5.2

Redex

Contractum

\({{{\mathbf {\mathsf{{S}}}}}}_{ABC}abc\)

ac(bc)

\({{{\mathbf {\mathsf{{K}}}}}}_{AB}ab\)

a

\({{{\mathbf {\mathsf{{I}}}}}}_Aa\)

a

\({{{\mathbf {\mathsf{{U}}}}}}^{\forall \phi B(\phi )\rightarrow B(C)}a\)

\(b(^{\alpha }_C)\)

The second class of redexes is defined by:

Definition 5.4

  1. (i)

    An open \(CL_2\)-term is one of the shape \({{{\mathbf {\mathsf{{S}}}}}}_{ABC}ab\) or \({{{\mathbf {\mathsf{{S}}}}}}_{ABC}a\), where a, b \(\in \) Ir.

  2. (ii)

    Any openFootnote 5, reducible \(CL_2\)-term a of type \(A\) \(\rightarrow \)B is an open redex. Its contractum is \(\uplambda x^A.ax^A\), where \(x^A\) is the first term-variable of type A that does not occur in a and is also distinct from every \(\uplambda \)-binding variable that has a within its scope.

Definition 5.5

A \(\uplambda '\)-redex is any \({\mathfrak {J}}_2\)-term of the shape \(\uplambda x.a\), where \(a \in Ir\). Its contractum is [x]a.

Definition 5.6

A \(\wedge \)-redex is any CL\(_2\)-term of the shape \(\Xi _{\forall \phi (F\rightarrow G)}(\wedge _{\alpha ,\phi }a)(\wedge _{\alpha ,\phi }b)\), where \(ab \notin \) Ir. Its contractum is \(\Lambda _{\alpha ,\phi }(ab)\).

Remark

this reduction rule does for reducible terms of the shape \(\wedge _{\alpha ,\phi } a\) roughly the same job as Definition 5.4 does for open reducible terms. Despite what the notation might suggest, \(\wedge _{\alpha ,\phi } a\) will in general have a radically different form from \(\Lambda _{\alpha ,\phi } a\), since the latter is formed from a by a single term-building operation while the operation which forms \(\wedge _{\alpha ,\phi } a\) from a may be complicated and, in general, a will not be a subterm of \(\wedge _{\alpha ,\phi } a\).

Definition 5.7

A \(\Lambda \)-redex is any \({\mathfrak {J}}_2\)-term of the shape \(\Lambda _{\alpha ,\phi } a\), where \(a \in Ir\). Its contractum is \(\wedge _{\alpha ,\phi } a\).

Corollaries a redex has the same type as its contractum. No redex has more than one contractum.

Let the maximal head of a \({\mathfrak {J}}_2\)-term a be written as \(Xa_1a_2...a_n\), where X is an atom and \(a_1, a_2\ldots a_n\) are arbitrary \({\mathfrak {J}}_2\)-terms. Then we define:

Definition 5.8

A subterm b of a is an active redex within a iff one of the following holds.

  1. (i)

    \(a_i \in \) Ir for each i such that \(1 \le i\le n\) and b is in head position within a and b is a weak redex; or

  2. (ii)

    \(a_i \in \) Ir for each i such that \(1 \le i\le n\) and \(Xa_1a_2...a_n\) is an open redex or a \(\wedge \)-redex and \( b \equiv Xa_1a_2...a_n\); or

  3. (iii)

    \(b \equiv \uplambda x^{A}(Xa_1a_2...a_n)\) for some \(x^A\) and b is a \(\uplambda '\)-redex; or

  4. (iv)

    \(b \equiv \Lambda _{\alpha ,\phi }(Xa_1a_2...a_n)\) for some \(\alpha , \phi \) and b is a \(\Lambda \)-redex; or

  5. (v)

    \(j < n\), \(a_1, a_2,\ldots a_j \in \) Ir, \(a_{j+1} \notin \) Ir and b is an active redex in \(a_{j+1}\).

Corollary

No \({\mathfrak {J}}_2\)-term contains more than one active redex.

Definition 5.9

  1. (i)

    a \(\triangleright _1\) b shall hold iff b is derived from a by replacing the active redex in a with its contractum.

  2. (ii)

    A reduction (sequence) for a shall be any sequence \(a, a_1, a_2,\ldots \) with the property that, for a any nonterminal \(a_i\), \(a_i \triangleright _1 a_{i+1}.\)

  3. (iii)

    a \(\triangleright \) b shall mean that a and b are, respectively, the first and last elements of some reduction sequence.

  4. (iv)

    A complete reduction sequence is one which either continues ad infinitum or else has a last element which contains no active redex.

  5. (v)

    The last element (if it exists) of any complete reduction sequence for a shall be the normal form of a.

Proposition 5.10

Suppose \(a \in {\mathfrak {J}}_2(A)\); then the following are all equivalent.

  1. (i)

    \(a \in Ir(A)\).

  2. (ii)

    a contains no active redex.

  3. (iii)

    a contains no redexes.

  4. (iv)

    a is a normal form.

Proof

that (i) implies (iii) is easily seen by induction on the deduction of \(a \in \) Ir(A) (Table 5.1). (iii) implies (ii) because every active redex is a redex. That (ii) implies (i) is proven by induction on the deduction of \(a \in {\mathfrak {J}}_2(A)\).

If \(a \equiv \uplambda xb\) or \(a \equiv \Lambda _{\alpha ,\phi }b\), then a is its own active redex if \(b \in \) Ir. If \(b \notin \) Ir, then by the (contraposition of the) I.H., there is an active redex in b and therefore in a.

If \(a \equiv Xa_1a_2...a_n\) and some \(a_i\) is reducible, then, again by the contraposition of the I.H., the leftmost such \(a_i\) contains the active redex of a. From now on, therefore, we assume that every \(a_i\) within \(Xa_1a_2...a_n\) is irreducible. If \(Xa_1a_2...a_n\) begins with a combinator but has no active redex, then a must have one of the following shapes: \({{{\mathbf {\mathsf{{S}}}}}}_{ABC}\), \({{\mathbf {\mathsf{{K}}}}}_{AB}\), \({{\mathbf {\mathsf{{K}}}}}_{AB}b\) (with \(b \in \) Ir), \({{\mathbf {\mathsf{{I}}}}}_{A}\), \({{\mathbf {\mathsf{{S}}}}}_{ABC}b\), \({{\mathbf {\mathsf{{S}}}}}_{ABC}bc\). In the first four cases, \(a \in \) Ir is then easily shown using Table 5.1. If \(a \equiv {{\mathbf {\mathsf{{S}}}}}_{ABC}b\) or \(a \equiv {{\mathbf {\mathsf{{S}}}}}_{ABC}bc\) where \(b, c \in \) Ir and if a is not a redex, then \(a \in \) Ir by contraposing Definition 5.4. Similarly, if \(Xa_1a_2...a_n\) begins with a \({{\mathbf {\mathsf{{U}}}}}\)-term, a variable, a \(\Xi \)-term or a \(\Theta \)-term and contains no active redex, \(a \in \) Ir can be shown using the rules of Table 5.1.

If \(a \equiv \Pi _{\alpha ,\phi }b\) for some b which has no active redex, then \(a \in \) Ir follows easily from the I.H. and rule (c) of Table 5.1.

(ii) and (iv) are equivalent because, by Definition 5.9, a is a normal form iff there is no active redex in a.

Proposition 5.11

If \(a'\) is the normal form of a and \(\alpha \) is not used as an eigenvariable in the construction of a, then \(a'(^{\alpha }_C)\) is the normal form of \(a(^{\alpha }_C)\), for any type C.

Proof

by induction on the length of the reduction sequence leading from a to \(a'\).

6 Relations between the calculi of \(\Lambda _2\)-terms and CL\(_2\)-terms

In this section, \(M, N, P, Q, M_0, M_1,\ldots \) shall be \(\Lambda _2\)-terms while \(a, b, c, d, e, a_0, a_1,\ldots \) shall be \({\mathfrak {J}}_2\)-terms. \(M =_{\Lambda _2} N\) shall mean that the equation \(M = N\) is provable in the calculus of \(\Lambda _2\)-terms defined on p. 83 of [7], with \(\eta \)-conversion included.

The H-transformation defined by the following definition takes each \(\Lambda _2\)-term M to a CL\(_2\)-term \(M_H\).

Definition 6.1

  1. (i)

    If M is a variable, then \(M_H \equiv M\).

  2. (ii)

    If M \(\equiv \) PQ, then \(M_H \equiv P_HQ_H\).

  3. (iii)

    If M \(\equiv \uplambda x^AN\), then \(M_H \equiv [x^A](N_H)\) as defined by Definition 4.6.

  4. (iv)

    If \(M \equiv \Lambda _{\alpha ,\phi }N\), then \(M_H \equiv \wedge _{\alpha ,\phi }(N_H)\).

  5. (v)

    If \(M \equiv NC\) for \(N \in \Lambda _2(\forall \phi A(\phi ))\) and some type C, then \(M_H \equiv {{\mathbf {\mathsf{{U}}}}}^{\forall \phi A(\phi )\rightarrow A(C)}(N_H)\).

Proposition 6.2

When a \(CL_2\)-term b is itself a redex, the active redex within b is b itself.

Proof

immediate from Definition 5.8.

Proposition 6.3

If \(M_H \equiv {{\mathbf {\mathsf{{S}}}}}_{ABC}ab\), then \(M =_{\Lambda _2} \uplambda z^A.Nz^A(Pz^A)\) for some NP such that \(N_H \equiv a\), \(P_H \equiv b\).

Proof

assume first that M contains no \(\eta \)-redexes. Then inspection of the right-hand terms of Definition 6.1 shows that M must be either \(\uplambda z^A.L\) or \(L_0L_1\) for some \(L, L_0, L_1\). In the former case, since \(M_H \equiv {{\mathbf {\mathsf{{S}}}}}_{ABC}ab\), L must be \(K_0K_1\) for some \(K_0, K_1\). So \(M \equiv \uplambda z^A(K_0K_1) =_{\Lambda _2} \uplambda z^A.((\uplambda v^AK_0)z^A((\uplambda v^AK_1)z^A))\) where \((\uplambda v^A.K_0)_H \equiv a\) and \((\uplambda v^A.K_1)_H \equiv b\).

If \(M \equiv L_0L_1\), for some \(L_0, L_1\), then \(L_{0H} \equiv {{\mathbf {\mathsf{{S}}}}}_{ABC}a\) and \(L_{1H} \equiv b\) and either \(L_0 \equiv K_0K_1\) for some \(K_0, K_1\) with \(K_{0H} \equiv {{\mathbf {\mathsf{{S}}}}}_{ABC}\) and \(K_{1H} \equiv a\) or \(L_0 \equiv \uplambda y^{A\rightarrow B}K\) for some K. In the latter case, inspection of Definition 6.1 shows that \((\uplambda y^{A\rightarrow B}K)_H \equiv {{\mathbf {\mathsf{{S}}}}}_{ABC}a\) is possible only if \(K \equiv \uplambda z^A.J(y^{A\rightarrow B}z^A)\) for some J in which \(y^{A\rightarrow B}\) does not occur. As before, \(\uplambda y^{A\rightarrow B}z^A.J(y^{A\rightarrow B}z^A) =_{\Lambda _2} \uplambda y^{A\rightarrow B}z^A.(\uplambda v^AJ)z^A(y^{A\rightarrow B}z^A)\) where \((\uplambda v^AJ)_H \equiv a\). Then \((\uplambda y^{A\rightarrow B}K)L_1 =_{\Lambda _2} \uplambda z^A.(\uplambda v^AJ)z^A(L_1z^A)\), where \((L_1)_H \equiv b\).

If \(L_0 \equiv K_0K_1\), then \(K_{0H} \equiv {{\mathbf {\mathsf{{S}}}}}_{ABC}\) while \(K_{1H} \equiv a\). In this case, inspection of Definition 6.1 shows that \(K_0 \equiv \uplambda x^{A\rightarrow B\rightarrow C}y^{A\rightarrow B}z^A.x^{A\rightarrow B\rightarrow C}z^A(y^{A\rightarrow B}z^A)\) is a necessary condition of \(K_{0H} \equiv {{\mathbf {\mathsf{{S}}}}}_{ABC}\). So \(L_0L_1 \equiv K_0K_1L_1 =_{\Lambda _2} \uplambda z^A(K_{1}z^A(L_1z^A))\), where \(K_{1H} \equiv a\) and \(L_{1H} \equiv b\).

If M contains \(\eta \)-redexes and \(M'\) is the term obtained from M by contracting them all, then by Definitions 4.6 and 6.1\(M_H \equiv M'_H\) and the foregoing argument applies.

Proposition 6.4

If \(M_H \equiv {{\mathbf {\mathsf{{S}}}}}_{ABC}abc\), then \(M =_{\Lambda _2} (\uplambda z^A.Nz^A(Pz^A))Q\) for some NPQ such that \(N_H \equiv a\), \(P_H \equiv b\) and \(Q_H \equiv c\).

Proof

as before, the general case can be reduced to the case where M contains no \(\eta \)-redexes. Then inspection of the right-hand terms of Definition 6.1 shows that M can only have the shape \(M_0Q\), where \(Q_H \equiv c\) and \(M_{0H} \equiv {{\mathbf {\mathsf{{S}}}}}_{ABC}ab\). Now apply Proposition 6.3.

Proposition 6.5

If \(e \equiv M_H\) is a weak redex which contracts to \(e'\), then M \(=_{\Lambda _2}\) R for some R such that \(R_H \equiv e'\).

Proof

e must have one of the four shapes shown in Table 5.2. If \(e \equiv {{\mathbf {\mathsf{{S}}}}}_{ABC}abc\), then by Proposition 6.4, \(M =_{\Lambda _2} NQ(PQ)\) for some NPQ such that \((NQ(PQ))_H \equiv ac(bc)\). If \(e \equiv {{\mathbf {\mathsf{{K}}}}}_{AB}ab\) then either \(M \equiv (\uplambda v^BM_1)M_2\) where \(M_{1H} \equiv a\) and \(M_{2H} \equiv b\) or \(M \equiv M_0M_1M_2\), where \(M_{0H} \equiv {{\mathbf {\mathsf{{K}}}}}_{AB}\), \(M_{1H} \equiv a\) and \(M_{2H} \equiv b\). In either case, \(M =_{\Lambda _2} M_{1}\) and \(M_{1H} \equiv a \equiv e'\). If \(e \equiv {{\mathbf {\mathsf{{I}}}}}_{A}a\) then \(M \equiv (\uplambda v^A.v^A)N\), where \(N_H \equiv a \equiv e'\). Finally, if \(e \equiv {{\mathbf {\mathsf{{U}}}}}^{\forall \phi A(\phi )\rightarrow A(C)}\wedge _{\alpha ,\phi }a\) then \(M \equiv (\Lambda _{\alpha ,\phi }N)C\) for some N having type \(A(\alpha )\) such that \(N_H \equiv a\). Then \(M =_{\Lambda _2} N(^{\alpha }_C)\) and \(N(^{\alpha }_C)_H \equiv a(^{\alpha }_C)\).

Proposition 6.6

If \(M_H \equiv \wedge _{\alpha ,\phi }a\), then \(M =_{\Lambda _2} \Lambda _{\alpha ,\phi }N\) for some N such that \(N_H \equiv a\).

Proof

by Definition 4.3, \(\wedge _{\alpha ,\phi }a\) will have one of the following shapes: \(\Pi _{\alpha ,\phi }a\), \(\Theta _{\forall \phi B}a\), \(\Xi _{\forall \phi (F\rightarrow G)}bc\). In the first and third cases, the stronger proposition \(M \equiv \Lambda _{\alpha ,\phi }N\) holds. For \(\Pi _{\alpha ,\phi }a\) can only be the H-transform of \(\Lambda _{\alpha ,\phi }N\) for some N such that \(N_H \equiv a\). In the third case, it might seem that \(\wedge _{\alpha ,\phi }a\) could also be the H-transform of \(M_0M_1\), where, for example, \(M_{0H} \equiv \Xi _{\forall \phi (F\rightarrow G)}b\) and \(M_{1H} \equiv b\). However there is no \(\Lambda _2\)-term of which \(\Xi _{\forall \phi (F\rightarrow G)}b\) is the H-transform, just as if there is none of which \(\Xi _{\forall \phi (F\rightarrow G)}\) is. An atom of this last shape could only be obtained by the H-transformation of Definition 6.1 if an application of clause (iv) yielded \(\Xi _{\forall \phi (F\rightarrow G)}xy\) and this were followed by two applications of clause (iii); but an application of clause (iv) can only yield terms of the shape \(\Xi _{\forall \phi (F\rightarrow G)}bc\) where b and c are themselves generalizations, which means they cannot be variables.

If \(\wedge _{\alpha ,\phi }a \equiv \Theta _{\forall \phi B}a\), then a can indeed be a variable and M can be an applicative term \(M_0M_1\). In this case, \(M_0 \equiv \uplambda x^B\Lambda _{\alpha ,\phi }.x^B\) so that \(M_{0H} \equiv [x^B]\Theta _{\forall \phi B}x^B \equiv \Theta _{\forall \phi B}\), while \(M_{1H} \equiv a\). But then \(M_0M_1\) \(\beta \)-contracts to \(\Lambda _{\alpha ,\phi }M_1\) where \(M_{1H} \equiv a\).

Definition 6.7

If b is the active redex in a, then d(ba), the depth of b within a, is defined to be:

  1. (i)

    0 if b is in head position within a;

  2. (ii)

    \(d(b,e)\) \(+\) \(1\) if \(a \equiv ce\) and b occurs inside e.

Proposition 6.8

If \(M_H \equiv a\) and b is the normal form of a, then there is an N such that \(M =_{\Lambda _2} N\) and \(N_H \equiv b\).

Proof

as a is an H-transform, by Definition 6.1a will be a CL\(_2\)-term. Moreover, \(\Xi \)-terms will occur in a only at the head of generalizations: that is, in a context \(\Xi _{\forall \phi (F\rightarrow G)}cd\), where c and d are themselves generalizations.

The proposition is trivial if there are no redexes in a. Assume then that there are redexes in a and let n be the number of terms following a in the reduction sequence which leads from a to b, let p be the depth of the active redex within a and let q be the number of times the operation \(c \mapsto \Pi _{\alpha ,\phi }c\) is used in the construction of a. The proposition is now proven by transfinite induction on \(\omega ^n+p+q\).

Only composite terms can be reduced, so a has either the shape \(\Pi _{\alpha , \phi }a_0\) or the shape \(a_0a_1\). In the former case, the proposition easily follows from the I.H., so we assume now that \(a \equiv a_0a_1\).

Case 1 a is \(ce_1e_2...e_m\), where \(0 \le m\) and the active redex in a is c, which is a weak redex. If \(M_H \equiv ce_1e_2...e_m\), then by Definition 6.1\(M \equiv PQ_1Q_2...Q_m\) for some \(P, Q_1, Q_2,\ldots Q_m\) such that \(P_H \equiv c\) etc. By Proposition 6.5, there is a \(P'\) such that \(PQ_1Q_2...Q_m =_{\Lambda _2} P'Q_1Q_2...Q_m\) and \(P'_H \equiv c'\), where \(c'\) is the contractum of c. By the hypothesis of the induction on n, there is a term N such that \(N_H\) is the normal form of \(c'e_1e_2...e_m\) and \(P'Q_1Q_2...Q_m =_{\Lambda _2} N\). The conclusion follows using the transitivity of =.

Case 2 a is \({{\mathbf {\mathsf{{S}}}}}_{ABC}c_1...c_i\) where \(i \in \lbrace 1,2\rbrace \) and \(c_1,\ldots c_i \in \) Ir. If i = 2, then a reduces in two steps to \(\uplambda x^A.c_1x^A(c_2x^A)\). By Proposition 6.3 there are terms NP such that \(M =_{\Lambda _2} \uplambda z^A.Nz^A(Pz^A)\) and \(N_H \equiv c_1\), \(P_H \equiv c_2\). By the stipulations of Sect. 5, a has a normal form only if \(c_1x^A(c_2x^A)\) does. By Definition 6.1, \((Nx^A(Px^A))_H \equiv c_1x^A(c_2x^A)\). Since the normal form of \(c_1x^A(c_2x^A)\) is reached in \(\le n{-}3\) steps, the I.H. of the induction on n entails there is a term Q such that \(Nx^A(Px^A) =_{\Lambda _2} Q\) and \(Q_H\) is the normal form of \(c_1x^A(c_2x^A)\). By rule \((\xi )\), \(\uplambda x^A.Nx^A(Px^A) =_{\Lambda _2} \uplambda x^A.Q\) while, by Definition 6.1, (\(\uplambda x^A.Q)_H \equiv [x^A]Q_H\) and, by the stipulations of Sect. 5, \([x^A]Q_H\) is the normal form of a. It follows that \(\uplambda x^A.Q\) is the term required to make the proposition true.

If i = 1, \(M \equiv \uplambda v^{A\rightarrow B}K\) for some K such that \(K_H \equiv {{\mathbf {\mathsf{{S}}}}}_{ABC}c_1v^{A\rightarrow B}\) for some \(v^{A\rightarrow B}\) not in \(c_1\). By the argument of the last paragraph, there is then a Q such that \(K =_{\Lambda _2} \uplambda x^A.Q\) and (\(\uplambda x^A.Q)_H \equiv e\) for some e which is the normal form of \({{\mathbf {\mathsf{{S}}}}}_{ABC}c_1v^{A\rightarrow B}\). In that case, \([v^{A\rightarrow B}]e\) is the normal form of \({{\mathbf {\mathsf{{S}}}}}_{ABC}c_1\) and (\(\uplambda v^{A\rightarrow B}x^A.Q)_H \equiv [v^{A\rightarrow B}]e\).

Case 3 a is a \(\wedge \)-redex, say \(\wedge _{\alpha ,\phi }b\). By Definition 5.6 and Proposition 6.2, \(a\;\triangleright _1\;\Lambda _{\alpha ,\phi }b\). Since a is a redex it follows from Definition 5.6 that b is reducible. There is therefore a reduction of \(n{-}2\) steps taking b to its normal form \(b'\). By Proposition 6.6, \(M \equiv \Lambda _{\alpha ,\phi }P\) for some P such that \(P_H \equiv b\), so by the I.H. on n there is a Q such that \(P =_{\Lambda _2} Q\) and \(Q_H \equiv b'\). By rule (\(\xi \)) for type-abstraction, \(M \equiv \Lambda _{\alpha ,\phi }P =_{\Lambda _2} \Lambda _{\alpha ,\phi }Q\), so \(\Lambda _{\alpha ,\phi }Q\) is the term that makes the proposition true in this case.

Case 4 by Definition 5.8, the only remaining possibility is that \(a \equiv Xd_1d_2...d_m\) for some atom X and the active redex in a is inside \(d_k\). In this case, there are various possible shapes that M may have but, by Propositions 6.3, 6.4 and 6.6, M will have a component \(Q_k\) such that \(Q_{kH} \equiv d_k\). By Definition 5.8, the reduction of a must begin by repeatedly contracting the active redex in \(d_k\) until its normal form \(d'\) is reached, at which point either we have reached the normal form of a or else there is an active redex in \(Xd_1d_2...d'...d_m\). In the latter case, since less than n steps are needed in order to reach \(d'\), by the I.H. of the induction on n there is a \(\Lambda _2\)-term \(Q'\) such that \(Q_k =_{\Lambda _2} Q'\) and \(Q'_H \equiv d'\), from which follows that M can be converted to a term \(M'\) such that \(M'_H \equiv Xd_1d_2...d'...d_m\) by converting the component \(Q_k\) to \(Q'\) while, by another application of the I.H. on n, \(M'\) can in turn be converted to a term N such that \(N_H\) is the normal form of \(Xd_1d_2...d'...d_m\) and therefore of a.

If n reduction steps are needed in order to reduce \(d_k\) to its normal form, then use is made of the I.H. on the depth of the active redexes, since this is smaller for \(d_k\) than it is for a.

Proposition 6.9

If \(M_H \in \) Ir and there are no \(\eta \)-redexes in M, then there are no \(\beta \)-redexes in M either.

Proof

by induction on the construction of M.

If \(M \equiv M_0M_1\), then \(M_H \equiv M_{0H}M_{1H}\). Because \(M_{0H}M_{1H}\) is in Ir, it follows from the definition of Ir (Table 5.1) together with Proposition 5.10 that \(M_{0H}\) and \(M_{1H}\) contain no redexes. It follows by the I.H. that \(M_0\) and \(M_1\) contain no redexes of \(\Lambda _2\). Nor can \(M_0M_1\) be a \(\beta \)-redex because, if it were, either \(M_0\) would be an \(\eta \)-redex or \(M_{0H}M_{1H}\) would be a weak redex and therefore not in Ir.

If \(M \equiv \uplambda v^AN\) for some \(v^A\) and N, then by Definition 6.1\(M_H\) is \({{\mathbf {\mathsf{{S}}}}}_{ABC}ab\), \({{\mathbf {\mathsf{{K}}}}}_{BA}a\) or \({{\mathbf {\mathsf{{I}}}}}_{A}\) for some abBC (there would be other possibilities if M were an \(\eta \)-redex but it is not). So one of the following must hold: \(N \equiv N_0N_1, N_H \equiv a\) or \(N \equiv v^A\). In the second and third cases, N clearly contains no \(\beta \)-redex (using the I.H. in the case where \(N_H \equiv a\)). By definition of Ir, \({{\mathbf {\mathsf{{S}}}}}_{ABC}ab\) is in Ir only if \(N_H\) is. By the I.H., \(N_H\) is in Ir only if N contains no \(\beta \)-redex.

If \(M \equiv \Lambda _{\alpha ,\phi }N\) then \(M_H \equiv \wedge _{\alpha ,\phi }N_H\). By the definition of Ir (Table 5.1; especially row (c) and the condition appended to (f)) \(\wedge _{\alpha ,\phi }N_H\) is in Ir only if \(N_H\) is. By the I.H., there are no \(\beta \)-redexes in N.

If \(M \equiv NC\) for some type C, then by Definition 6.1\(M_H \equiv {{\mathbf {\mathsf{{U}}}}}^{\forall \phi B(\phi )\rightarrow B(C)}N_H\) for some B and \(\phi \). Since this is in Ir, so is \(N_H\) and therefore, by the I.H., N contains no \(\beta \)-redexes.

Proposition 6.10

If \(M_H \in \) Ir, there are no type-redexes in M.

Proof

by induction on the construction of M.

If \(M \equiv NC\) for some type C, then by Definition 6.1\(M_H \equiv {{\mathbf {\mathsf{{U}}}}}^{\forall \phi B(\phi )\rightarrow B(C)}N_H\) for some B and \(\phi \). Since this term is in Ir it is, by Proposition 5.10, not a redex and therefore \(N_H\) is not a generalization. By Definition 6.1, it follows that N is not formed by \(\Lambda \)-abstraction.

That no subterm of M is a type-redex is shown, both in this case and in general, using the definition of Ir and the I.H., just as in the proof of 6.9.

Theorem 1

If \(\Sigma \) is a class of \(CL_2\)-terms such that every term in \(\Sigma \) has a normal form, so does every \(\Lambda _2\)-term M such that \(M_H\) is in \(\Sigma \).

Proof

if \(M_H \in \Sigma \), by Proposition 6.8 there is an N such that \(M =_{\Lambda _2} N\) and \(N_H\) is the normal form of \(M_H\). Let all \(\eta \)-redexes in N be contracted and let \(N'\) be the result. By Definition 4.6, \(N_H \equiv N'_H\). By Propositions 6.9 and 6.10, \(N'\) contains no redexes.

7 Normalizability of CL\(_2\)-terms

Theorem 1 shows (or, at least, so it will be argued in Sect. 8) that interesting consequences follow from the assumption that all CL\(_2\)-terms are normalizable. The theorem will, admittedly, be of limited interest if there are CL\(_2\)-terms which are not normalizable. The purpose of this section is to apply the method invented by Girard ([6]) and used in [1, 7, 12, 19, 24] to show that there are no such terms.

Definition 7.1

The rank of a type A, henceforth \(\rho (A)\), is defined to be:-

  1. (i)

    0 if A is a type-variable.

  2. (ii)

    max\(\lbrace \rho (B)\), \(\rho (C)\rbrace + 1\) if \(A \equiv B\) \(\rightarrow \) \(C\).

  3. (iii)

    \(\rho (B(\alpha )) + 1\) if \(A \equiv \forall \phi B(\phi )\).

Definition 7.2

For each type A, a subset S of \(CL_2(A)\) is regular iff:

  1. (i)

    Every term in S has a normal form.

  2. (ii)

    S contains every term of type A that can be built up by starting with either a variable or a term of the shape \(\Xi _{\forall \phi (F\rightarrow G)}ab\) which is irreducible and not a generalization and applying the two operations:

(\(\alpha \)):

a \(\mapsto ab_1b_2...b_n\) for some normalizable \(CL_2\)-terms \(b_1, b_2,\ldots b_n\).

(\(\beta \)):

a \(\mapsto Xa\), for some \({{\mathbf {\mathsf{{U}}}}}\)-term X.

  1. (iii)

    If \(a' \in \) S and \(a \triangleright a'\), then \(a \in \) S.

  2. (iv)

    If \(a \in \) S and \(a \triangleright a'\), then \(a' \in \) S.

Definition 7.3

A valuation (sc., of the free type-variables) is a function \(\xi \) with the property that, for every \(\alpha \), there is an A such that \(\xi (\alpha )\) is a regular subset of \(CL_2(A)\).

The letters \(\xi , \xi ', \xi '',\ldots \) will range over valuations.

Definition 7.4

For each type A, \(A^{\xi }\) is the type defined by:

  1. (i)

    If A \(\equiv \alpha \), then \(A^{\xi }\) is the type of the elements of \(\xi (\alpha )\).

  2. (ii)

    If A \(\equiv B\) \(\rightarrow \) \(C\), then \(A^{\xi } \equiv B^{\xi }\) \(\rightarrow \) \(C^{\xi }\).

  3. (iii)

    If A \(\equiv \forall \phi (B(^{\alpha }_{\phi }))\), then \(A^{\xi } \equiv \forall \phi (B^{\xi '}(^{\alpha }_{\phi }))\), where \(\xi '\) assigns an arbitrary regular set of type \(\alpha \) to \(\alpha \) and otherwise agrees with \(\xi \).

Corollary

if \(\xi (\alpha )\) is a set of type C, then \((A(\alpha ))^{\xi } \equiv A^{\xi }(C)\).

Definition 7.5

For any \(CL_2\)-term a and any valuation \(\xi \), the term \(a^{\xi }\) shall be constructed as follows:

  1. (i)

    If a is an atom of type A described on some row of Table 4.1, \(a^{\xi }\) shall be the atom of type \(A^{\xi }\) on the same row.

  2. (ii)

    If \(a \equiv a_0a_1\), then \(a^{\xi }\) shall be \(a_0^{\xi }\) \(a_1^{\xi }\)

  3. (iii)

    If \(a \equiv \Pi _{\alpha ,\phi }a_0\), then \(a^{\xi }\) shall be \(\Pi _{\alpha ,\phi }(a_0^{\xi '})\), where \(\xi '\) takes \(\alpha \) to a set of type \(\alpha \) but otherwise agrees with \(\xi \).

Corollary

if \(a \in \) CL\(_2(A)\) then \(a^{\xi } \in \) CL\(_2(A^{\xi })\).

The next definition defines the sequence of sets Comp\(_{\xi ,A}\), read as “the set of terms that are computable relative to A and \(\xi \)”, for each \(\xi \) and A.Footnote 6 The definition is by recursion on \(\rho (A)\). In clause (iii), the type \(\forall \phi B(\phi )\) shall be \((\forall \phi A(\phi ))^{\xi }\).

Definition 7.6

  1. (i)

    If \(A \equiv \alpha \), then Comp\(_{\xi ,A} = \xi (\alpha )\).

  2. (ii)

    Comp\(_{\xi ,A\rightarrow B} = \lbrace a :\) if b \(\in \) Comp\(_{\xi ,A}\) then ab \(\in \) Comp\(_{\xi ,B}\rbrace \).

  3. (iii)

    Comp\(_{\xi ,\forall \phi A(\phi )} = \lbrace a :\) for each type C, \({{\mathbf {\mathsf{{U}}}}}^{\forall \phi B(\phi )\rightarrow B(C)}a \in \) Comp\(_{\xi ' ,A(\alpha )}\) for every valuation \(\xi '\) which assigns a regular set of type C to \(\alpha \) and otherwise agrees with \(\xi \rbrace \).

Proposition 7.7

For each A and \(\xi \), Comp\(_{\xi ,A}\) is a subset of CL\(_2(A^{\xi })\).

Proof

by induction on \(\rho (A)\). If \(A \equiv B\) \(\rightarrow \) \(C\), then by clause (ii) of Definition 7.6, Comp\(_{\xi ,A}\) is a set of terms each of which, applied to a term in Comp\(_{\xi ,B}\) yields a term in Comp\(_{\xi ,C}\). Applying the I.H., this is equivalent to saying Comp\(_{\xi ,A}\) is a set of terms of type \(B^{\xi }\) \(\rightarrow \) \(C^{\xi }\). Suppose A is a quantified type and \(A^{\xi } \equiv \forall \phi B(\phi )\). Then the elements of Comp\(_{\xi ,A}\) must have type \(\forall \phi B(\phi )\) in order for \({{\mathbf {\mathsf{{U}}}}}^{\forall \phi B(\phi )\rightarrow B(C)}a\) to be well-formed.

Definition 7.8

For any term b of type B and any A and \(\xi \) such that \(A^{\xi } \equiv B\), a term c is an \((A,\xi )\)-supplementation of b iff one of the following conditions is fulfilled:

  1. (i)

    A is a type-variable and \(c \equiv b\).

  2. (ii)

    \(A \equiv A_0\) \(\rightarrow \) \(A_1\) and c is an \((A_1,\xi )\)-supplementation of bd for some d in Comp\(_{A_0 ,\xi }\).

  3. (iii)

    \(A \equiv \forall \phi D(\phi )\) for some D, which means that B is also a quantified type, say \(\forall \phi E(\phi )\). Moreover, there is a C such that \(\xi (\alpha )\) is of type C, \((D(\alpha ))^{\xi } \equiv E(C)\) and c is a \((D(\alpha ),\xi )\)-supplementation of \({{\mathbf {\mathsf{{U}}}}}^{\forall \phi E(\phi )\rightarrow E(C)}b\).

It follows easily from this definition that an (\(A,\xi \))-supplementation of a term in CL\(_2(A^{\xi })\) will have the type of the terms in \(\xi (\beta )\), for some \(\beta \). Further work, though, is needed to show that it will also be in \(\xi (\beta )\) for some \(\beta \).

Proposition 7.9

For any term b of type B and any A and \(\xi \) such that \(A^{\xi } \equiv B\), if every \((A,\xi )\)-supplementation of b is in \(\xi (\alpha )\) for some \(\alpha \) occurring in A, then \(b \in Comp_{\xi ,A}\).

Proof

by induction on \(\rho (A)\). If A is a type-variable, the proposition is obvious. Suppose \(A \equiv A_0\) \(\rightarrow \) \(A_1\) and let d be in Comp\(_{\xi ,A_0}\). By the I.H., if every \((A_1 ,\xi )\)-supplementation of bd is in Comp\(_{\xi ,\alpha }\) for some \(\alpha \) occurring in \(A_1\), then bd is in Comp\(_{\xi ,A_1}\). By Definitions 7.6 and 7.8, this is equivalent to the proposition to be proven.

To deal with the case where \(A \equiv \forall \phi D(\phi )\), note that the last sentence of clause (iii) of Definition 7.8 is equivalent (using the corollary to Definition 7.4) to the following:

There are a C and a \(\xi '\) such that \(\xi '\) takes \(\alpha \) to a set of type C and otherwise agrees with \(\xi \); and c is a \((D(\alpha ),\xi ')\)-supplementation of \({{\mathbf {\mathsf{{U}}}}}^{\forall \phi E(\phi )\rightarrow E(C)}b\).

Suppose now \(A \equiv \forall \phi D(\phi )\) while \(A^{\xi } \equiv \forall \phi E(\phi )\) and every \((\forall \phi D(\phi ),\xi )\)-supplementation of b is in \(\xi (\beta )\) for some \(\beta \) occurring in A. That is, by Definition 7.8, for every C and \(\xi '\) such that \(\xi '\) takes \(\alpha \) to a set of type C and otherwise agrees with \(\xi \), every (\(D(\alpha ),\xi '\))-supplementation of b is in \(\xi (\beta )\) for some \(\beta \) occurring in A. Since \(\alpha \) does not occur in A, \(\beta \) is distinct from \(\alpha \) and “\(\xi (\beta )\)” may here be replaced with “\(\xi '(\beta )\)”. Then by the I.H. \({{\mathbf {\mathsf{{U}}}}}^{\forall \phi E(\phi )\rightarrow E(C)}b\) is in Comp\(_{\xi ' ,D(\alpha )}\) for every such \(\xi '\). But this is precisely the condition for b to be in Comp\(_{\xi ,\forall \phi D(\phi )}\).

Proposition 7.10

  1. (i)

    For every \(\xi \) and A, every element of Comp\(_{\xi ,A}\) has a normal form.

  2. (ii)

    For every term-variable \(x^A\) and every \(\xi \), \((x^A)^{\xi } \in \) Comp\(_{\xi ,A}\).

Proof

the two halves are proven simultaneously by induction on \(\rho (A)\). If A is a type-variable \(\alpha \), then \((x^A)^{\xi } \in \) Comp\(_{\xi ,A}\) by clause (ii) of Definition 7.2 while every element of Comp\(_{\xi ,A}\) has a normal form by clause (i) of the same definition.

Suppose \(\rho (A)>\) 0. Then an (\(A,\xi \))-supplementation of \((x^A)^{\xi }\) is built up by starting with \((x^A)^{\xi }\) and applying the two operations \(a \mapsto ab_1b_2...b_n\) for computable \(b_1, b_2,\ldots b_n\) and \(a \mapsto Xa\), for some \({{\mathbf {\mathsf{{U}}}}}\)-term X. The terms \(b_1, b_2,\ldots b_n\) used in the first operation all have normal forms by part (i) of the I.H., so every (\(A,\xi \))-supplementation of \((x^A)^{\xi }\) is in \(\xi (\beta )\) for some \(\beta \) by clause (ii) of Definition 7.2, which is sufficient by Proposition 7.9 for \((x^A)^{\xi }\) to be in Comp\(_{\xi ,A}\).

Suppose \(a \in \) Comp\(_{\xi ,A}\) and \(A \equiv A_0\) \(\rightarrow \) \(A_1\). Any variable \(x^B\), where \(B \equiv A_0^{\xi }\), is in Comp\(_{\xi ,A_0}\) by part (ii) of the I.H. By Definition 7.6 therefore, \(ax^B\) is in Comp\(_{\xi ,A_1}\) and therefore has a normal form, say \(a'\), by part (i) of the I.H. So \([x^B]a'\) is irreducible by clause (b) of Table 5.1 and is also, by Sect. 5, a term to which a reduces.

Finally if \(a \in \) Comp\(_{\xi ,A}\), if \(A \equiv \forall \phi B(\phi )\) and if \(B^{\xi }(\alpha ) \equiv D(\alpha )\), then by clause (iii) of Definition 7.6, \({{\mathbf {\mathsf{{U}}}}}^{\forall \phi D(\phi )\rightarrow D(\alpha )}a \in \) Comp\(_{\xi ' ,B(\alpha )}\) where \(\xi '\) is a valuation which takes \(\alpha \) to a set of type \(\alpha \) and otherwise agrees with \(\xi \). \(B(\alpha )\) has a lower rank than \(\forall \phi B(\phi )\), so by part (i) of the I.H. \({{\mathbf {\mathsf{{U}}}}}^{\forall \phi D(\phi )\rightarrow D(\alpha )}a\) has a normal form, which can only be the case if \({{\mathbf {\mathsf{{U}}}}}^{\forall \phi D(\phi )\rightarrow D(\alpha )}a'\), where \(a'\) is the normal form of a, is a term of its reduction sequence.

Proposition 7.11

Every \((A,\xi )\)-supplementation of a term b in \(Comp_{\xi ,A}\) belongs to \(\xi (\beta )\) for some \(\beta \) which occurs in A.

Proof

by induction on \(\rho (A)\). If A is a type-variable \(\alpha \), then \(\xi (\alpha )\) = Comp\(_{\xi ,A}\). Suppose \(A^{\xi } \equiv B\). If \(A \equiv A_0\) \(\rightarrow \) \(A_1\) then \(B \equiv B_1\) \(\rightarrow \) \(B_1\), where \(B_i \equiv A_i^{\xi }\) for i = 1,2. Let c be an arbitrary term in Comp\(_{\xi ,A_0}\); then by the I.H. every \((A_1,\xi )\)-supplementation of bc is in \(\xi (\beta )\) for some \(\beta \); but by Definition 7.8 this is all that is required. If \(A \equiv \forall \phi D(\phi )\) then \(B \equiv \forall \phi E(\phi )\), for some \(E(\alpha )\) such that \(D(\alpha ) \equiv E(\alpha )^{\xi '}\), for any valuation \(\xi '\) which takes \(\alpha \) to a set of terms of type \(\alpha \) and otherwise agrees with \(\xi \). Suppose that \(\xi \) takes \(D(\alpha )\) to E(C); then, by the I.H., every \((D(\alpha ),\xi )\)-supplementation of \({{\mathbf {\mathsf{{U}}}}}^{\forall \phi E(\phi )\rightarrow E(C)}b\) is in \(\xi (\beta )\) for some \(\beta \); but then by Definition 7.8 so is every \((A,\xi )\)-supplementation of b.

Proposition 7.12

Comp\(_{\xi ,A}\) is a regular set for every \(\xi \) and A.

Proof

by Proposition 7.10, Comp\(_{\xi ,A}\) has property (i) of the four properties mentioned in Definition 7.2.

That it has property (ii) will be shown by induction on \(\rho (A)\). The basis is obvious. Let \(A \equiv A_0\) \(\rightarrow \) \(A_1\) and a be a term of type \(A^{\xi }\) that is built up in the manner described in clause (ii) of Definition 7.2. Let b be a normalizable term of type \(A^{\xi }_0\); now by the I.H. Comp\(_{\xi ,A_1}\) is a regular set and therefore contains ab, since that term is of the shape described in clause (ii) of Definition 7.2. If this is so for every normalizable b, then a fortiori (by Proposition 7.10) it is so for every b that is in Comp\(_{\xi ,A_0}\). But the latter is the very condition required by Definition 7.6 for a to be in Comp\(_{\xi ,A}\). A fairly similar argument deals with the case where A is a quantified type.

That Comp\(_{\xi ,A}\) has properties (iii) and (iv) of the four properties mentioned in Definition 7.2 can be shown by considering an arbitrary \((A,\xi )\)-supplementation of a term in Comp\(_{\xi ,A}\) and applying Propositions 7.9 and  7.11.

Corollary

if a is in Comp\(_{\xi ,A}\), then so is every term to which a reduces.

Proposition 7.13

Comp\(_{\xi ,A(D)} = Comp_{\xi ',A(\alpha )}\) for every \(\xi '\) such that \(\xi '(\alpha )\) = Comp\(_{\xi , D}\) and \(\xi '\) otherwise agrees with \(\xi \).

Proof

by induction on \(\rho (A(\alpha ))\) (cf. lemma 11.3.6 in [24]).

Proposition 7.14

If a \(\in CL_2(A)\) is an atomic term, then \(a^{\xi } \in Comp_{\xi ,A}\) for every \(\xi \).

Proof

if a is a variable, this is Proposition 7.10. If a is an atomic constant, six cases are distinguished according to which kind of constant a is.

(Case 1) \(a \equiv {{\mathbf {\mathsf{{S}}}}}_{CDE}\). By Definition 7.6 and Proposition 7.9, it suffices to show that, whenever \(c^{\xi }, d^{\xi }, e^{\xi }\) belong to Comp\(_{\xi ,C\rightarrow D\rightarrow E}\), Comp\(_{\xi ,C\rightarrow D}\), Comp\(_{\xi ,C}\) respectively and \(f({{\mathbf {\mathsf{{S}}}}}^{\xi }_{CDE}c^{\xi }d^{\xi }e^{\xi })\) is an \((A ,\xi )\)-supplementation of \({{\mathbf {\mathsf{{S}}}}}^{\xi }_{CDE}\), then \(f({{\mathbf {\mathsf{{S}}}}}^{\xi }_{CDE}c^{\xi }d^{\xi }e^{\xi }) \in \xi (\beta )\) for some \(\beta \).

By Proposition 7.10, each of \(c^{\xi }, d^{\xi }, e^{\xi }\) has a normal form; let \(c'^{\xi }, d'^{\xi }, e'^{\xi }\) be the respective normal forms of \(c^{\xi }, d^{\xi }, e^{\xi }\). By the corollary to Proposition 7.12, \(c'^{\xi }, d'^{\xi }, e'^{\xi }\) belong to Comp\(_{\xi ,C\rightarrow D\rightarrow E}\), Comp\(_{\xi ,C\rightarrow D}\) and Comp\(_{\xi ,C}\) respectively. By clause (ii) of Definition 7.6, \(c'^{\xi }e'^{\xi }(d'^{\xi }e'^{\xi }) \in \) Comp\(_{\xi ,E}\) and therefore (by Proposition 7.11) there is a \(\beta \) such that \(f(c'^{\xi }e'^{\xi }(d'^{\xi }e'^{\xi })) \in \xi (\beta )\), where f is as in the last paragraph. By clause (iii) of Definition  7.2, \(f({{\mathbf {\mathsf{{S}}}}}^{\xi }_{CDE}c^{\xi }d^{\xi }e^{\xi }) \in \xi (\beta )\) follows.

Cases 2 and 3, i.e. where a is a typed instance of \({{\mathbf {\mathsf{{K}}}}}\) or \({{\mathbf {\mathsf{{I}}}}}\) are treated likewise.

Case 4 \(a \equiv {{\mathbf {\mathsf{{U}}}}}^{\forall \phi B(\phi )\rightarrow B(C)}\). Let \((\forall \phi B(\phi )\) \(\rightarrow \) \(B(C))^{\xi } \equiv \forall \phi D(\phi )\) \(\rightarrow \) \(D(E)\) and let b be in Comp\(_{\xi ,\forall \phi B(\phi )}\). By clause (iii) of Definition 7.6, \({{\mathbf {\mathsf{{U}}}}}^{\forall \phi D(\phi )\rightarrow D(E)}b\) is in Comp\(_{\xi ',B(\alpha )}\) for every \(\xi '\) that takes \(\alpha \) to a set of type E and otherwise agrees with \(\xi \), including the particular \(\xi '\) that takes \(\alpha \) to Comp\(_{\xi ,C}\). By Proposition 7.13, \({{\mathbf {\mathsf{{U}}}}}^{\forall \phi D(\phi )\rightarrow D(E)}b \in \) Comp\(_{\xi ,B(C)}\), from which the conclusion follows by clause (ii) of Definition 7.6.

Case 5 \(a \equiv \Xi _{\forall \phi (F\rightarrow G)}\). Let \(D(\alpha )\) and \(E(\alpha )\) be types such that \(\forall \phi D(\phi ) \equiv \forall \phi F\) and \(\forall \phi E(\phi ) \equiv \forall \phi G\). Let bc belong to Comp\(_{\xi ,\forall \phi (F\rightarrow G)}\), Comp\(_{\xi ,\forall \phi F}\) respectively and let \(f({{\mathbf {\mathsf{{U}}}}}^{\forall \phi (D^{\xi }(\phi )\rightarrow E^{\xi }(\phi ))\rightarrow D^{\xi }(C)\rightarrow E^{\xi }(C)}(\Xi ^{\xi }_{\forall \phi (F\rightarrow G)}bc))\) be an \((A ,\xi )\)-supplementation of \(\Xi ^{\xi }_{\forall \phi (F\rightarrow G)}\). Under these assumptions, it suffices to show this last term belongs to \(\xi (\beta )\) for some \(\beta \).

The assumptions on b and c entail, by Proposition 7.10, that b and c have normal forms; let the latter be \(b'\) and \(c'\) respectively.

Subcase 5a in both of the reduction sequences leading from b to \(b'\) and from c to \(c'\), there are \(\alpha /\phi \)-generalizations; let \(\wedge _{\alpha ,\phi }e_0\) and \(\wedge _{\alpha ,\phi }e_1\) be the first such generalizations which appear in the two respective reduction sequences.

The computability of b and c also entails, by clause (iii) of Definition 7.6, that \({{\mathbf {\mathsf{{U}}}}}^{\forall \phi (D^{\xi }(\phi )\rightarrow E^{\xi }(\phi ))\rightarrow D^{\xi }(\alpha )\rightarrow E^{\xi }(\alpha )}b({{\mathbf {\mathsf{{U}}}}}^{\forall \phi (D^{\xi }(\phi )\rightarrow D^{\xi }(\alpha )}c)\) is in Comp\(_{\xi ',E(\alpha )}\) for every \(\xi '\) that takes \(\alpha \) to a set of type \(\alpha \). This term therefore has a terminating complete reduction that proceeds through \({{\mathbf {\mathsf{{U}}}}}^{\forall \phi (D^{\xi }(\phi )\rightarrow E^{\xi }(\phi ))\rightarrow D^{\xi }(\alpha )\rightarrow E^{\xi }(\alpha )}\) \((\wedge _{\alpha ,\phi }e_0')\) \(({{\mathbf {\mathsf{{U}}}}}^{\forall \phi (D^{\xi }(\phi )\rightarrow D^{\xi }(\alpha )}\)\((\wedge _{\alpha ,\phi }e_1'))\), where \(e_0'\) and \(e_1'\) are the respective normal forms of \(e_0\) and \(e_1\). Let its normal form, therefore, be \(e'\).

By Sect. 5, the term \(f({{\mathbf {\mathsf{{U}}}}}^{\forall \phi (D^{\xi }(\phi )\rightarrow E^{\xi }(\phi ))\rightarrow D^{\xi }(C)\rightarrow E^{\xi }(C)}(\Xi ^{\xi }_{\forall \phi (F\rightarrow G)}bc))\) is reduced by first replacing the component \(\Xi ^{\xi }_{\forall \phi (F\rightarrow G)}bc\) with successive reducts as follows:

  • \(\Xi ^{\xi }_{\forall \phi (F\rightarrow G)}bc\) \(\triangleright \) \(\Xi ^{\xi }_{\forall \phi (F\rightarrow G)}(\wedge _{\alpha ,\phi }e_0)c\)

  •       \(\triangleright \) \(\Xi ^{\xi }_{\forall \phi (F\rightarrow G)}(\wedge _{\alpha ,\phi }e_0')c\quad \)

  •       \(\triangleright \) \(\Xi ^{\xi }_{\forall \phi (F\rightarrow G)}(\wedge _{\alpha ,\phi }e_0')(\wedge _{\alpha ,\phi }e_1')\)

  •       \(\triangleright \) \(\Lambda _{\alpha ,\phi }(e_0'e_1')\quad \) by Definitions 5.6 and 5.8, assuming \(e'_0e_1' \notin \) Ir

  •       \(\triangleright \) \(\Lambda _{\alpha ,\phi }(e')\quad \quad \) because \(e_0'e_1'\;\triangleright \; e'\) (see above)

  •       \(\triangleright \) \(\wedge _{\alpha ,\phi }(e')\quad \)

The term \(f({{\mathbf {\mathsf{{U}}}}}^{\forall \phi (D^{\xi }(\phi )\rightarrow E^{\xi }(\phi ))\rightarrow D^{\xi }(C)\rightarrow E^{\xi }(C)}(\Xi ^{\xi }_{\forall \phi (F\rightarrow G)}bc))\) therefore reduces to \(f(e'(^{\alpha }_C))\).

On the other hand, \({{\mathbf {\mathsf{{U}}}}}^{\forall \phi (D^{\xi }(\phi )\rightarrow E^{\xi }(\phi ))\rightarrow D^{\xi }(C)\rightarrow E^{\xi }(C)}b({{\mathbf {\mathsf{{U}}}}}^{\forall \phi (D^{\xi }(\phi )\rightarrow D^{\xi }(C)}c)\) is in Comp\(_{\xi ,D(\alpha )}\) in consequence of the assumptions \(b \in \) Comp\(_{\xi ,\forall \phi (F\rightarrow G)}\) and \(c \in \) Comp\(_{\xi ,\forall \phi F}\), so the term \(f({{\mathbf {\mathsf{{U}}}}}^{\forall \phi (D^{\xi }(\phi )\rightarrow E^{\xi }(\phi ))\rightarrow D^{\xi }(C)\rightarrow E^{\xi }(C)}b({{\mathbf {\mathsf{{U}}}}}^{\forall \phi (D^{\xi }(\phi )\rightarrow D^{\xi }(C)}c))\), being a \((D(\alpha ),\xi )\)-supplementation of it, is in \(\xi (\beta )\) for some \(\beta \) by Proposition 7.11. By Sect. 5, it reduces to \(f(e'_0(^{\alpha }_C)e'_1(^{\alpha }_C))\), which, by Proposition 5.11, also reduces to \(f(e'(^{\alpha }_C))\). By clause (iv) of Definition 7.2, this is in \(\xi (\beta )\) for some \(\beta \), hence by clause (iii) of the same Definition so is any term that reduces to \(f(e'(^{\alpha }_C))\).

Subcase 5b \(\Xi ^{\xi }_{\forall \phi (F\rightarrow G)}b'c'\) is not an \(\alpha /\phi \)-generalization. Then, by clause (ii) of Definition 7.2, \(f({{\mathbf {\mathsf{{U}}}}}^{\forall \phi (D^{\xi }(\phi )\rightarrow E^{\xi }(\phi ))\rightarrow D^{\xi }(C)\rightarrow E^{\xi }(C)} (\Xi ^{\xi }_{\forall \phi (F\rightarrow G)}b'c'))\), being irreducible, belongs to \(\xi (\beta )\) for every \(\beta \). By clause (iii), so does \(f({{\mathbf {\mathsf{{U}}}}}^{\forall \phi (D^{\xi }(\phi )\rightarrow E^{\xi }(\phi ))\rightarrow D^{\xi }(C)\rightarrow E^{\xi }(C)} (\Xi ^{\xi }_{\forall \phi (F\rightarrow G)}bc))\).

Case 6 \(a \equiv \Theta _{\forall \phi B}\). An arbitrary (\(\forall \phi B,\xi \))-supplementation of this has the shape \(f({{\mathbf {\mathsf{{U}}}}}^{\forall \phi (B^{\xi }(\phi )\rightarrow B^{\xi }(C)}(\Theta ^{\xi }_{\forall \phi B}b))\), where \(b \in \) Comp\(_{\xi ,B}\) and therefore has a normal form \(b'\). Moreover, \(b'\) is also the normal form of \({{\mathbf {\mathsf{{U}}}}}^{\forall \phi (B^{\xi }(\phi )\rightarrow B^{\xi }(C)}(\Theta ^{\xi }_{\forall \phi B}b)\), since, as we saw in Sect. 4, the outermost quantifier in \(\forall \phi B\) does not actually bind anything. So \(f(b')\) is a well-formed term. Moreover, by the corollary to Proposition 7.12 together with Proposition 7.11, \(f(b')\) is in \(\xi (\beta )\) for some \(\beta \). By clause (iii) of Definition 7.2, so is \(f({{\mathbf {\mathsf{{U}}}}}^{\forall \phi (B^{\xi }(\phi )\rightarrow B^{\xi }(C)}(\Theta ^{\xi }_{\forall \phi B}b))\).

Proposition 7.15

For every term a of \(CL_2(A)\) and every \(\xi \), \(a^{\xi }\) is in Comp\(_{\xi ,A}\).

Proof by induction on the construction of terms. If a is formed by application, then the conclusion follows from the I.H. by clause (ii) of Definition 7.6. Suppose \(a \equiv \Pi _{\alpha ,\phi }b\) and let b have type \(B(\alpha )\). Then an arbitrary (\(\forall \phi B(\phi ),\xi \))-supplementation of a has the shape \(f({{\mathbf {\mathsf{{U}}}}}^{\forall \phi (B^{\xi }(\phi )\rightarrow B^{\xi }(C)}(\Pi _{\alpha ,\phi }b)^{\xi })\). By the I.H.,

$$\begin{aligned} b^{\xi }\in {Comp}_{\xi , B(\alpha )}\;\text {for every}\; \xi . \end{aligned}$$
(5)

Let \(\xi \) now be fixed and let D be such that \(D^{\xi } \equiv C\). Then, instantiating (5), \(b^{\xi '} \in \) Comp\(_{\xi ',B(\alpha )}\) for any valuation \(\xi '\) that takes \(\alpha \) to Comp\(_{\xi ,D}\) and otherwise agrees with \(\xi \). So by Proposition 7.13\(b^{\xi '} \in \) Comp\(_{\xi ,B(D)}\) and therefore \(f(b^{\xi '})\) is in \(\xi (\beta )\) for some \(\beta \). \(b^{\xi '}\) is the same term as \((b(^{\alpha }_D))^{\xi }\) and \(f({{\mathbf {\mathsf{{U}}}}}^{\forall \phi (B^{\xi }(\phi )\rightarrow B^{\xi }(C)}(\Pi _{\alpha ,\phi }b)^{\xi })\) reduces to \(f((b'(^{\alpha }_D))^{\xi })\), where \(b'\) is the normal form of b. However, by clauses (iii) and (iv) of Definition 7.2, any term that reduces to \(f((b'(^{\alpha }_D))^{\xi })\) is in \(\xi (\beta )\) for some \(\beta \).

Theorem 2

Every \(CL_2\)-term has a normal form.

Proof

by Propositions 7.10 and 7.15.

8 The importance of Theorem 1 and the possible utility of CL\(_2\)-terms

Theorems 1 and 2 together entail that every \(\Lambda _2\)-term has a normal form. It may be objected that this has already been repeatedly proven ([1, 6, 7, 12, 19, 24]). The answer to this objection is that, for some purposes, a proof of Theorem 2 like the one given here will not be optimal. There are advantages in being able to produce normalizability proofs that can be formalized in primitive recursive arithmetic (PRA) augmented only by transfinite induction (applied to predicates in the language of PRA) over some initial segment of the recursive ordinal numbers. The predicates “\(\in \) Comp\(_{\xi ,A}\)”, used in Sect. 7, are of course very far from being formalizable in the language of PRA.

In the terminology used by proof theorists, a normalizability proof of the kind envisaged here is called “ordinally informative”.Footnote 7

Definition 8.1

The n-th polymorphic Church numeral is the \(\Lambda _2\)-term \(\Lambda _{\alpha ,\phi }\uplambda y^{\alpha \rightarrow \alpha }\uplambda x^{\alpha }(\underbrace{y^{\alpha \rightarrow \alpha }(y^{\alpha \rightarrow \alpha }...(y^{\alpha \rightarrow \alpha }}_{n}x^{\alpha })...))\), abbreviated henceforth to \(c_n\).

Definition 8.2

For any m-place function f of natural numbers, f is represented by the \(\Lambda _2\)-term M iff, for every m-tuple \(c_{n_1}, c_{n_2},\ldots , c_{n_m}\) of polymorphic Church numeralsFootnote 8

$$\begin{aligned} Mc_{n_1}c_{n_2}...c_{n_m}\quad =_{\Lambda _2}\quad c_{f(n_1, n_2,\ldots ,n_m)} \end{aligned}$$

Definition 8.3

A class \(\Phi \) of numerical functions is represented by a class \(\Sigma \) of terms of \(\Lambda _2\) iff each function in \(\Phi \) is represented by some term in \(\Sigma \).

Up till now, ordinally informative proofs of normalizability have been developed much less for \(\Lambda _2\)-terms than for the terms of G\(\ddot{{o}}\)del’s T,Footnote 9 which can however be embedded in \(\Lambda _2\) ([19], p. 283). A number of writers have exploited this work to prove theorems along the following lines: if all the terms in \(\Sigma \) can be shown to have normal forms by means of transfinite induction up to some ordinal number \(\alpha \) and \(\Sigma \) represents the class \(\Phi \), then each function in \(\Phi \) can be defined by ordinal recursion up to some ordinal below \(\alpha \). See for example [11, 17, 22].

Ordinal recursion has been felt to be a natural and illuminating way of defining subrecursive classes ([15], chapter 3) and convenient for proving the identity of classes that have been defined in different ways (ibid., chapter 4). So ordinally informative proofs of normalizability of classes of \(\Lambda _2\)-terms can be expected to show not only that the terms have normal forms but also that the functions they represent can be characterized in a way that is intelligible to people who are not acquainted with \(\Lambda _2\) and arguably more illuminating even to people who are.

There is another advantage which at least some ordinally informative proofs of normalizability have over proofs of the kind presented in Sect. 7. If it should prove possible to define a function \({\mathcal {F}}\) which takes terms in class \(\Sigma \) to recursive ordinal numbers in such a way that:

  1. (6)

    If M reduces to N, then \({\mathcal {F}}(M) > {\mathcal {F}}(N)\).

then this very easily yields not only a proof that the terms of \(\Sigma \) all have normal forms but also a definition by ordinal recursion of a function which outputs an upper bound to the number of reduction steps needed to reach the normal form.Footnote 10

Theorem 1 above entails that an ordinally informative proof of the normalizability of some class \(\Sigma \) of CL\(_2\)-terms is also an ordinally informative proof of the normalizability of the set of \(\Lambda _2\)-terms that have H-images in \(\Sigma \). It remains to discuss why there might be any advantages in trying to achieve such a result for CL\(_2\) first rather than tackling \(\Lambda _2\) directly. To some extent, the answer has to be that the best way to tackle \(\Lambda _2\) is not yet known but it makes sense for researchers to arm themselves in advance with any tool which might potentially be useful. But one point can be made already. The most obvious form for the definition of an ordinal assignment function \({\mathcal {F}}\) to take is for it to follow the construction of terms: to each operation in the construction of terms (term-application, \(\uplambda \)-abstraction, type-abstraction and type-application) shall correspond a function taking the \({\mathcal {F}}\)-images of immediate subterms of M to \({\mathcal {F}}(M)\) itself. As CL\(_2\)-terms are built up by fewer and simpler operations than \(\Lambda _2\)-terms—only term-application and a restricted form of type-abstraction denoted by \(\Pi _{\alpha ,\phi }\)—it is a reasonable expectation that the invention of a suitable ordinal assignment function for CL\(_2\)-terms will be a simpler matter than for \(\Lambda _2\)-terms.

Not all of the ordinal assignment functions in the proof-theoretic literature do follow this obvious pattern, but some do, notably in [10, 16] (pp. 105–112). The normalizability proof in [10] is for \(\mathbf{T} \) formulated using \(\Lambda _{\rightarrow }\)-terms, in [16] it is for \(\mathbf{T} \) formulated using CL\(_{\rightarrow }\)-terms. The assignment in [10] is more complicated in two respects: firstly what is assigned to each term in [16] is a finite sequence of ordinal numbers whereas in [10] it is a finite sequence of functions of ordinal numbers; in the second place in [10] the finite sequence assigned to a term is calculated by means of two operations, corresponding to term-application and \(\uplambda \)-application respectively, whereas in [16] only one operation is needed, corresponding to application alone.

9 Irreducible terms; connections to second-order logic

Any deduction formed using rules (\(a){-}(e)\) of Table 5.1 can be transformed into a proof in the natural deduction calculus \(\rightarrow \) \(\forall ^2 \mathbf{Nip} ^2\) (described in [24], p. 345), in the following way: every statement \(a \in \) Ir\(_2(A)\) should be transformed into a sequent that has, on the left, the types of the variables in a and, on the right, just the formula A. This transformation will be called the statements-to-sequents transformation. Under it, the rules (\(a){-}(e)\) of Table 5.1 become rules of \(\rightarrow \) \(\forall ^2 \mathbf{Nip} ^2\). It will be permitted, however, that in translating from a statement to a sequent, formulae may be inserted on the left of the sequent which are not necessarily types of free variables in the subject of the statement.

Provided that the given deduction is constructed in accordance with the restrictions stated after Table 5.1, it is possible to see that its image under the statements-to-sequents transformation will actually be a normal proof in \(\rightarrow \) \(\forall ^2 \mathbf{Nip} ^2\).

If the term a in a particular application of rule (b) of Table 5.1 satisfies the suffixed condition (3), then \([x^A]a\) has a combinator in head position. If a satisfies condition (4) but not (3), then either \([x^A]a\) is a \({{\mathbf {\mathsf{{U}}}}}\)-term or \([x^A]a\) has in head position a combinator or a \(\Theta \)-term. There are no other possibilities. It might seem at first that \([x^A]a\) could have a \(\Xi \)-term in head position, but this is so only if a has the shape \(\Xi _{\forall \phi (F\rightarrow G)}bx^{\forall \phi F}\) or \(\Xi _{\forall \phi (F\rightarrow G)}x^{\forall \phi (F\rightarrow G)}\). However, by Definition 4.3, no term of any such shape is a generalization and it was stipulated that rules (f) and (g) have not been used in the deduction.

On the other hand, the conditions on rule (d) in Table 5.1 were that, in the left premiss \(a \in {Ir}(A\) \(\rightarrow \) \(B)\) of an application of this rule, a had to have either a variable or a \({{\mathbf {\mathsf{{U}}}}}\)-term in head position, but could not be itself a \({{\mathbf {\mathsf{{U}}}}}\)-term. This means that a conclusion of rule (b) cannot be a left premiss of an application of rule (d). In the terminology used to talk about natural deduction calculi this is expressed by saying that the main premiss of an application of \(\rightarrow \)-elimination may not be the conclusion of an application of \(\rightarrow \)-introduction.

The argument of the last two paragraphs may be summarized by saying that, in a deduction using the rules of Table 5.1, or in its image under the statements-to-sequents transformation, the conclusion of an introduction inference may not be the sole premiss of a \(\forall \)E-inference nor the main premiss of a \(\rightarrow \)E-inference. When the only connectives present are \(\forall \) and \(\rightarrow \) this entails ([24], p. 179) that the deduction is normal.

This equivalence can now be used to show that the set of sequents provable in \(\rightarrow \) \(\forall ^2\)Nip\(^2\) is just the image under the statements-to-sequents transformation of the set of statements provable in CL\(_2\).

Proposition 9.1

If a sequent is provable in \(\rightarrow \) \(\forall ^2 \mathbf {Nip} ^2\), it is the image under the statements-to-sequents transformation of a statement provable in CL\(_2\).

Proof

every initial sequent is the image of a statement of the shape \(x^A \in \) CL\(_2(A)\). The rules \(\forall \)I (second-order) and \(\rightarrow \)I are admissible by Propositions 4.5 and 4.7 respectively. The rule \(\rightarrow \)E is a primitive rule of CL\(_2\). As for \(\forall \)E: if \(a \in \) CL\(_2(\forall \phi A(\phi ))\) is deducible in CL\(_2\), then, in virtue of the initial statements and rules of CL\(_2\), so is \({{\mathbf {\mathsf{{U}}}}}^{\forall \phi A(\phi )\rightarrow A(C)}a \in A(C)\); while, if the statements-to-sequents transformation takes \(a \in \) CL\(_2(\forall \phi A(\phi ))\) to \(\Gamma \vdash \, \forall \phi A(\phi )\), it also takes \({{\mathbf {\mathsf{{U}}}}}^{\forall \phi A(\phi )\rightarrow A(C)}a \in A(C)\) to \(\Gamma \vdash A(C)\).

Proposition 9.2

If a sequent \(\Gamma \vdash A\) is the image under the statements-to-sequents transformation of a statement \(a \in CL_2(A)\), provable in CL\(_2\), then \(\Gamma \vdash A\) is also provable in \(\rightarrow \) \(\forall ^2 \mathbf {Nip} ^2\).

Proof

by Theorem 2, a has a normal form \(a'\) which also has type A. By Proposition 5.10 and the first corollary following Definition 5.7, there is then a deduction \({\mathfrak {D}}\) of \(a' \in \) Ir(A) in Ir. Taking advantage, if necessary, of the right to add extra formulae on the left (see above), \(\Gamma \vdash A\) is also the image under the statement-to-sequents transformation of \(a' \in \) Ir(A). Suppose, first, that the deduction \({\mathfrak {D}}\) uses only rules \((a){-}(e)\). Then, applying the statements-to-sequents transformation to the whole of \({\mathfrak {D}}\), we obtain, as the foregoing discussion showed, a proof in \(\rightarrow \) \(\forall ^2 \mathbf {Nip} ^2\) (and, indeed, a normal proof). Even if \({\mathfrak {D}}\) makes use of rules (f) and (g), the images of these rules under the statement-to-sequents transformation are rules admissible in \(\rightarrow \) \(\forall ^2 \mathbf {Nip} ^2\).

The proof of Proposition 9.1 introduces in effect a mapping which takes derivations in \(\rightarrow \) \(\forall ^2 \mathbf {Nip} ^2\) to CL\(_2\)-terms. The image of a derivation \({\mathfrak {D}}\) under this mapping will be called the term which encodes \({\mathfrak {D}}\). It is used in the next theorem.

Theorem 3

If a class \(\Sigma \) of \(CL_2\)-terms contains normalizable terms only and the derivation \({\mathfrak {D}}\) is encoded by a term in \(\Sigma \), then the conclusion of \({\mathfrak {D}}\) is also the conclusion of some normal derivation.

Proof

if a \(CL_2\)-term c is formed by the method described in the proof of Proposition 9.1, it will contain no subterms which begin with a \(\Xi \)-term but which are not generalizations. Moreover, a term of the shape \(\Xi _{\forall \phi (F\rightarrow G)}ab\) where a and b are generalizations does not lose that shape under reduction of terms, so if c has the normal form \(c'\) and a deduction of \(c' \in \) Ir(A) is formed, the deduction will make no use of rules (f) and (g) and therefore, by the discussion at the beginning of this section, the image of \(c'\) under the statements-to-sequents transformation will be a normal derivation in \(\rightarrow \) \(\forall ^2 \mathbf {Nip} ^2\). This suffices to establish the theorem.

It is hoped (see the last section) that future proofs of normalizability will correlate, for some proper subsets \(\Sigma \) of the set of all CL\(_2\)-terms, each such \(\Sigma \) with an initial case of transfinite induction which suffices to prove the normalizability of all the terms in \(\Sigma \). Just as Theorem 1 showed that, for each such \(\Sigma \), a proof of the normalizability of all terms in \(\Sigma \) entails the normalizability of a certain related class of \(\Lambda _2\)-terms, so theorem 3 shows that the normalizability of terms in \(\Sigma \) entails the normalizability of derivations encoded by terms in \(\Sigma \). It is shown, further, in [24] (especially pp. 346 and 351f.), that normalizability of \(\rightarrow \) \(\forall ^2 \mathbf {Nip} ^2\) entails normalizability of intuitionist second-order logic generally.

Although the discussion at the beginning of this section showed that applying the statements-to-sequents transformation to a deduction in Ir using rules (a)–(e) only yields a proof in \(\rightarrow \) \(\forall ^2 \mathbf {Nip} ^2\), this does not apply to deductions in CL\(_2\), which in shape resemble axiomatic deductions of second order logic rather than natural deduction ones.

Section 5.3 of [19] indicates how CL\(_{\rightarrow }\)-terms can be read as encoding axiomatic deductions in intuitionist implicational logic. A popular way (perhaps first used in [13], pp. 80–85) of axiomatizing quantificational logic (first or second order), in such a way that no specifically quantificational rules of inference are needed but only modus ponens, is to stipulate that every generalization of an axiom is also an axiom and to adopt, as specifically quantificational axioms, all formulae of the following shapes:

  • \(\forall \phi A(\phi )\rightarrow A(C)\)

  • \(A\rightarrow \forall \phi A\)   where \(\phi \) does not occur free in A

  • \(\forall \phi (F\rightarrow G)\rightarrow \forall \phi F\rightarrow \forall \phi G\)   where FG are as in Table 4.1

But these three schemata are just the schemata exemplified by the types of \({{\mathbf {\mathsf{{U}}}}}\)-terms, \(\Theta \)-terms and \(\Xi \)-terms respectively (Table 4.1). Closure of the set of axioms under generalization is obtained by using the function \(a \mapsto \Pi _{\alpha ,\phi }a\) (Table 4.2).

10 Further possible applications

The Church-style and de Bruijn-style systems mentioned in Sect. 2 above are often (e.g. [1], Sect. 3) contrasted with Curry-style systems, also called theories of type-assignment. Section 4.1 of [1] defines a theory of this kind, which assigns polymorphic types to \(\Lambda \)-terms. This can easily be turned into a theory which assigns polymorphic types to CL-terms (i.e., untyped combinatory terms), simply by reading the variables which, in the formulation of the theory, range over \(\Lambda \)-terms as ranging over their H-transforms instead (here “H-transform” refers not to the H-transformation of Definition 6.1 but to a simpler H-transformation which takes \(\Lambda \) into CL: see p. 212 of [4]). The theory of type-assignment formed in this way shall be called Curry-style CL\(_2\).

It is obvious that from a term a in CL\(_2(A)\), a unique deduction in CL\(_2\) of \(a \in \) CL\(_2(A)\) can be reconstructed. Thus a can be read as encoding a deduction of the fact that a itself has the type it does have. Somewhat similarly, a CL\(_2\)-term of type A can be read as encoding a deduction in Curry-style CL\(_2\) that a certain CL-term can be assigned the type A. How it can be so read should be clear from a reading of Sect. 5A of [8], where \(\Lambda _{\rightarrow }\)-terms are used to encode deductions in a Curry-style theory that assigns simple types to those \(\Lambda \)-terms that can be formed from \(\Lambda _{\rightarrow }\)-terms by erasing their types.

Normalizability moreover is carried over from \(\Lambda _{\rightarrow }\)-terms to their erasures ([8], Sect. 5C). Something similar has been done with CL\(_2\)-terms (see [21] for a first attempt), but the notion of erasure that needs to be defined is more complicated and things do not work as neatly as they do in the case of \(\Lambda _{\rightarrow }\)-terms. In any event, it has to be admitted that not many people will necessarily wish to master the machinery of the present paper for the sake of this application alone.

11 Conclusion

An ordinally informative proof of the normalizability of a class of CL\(_2\)-terms will yield an ordinally informative proof of the normalizability both of a certain class of \(\Lambda _2\)-terms and of a certain fragment of second-order intuitionist logic. There is some reason, moreover, to expect that such proofs will be easier to achieve by concentrating on CL\(_2\)-terms than by tackling either \(\Lambda _2\)-terms or second-order logic directly. It is hoped that future work will vindicate this claim.