1 Introduction

Due to recent advances in the field of automated reasoning, formal systems for mechanizing mathematical reasoning are attracting a lot of interest [see, e.g., Kamareddine (2003), Campbell et al. (2008), Constable et al. (1986), Rudnicki (1992)]. Most of these systems go beyond first-order logic (FOL), because the latter is too weak for this task: one cannot even provide in it a categorical characterization of the most basic concept of mathematics—the natural numbers. While FOL is too weak, using second-order logic (SOL) for this task has many disadvantages too. First of all, SOL has doubtful semantics, as it is based on debatable ontological commitments. Moreover, it does not seem satisfactory that dealing with basic notions (such as the natural numbers) requires the use of the strong notions involved in SOL, such as quantification over all subsets of infinite sets. In addition, SOL is difficult to deal with from a proof-theoretical point of view.

The above considerations indicate that the most suitable framework for mechanizing mathematical reasoning should be provided by some logic which is intermediate between FOL and SOL. There are several natural candidates for this task that have been suggested in the literature, such as weak second-order logic, \(\omega \)-logic, cardinality logic, etc. In Shapiro (1991) it was shown how the natural numbers can be categorically characterized in all these logics. We believe that the most suitable choice among them for the task of formalizing mathematics is ancestral logic (AL)—the logic obtained by augmenting FOL with the concept of transitive closure of a given relation. Indeed, ancestral logic provides the most suitable framework for the formalization of the recursive definitions of fundamental concepts which are characteristic for basic mathematics [see, e.g., Avron (2003), Shapiro (1991), Smith (2008)].

The expressive power of ancestral logic is equivalent to that of the other candidates, in the sense that any class of infinite structures definable by one of them can be defined by ancestral logic. However, there are several reasons to prefer it over the others. One of them is that it seems to be the easiest choice from a proof-theoretic point of view. Another important reason is simply the simplicity of the notion of transitive closure. Any person, even with no mathematical background whatsoever, can easily grasp the concept of the ancestor of a given person (or, in other words, the idea of transitive closure of a certain binary relation).

Here are some examples of the use of transitive closure in every day life:

  • The transitive closure of the relation “x is a child of y” is: “x is a descendant of y”. We often use this transitive relation to make inferences, such as: if a disease is hereditary, i.e. transferred from parent to child, and one of my ancestors had this disease, then I’ll have this disease too.

  • A mathematical example: Understanding the concept of the natural numbers is basically understanding that every number is a descendant of zero through the successor relation. Also, understanding the concept of a well-formed formula in formal logic involves applying certain operations again and again starting from a class of atoms. Thus, the understanding of basic arithmetic and basic logic relies on the understanding of the idea of the transitive closure.

The examples above (especially the last one) show that any system designed for capturing the ability to do mathematics must provide the means to create the transitive closure of a relation and to make appropriate inferences regarding it. The examples also show that our basic understanding of the transitive closure operator involves two components: the ability to construct a new binary relation from a given one (the transitive closure of the given relation), and the ability to infer that if a certain property is hereditary between objects in a given relation, then it will also be hereditary between objects which are related by the new relation.

Most of the works on ancestral logic have so far been carried out in the context of finite model theory [see, e.g., Ebbinghaus et al. (1995)]. Clearly, the focus on finite structures renders these works irrelevant for the task of formalizing mathematics. Moreover, most of this research has been dedicated to model theory, while for mechanizing mathematics we need useful proof systems.

In this paper we suggest a complementary view of ancestral logic by investigating it from a proof-theoretical point of view. We first review the basic definitions and present some of the most important model-theoretic properties of ancestral logic. Then, we go on to develop useful proof systems for ancestral logic. A first step in this direction was done in Avron (2003) where a Gentzen-style system for the non-reflexive transitive closure operator was presented. Therein it was stated that: “a major research task here is to find out what other rules (if any) should be added in order to make the system ‘complete’ in some reasonable sense”. The main goal of this work is to provide an answer to this question. We show that the system proposed in Avron (2003) is too weak, as it fails to prove certain fundamental properties of the transitive closure operator. We then take further steps towards a useful proof system for ancestral logic by proposing a stronger system, \(TC_{G}\), which is sound for this logic and apparently encompasses all forms of reasoning for this logic that are used in practice. \(TC_{G}\) is proven to be equivalent to Hilbert-style systems previously suggested in the literature for the reflexive transitive closure, in the sense that there are translation algorithms between them that preserve provability. In the context of our system we also investigate two crucial proof-theoretical properties: cut elimination and the existence of a constructive consistency proof. Unfortunately, it turns out that the generalization of PA’s induction rule employed in our system for ancestral logic renders Gentzen’s standard methods for analyzing PA inapplicable. Nevertheless, we do show that in the case of arithmetic the ordinal number of the systems is \(\varepsilon _{0}\), like the ordinal of PA.

2 First-order languages augmented by a transitive closure operator

A standard mathematical definition of the transitive closure of a binary relation is as follows.

Definition 1

Let X be a set and \(R\subseteq X\times X\) be a binary relation on X. The transitive closure operator \(TC_{R}\) of the relation R is the smallest relation \(TC_{R}\subseteq X\times X\) such that the following holds:

  1. 1.

    \(R\subseteq TC_{R}\).

  2. 2.

    \(TC_{R}\) is transitive.

The relation \(TC_{R}\) exists for any binary relation R. To see this, note that there exists at least one transitive relation containing R, the trivial one: \(X\times X\). Furthermore, the intersection of any family of transitive relations is again transitive. Hence, the transitive closure of R is the intersection of all transitive relations containing R.

Definition 1 is an impredicative definition. A more constructive, predicative characterization can be obtained as follows:

Definition 2

Let X be a set and \(R\subseteq X\times X\) be a binary relation on X. The transitive closure operator\(TC_{R}\) of the relation R is defined by

$$\begin{aligned} TC_{R}=\underset{{\scriptscriptstyle {\scriptstyle n\in \mathbb {N}^{+}}}}{\bigcup }R^{n}, \end{aligned}$$

where \(R^{n}\) is defined by

$$\begin{aligned} R^{n}={\left\{ \begin{array}{ll} R, \qquad \qquad \,\,\,\mathrm {if}\,\, n=1,\\ R^{n-1}\circ R, \,\,\mathrm {otherwise}. \end{array}\right. } \end{aligned}$$

Ancestral logic is an extension of first-order logic obtained by augmenting FOL with a transitive closure operator. The essential idea in embedding the concept of the transitive closure operator into a logical framework is that one may treat a first-order formula with two (assigned) free variables as a definition of a binary relation. Below are the corresponding formal definitions of a first-order logic augmented by the transitive closure operator, and its semantics.

In this paper \(\sigma \) denotes a first-order signature with equality and \(L^{1}\left( \sigma \right) \) is the first-order language based on \(\sigma \). A structure for a first-order language based on \(\sigma \) is an ordered pair \(M=\left\langle D,I\right\rangle \), where D is a non-empty set of elements (the domain) and I is an interpretation function on \(\sigma \). To avoid confusion regarding parentheses, we use ( , ) for parentheses in a formal language, and [ , ] for parentheses in the metalanguage.

Definition 3

Let \(\sigma \) be a signature for a first-order language with equality, and let \(M=\left\langle D,I\right\rangle \) be a structure for \(\sigma \) and v an assignment in M.

  • The language \(L_{TC}\left( \sigma \right) \) is defined as the first-order language based on \(\sigma \), with the addition of the TC operator defined by: for any formula \(\varphi \) in \(L_{TC}\left( \sigma \right) \), xy distinct variables, and st terms, \(\left( TC_{x,y}\varphi \right) \left( s,t\right) \) is a formula in \(L_{TC}\left( \sigma \right) \). The free occurrences of x and y in \(\varphi \) become bound in this formula.

  • The pair \(\left\langle M,v\right\rangle \) is said to satisfy \(\left( TC_{x,y}\varphi \right) \left( s,t\right) \) if there exist \(a_{0},...,a_{n}\in D\) (\(n>0\)) such that \(v[s]=a_{0},\, v[t]=a_{n}\), and \(\varphi \) is satisfied by M and \(v[x:=a_{i},\, y:=a_{i+1}]\)Footnote 1 for \(0\le i\le n-1\).

The logic obtained is called Ancestral Logic and it is denoted by \(\mathcal {L}_{TC}\).

The first (as far as we know) to suggest expanding first-order logic by the TC operator was Martin in Martin (1943, 1949). Actually, Martin used a generalized form of the transitive closure operator. He expanded first-order logic by adding for each \(n\in \mathbb {N}\) a \(TC^{n}\) operator which, when applied to an 2n-ary predicate, produce a new 2n-ary predicate. In Myhill (1952), Myhill presented a first-order logic augmented only by the operator \(TC^{1}\), but together with the introduction of ordered pairs into the language. The expressive power of the logic presented by Martin turns out to be the same as that of the logic presented by Myhill.

In the semantics presented in this paper, \(\left( TC_{x,y}\varphi \right) \left( s,t\right) \) requires that there should be at least one \(\varphi \)-step between s and t. However, another well-studied form of the transitive closure operator is its reflexive form, RTC.

Definition 4

Let \(\sigma \) be a first-order signature, and let \(M=\left\langle D,I\right\rangle \) be a structure for \(\sigma \) and v an assignment in M.

  • The language \(L_{RTC}\left( \sigma \right) \) is defined as \(L_{TC}\left( \sigma \right) \) with TC replaced by RTC.

  • The pair \(\left\langle M,v\right\rangle \) is said to satisfy\(\left( RTC_{x,y}\varphi \right) \left( s,t\right) \) if \(s=t\) or there exist \(a_{0},...,a_{n}\in D\) (\(n>0\)) such that \(v[s]=a_{0},\, v[t]=a_{n}\), and \(\varphi \) is satisfied by M and \(v[x:=a_{i},\, y:=a_{i+1}]\) for \(0\le i\le n-1\).

Similarly, the logic obtained is denoted by \(\mathcal {L}_{RTC}\).

In the presence of equality, the two forms of the transitive closure operator are definable in terms of each other. The reflexive transitive closure operator is definable using the non-reflexive form by

$$\begin{aligned} \left( RTC_{x,y}\varphi \right) \left( s,t\right) :=\left( TC_{x,y}\varphi \right) \left( s,t\right) \vee s=t, \end{aligned}$$

while the non-reflexive TC operator is definable by either one of the following forms (which can be easily shown to be equivalent):

$$\begin{aligned} \left( TC_{x,y}\varphi \right) \left( s,t\right) := & {} \exists z\left( \varphi \left\{ \frac{s}{x},\frac{z}{y}\right\} \wedge \left( RTC_{x,y}\varphi \right) \left( z,t\right) \right) \\= & {} \exists z\left( \left( RTC_{x,y}\varphi \right) \left( s,z\right) \wedge \varphi \left\{ \frac{z}{x},\frac{t}{y}\right\} \right) \\= & {} \exists z\exists u\left( \left( RTC_{x,y}\varphi \right) \left( s,z\right) \wedge \varphi \left\{ \frac{z}{x},\frac{u}{y}\right\} \wedge \left( RTC_{x,y}\varphi \right) \left( u,t\right) \right) , \end{aligned}$$

where uz are fresh variables.Footnote 2

One difference between the two forms is the ability to define quantifiers. As shown in Avron (2003), the existential quantifier can be defined using the TC operator by:

$$\begin{aligned} \exists x\varphi :=\left( TC_{u,v}\left( \varphi \left\{ \frac{u}{x}\right\} \vee \varphi \left\{ \frac{v}{x}\right\} \right) \right) \left( s,t\right) \end{aligned}$$

However, it cannot be defined using the RTC operator, as proven below.

Proposition 1

The existential quantifier is not definable in the quantifier-free fragment of \(\mathcal {L}_{RTC}\).

Proof

Take \(\sigma \) to consist of a constant symbol 0 and a unary predicate symbol P. It can be easily shown by induction that each quantifier-free sentence \(\psi \) in \(\mathcal {L}_{RTC}^{\sigma }\)Footnote 3 is logically equivalent to one of the following sentences: \(P\left( 0\right) \), \(\lnot P\left( 0\right) \), \(0=0\), or \(0\ne 0\). Since \(\exists xP\left( x\right) \) is clearly not logically equivalent to any of these four sentences, we conclude that the existential quantifier cannot be defined in the quantifier-free fragment of \(\mathcal {L}_{RTC}\).\(\square \)

A simple compactness argument shows that in general the transitive closure operator, TC, is not first-order definable [see, e.g., Fagin (1974), Aho and Ullman (1979)]. However, there are cases in which there is a first-order sentence equivalent to \(\left( TC_{x,y}\varphi \right) \left( s,t\right) \). The obvious case is when \(\varphi \) is a valid formula, since then \(\left( TC_{x,y}\varphi \right) \left( s,t\right) \) is also a valid formula. This case is a special case of the following type of formulas.

Definition 5

Let \(\varphi \) be a formula in \(L^{1}\left( \sigma \right) \). \(\varphi \) is called transitive (w.r.t x and y) if for every structure M for \(\sigma \), every assignment v for M, and every \(a,b,c\in D\): if \(M,v\left[ x:=a,y:=b\right] \models \varphi \) and \(M,v\left[ x:=b,y:=c\right] \models \varphi \), then \(M, v\left[ x:=a,y:=c\right] \models \varphi \).

Example 1

The formula \(P\left( x\right) \wedge P\left( y\right) \) is transitive, while \(P\left( x\right) \vee P\left( y\right) \) is not.

Proposition 2

If \(\varphi \) is a transitive formula in \(L^{1}\left( \sigma \right) \), then \(\left( TC_{x,y}\varphi \right) \left( s,t\right) \) is definable in \(L^{1}\left( \sigma \right) \).

Proof

The result follows immediately from the fact that for every transitive formula \(\varphi \) and st terms in \(L^{1}\left( \sigma \right) \), \(M,v\models \left( TC_{x,y}\varphi \right) \left( s,t\right) \) iff \(M,v\models \varphi \left\{ \frac{s}{x},\frac{t}{y}\right\} \), for every structure M for \(\sigma \) and assignment v for M.\(\square \)

As mentioned, \(\exists xP\left( x\right) \) is definable by the formula \(\left( TC_{u,v}\left( P\left( u\right) \vee P\left( v\right) \right) \right) \left( s,t\right) \). Thus, \(P\left( u\right) \vee P\left( v\right) \) is an example of a non-transitive formula whose transitive closure is first-order definable. Thus far, there is no complete characterization of the set of first-order logic formulas whose transitive closure is definable in first-order logic.

Though the TC operator cannot be defined in first-order logic, it is definable in second-order logic. Thus, the formula \(\left( RTC_{x,y}\varphi \right) \left( s,t\right) \) can be defined by the second-order formula \(\forall X\left( \left( Xs\wedge \forall x\forall y\left( \varphi \left( x,y\right) \wedge Xx\rightarrow Xy\right) \right) \rightarrow Xt\right) \), from which follows that the TC operator is also second-order definable.

The concept of the transitive closure operator is embedded in our understanding of the natural numbers. Therefore, it is only natural to explore the expressive power of various first-order languages for arithmetic augmented by the TC operator. Let 0 be a constant symbol and s a unary function symbol. It is known that in \(\mathcal {L}_{TC}^{\left\{ 0,s\right\} }\), together with the standard axioms for the successor function, the following sentence categorically characterizes the natural numbers:

$$\begin{aligned} \forall x\left( x=0\vee \left( TC_{w,u}\left( s(w)=u\right) \right) \left( 0,x\right) \right) \end{aligned}$$
(1)

In Avron (2003) it was shown that all recursive functions and relations are definable in \(\mathcal {L}_{TC}^{\left\{ 0,s,+\right\} }\), where \(+\) is a binary function symbol, though in the absence of ordered pairs, one cannot define addition in \(\mathcal {L}_{TC}^{\left\{ 0,s\right\} }\). This implies that the upward Lwenheim–Skolem theorem fails for ancestral logic, and AL is finitary, i.e. the compactness theorem fails for it. Moreover, AL is not even arithmetic, thus any formal deductive system which is sound for AL is incomplete.

Though more expressive than FOL, ancestral logic does not offer all the wealth of SOL. Thus, it follows from the downward Lwenheim–Skolem theorem that the real numbers cannot be characterized up to isomorphism in it (while they can be characterized in SOL). The same is true for the notion of well-ordering.

An important indication that the expressive power of ancestral logic captures a very significant and natural fragment of SOL is provided by the fact that it is equivalent Shapiro (1991) to several other intermediate logics between first-order logic and second-order logic that have been suggested and investigated in the literature. This includes weak second-order logic, logics with a “cardinality quantifier”, and logics with Henkin quantifiers [see Shapiro (1991), Henkin (1961), Yasuhara (1966)]. The advantages of ancestral logic over these logics are that it is particularly natural (as explained above), and, no less important, it seems much easier to develop an adequate proof system for it than for the other logics mentioned above. This will be demonstrated in the next section.

3 Formal proof systems for ancestral logic

3.1 Gentzen-style systems for ancestral logic

Among the various intermediate logics between FOL and SOL mentioned in the introduction, ancestral logic seems to be the easiest choice from a proof-theoretical point of view. We now turn to substantiate this claim. Ideally, we would like to have a consistent, sound, and complete axiomatic system for ancestral logic. However, since there can be no sound and complete system for it, one should instead look for useful and effective partial formal systems that are still adequate for formalizing mathematical reasoning. The systems defined in this section are extensions of Gentzen-style system for classical first-order logic with equality, \(\mathcal {LK}_{=}\)Gentzen (1935).

In what follows the letters \(\varGamma ,\Delta \) represent finite (possibly empty) multisets of formulas, \(\varphi ,\psi ,\phi \) arbitrary formulas, xyzuv variables, and rst terms. For convenience, we shall denote a sequent of the form \(\varGamma \Rightarrow \left\{ \varphi \right\} \) by \(\varGamma \Rightarrow \varphi \), and employ other standard abbreviations, such as \(\varGamma ,\Delta \) instead of \(\varGamma \cup \Delta \). To improve readability, in some derivations we omit the context from the sequents. Also, for readability, frequently we shall not distinguish between the sequents \(\varphi \wedge \psi ,\varGamma \Rightarrow \Delta \) and \(\varphi ,\psi ,\varGamma \Rightarrow \Delta \), or \(\varGamma \Rightarrow \Delta ,\varphi \vee \psi \) and \(\varGamma \Rightarrow \Delta ,\varphi ,\psi \), as they are provable from one another using cuts. We employ the following standard abbreviation for a sub-proof P ending with the sequent S:

In Martin (1943, 1949), Myhill (1952) Martin and Myhill suggested two equivalent Hilbert-style systems for ancestral logic in which the reflexive transitive closure operator was taken as primitive. Below is a Gentzen-style proof system for the RTC operator which is equivalent to the Hilbert-style systems presented in these original papers.

Definition 6

(The system\({RTC_{G}}\)) The system \(RTC_{G}\) is defined by adding to \(\mathcal {LK}_{=}\) the axiom

$$\begin{aligned} \varGamma \Rightarrow \Delta ,\left( RTC_{x,y}\varphi \right) \left( s,s\right) \end{aligned}$$
(2)

and the following inference rules:

(3)
(4)
(5)

In all three rules we assume that the terms which are substituted are free for substitution and that no forbidden capturing occurs. In Rule (5) x should not occur free in \(\varGamma \) and \(\Delta \), and y should not occur free in \(\varGamma ,\Delta \) and \(\psi \).

Rule (5) is a generalized induction principle which states that if t is a \(\varphi \)-descendant of s or equal to it, then if s has some hereditary property which is passed down from one object to another if they are \(\varphi \)-related, then t also has that property.

We next show that \(RTC_{G}\) is adequate for RTC, in the sense that it does give the RTC operator the intended meaning of the reflexive transitive closure, and can derive all fundamental rules concerning the RTC operator that have been suggested in the literature (as far as we know).

Proposition 3

The following rules are derivable in \(RTC_{G}\)Footnote 4:

$$\begin{aligned}&\frac{\varGamma \Rightarrow \Delta ,\varphi \left\{ \frac{s}{x},\frac{r}{y}\right\} \,\,\,\varGamma \Rightarrow \Delta ,\left( RTC_{x,y}\varphi \right) \left( r,t\right) }{\varGamma \Rightarrow \Delta ,\left( RTC_{x,y}\varphi \right) \left( s,t\right) }\nonumber \\&\quad \frac{\varGamma \Rightarrow \Delta ,\left( RTC_{x,y}\varphi \right) \left( s,r\right) \,\,\,\varGamma \Rightarrow \Delta ,\varphi \left\{ \frac{r}{x},\frac{t}{y}\right\} }{\varGamma \Rightarrow \Delta ,\left( RTC_{x,y}\varphi \right) \left( s,t\right) } \end{aligned}$$
(6)
$$\begin{aligned}&\frac{\varGamma \Rightarrow \Delta ,\left( RTC_{x,y}\varphi \right) \left( s,t\right) }{\varGamma \Rightarrow \Delta ,s=t,\exists z\left( \left( RTC_{x,y}\varphi \right) \left( s,z\right) \wedge \varphi \left\{ \frac{z}{x},\frac{t}{y}\right\} \right) }\nonumber \\&\quad \frac{\varGamma \Rightarrow \Delta ,\left( RTC_{x,y}\varphi \right) \left( s,t\right) }{\varGamma \Rightarrow \Delta ,s=t,\exists z\left( \varphi \left\{ \frac{s}{x},\frac{z}{y}\right\} \wedge \left( RTC_{x,y}\varphi \right) \left( z,t\right) \right) } \end{aligned}$$
(7)
$$\begin{aligned}&\frac{\varGamma \Rightarrow \Delta ,\left( RTC_{x,y}\varphi \right) \left( s,t\right) }{\varGamma \Rightarrow \Delta ,\left( RTC_{y,x}\varphi \right) \left( t,s\right) }\,\,\,\,\,\,\,\,\frac{\left( RTC_{x,y}\varphi \right) \left( s,t\right) ,\varGamma \Rightarrow \Delta }{\left( RTC_{y,x}\varphi \right) \left( t,s\right) ,\varGamma \Rightarrow \Delta } \end{aligned}$$
(8)
$$\begin{aligned}&\frac{\varGamma \Rightarrow \Delta ,\left( RTC_{x,y}\varphi \right) \left( s,t\right) }{\varGamma \Rightarrow \Delta ,\left( RTC_{u,v}\varphi \left\{ \frac{u}{x},\frac{v}{y}\right\} \right) \left( s,t\right) }\,\,\,\,\,\,\,\,\frac{\left( RTC_{x,y}\varphi \right) \left( s,t\right) ,\varGamma \Rightarrow \Delta }{\left( RTC_{u,v}\varphi \left\{ \frac{u}{x},\frac{v}{y}\right\} \right) \left( s,t\right) ,\varGamma \Rightarrow \Delta } \end{aligned}$$
(9)
$$\begin{aligned}&\frac{\varGamma ,\varphi \Rightarrow \Delta ,\psi }{\varGamma ,\left( RTC_{x,y}\varphi \right) \left( s,t\right) \Rightarrow \Delta ,\left( RTC_{x,y}\psi \right) \left( s,t\right) } \end{aligned}$$
(10)
$$\begin{aligned}&\frac{\left( RTC_{x,y}\varphi \right) \left( s,t\right) ,\varGamma \Rightarrow \Delta }{\left( RTC_{u,v}\left( RTC_{x,y}\varphi \right) \left( u,v\right) \right) \left( s,t\right) ,\varGamma \Rightarrow \Delta } \end{aligned}$$
(11)
$$\begin{aligned}&\frac{\varphi \left\{ \frac{s}{x}\right\} ,\varGamma \Rightarrow \Delta }{\left( RTC_{x,y}\varphi \right) \left( s,t\right) ,\varGamma \Rightarrow s=t,\Delta }\,\,\,\,\,\,\,\,\frac{\varphi \left\{ \frac{t}{y}\right\} ,\varGamma \Rightarrow \Delta }{\left( RTC_{x,y}\varphi \right) \left( s,t\right) ,\varGamma \Rightarrow s=t,\Delta } \end{aligned}$$
(12)

Conditions:

  • In all the rules we assume that the terms which are substituted are free for substitution and that no forbidden capturing occurs.

  • In (7) z should not occur free in \(\varGamma ,\Delta \) and \(\varphi \left\{ \frac{s}{x},\frac{t}{y}\right\} \).

  • In (9) the conditions are the usual ones concerning the \(\alpha \)-rule.

  • In (10) xy should not occur free in \(\varGamma ,\Delta \).

  • In (11) uv should not occur free in \(\varphi \).

  • In (12) y should not occur free in \(\varGamma ,\Delta \) or s in the left rule, and x should not occur free in \(\varGamma ,\Delta \) or t in the right rule.

Proof

In the following proof we omit the contexts \(\varGamma ,\Delta \) from the sequents in the derivations.

  • The first rules in (6):

    The proof of the second rule in (6) is analogous.

  • The first rule in (7): Consider the following proof \(P_{1}\):

    The sequent \(\left( RTC_{x,y}\varphi \right) \left( s,w\right) ,\varphi \left\{ \frac{w}{x},\frac{y}{y}\right\} \Rightarrow \left( RTC_{x,y}\varphi \right) \left( s,y\right) \) is provable in \(RTC_{G}\) using (6). Thus, we can construct the following \(P_{2}\):

    Denote by \(A\left( y\right) \) the formula \(\exists w\left( \left( RTC_{x,y}\varphi \right) \left( s,w\right) \wedge \varphi \left\{ \frac{w}{x},\frac{y}{y}\right\} \right) \vee s=y\). From \(P_{1}\) and \(P_{2}\) we obtain a proof of the sequent \(A\left( y\right) ,\varphi \left\{ \frac{y}{x},\frac{z}{y}\right\} \Rightarrow A\left\{ \frac{z}{y}\right\} \), from which, using Rule (5), we get \((*)\, A\left\{ \frac{s}{y}\right\} ,\left( RTC_{x,y}\varphi \right) \left( s,t\right) \Rightarrow A\left\{ \frac{t}{y}\right\} \). Since \(\Rightarrow A\left\{ \frac{s}{y}\right\} \) is derivable from the equality axiom \(\Rightarrow s=s\), applying a cut on it and on \((*)\), followed by another cut on the results and the premiss of Rule (7), we get the desired sequent.

The proof of the second rule in (7) is symmetric.

  • The left rule in (8): Clearly, \(s=t\Rightarrow \left( RTC_{y,x}\varphi \right) \left( t,s\right) \) is provable in \(RTC_{G}\) using Axiom (2). The sequent \(\varphi \left( x,y\right) ,\left( RTC_{y,x}\varphi \right) \left( x,s\right) \Rightarrow \left( RTC_{y,x}\varphi \right) \left( y,s\right) \) is also provable in \(RTC_{G}\) using (6). Thus, we can construct the following proof : 

    The sequent \(\left( RTC_{x,y}\varphi \right) \left( s,t\right) \Rightarrow s=t,\exists z\left( \varphi \left\{ \frac{s}{x},\frac{z}{y}\right\} \wedge \left( RTC_{x,y}\varphi \right) \left( z,t\right) \right) \) is provable in \(RTC_{G}\) using Rule (7). From this, by two cuts, we obtain:

    $$\begin{aligned} \vdash _{RTC_{G}}\left( RTC_{x,y}\varphi \right) \left( s,t\right) \Rightarrow \left( RTC_{y,x}\varphi \right) \left( t,s\right) \end{aligned}$$

    An additional cut on the premiss of Rule (8) results in the desired sequent. The proof of the right rule is symmetric.

  • The left rule in (9): In \(RTC_{G}\) the sequent \(s=t\Rightarrow \left( RTC_{u,v}\varphi \left\{ \frac{u}{x},\frac{v}{y}\right\} \right) \left( s,t\right) \) is provable. By a method similar to the one used in the proof of (8) we get:

    $$\begin{aligned} \vdash _{RTC_{G}}\exists z\left( \left( RTC_{x,y}\varphi \right) \left( s,z\right) \wedge \varphi \left\{ \frac{z}{x},\frac{t}{y}\right\} \right) \Rightarrow \left( RTC_{u,v}\varphi \left\{ \frac{u}{x},\frac{v}{y}\right\} \right) \left( s,t\right) \end{aligned}$$

    Applying cuts and Rule (7) results in:

    $$\begin{aligned} \vdash _{RTC_{G}}\left( RTC_{x,y}\varphi \right) \left( s,t\right) \Rightarrow \left( RTC_{u,v}\varphi \left\{ \frac{u}{x},\frac{v}{y}\right\} \right) \left( s,t\right) \end{aligned}$$

    An additional cut on the premiss of Rule (9) results in the desired sequent. The proof of the right rule is symmetric.

  • Rule (10): Consider the following two proofs: \(P_{1}\):

    \(P_{2}:\)

    Thus from \(P_{1}\) and \(P_{2}\) we can obtain a proof of the sequent \(\varphi \left\{ \frac{s}{x},\frac{z}{y}\right\} \wedge \left( RTC_{x,y}\varphi \right) \left( z,t\right) \Rightarrow \left( RTC_{x,y}\psi \right) \left( s,t\right) \), from which we can obtain a proof of \(\exists z\left( \varphi \left\{ \frac{s}{x},\frac{z}{y}\right\} \wedge \left( RTC_{x,y}\varphi \right) \left( z,t\right) \right) \Rightarrow \left( RTC_{x,y}\psi \right) \left( s,t\right) \). Clearly, the sequent \(s=t\Rightarrow \left( RTC_{y,x}\psi \right) \left( s,t\right) \) is provable in \(RTC_{G}\) using Axiom (2). Using Rule (7) we get \(\left( RTC_{x,y}\varphi \right) \left( s,t\right) \Rightarrow s=t,\exists z\left( \varphi \left\{ \frac{s}{x},\frac{z}{y}\right\} \wedge \left( RTC_{x,y}\varphi \right) \left( z,t\right) \right) \), and two cuts result in a proof of \(\left( RTC_{x,y}\varphi \right) \left( s,t\right) \Rightarrow \left( RTC_{x,y}\psi \right) \left( s,t\right) \)

  • Rule (11): Rule (4) entails the existence of a proof in \(RTC_{G}\) of the sequent \(\left( RTC_{x,y}\varphi \right) \left( s,u\right) ,\left( RTC_{x,y}\varphi \right) \left( u,v\right) \Rightarrow \left( RTC_{x,y}\varphi \right) \left( s,v\right) \). Applying Rule (5) results in a proof of \(\left( RTC_{x,y}\varphi \right) \left( s,s\right) ,\left( RTC_{u,v}\left( RTC_{x,y}\varphi \right) \left( u,v\right) \right) \left( s,t\right) \Rightarrow \left( RTC_{x,y}\varphi \right) \left( s,t\right) \). Since \(\Rightarrow \left( RTC_{x,y}\varphi \right) \left( s,s\right) \) is an axiom of \(RTC_{G}\), a cut results in a proof of \(\left( RTC_{u,v}\left( RTC_{x,y}\varphi \right) \left( u,v\right) \right) \left( s,t\right) \Rightarrow \left( RTC_{x,y}\varphi \right) \left( s,t\right) \).

  • The left rule in (12): From the sequent \(\varphi \left\{ \frac{s}{x}\right\} \Rightarrow \), by standard rules of \(\mathcal {LK}_{=}\), we can derive the sequent: \(\exists z\left( \varphi \left\{ \frac{s}{x},\frac{z}{y}\right\} \wedge \left( RTC_{x,y}\varphi \right) \left( z,t\right) \right) \Rightarrow \), where z is a fresh variable. By Rule (7) we can obtain \(\vdash _{RTC_{G}}\left( RTC_{x,y}\varphi \right) \left( s,t\right) \Rightarrow s=t,\exists z\left( \varphi \left\{ \frac{s}{x},\frac{z}{y}\right\} \wedge \left( RTC_{x,y}\varphi \right) \left( z,t\right) \right) \). Thus, a cut results in a proof of \(\left( RTC_{x,y}\varphi \right) \left( s,t\right) \Rightarrow s=t\). The proof of the right rule in (12) is analogous.

\(\square \)

In Avron (2003) a Gentzen-style system for the non-reflexive transitive closure operator was presented. Below is a proof system for the non-reflexive transitive closure operator which is an extansion of the one suggested in Avron (2003).Footnote 5

Definition 7

(The system\(TC{}_{G}\)) The system \(TC_{G}\) is defined by adding to \(\mathcal {LK}_{=}\) the axiom

$$\begin{aligned} \left( TC_{x,y}\varphi \right) \left( s,t\right) \Rightarrow \varphi \left\{ \frac{s}{x},\frac{t}{y}\right\} \vee \exists z\left( \varphi \left\{ \frac{s}{x},\frac{z}{y}\right\} \wedge \left( TC_{x,y}\varphi \right) \left( z,t\right) \right) \end{aligned}$$
(13)

and the following inference rules

(14)
(15)
(16)

The same restrictions on the rules in \(RTC_{G}\) apply here, and in Axiom (13) z is a fresh variable.

Proposition 4

In \(TC_{G}\) all the TC-counterparts of the rules in Proposition 3 are derivable.

We denote the system presented in Avron (2003), which is obtained from \(TC_{G}\) by discarding Axiom (13), by \(TC'_{G}\). We start by showing that the system \(TC_{G}\) suggested here is indeed a proper extension of \(TC'_{G}\).

Proposition 5

Axiom (13) is independent of the other rules in \(TC_{G}\), i.e. it is unprovable in \(TC'_{G}\).

Proof

Suppose the sequent (13) is derivable in \(TC'_{G}\). It is easy to see that all the rules in \(TC'_{G}\) remain valid and derivable in \(RTC_{G}\) if we replace the operator TC with RTC. Hence, the corresponding sequent for RTC is provable in \(RTC_{G}\). However, it is obviously not valid, since \(\left( RTC_{x,y}\varphi \right) \left( s,s\right) \) holds for all s and \(\varphi \). In general, any sequent that is valid only for the TC operator and not for the RTC operator will not be derivable in \(TC'_{G}\).\(\square \)

Since each of the two forms of the transitive closure operator can be expressed in terms of the other, it is interesting to explore the connection between the two systems.

Definition 8

Define recursively two interpretations, \('\) from \(\mathcal {L}_{RTC}\) to \(\mathcal {L}_{TC}\) and \(*\) from \(\mathcal {L}_{TC}\) to \(\mathcal {L}_{RTC}\), as follows:

  • \(\varphi '=\varphi ^{*}=\varphi \), for \(\varphi \) atomic formula.

  • \(\left( \lnot \varphi \right) ^{*}=\lnot \varphi ^{*}\) and \(\left( \lnot \varphi \right) '=\lnot \varphi '\).

  • \(\left( \varphi \circ \psi \right) ^{*}=\varphi ^{*}\circ \psi ^{*}\) and \(\left( \varphi \circ \psi \right) '=\varphi '\circ \psi '\), where \(\circ \in \left\{ \wedge ,\vee ,\rightarrow \right\} \).

  • \(\left( Qx\varphi \right) ^{*}=Qx\varphi ^{*}\) and \(\left( Qx\varphi \right) '=Qx\varphi '\), where \(Q\in \left\{ \forall ,\exists \right\} \).

  • \(\left( \left( TC_{x,y}A\right) \left( s,t\right) \right) ^{*}=\exists z\left( A^{*}\left\{ \frac{s}{x},\frac{z}{y}\right\} \wedge \left( RTC_{x,y}A^{*}\right) \left( z,t\right) \right) \)

  • \(\left( \left( RTC_{x,y}A\right) \left( s,t\right) \right) '=\left( TC_{x,y}A'\right) \left( s,t\right) \vee s=t\)

We use the standard abbreviations: \(\varGamma ^{*}\) for \(\left\{ \varphi ^{*}|\varphi \in \varGamma \right\} \) and \(\varGamma '\) for \(\left\{ \varphi '|\varphi \in \varGamma \right\} \).

First we show that the above interpretations preserve provability (i.e., any theorem of \(TC{}_{G}\) can be translated into a theorem of \(RTC_{G}\), and vice versa), and as such, they are considered as translations between the two systems [see, e.g. Prawitz and Malmnäs (1968)].Footnote 6

Proposition 6

The following holds:

  1. 1.

    \(\vdash _{TC_{G}}\varGamma \Rightarrow \Delta \) implies \(\vdash _{RTC_{G}}\varGamma ^{*}\Rightarrow \Delta ^{*}\).

  2. 2.

    \(\vdash _{RTC_{G}}\varGamma \Rightarrow \Delta \) implies \(\vdash _{TC_{G}}\varGamma ^{'}\Rightarrow \Delta ^{'}\).

Proof

The proof of (1) is carried out by induction on the proof in \(TC_{G}\). We state here only the cases concerning the TC operator.

  • Axiom (13): We need to show that \(\exists z\left( \varphi ^{*}\left\{ \frac{s}{x},\frac{z}{y}\right\} \wedge \left( RTC_{x,y}\varphi ^{*}\right) \left( z,t\right) \right) \Rightarrow \varphi ^{*}\left\{ \frac{s}{x},\frac{t}{y}\right\} \vee \exists z\left( \varphi ^{*}\left\{ \frac{s}{x},\frac{z}{y}\right\} \wedge \exists w\left( \varphi ^{*}\left\{ \frac{z}{x},\frac{w}{y}\right\} \wedge \left( RTC_{x,y}\varphi ^{*}\right) \left( w,t\right) \right) \right) \) is provable in \(RTC_{G}\). This can easily be obtained from Rule (7) using the standard rules of \(\mathcal {LK}_{=}\).

  • Rule (14): An application of Rule (14) can be transformed into the following derivation:

  • Rule (15): Rule (6) entails the existence of a proof in \(RTC_{G}\) of the sequent \(\exists z\left( \varphi ^{*}\left\{ \frac{r}{x},\frac{z}{y}\right\} \wedge \left( RTC_{x,y}\varphi ^{*}\right) \left( z,t\right) \right) \Rightarrow \left( RTC_{x,y}\varphi ^{*}\right) \left( r,t\right) \). A cut on the hypothesis \(\Rightarrow \exists z\left( \varphi ^{*}\left\{ \frac{r}{x},\frac{z}{y}\right\} \wedge \left( RTC_{x,y}\varphi ^{*}\right) \left( z,t\right) \right) \) results in a proof of \(\Rightarrow \left( RTC_{x,y}\varphi ^{*}\right) \left( r,t\right) \). Then, we can construct the following derivation:

    The desired sequent is now obtained by one more cut on the second hypothesis \(\Rightarrow \exists z\left( \varphi ^{*}\left\{ \frac{s}{x},\frac{z}{y}\right\} \wedge \left( RTC_{x,y}\varphi ^{*}\right) \left( z,r\right) \right) \).

  • Rule (16): An application of Rule (16) can be transformed into the following derivation:

The proof of (2) is also carried out by induction, and again, we only present here the cases concerning the RTC operator.

  • Axiom (2): The interpretation of the axiom is \(\Rightarrow \left( TC_{x,y}\varphi '\right) \left( s,s\right) \vee s=s\), which is easily derivable from the equality axioms.

  • Rule (3): An application of Rule (3) can be transformed into the following derivation:

  • Rule (4): By classical reasoning, to show that \(\Rightarrow \left( TC_{x,y}\varphi '\right) \left( s,t\right) ,s=t\) is provable from \(\Rightarrow \left( TC_{x,y}\varphi '\right) \left( s,r\right) ,s=r\) and \(\Rightarrow \left( TC_{x,y}\varphi '\right) \left( r,t\right) ,r=t\), it suffices to show that the following four sequents are provable:

    • \(\left( TC_{x,y}\varphi '\right) (s,r),r=t\Rightarrow \left( TC_{x,y}\varphi '\right) \left( s,t\right) ,s=t\), which is derivable using equality rules and Weakening.

    • \(\left( TC_{x,y}\varphi '\right) (r,t),s=r\Rightarrow \left( TC_{x,y}\varphi '\right) \left( s,t\right) ,s=t\), which is derivable using equality rules and Weakening.

    • \(\left( TC_{x,y}\varphi '\right) (s,r),\left( TC_{x,y}\varphi '\right) (r,t)\Rightarrow \left( TC_{x,y}\varphi '\right) \left( s,t\right) ,s=t\), which is derivable using rule (15) and Weakening.

    • \(r=t,s=r\Rightarrow \left( TC_{x,y}\varphi '\right) \left( s,t\right) ,s=t\), which is derivable using equality rules and Weakening.

    Using cuts we obtain a proof of \(\Rightarrow \left( TC_{x,y}\varphi '\right) \left( s,t\right) ,s=t\).

  • Rule (5): An application of rule (5) can be transformed into the following derivation:

\(\square \)

Proposition 7

The following holds:

  1. 1.

    \(\vdash _{TC_{G}}\left( \varphi ^{*}\right) '\Rightarrow \varphi \) and \(\vdash _{TC_{G}}\varphi \Rightarrow \left( \varphi ^{*}\right) '\).

  2. 2.

    \(\vdash _{RTC_{G}}\left( \varphi '\right) ^{*}\Rightarrow \varphi \) and \(\vdash _{RTC_{G}}\varphi \Rightarrow \left( \varphi '\right) ^{*}\).

Proof

The proofs of both (1) and (2) are carried out by induction on \(\varphi \). If \(\varphi \) does not contain the TC or RTC operator, then \(\left( \varphi '\right) ^{*}\) and \(\left( \varphi ^{*}\right) ^{'}\) are equal to \(\varphi \), hence provably equivalent to it.

For (1) assume that \(\varphi :=\left( RTC_{x,y}A\right) \left( s,t\right) \). Thus, \(\left( \varphi '\right) ^{*}\) is the formula \(\exists z\left( \left( A'\right) ^{*}\left\{ \frac{s}{x},\frac{z}{y}\right\} \wedge RTC_{x,y}\left( A'\right) ^{*}\left( z,t\right) \right) \vee s=t\). By the induction hypothesis we have \(\vdash _{RTC_{G}}\left( A'\right) ^{*}\Rightarrow A\), thus by (10) the sequent \(\left( RTC_{x,y}\left( A'\right) ^{*}\right) \left( s,t\right) \Rightarrow \left( RTC_{x,y}A\right) \left( s,t\right) \) is also provable in \(RTC_{G}\). It is easy to check that the sequent \(\exists z\left( \left( A'\right) ^{*}\left\{ \frac{s}{x},\frac{z}{y}\right\} \wedge RTC_{x,y}\left( A'\right) ^{*}\left( z,t\right) \right) \vee s=t\Rightarrow \left( RTC_{x,y}\left( A'\right) ^{*}\right) \left( s,t\right) \) is provable in \(RTC_{G}\) [(using (6) and (2)]. A cut on the last two sequents results in a proof of \(\exists z\left( \left( A'\right) ^{*}\left\{ \frac{s}{x},\frac{z}{y}\right\} \wedge RTC_{x,y}\left( A'\right) ^{*}\left( z,t\right) \right) \vee s=t\Rightarrow \left( RTC_{x,y}A\right) \left( s,t\right) \). For the converse, denote \(\exists z\left( \left( A'\right) ^{*}\left\{ \frac{u}{x},\frac{z}{y}\right\} \wedge RTC_{x,y}\left( A'\right) ^{*}\left( z,w\right) \right) \vee s=t\) by \(\psi \) (notice that \(\left( \varphi '\right) ^{*}\) is \(\psi \left\{ \frac{s}{u},\frac{t}{w}\right\} \)). It is easy to see that \(\psi \left\{ \frac{s}{u},\frac{x}{w}\right\} ,\left( A'\right) ^{*}\Rightarrow \psi \left\{ \frac{s}{u},\frac{y}{w}\right\} \) is provable in \(RTC_{G}\). Applying Rule (5) results in a proof of the sequent \(\psi \left\{ \frac{s}{u},\frac{s}{w}\right\} ,\left( RTC_{x,y}\left( A'\right) ^{*}\right) \left( s,t\right) \Rightarrow \psi \left\{ \frac{s}{u},\frac{t}{w}\right\} \). The sequent \(\Rightarrow \psi \left\{ \frac{s}{u},\frac{s}{w}\right\} \) is clearly provable using the equality axiom, thus, a cut entails a proof of the sequent \(\left( RTC_{x,y}\left( A'\right) ^{*}\right) \left( s,t\right) \Rightarrow \left( \varphi '\right) ^{*}\). As before, by the induction hypothesis we have that \(\vdash _{RTC_{G}}A\Rightarrow \left( A'\right) ^{*}\), so by (10) the sequent \(\left( RTC_{x,y}A\right) \left( s,t\right) \Rightarrow \left( RTC_{x,y}\left( A'\right) ^{*}\right) \left( s,t\right) \) is also provable in \(RTC_{G}\), and by one cut we obtain a proof of \(\left( RTC_{x,y}A\right) \left( s,t\right) \Rightarrow \left( \varphi '\right) ^{*}\).

For (2) assume that \(\varphi :=\left( TC_{x,y}A\right) \left( s,t\right) \). Hence, \(\left( \varphi ^{*}\right) '\) is the formula \(\exists z\left( \left( A^{*}\right) '\left\{ \frac{s}{x},\frac{z}{y}\right\} \wedge \left( TC_{x,y}\left( A^{*}\right) '\left( z,t\right) \vee z=t\right) \right) \). It is easy to check that the sequent \(\exists z\left( \left( A^{*}\right) '\left\{ \frac{s}{x},\frac{z}{y}\right\} \wedge \left( TC_{x,y}\left( A^{*}\right) '\left( z,t\right) \vee z=t\right) \right) \Rightarrow \left( TC_{x,y}\left( A^{*}\right) '\right) \left( s,t\right) \) is provable in \(TC_{G}\). By the induction hypothesis we have \(\vdash _{TC_{G}}\left( A^{*}\right) '\Rightarrow A\), so by the TC-counterpart of (10) the sequent \(\left( TC_{x,y}\left( A^{*}\right) '\right) \left( s,t\right) \Rightarrow \left( TC_{x,y}A\right) \left( s,t\right) \) is also provable in \(TC_{G}\). Applying a cut results in a proof of the sequent \(\exists z\left( \left( A^{*}\right) '\left\{ \frac{s}{x},\frac{z}{y}\right\} \wedge \left( TC_{x,y}\left( A^{*}\right) '\left( z,t\right) \vee z=t\right) \right) \Rightarrow \left( TC_{x,y}A\right) \left( s,t\right) \). For the converse, notice that Axiom (13) entails the provability of \(\left( TC_{x,y}\left( A^{*}\right) '\right) \left( s,t\right) \Rightarrow \left( A^{*}\right) '\left\{ \frac{s}{x},\frac{t}{y}\right\} \vee \exists z\left( \left( A^{*}\right) '\left\{ \frac{s}{x},\frac{z}{y}\right\} \wedge \left( TC_{x,y}\left( A^{*}\right) '\right) \left( z,t\right) \right) \). Clearly, the sequent \(\left( A^{*}\right) ' \left\{ \frac{s}{x},\frac{t}{y}\right\} \Rightarrow \exists z\left( \left( A^{*}\right) '\left\{ \frac{s}{x},\frac{z}{y}\right\} \wedge z=t\right) \) is provable in \(TC_{G}\), and again, using the induction hypothesis on A together with the TC-counterpart of (10) we get that \(\left( TC_{x,y}A\right) \left( s,t\right) \Rightarrow \left( TC_{x,y}\left( A^{*}\right) '\right) \left( s,t\right) \) is provable in \(TC_{G}\). By cuts we get \(\vdash _{TC_{G}}\left( TC_{x,y}A\right) \left( s,t\right) \Rightarrow \exists z\left( \left( A^{*}\right) '\left\{ \frac{s}{x},\frac{z}{y}\right\} \wedge \left( TC_{x,y}\left( A^{*}\right) '\left( z,t\right) \vee z=t\right) \right) \).\(\square \)

Theorem 1

\(TC_{G}\) and \(RTC_{G}\) are equivalent in the following sense:

  1. 1.

    \(\vdash _{RTC_{G}}\varGamma \Rightarrow \Delta \) iff \(\vdash _{TC_{G}}\varGamma ^{'}\Rightarrow \Delta ^{'}\).

  2. 2.

    \(\vdash _{TC_{G}}\varGamma \Rightarrow \Delta \) iff \(\vdash _{RTC_{G}}\varGamma ^{*}\Rightarrow \Delta ^{*}\).

Proof

The left-to-right implications are simply Proposition 6. For the right-to-left implication, consider \(\vdash _{TC_{G}}\varGamma ^{'}\Rightarrow \Delta ^{'}\). By 6 we get that \(\vdash _{RTC_{G}}\left( \varGamma '\right) ^{*}\Rightarrow \left( \Delta '\right) ^{*}\). Since by Proposition 7 we have that \(\vdash _{RTC_{G}}\left( \varphi '\right) ^{*}\Rightarrow \varphi \) and \(\vdash _{RTC_{G}}\varphi \Rightarrow \left( \varphi '\right) ^{*}\) for any formula \(\varphi \), using cuts we get that \(\vdash _{RTC_{G}}\varGamma \Rightarrow \Delta \). The proof of (2) is similar.\(\square \)

3.2 On cut elimination and constructive consistency proofs

Next we examine some fundamental proof-theoretical properties of \(TC_{G}\) , the most important of which is cut elimination.Footnote 7 The cut rule is the following:

$$\begin{aligned} \dfrac{\varGamma \Rightarrow \Delta ,\varphi \,\,\,\,\,\varphi ,\varGamma \Rightarrow \Delta }{\varGamma \Rightarrow \Delta }\,(cut) \end{aligned}$$

The formula \(\varphi \) is called the cut formula. Intuitively, we may view this rule as allowing the use of lemmas, such as \(\varphi \), in proofs.

The cut-elimination theorem (also known as Gentzen’s Hauptsatz) is a central proof theoretical property of a sequent calculus, originally proved by Gerhard Gentzen (1935) for the system \(\mathcal {LK}\) and for the system \(\mathcal {LJ}\) for intuitionistic logic. The cut elimination theorem states that any proof can be effectively transformed into a proof with the same end-sequent without using the cut rule. Thus, a cut-free proof is “direct” in the sense that it avoids intermediate results (which may be more general than the final theorem). The cut elimination theorem has some immediate consequences. Any system which admits cut elimination enjoys the sub-formula property, which states that given a cut-free proof of a sequent, every formula that appears in the proof is a sub-formula of a formula in the end-sequent. Another consequence is the separation property: any provable sequent has a proof using only the logical rules or axioms for the logical operators occurring in the end-sequent. These properties are essential for a system which aims to have an effective proof search procedure. For \(\mathcal {LK}_{=}\), an alternative version of the cut elimination theorem must be used due to the presence of the equality axioms. A cut is said to be inessential if the cut formula is of the form \(s=t\), otherwise it is called an essential cut. A system with equality is said to admit cut elimination if all essential cuts are admissible.

In semantical proofs of cut elimination one usually establishes not only closure under cut, but also completeness. However, this type of proof does not provide a constructive method for eliminating cuts from a given proof. In contrast, syntactic proofs of cut elimination do not just show that the cut rule remains admissible if it is deleted from the list of the rules of the system, they provide algorithms for transforming any proof containing essential cuts into an essential-cut-free proof. The standard syntactic cut elimination proofs Gentzen (1935), Takeuti (2013), Pohlers (2009), Troelstra and Schwichtenberg (2000) use a method of going over a given proof and “reducing” it to a proof which is less complicated in some sense, until all essential cuts are eliminated. What is reduced can be the complexity of the cut formula, the “depth” of the proof, the ordinal of the proof, or some other measure for the complexity of the proof.

For example, Gentzen’s classic proof of the cut-elimination theorem for first-order logic Gentzen (1935) uses a double induction: the main induction is on the number of logical connectives and quantifiers in the cut formula, and the sub-induction is on the “rank” of the cut, which is some measure depending on the place of the cut in the proof. A reduction step is defined for every derivation ending with an application of the cut rule. For instance, a cut on a compound formula is replaced by cuts on its sub-formulas, which necessarily contain a smaller number of connectives. For example, the derivation

is reduced to

By the induction hypothesis, this cut on \(\varphi \) can be eliminated, hence the original cut on \(\varphi \wedge \psi \) can also be eliminated.

Following this standard method in the case of \(TC_{G}\), a reduction step should be defined for every derivation ending with an application of the cut rule. Consider, for example, the following derivation. (For convenience, we shall omit the context from the sequents in all the derivations from this point on.)

The natural reduction of this derivation is

The cut on the formula \(\left( TC_{x,y}\varphi \right) \left( s,t\right) \) is replaced by a cut on the formula \(\varphi \left\{ \frac{s}{x},\frac{t}{y}\right\} \), which is of smaller complexity. Hence, in this case we have a natural reduction of the proof.

However, let us examine the following derivation:

The natural reduction is carried out by constructing the derivation

and using the sub-proofs \(P_{1}\) and \(P_{2}\) to obtain a proof of \(\psi \left\{ \frac{s}{x}\right\} \Rightarrow \psi \left\{ \frac{t}{x}\right\} \) by applying two cuts. Hence the cut on the formula \(\left( TC_{x,y}\varphi \right) \left( s,t\right) \) is replaced here by three cuts on the formulas \(\left( TC_{x,y}\varphi \right) \left( r,t\right) \), \(\left( TC_{x,y}\varphi \right) \left( s,r\right) \), and \(\psi \left\{ \frac{r}{x}\right\} \). It is unclear what kind of measure can be used here in order to achieve a reduction in the proof. The number of applications of Rule (15) has gone down by one, yet the duplication of the derivation \(P_{3}\) and the application of the induction rule might offset this. Moreover, while the two new cut formulas, \(\left( TC_{x,y}\varphi \right) \left( r,t\right) \) and \(\left( TC_{x,y}\varphi \right) \left( s,r\right) \), are of complexity equal to that of the original cut formula and there is reduction of the depth, the real difficulty is that the new cut formula \(\psi \left\{ \frac{r}{x}\right\} \) is not related at all to the original cut formula. Thus it can be of larger complexity than \(\left( TC_{x,y}\varphi \right) \left( s,t\right) \), unless we force some constraints on the applicability of the induction rule.

This difficulty is not unique to \(TC_{G}\). It occurs often in systems with an induction rule, since the use of the induction rule often complicates the reduction of cuts. In order to avoid this problem Gentzen applied a different method for \(PA_{G}\) [Gentzen-style system for PA Gentzen (1969), Takeuti (2013)]. Instead of proving full cut elimination for \(PA_{G}\), Genten proved a weaker version of the cut elimination theorem from which follows the consistency of \(PA_{G}\). A system is said to be consistent if it does not admit a proof of the absurd, i.e. the empty sequent. In \(PA_{G}\), as well as in \(TC_{G}\), formulas never disappear, except in cuts (the only other simplification allowed is contraction, in which a repetition is reduced). From this follows that there can be no cut-free proof of the empty sequent. Thus, by proving a weaker version of the cut elimination theorem which provides an algorithm for eliminating cuts only from proofs ending with the empty sequent, one establishes a constructive consistency proof of the system.Footnote 8

A crucial step in Gentzen consistency proof for \(PA_{G}\) is the elimination of all occurrences of \(PA_{G}\)’s induction rule from the end-piece of the proof.Footnote 9 First, all free variables which are not used as eigenvariables in the end-piece of the proof are replaced by constants. Then, any application of the induction rule up to a specific natural number is replaced by a corresponding number of structural inference rules. The transformation is done in the following way. Assume that the following application of \(PA_{G}\)’s induction rule appears within an end-piece

Since all free variables were eliminated, t is a closed term and hence there is a term \(s(\ldots (s(0))\) such that \(\Rightarrow s(\ldots (s(0))=t\) is provable in \(PA_{G}\) without essential cuts or induction. Therefore, there is also a proof of \(\psi (s(\ldots (s(0)))\Rightarrow \psi (t)\) without essential cuts or induction. Let \(P'\left( b\right) \) be the proof which is obtained from \(P'\) by replacing a by b throughout the proof. Each occurrence of the induction rule is replaced by

These consecutive cuts are carried on up to the sequent \(\psi \left\{ \frac{0}{x}\right\} \Rightarrow \psi \left\{ \frac{s(\ldots (s(0))}{x}\right\} \). Then one more cut is used on the sequent \(\psi (s(\ldots (s(0)))\Rightarrow \psi (t)\) to obtain a proof of \(\psi \left\{ \frac{0}{x}\right\} \Rightarrow \psi \left\{ \frac{t}{x}\right\} \).

Can a similar method be applied to the TC-induction rule? The problem is that Gentzen’s transformation uses special features of the natural numbers that generally do not exist in \(TC_{G}\). To see this, notice that the induction rule Rule (16) entails all instances of \(PA_{G}\)’s induction rule by taking \(\varphi \) to be \(s\left( x\right) =y\). However, in the general case \(\varphi \) is an arbitrary formula. Thus, unlike in \(PA_{G}\), we do not have a “built in” measure for the \(\varphi \)-distance between two arbitrary closed terms s and t, . The path from s to t through \(\varphi \)-steps is not known apriori. Moreover, it does not have to be unique.

Unfortunately, this generalization of the induction principle renders this standard method for analyzing \(PA_{G}\) inapplicable. Thus, one should look for useful fragments of \(TC_{G}\) in which cuts can be eliminated from proofs of the empty sequent. One such fragment can be obtained via restricting \(TC_{G}\)’s induction rule by allowing only \(\varphi \)’s of the form \(y=t\), where x is the only free variable in t. In this way we force a deterministic path of \(\varphi \)-steps between any two closed terms. Obviously, this system is still adequate for the task of mechanizing mathematics, as its restricted induction rule still includes that of \(PA_{G}\). Exploring this direction is left for future research.

Another key proof-theoretical method which arises from Gentzen’s consistency proof for \(PA_{G}\) is the assignment of ordinals to proof systems. In Gentzen’s method, each system is assigned the least ordinal number needed for its constructive consistency proof. This provides a measure for a complexity of a system which is useful for comparing different proof systems. The constructive consistency proof of \(PA_{G}\) entails that the ordinal number of \(PA_{G}\) is at most \(\varepsilon _{0}\), and another theorem of Gentzen Gentzen (1943) shows that it is exactly \(\varepsilon _{0}\).

Definition 9

The system \(TC_{A}\) is obtained by augmenting \(TC_{G}\) with the standard axioms for successor, addition, and multiplication, together with the axiom characterizing the natural numbers in ancestral logic Axiom (1).Footnote 10

We next show that for the standard language of PA the system \(TC_{A}\) is equivalent to \(PA_{G}\), in the sense that there is a provability preserving translation algorithm between them.Footnote 11 For the translation we use a beta function which allows us to encode in PA finite sequences [(this idea is taken from Smith (2008)]. Recall that we can express facts about sequences of numbers in PA by using a \(\beta \)-function such that for any finite sequence \(k_{0},k_{1},...,k_{n}\) there is some c such that for all \(i\le n\), \(\beta (c,i)=k_{i}\). Thus, our motivation is that \(\left( TC_{x,y}\varphi \right) \left( s,t\right) \) holds iff for some n, there is a sequence \(k_{0},k_{1},...,k_{n}\) such that \(k_{0}=I\left[ s\right] \), \(k_{n}=I\left[ t\right] \), and each pair of consecutive terms are in the relation defined by \(\varphi \). Accordingly, let B be a wff of the language of PA with three free variables which captures in PA a \(\beta \)-function. For each formula \(\varphi \) of the language of PA define \(\varphi ^{\beta }:=\varphi \), and define \(\left( \left( TC_{x,y}\varphi \right) \left( s,t\right) \right) ^{\beta }\) to be the formula:

$$\begin{aligned}&\exists z\exists c\left( B\left( c,0,s\right) \wedge B\left( c,s\left( z\right) ,t\right) \wedge \forall u\le z\exists v\exists w \right. \\&\left. \quad \times \left( B\left( c,u,v\right) \wedge B\left( c,s\left( u\right) ,w\right) \wedge \varphi ^{\beta }\left\{ \frac{v}{x},\frac{w}{y}\right\} \right) \right) \end{aligned}$$

Proposition 8

\(\vdash _{TC_{A}}\varphi \Rightarrow \varphi ^{\beta }\) and \(\vdash _{TC_{A}}\varphi ^{\beta }\Rightarrow \varphi \).

Proof

The proof is carried out by induction. If \(\varphi \) does not contain the TC operator, \(\varphi ^{\beta }\) is equal to \(\varphi \). Let \(\varphi \) be \(\left( TC_{x,y}\psi \right) \left( s,t\right) \). Denote by \(\phi \left( a,b\right) \) the formula \(\exists z\exists c\big (B\left( c,0,s\right) \wedge B\left( c,s\left( z\right) ,t\right) \wedge \forall u\le z\exists v\exists w\left( B\left( c,u,v\right) \wedge B\big (c,s\left( u\right) ,w\right) \wedge \varphi ^{\beta } \left\{ \frac{v}{x},\frac{w}{y}\right\} \big )\big )\). It is easy to check that \(\psi ^{\beta }\left\{ \frac{a}{x},\frac{b}{y}\right\} \Rightarrow \phi \left( a,b\right) \) is provable and from this follows that \(\phi \left\{ \frac{s}{a},\frac{x}{b}\right\} ,\psi ^{\beta }\Rightarrow \phi \left\{ \frac{s}{a},\frac{y}{b}\right\} \) is also provable. Hence, by Rule (16) we get \(\phi \left\{ \frac{s}{a},\frac{z}{b}\right\} ,\psi ^{\beta }\left\{ \frac{z}{x},\frac{t}{y}\right\} \Rightarrow \phi \left\{ \frac{s}{a},\frac{t}{b}\right\} \). Now, using Axiom (13) we can get a proof of \(\left( TC_{x,y}\psi ^{\beta }\right) \left( s,t\right) \Rightarrow \phi \left\{ \frac{s}{a},\frac{t}{b}\right\} \) (notice that \(\phi \left\{ \frac{s}{a},\frac{t}{b}\right\} \) is exactly \(\varphi ^{\beta }\)). By the induction hypothesis we have \(\vdash _{TC_{A}}\psi \Rightarrow \psi ^{\beta }\) and by (10) the sequent \(\left( TC_{x,y}\psi \right) \left( s,t\right) \Rightarrow \left( TC_{x,y}\psi ^{\beta }\right) \left( s,t\right) \) is also provable in \(TC_{A}\). Thus, a cut results in a proof of \(\left( TC_{x,y}\psi \right) \left( s,t\right) \Rightarrow \varphi ^{\beta }\). For the converse, notice that in \(TC_{A}\) all instances of \(PA_{G}\)’s induction rule are derivable. Denote by \(G\left( z\right) \) the formula \(\exists c\big (B\left( c,0,s\right) \wedge B\left( c,s\left( z\right) ,a\right) \wedge \forall u\le z\exists v\exists w\left( B\left( c,u,v\right) \wedge B\big (c,s\left( u\right) ,w\right) \wedge \varphi ^{\beta } \left\{ \frac{v}{x},\frac{w}{y}\right\} \big )\big )\rightarrow \left( TC_{x,y}\psi ^{\beta }\right) \left( s,t\right) \). It is straightforward to verify that \(G\left( 0\right) \) is provable using Rule (14) and \(G\left( z\right) \Rightarrow G\left( s\left( z\right) \right) \) is provable using the TC-counterpart of Rule (6). By applying PA’s induction rule and one cut we get a proof of \(G\left( z\right) \). We can then substitute t for a and introduce \(\forall \) to get a proof of \(\forall zG\left\{ \frac{t}{a}\right\} \), from which by standard \(\mathcal {LK}\) rules we can infer \(\varphi ^{\beta }\Rightarrow \left( TC_{x,y}\psi ^{\beta }\right) \left( s,t\right) \). By the induction hypothesis we have \(\vdash _{TC_{A}}\psi ^{\beta }\Rightarrow \psi \) thus by (10) the sequent \(\left( TC_{x,y}\psi ^{\beta }\right) \left( s,t\right) \Rightarrow \left( TC_{x,y}\psi \right) \left( s,t\right) \) is also provable in \(TC_{A}\). Hence, a cut results in a proof of \(\varphi ^{\beta }\Rightarrow \left( TC_{x,y}\psi \right) \left( s,t\right) \).\(\square \)

We use the standard abbreviations: \(\varGamma ^{\beta }\) for \(\left\{ \varphi ^{\beta }|\varphi \in \varGamma \right\} \).

Theorem 2

\(\vdash _{TC_{A}}\varGamma \Rightarrow \Delta \) iff \(\vdash _{PA_{G}}\varGamma ^{\beta }\Rightarrow \Delta ^{\beta }\). In particular, for \(\varGamma ,\Delta \) in the language of PA, \(\vdash _{TC_{A}}\varGamma \Rightarrow \Delta \) iff \(\vdash _{PA_{G}}\varGamma \Rightarrow \Delta \).

Proof

It is easy to check that all the inference rules for the TC-operator apply equally to their \(\beta \)-translated formulas in \(PA_{G}\), and the \(\beta \)-translated analogue of Axiom (1) is also a theorem of \(PA_{G}\). Therefore corresponding to any proof of \(\varGamma \Rightarrow \Delta \) in \(TC_{A}\) there is a parallel proof in \(PA_{G}\) of \(\varGamma ^{\beta }\Rightarrow \Delta ^{\beta }\). For the right-to-left implication first notice that \(TC_{A}\) is an extension of \(PA_{G}\), i.e. any TC-less formula that is provable in \(PA_{G}\) is also provable in \(TC_{A}\) (since Rule (16) together with Axiom (1) entail all instances of \(PA_{G}\)’s induction rule). From this together with Prop. 8 using cuts we can convert any proof of \(\varGamma ^{\beta }\Rightarrow \Delta ^{\beta }\) in \(PA_{G}\) to a proof of \(\varGamma \Rightarrow \Delta \) in \(TC_{A}\).\(\square \)

Corollary 1

The ordinal number of the system \(TC_{A}\) is \(\varepsilon _{0}\).

Proof

Using the translation algorithm between \(TC_{A}\) and \(PA_{G}\) and the constructive proof of Proposition 8 (which obviously requires ordinal less than \(\omega ^{\omega }\)), any constructive proof of the consistency of \(PA_{G}\) with transfinite induction up to some ordinal greater that \(\omega ^{\omega }\) can be transformed into a constructive proof of the consistency of \(TC_{A}\) which uses transfinite induction up to the same ordinal, and vice versa. Hence, \(TC_{A}\) and \(PA_{G}\) have the same ordinal, which is known to be \(\varepsilon _{0}\).\(\square \)

4 Conclusions and further research

In this paper we reviewed the expressive power of logics augmented by a transitive closure operator and explored their reasoning potential. This work focused on working out this potential by presenting effective sound proof systems for ancestral logic that are strong enough for various mathematical needs. Our next goal is to improve the computational efficiency of these systems, in order to make them suitable for mechanization.

In the last section the property of cut-elimination for \(TC_{G}\) was discussed. Further research is required in order to determine what kinds of useful fragments of \(TC_{G}\) do admit cut-elimination. One possible option (already mentioned) is to restrict the induction rule of \(TC_{G}\) by allowing only \(\varphi \)’s of the form \(y=t\) where x is the only free variable that occurs in t. Another option is to find out what are the conditions on a formula \(\varphi \) and terms st so that there is a proof in \(TC_{G}\) for the sequent \(\Rightarrow \left( TC_{x,y}\varphi \right) \left( s,t\right) \) without the induction rule, and then restrict the induction rule by those conditions.

The consistency of \(TC_{A}\) certainly implies the consistency of \(TC_{G}\), and the ordinal of \(TC_{G}\) is therefore at most \(\upvarepsilon _{0}\). It seems almost certain that it is exactly \(\upvarepsilon _{0}\). However, presenting a full rigorous proof that the ordinal of \(TC_{G}\) is not less than \(\upvarepsilon _{0}\) is left for future work.

As declared, we believe that ancestral logic should suffice for most of applicable mathematics. Substantiating this claim by creating formal systems based on AL and formalizing in them large portions of mathematics, is a further future work. A promising candidate for serving as the basis for the system is the predicative set theory PZF, presented in Arnon (2004, 2008), which resembles ZF and is suitable for mechanization. The key element of PZF is that it uses syntactic safety relations between formulas and sets of variables. The underlying logic of PZF is ancestral logic, which makes it possible to provide inductive definitions of relations and functions which are sets. An important criterion for the adequacy of AL for the task of formalizing mathematics is the extent to which this will be done in a natural way, as close as possible to real mathematical practice.