1 Introduction

Let Φ be a signature and A be a first-order Φ-structure. Any such structure A has an associated recursion theory, that is to say a family of functions and relations that are recursive over A.

We say that a function is recursive over A in case it is computed by a system of recursive equations whose non-logical symbols come from Φ. We call this system a (McCarthy) recursive program after John McCarthy, who pioneered the idea in [6]. In the case that A is

$$\mathbf{N}_{u}:=(\mathbb{N},0,1,S,Pd,=) $$

where S is the successor and P d the predecessor function, we recover classical recursion theory.

There are several other ways to understand computability over A. For example, we may consider iterative programs. A program over a given structure is iterative if, roughly speaking, it can be expressed using recursion no more complicated than “loops,” such as for loops or while loops. Iterative programs express a broad class of familiar algorithms, but not the ones that use more complicated “divide-and-conquer” recursion. There is a restricted class of recursive programs, called tail recursive programs, which compute exactly the iterative functions.

Over any structure, every tail recursive function is automatically recursive. In other words,

$$\text{tail}(\mathbf{A})\subseteq\text{rec}(\mathbf{A}) $$

for all A, where tail(A) and rec(A) are the set of tail recursive and recursive functions over A respectively.

In the classical setting, the converse is true:

$$\text{tail}(\mathbf{N}_{u})=\text{rec}(\mathbf{N}_{u}). $$

However, this is not true in general. Examples of structures for which it fails can be found in [4, 5, 9, 10].

It is natural to ask whether there is a difference in complexity between recursion and tail recursion, even when they might agree in terms of computability. In other words:

Is there a structure A such that rec(A)=tail(A), some partial function f on A, and some natural measure of efficiency on recursive programs such that there is a recursive program compuing f more efficiently than any tail recursive program?

In this paper we give a positive answer to this question, thus answering an open question from Lynch and Blum [5].

Related Work

Several different groups of authors have studied recursion theory over general first-order structures. Each of them draws the distinction between classes corresponding to recursion and tail recursion. In particular, there are:

  • Finite algorithmic procedures on A (FAP (A)) of Harvey Freidman [1], augmented with a stack (FAPS (A)) by Moldestad, Stoltenberg-Hansen, and Tucker [3]

  • W h i l e(A) and W h i l e (A) of Tucker and Zucker [11]

  • Program schemata and recursive schemata in the field of schematology in computer science. (See, for example, the book of Greibach [2])

Each pair of computability classes corresponds to recursion and tail recursion respectively. In the first two cases, instead of realizing tail recursive programs as a restriction of more general recursive programs, the fundamental object is an abstract register machine or while-program, which corresponds to tail recursion. The full power of recursion is recovered by adding an additional stack or array data structure.

In fact, these models not only compute the same functions as recursive and tail recursive programs, but they do so about as efficiently according to natural measures of complexity (see Section 2.3). Hence our separation results are robust, and not an artifact of our model.

Organization of This Paper

In Section 2, we review recursive programs, tail recursion, and complexity measures on recursive programs. Section 3.1 contains the bulk of our technical work and concludes with the abstract theorem about tail recursive programs over the natural numbers which we will use to prove our lower bound, and Section 3.2 applies that theorem to an explicit example. We then make concluding remarks and state some open problems.

2 Recursion over Abstract Structures

In this section we will precisely define our basic objects of study, namely recursive programs and recursive functions over first-order structures. We follow a simplied version of the exposition in Moschovakis, [7, 8]. First, we review a few preliminaries.

Pointed Structures

We require that our structures be pointed, i.e., that they contain distinct constants 0 and 1. This simplifies the development of the theory considerably. By identifying boolean values with {0,1} we forego predicates in favor of functions into {0,1}, and we do not need to keep track of the sorts of terms, which in our setting generalize both terms and atomic formulas in first-order logic.

Expansions

If A is a structure and f is a function on A, then (A,f) denotes the structure with f as an additional primitive.

In fact, all the structures we will work with will be some expansion of what we have decided to call (for better or for worse) Predecessor arithmetic, namely

$$\mathbf{N}_{Pd}:=(\mathbb{N},0,1,Pd,eq_{0}) $$

where

$$eq_{0}(n)=\left\{\begin{array}{ll} 0 & \text{if} n=0\\ 1 & \text{otherwise} \end{array}\right.. $$

2.1 Syntax and Semantics of Recursive Programs

Recursive Programs

are the syntactic objects that define recursive functions. Terms are the building blocks of recursive programs, and extend terms in first-order logic by function-valued variables and conditional statements.

Terms and Their Semantics

Definition 1

A Φ-term M is generated by the grammar

$$M:=\mathtt{x}\mid\phi(M_{1},\dots,M_{n})\mid\mathtt{p}(M_{1},\dots,M_{n})\mid\text{ if } M_{0} \text{ then } M_{1} \text{ else } M_{2} $$

where is a variable, is a function-valued variable or recursive variable, and ϕ∈Φ. Without loss of generality, assume that variables and function-valued variables are all of the form i or i for \(i\in \mathbb {N}\).

A Φ-term without recursive variables is called explicit, and an explicit Φ-term without conditionals is called algebraic. An algebraic Φ-term is simply a term in the first-order setting; it denotes a function over a Φ-structure A.

Within an explicit term, a conditional

$$\text{ if } M_{0} \text{ then } M_{1} \text{ else }M_{2} $$

denotes M 1 if the denotation of M 0 is 0 and M 2 if the denotation of M 0 is not 0. Explicit terms also denote functions, and such functions are themselves called explicit.

A non-explicit term with recursive variables becomes explicit when we substitute in explicit functions (or partial functions) for those variables. Let the denotation of a non-explicit term be this functional that takes tuples of partial functions to a partial function.

If \(\bar {\mathtt {x}}\) and \(\bar {\mathtt {p}}\) are tuples of variables and recursive variables respectively, and if for some term M all of its variables and recursive variables occur in \(\bar {\mathtt {x}}\) and \(\bar {\mathtt {p}}\), then (following Moschovakis) we write \(M(\bar {\mathtt {x}},\bar {\mathtt {p}})\) and call it a full extended term.

The advantage of this notation is that it makes substitution of terms very easy to write. For example, given some elements \(\bar {\mathtt {x}}\) from a Φ-structure, we may form the term with parameters \(M(\bar {x},\bar {\mathtt {p}})\). Or, we might substitute additional Φ-terms for the variables \(\bar {\mathtt {x}}\).

We trust that this method of notating substitution is natural to our readers and does not require further explanation.

Anatomy of a Deterministic Recursive Program

Definition 2

A Φ-recursive program is a (K+1)-tuple of Φ-terms

$$E=(E_{0}(\bar{\mathtt{x}}_{0},\bar{\mathtt{p}}),E_{1}(\bar{\mathtt{x}}_{1},\bar{\mathtt{p}}),\dots,E_{K}(\bar{\mathtt{x}}_{K},\bar{\mathtt{p}})) $$

where \(\bar {\mathtt {p}}=(\mathtt {p}_{1},\dots ,\mathtt {p}_{K})\) and the arity of i is \(|\bar {\mathtt {x}}_{i}|\) for 1≤iK.

We shall typically write a program E as

$$E_{0}(\bar{\mathtt{x}}_{0},\bar{\mathtt{p}})\ \text{where} $$
$$\{\mathtt{p}_{1}(\bar{\mathtt{x}}_{1})=E_{1}(\bar{\mathtt{x}}_{1},\bar{\mathtt{p}}) $$
$$\vdots $$
$$\mathtt{p}_{K}(\bar{\mathtt{x}}_{K})=E_{k}(\bar{\mathtt{x}}_{K},\bar{\mathtt{p}})\}. $$

The term E 0 is called the head of E. The system of equations below the head is the body of E.

Denotational Semantics

Given a Φ-structure A, a Φ-recursive program E computes a partial function over its domain A:

A solution to the body of a recursive equation is a tuple of partial functions \(\bar {p}\) on A that is fixed by the functionals denoted by E i , i.e.,

$$\mathbf{A}\models p_{i}(\bar{x})=E_{i}(\bar{x},\bar{p}) $$

for all \(\bar {x}\) and each 1≤iK. Footnote 1 Such a solution must always exist (among partial functions!) and moreover there must be a least solution \(\bar {p}_{0}\) which is extended by all the others.

Finally, the partial function computed by E over A is defined to be the partial function denoted by \(E_{0}(\cdot ,\bar {p}_{0})\); i.e., the partial function obtained when the least solution \(\bar {p}_{0}\) to the body is fed into the head E 0.

For more detail and proofs, see Chapter 2A of Moschovakis [7].

Definition 3

For a Φ-structure A, a partial function is recursive if it is computable by some Φ-recursive program E over A. Let rec(A) be the set of all recursive partial functions over A.

Informal Operational Semantics

It will be helpful to informally describe how a human might evaluate a recursive program on a given input. This will also motivate the definition of the sequential logical complexity in Section 2.3.

We give a procedure to evaluate a term relative to a fixed Φ-structure A and Φ-recursive program

$$E=(E_{0}(\bar{\mathtt{x}}_{0},\bar{\mathtt{p}}),E_{1}(\bar{\mathtt{x}}_{1},\bar{\mathtt{p}}),\dots,E_{K}(\bar{\mathtt{x}}_{K},\bar{\mathtt{p}})) $$

Given a term \(M(\bar {x},\bar {\mathtt {p}})\), proceed according to the following rules:

(Base):

If M is a single variable, then \(M(\bar {x},\bar {\mathtt {p}})\) is a single element of A. Return that element.

(Conditional):

If M≡ if M 0 then M 1 else M 2 then evaluate \(M_{0}(\bar {x},\bar {\mathtt {p}})\). If that is zero, evaluate \(M_{1}(\bar {x},\bar {\mathtt {p}})\), otherwise evaluate \(M_{2}(\bar {x},\bar {\mathtt {p}})\), and return the result.

(Primitive-call):

If Mϕ(M 1,…,M n ), evaluate \(M_{i}(\bar {x},\bar {\mathtt {p}})\) for 1≤in, apply ϕ, and return the result.

(Recursive-call):

M i (M 1,…,M n ), evaluate \(M_{i}(\bar {x},\bar {\mathtt {p}})\) to obtain y i for 1≤in, evaluate \(E_{i}(\bar {y},\bar {\mathtt {p}})\), and return the result.

For a given input \(\bar {x}\), our initial term will be \(E_{0}(\bar {x},\bar {\mathtt {p}})\). When we have evaluated it, the result will either be an element A, or the procedure will not terminate. This procedure defines a partial function which is identical to the one given by its denotational semantics.

2.2 Tail Recursion

Recursive programs which are tail recursive express exactly the iterative algorithms over a given structure. The partial functions they compute form an extremely robust class, admitting (as we have seen) several different equivalent definitions.

Tail recursive functions are closed under composition and, more generally, expansions: if f is tail recursive over (A,g) and g is tail recursive over A, then f is tail recursive over A.

Definition 4

A Φ-recursive program E is tail recursive in case it is of the form

$$f(\bar{\mathtt{x}}_{0})=\mathtt{p}(G(\bar{\mathtt{x}}_{0}))\ \text{where} $$
$$\big\{\mathtt{p}(\bar{\mathtt{x}}_{1})=\text{if} ~\tau(\bar{\mathtt{x}}_{1})\ \text{then} ~o(\bar{\mathtt{x}}_{1})\ \text{else}~\mathtt{p}(F(\bar{\mathtt{x}}_{1}))\big\} $$

where G is an explicit term of arity \(n=|\bar {\mathtt {x}}_{0}|\) and co-arity Footnote 2 \(k=|\bar {\mathtt {x}}_{1}|\), F is an explicit Φ-term of arity and co-arity k, and τ and o are explicit terms of arity k.

Definition 5

A partial function \(f:A^{n}\rightharpoonup A\) is tail recursive in case it is computable by some tail recursive program over A. The set of all tail recursive partial functions is denoted tail(A).

2.3 Complexity Measures on Recursive Programs

We measure the complexity of a recursive program on a particular input by the number of logical steps performed in the execution, namely, the total number of times we have to apply a primitive, parse a conditional, or make a recursive call. Following Moschovakis, we call this the sequential logical complexity. Footnote 3

For a Φ-structure A and a Φ-recursive program E, let \(M=M(\bar {x},\bar {\mathtt {p}})\) be a term-with-parameters. The sequential logical calls complexity L s(M) = L s(A,E,M), which measures the number of logical steps it takes to evaluate M, is specified by the following rules. These are easily verified by the informal operational semantics above.

  1. 1.

    If M is a single parameter x, L s(M)=0.

  2. 2.

    If Mϕ(M 1,…,M n ), then L s(M)=1 + L s(M 1)+⋯ + L s(M n ).

  3. 3.

    If Mp i (M 1,…,M n ), then \(L^{s}(M)=1+L^{s}(M_{1})+\dots +L^{s}(M_{n})+L^{s}(E_{i}(\bar {M}_{1},\dots ,\bar {M}_{n}))\), where M i is the denotation of \(M_{i}(\bar {x},\bar {\mathtt {p}})\).

  4. 4.

    If M≡ if M 0 then M 1 else M 2, then

    $$L^{s}(M)=\left\{\begin{array}{ll} 1+L^{s}(M_{0})+L^{s}(M_{1}) & \text{if} \bar{M}_{0}=0\\ 1+L^{s}(M_{0})+L^{s}(M_{2}) & \text{otherwise} \end{array}\right.. $$

The number of logical steps it takes to evaluate E on input \(\bar {x}\) for a program E is \(L^{s}(\mathbf {A},E,E_{0}(\bar {x},\bar {\mathtt {p}}))\), \(E_{0}(\bar {x},\bar {\mathtt {p}})\) being of course the first term in the computation. Footnote 4 Therefore, define

$$l^{s}(\mathbf{A},E,\bar{x}):=L^{s}(\mathbf{A},E,E_{0}(\bar{x},\bar{\mathtt{p}})).</p><p class="noindent">$$

This will be our fundamental measure of complexity.

Sequential Logical Complexity of Tail Recursive Programs

If we have a tail recursive program E given by

$$f(\bar{\mathtt{x}}_{0})=\mathtt{p}(G(\bar{\mathtt{x}}_{0})) \text{ where } $$
$$\big\{\mathtt{p}(\bar{\mathtt{x}}_{1})=\text{ if } \tau(\bar{\mathtt{x}}_{1}) \text{ then } o(\bar{\mathtt{x}}_{1}) \text{ else } \mathtt{p}(F(\bar{\mathtt{x}}_{1}))\big\}, $$

then the sequential logical complexity of this program on input \(\bar {x}\) is linearly related to the number of times the recursive variable gets called. We state the following (easy) lemma without proof:

Lemma 1

For a given \(\bar {x}\) , let n be the least k such that \(\tau (F^{k}(G(\bar {x})))=0\) . Then

$$n\le l^{s}(\mathbf{A},E,\bar{x})\le Cn+D $$

for some constants C and D independent of \(\bar {x}\).

Tail Recursion, Equivalent Models, and Their Complexity

We hypothesize that in any reasonable computing formalism over abstract structures, there will be a natural way of counting the number of logical steps.

For example, a program schema may be thought of as a flowchart whose edges are labelled by assignment statements, and whose nodes are labelled by conditionals. Given some input, the computation is modelled by a path through the flowchart. The number of steps in the computation can be thought of as the length of that path.

Under this assumption, our thesis becomes:

Let \(\mathcal {M}\) be a model of computation for abstract structures that computes exactly the tail recursive functions over any structure. Then the number of logical steps it takes to evaluate a function f on input \(\bar {x}\) over a structure A using some instance of \(\mathcal {M}\) is bounded below by some linear function of \(l^{s}(\mathbf {A},E,\bar {x})\), for some tail recursive program E computing f.

This thesis becomes a theorem when we substitute in any concrete model of tail recursive computation, for example finite algorithmic procedures, program schemata, or while-programs. This shows that the lower bounds we demonstrate for tail recursive programs apply more broadly.

3 Gaps in Complexity

On the structure (N P d ,γ) the total function

$$ f(n,x)=\gamma^{2^{n}}(x) $$
(1)

can be computed by the recursive program E R:

$$\begin{array}{ll}f(n,x)=p(n,x)\ \text{where}\\ p(n,x)=\left\{\begin{array}{llll} \gamma(x) & \text{if}~ n=0\\ p(n-1,p(n-1,x)) & \text{otherwise.} \end{array}\right. \end{array} $$

It is easy to show:

Lemma 2

l s((N Pd ,γ),E R ,n,x)=O(2 n ). In other words, there are constants c and d such that E takes at most d+c2 n logical steps on the input (n,x).

For any function \(g:\mathbb {N}\rightarrow \mathbb {N}\), we will define γ such that for infinitely many n, there is an x such that computing f(n,x) takes at least g(n) logical steps by any tail recursive program. This will give a positive answer to our main question, i.e., that even when all recursive functions are tail recursive, there can be arbitrarily large gaps in complexity between the two.

3.1 Combinatorics of (N P d ,γ)-Tail Recursion

For this section fix a natural number k and assume that γ is non-decreasing.

Definition 6

For tuples \(\bar {m},\bar {n}\in \mathbb {N}^{k}\) and \(a\in \mathbb {N}\), define \(\bar {m}\sim _{a}\bar {n}\) in case min(m i ,a)= min(n i ,a) for each 1≤ik.

In other words, \(\bar {m}\sim _{a}\bar {n}\) if the two tuples “agree on parameters less than a.” It is clearly an equivalence relation.

It is (reasonably) straightforward to show that:

Lemma 3

For some \(f:\mathbb {N} \to \mathbb {N}\) , and \(u,v,w,w^{\prime }\in \mathbb {N}\) , suppose w −w is less than or equal to f(u)−u and f(v)−v. Then

$$\min(u,w)=\min(v,w)\implies\min(f(u),w^{\prime})=\min(f(v),w^{\prime}). $$

3.1.1 a -Equivalence and (N P d ,γ)-Terms

If we imagine for a moment that a were infinity, the relation ∼ a would simply become equality. In this case, the following two statements would be trivially true:

$$\bar{m}\sim_{a}\bar{n}\implies F(\bar{m})\sim_{a}F(\bar{n}) $$

if F were a term of arity and co-arity k, and

$$\bar{m}\sim_{a}\bar{n}\implies F(\bar{m})=0\iff F(\bar{n})=0 $$

if F were a term of arity k. In the real world, we can still recover analogues to these statements, which are stated in Corollary 1.

Lemma 4

Suppose F is an explicit (N Pd ,γ)-term of arity k. Then there is a constant b depending only on F such that for all a>b,

$$ \bar{m}\sim_{a}\bar{n}\implies\min(F(\bar{m}),a-b)=\min(F(\bar{n}),a-b). $$
(2)

Proof

We may suppose that F is a term in the variables 1 through k , in which case \(F(\bar {m})\) is the value of the term obtained when m i is substituted for i . The proof is by induction on the construction of F.

If F is a constant 0 or 1, then the conclusion of equation (2) is trivially satisifed for any b and a>b, so we may take b to be 0. If F is the variable i , then \(F(\bar {m})=m_{i}\) and \(F(\bar {n})=n_{i}\), and \(\bar {m}\sim _{a}\bar {n}\implies \min (m_{i},a)=\min (n_{i},a)\) by definition, so we may also take b to be 0.

Suppose that F is φ(F ), where φ is γ or P d. Then by induction there is some b such that for all a>b , \(\bar {m}\sim _{a}\bar {n}\implies \min (F^{\prime }(\bar {m}),a-b^{\prime })=\min (F^{\prime }(\bar {n}),a-b^{\prime })\).

Now suppose that a>b +1. I claim that

$$\min(F^{\prime}(\bar{m}),a-b^{\prime})=\min(F^{\prime}(\bar{n}),a-b^{\prime}) $$
$$ \implies\min(\varphi(F^{\prime}(\bar{m})),a-(b^{\prime}+1))=\min(\varphi(F^{\prime}(\bar{n})),a-(b^{\prime}+1)) $$
(3)

This follows from Lemma 3 with f = φ, w = a−(b +1), \(u=F^{\prime }(\bar {m})\), and \(v=F^{\prime }(\bar {n})\). The hypotheses of Lemma 3 are satisfied as

$$(a-(b^{\prime}+1))-(a-b^{\prime})=-1\le\varphi(x)-x $$

for any x, when φ is either P d or γ. □

Hence for a>b +1, if \(\bar {m}\sim _{a}\bar {n}\), then by induction we have \(\min (F^{\prime }(\bar {m}),a-b^{\prime })=\min (F^{\prime }(\bar {n}),a-b^{\prime })\), and by equation (3) we have \(\min (F(\bar {m}),a-(b^{\prime }+1))=\min (F(\bar {n}),a-(b^{\prime }+1))\). Hence we may take b = b +1.

Suppose that F is e q 0(F ). By induction, there is some b such that for all a>b , \(\bar {m}\sim _{a}\bar {n}\implies \min (F^{\prime }(\bar {m}),a-b^{\prime })=\min (F^{\prime }(\bar {n}),a-b^{\prime })\). Fix a>b and assume \(\bar {m}\sim _{a}\bar {n}\). Then \(\min (F^{\prime }(\bar {m}),a-b^{\prime })=\min (F^{\prime }(\bar {n}),a-b^{\prime })\), and since ab ≥1, we may decrease along the second coordinate (in fact this is a consequence of Lemma 3) to obtain \(\min (F^{\prime }(\bar {m}),1)=\min (F^{\prime }(\bar {n}),1)\). But this says exactly that \(eq_{0}(F^{\prime }(\bar {m}))=eq_{0}(F^{\prime }(\bar {n}))\), and hence (trivially) that \(\min (F(\bar {m}),a-b)=\min (F(\bar {n}),a-b)\). We have shown that

$$\forall a>b\ \bar{m}\sim_{a}\bar{n}\implies\min(F(\bar{m}),a-b^{\prime})=\min(F(\bar{n}),a-b^{\prime}), $$

so we may take b = b .

Finally, suppose that F is the conditional statement “if F 0 then F 1 else F 2.” By induction, there are b 0, b 1, and b 2, such that for i∈{0,1,2}, if a>b i and \(\bar {m}\sim _{a}\bar {n}\), we have \(\min (F_{i}(\bar {m}),a-b_{i})=\min (F_{i}(\bar {n}),a-b_{i})\).

Let b be max{b 0,b 1,b 2} and suppose a>b and \(\bar {m}\sim _{a}\bar {n}\). We then want to show that \(\min (F(\bar {m}),a-b)=\min (F(\bar {n}),a-b)\).

Since a>b 0, \(\min (F_{0}(\bar {m}),a-b_{0})=\min (F_{0}(\bar {n}),a-b_{0})\). Since ab 0≥1, we can decrease the second coordinate to obtain \(\min (F_{0}(\bar {m}),1)=\min (F_{0}(\bar {n}),1)\), and hence \(F_{0}(\bar {m})=0\iff F_{0}(\bar {n})=0\).

Suppose that \(F_{0}(\bar {m})=F_{0}(\bar {n})=0\). Then \(F(\bar {m})=F_{1}(\bar {m})\) and \(F(\bar {n})=F_{1}(\bar {n})\). Since a>b 1, \(\min (F_{1}(\bar {m}),a-b_{1})=\min (F_{1}(\bar {n}),a-b_{1})\), so \(\min (F(\bar {m}),a-b_{1})=\min (F(\bar {n}),a-b_{1})\). Finally, since bb 1, abab 1, and decreasing along the second coordinate, we get \(\min (F(\bar {m}),a-b)=\min (F(\bar {n}),a-b)\).

The case that \(F_{0}(\bar {m}),F_{0}(\bar {n})>0\) follows similar lines.

Corollary 1

Suppose F is an explicit (N Pd ,γ)-term of arity k. Then there is a constant b depending only on F such that for all a>b,

$$\bar{m}\sim_{a}\bar{n}\implies eq_{0}(F(\bar{m}))=eq_{0}(F(\bar{n})). $$

Suppose G is an explicit (N Pd ,γ)-term of arity and co-arity k. Then there is a constant b depending only on F such that for all a>b,

$$\bar{m}\sim_{a}\bar{n}\implies G(\bar{m})\sim_{a-b}G(\bar{n}). $$

Proof

By Lemma 4, there is some b such that if \(\bar {m}\sim _{a}\bar {n}\) and a>b, \(\min (F(\bar {m}),a-b)=\min (F(\bar {n}),a-b)\). As noted in the proof of that lemma, this statement when a = b+1 is equivalent to \(eq_{0}(F(\bar {m}))=eq_{0}(F(\bar {n}))\).

To prove the second statement, suppose that G=(G 1,…,G k ), where each G i is a term of arity k. Then by Lemma 4, for each 1≤ik there is a b i such that if a>b i and \(\bar {m}\sim _{a}\bar {n}\), \(\min (G_{i}(\bar {m}),a-b_{i})=\min (G_{i}(\bar {n}),a-b_{i})\).

Let b= max{b i :1≤ik}, and suppose a>b and \(\bar {m}\sim _{a}\bar {n}\). Then for each i, \(\min (G_{i}(\bar {m}),a-b_{i})=\min (G_{i}(\bar {n}),a-b_{i})\), and hence, decreasing along the second coordinate, \(\min (G_{i}(\bar {m}),a-b)=\min (G_{i}(\bar {n}),a-b)\). But this is exactly the statement that \(G(\bar {m})\sim _{a-b}G(\bar {n})\). □

3.1.2 a -equivalence and (N P d ,γ)-tail recursion

For the remainder of Section 3.1, fix a tail recursive program E

$$f(\mathtt{x}_{1},\dots,\mathtt{x}_{n})=\mathtt{p}(G(\mathtt{x}_{1},\dots,\mathtt{x}_{n}))\ \text{where} $$
$$ \{\mathtt{p}(\bar{\mathtt{x}})=\text{if} ~\tau(\bar{\mathtt{x}})\ \text{then} ~o(\bar{\mathtt{x}})\ \text{else} ~\mathtt{p}(F(\bar{\mathtt{x}}))\} $$
(4)

where \(\bar {\mathtt {x}}=(\mathtt {x}_{1},\dots ,\mathtt {x}_{k})\), G, τ, o, and F are explicit (N P d ,γ)-terms, and G and F have co-arity k.

What we will do is examine the tuples of the form \(F^{j}(G(\bar {x}))\) such that \(\tau (F^{j}(G(\bar {x})))\ne 0\), which are the tuples we evaluate upon in the course of the computation on input \(\bar {x}\). If some tuple is repeated, i.e., \(F^{i}(G(\bar {x}))=F^{j}(G(\bar {x}))\) for some i<j, then the computation fails to halt.

In this section we obtain an analogue of this result when equality is replaced by ∼ a -equivalence.

By Corollary 1 applied to τ and F, there are constants b and c such that for any a>b,c,

$$ \bar{m}\sim_{a}\bar{n}\implies eq_{0}(\tau(\bar{m}))=eq_{0}(\tau(\bar{n})) $$
(5)

and

$$ \bar{m}\sim_{a}\bar{n}\implies F(\bar{m})\sim_{a-b}F(\bar{n}). $$
(6)

Corollary 2

Fix \(\bar {m}\) and \(\bar {n}\) , and suppose \(\bar {m}\sim _{ab+c}\bar {n}\) for some a>0. Then \(\tau (F^{j}(\bar {m}))=0\iff \tau (F^{j}(\bar {n}))=0\) for 0≤j<a.

Proof

Fix 0≤j<a. Then \(F^{j}(\bar {m})\sim _{(a-j)b+c}F^{j}(\bar {n})\): when j=0, it is equivalent to \(\bar {m}\sim _{ab+c}\bar {n}\), and when j>0, it follows from equation (6). (We take F 0 to be the identity function.)

By equation (5), since (aj)b + c>c, we have that \(eq_{0}(\tau (F^{j}(\bar {m})))=eq_{0}(\tau (F^{j}(\bar {n})))\), or in other words \(\tau (F^{j}(\bar {m}))=0\iff \tau (F^{j}(\bar {n}))=0\). □

Definition 7

For \(\bar {m}\in \mathbb {N}^{k}\) and \(u<v\in \mathbb {N}\), define

$$\mathfrak{G}(\bar{m},u,v)\iff\{m_{1},\dots,m_{k}\}\cap(u,v)=\emptyset. $$

(Here (u,v) is an interval, i.e. the set \(\{n\in \mathbb {N}:u<n<v\}\).)

Then we have immediately that if \(\mathfrak {G}(\bar {m},u,v)\), \(\mathfrak {G}(\bar {n},u,v)\) and \(\bar {m}\sim _{u}\bar {n}\) then \(\bar {m}\sim _{v}\bar {n}\).

Lemma 5

Fix \(\bar {m}\) . Suppose that \(\bar {m}\sim _{ab+c}F^{i}(\bar {m})\) for some a,i>0. Then the least j such that \(\tau (F^{j}(\bar {m}))=0\) , if it exists, cannot be in the interval [i,i+a).

Proof

By Corollary 2, for 0≤j<a,

$$\tau(F^{j}(\bar{m}))=0\iff\tau(F^{i+j}(\bar{m}))=0. $$

Therefore if for some ij<a + i, \(\tau (F^{j}(\bar {m}))=0\), then \(\tau (F^{j-i}(\bar {m}))=0\). □

Lemma 6

Fix \(\bar {m}\) . Suppose that there exists such a j such that \(\tau (F^{j}(\bar {m}))=0\) , and let j 0 be the least such j. Suppose that \(\mathfrak {G}(F^{j}(\bar {m}),a,a^{\prime }b+c)\) for some a>0, a ≥a, and for all j≤(a+1) k . Then j 0 cannot be in the interval ((a+1) k ,a ).

Proof

We first claim that \(F^{n_{1}}(\bar {m})\sim _{a}F^{n_{2}}(\bar {m})\) for some 0≤n 1<n 2≤(a+1)k. Otherwise each tuple \(\{F^{j}(\bar {m}):0\le j\le (a+1)^{k}\}\) is in its own ∼ a -equivalence class. However, there are only (a+1)k different ∼ a -equivalence classes: for each of the k indices, there are a+1 choices, one for each {0,1,…,a−1}, and one for numbers ≥a.

Hence \(F^{n_{1}}(\bar {m})\sim _{a}F^{n_{2}}(\bar {m})\) for some 0≤n 1<n 2≤(a+1)k. By assumption, \(\mathfrak {G}(F^{n_{1}}(\bar {m}),a,a^{\prime }b+c)\) and \(\mathfrak {G}(F^{n_{2}}(\bar {m}),a,a^{\prime }b+c)\), so \(F^{n_{1}}(\bar {m})\sim _{a^{\prime }b+c}F^{n_{2}}(\bar {m})\). By Lemma 5, the least j such that \(\tau (F^{n_{1}+j}(\bar {m}))=0\) cannot be in the interval [n 2n 1,n 2n 1 + a ). But this least j is j 0n 1; therefore, j 0n 1∉[n 2n 1,n 2n 1 + a ) and j 0∉[n 2,n 2 + a ). Since ((a+1)k,a )⊂[n 2,n 2 + a ), j 0 is not contained in ((a+1)k,a ). □

This result may seem technical but the philosophy behind it is quite clear:

Suppose the elements of a tuple \(\bar {m}\) contain very big or very small numbers. Then the computation of on \(\bar {m}\) halts in either a very short or very long amount of time.

3.2 An Explicit Complexity Gap

We will now apply the results of Section 3.1 to any tail recursive program computing the function f of equation (1). We shall show that, for any increasing function g, we can define γ such that any tail recursive program E computing f over (N P d ,γ) makes at least g(n) logical steps on infinitely many inputs (n,x). (Recall that only O(2n) logical steps were required by a general recursive program, for any γ.)

We first describe how to obtain γ from g. Namely, let \(\gamma :\mathbb {N}\to \mathbb {N}\) be any strictly increasing function for which for every function L:xu x + v for \(u,v\in \mathbb {N}\), the interval (n,2n + L(g(n))) is disjoint with the range of γ for arbitrarily large n. The fact that there exists such a function requires proof.

Lemma 7

For every increasing function \(g:\mathbb {N}\to \mathbb {N}\) there is an increasing function \(\gamma :\mathbb {N}\to \mathbb {N}\) such that for all \(u,v\in \mathbb {N}\) there exist infinitely many n such that (n,2 n +g(n)u+v) is disjoint with the range of γ.

Proof

We define γ by recursion. Let γ(0) be 0, and define

$$\gamma(i+1):=1+\max\{2^{\gamma(i)+1}+g(\gamma(i)+1)u+v\mid u,v\le i\}, $$

so that for n = γ(i)+1, γ(i)<n<2n + g(n)u + v<γ(i+1) for all u,vi. Clearly γ is increasing.

Now fix \(u,v\in \mathbb {N}\). Then for all iu,v if n = γ(i)+1, then (n,2n + g(n)u + v) is disjoint from the range of γ. This is because γ(i)<n, 2n + g(n)u + v<γ(i+1), and γ is increasing so there is no j such that γ(i)<γ(j)<γ(i+1). □

Having defined γ from g, we suppose that E is a tail recursive program computing f over (N P d ,γ), and we now describe the infinite family of inputs on which E runs slowly. Let L(n) be the linear function n b + c, where b and c are obtained from E as in Section 3.1.2. By definition, (n,2n + L(g(n))) is disjoint with the range of γ for arbitrarily large n. Let n 1<n 2<… be an infinite increasing sequence witnessing this, and let \(x_{i}:=2^{n_{i}}+L(g(n_{i}))\). The pairs (n i ,x i ) for \(i\in \mathbb {N}\) will be the infinitely many inputs on which E will perform at least g(n i ) logical steps.

For all i,j, let \(\bar {m}_{i,j}:=F^{j}(G(n_{i},x_{i}))\). The \(\bar {m}_{i,j}\) are the tuples that gets called upon in the course of the computation on input (n i ,x i ). Then

Lemma 8

For \(0\le j<2^{n_{i}}\) , \(\mathfrak {G}(\bar {m}_{i,j},n_{i},L(g(n_{i})))\).

Proof

Let \(\bar {m}_{i,j}=(m_{1}^{(i,j)},\dots ,m_{k}^{(i,j)})\). Let m range over \(m_{\ell }^{(i,j)}\) for \(i\in \mathbb {N}\), \(0\le j<2^{n_{i}}\) and 1≤k. Then m = b(y) where b is an algebraic {P d,γ,e q 0}-term of length less than \(2^{n_{i}}\) and y is 0, 1, n i , or x i . I claim that all such m are contained in the set

$$[0,n_{i}]\cup[x_{i}-2^{n_{i}},x_{i}]\cup\bigcup_{\vartheta\in\gamma[\mathbb{N}]}[\vartheta-2^{n_{i}},\vartheta]. $$

In case b is a {P d}-term, this is obvious. Otherwise b = P d jγb or b = P d je q 0b for some \(j<2^{n_{i}}\) and algebraic {P d,γ,e q 0}-term b . Define m : = b (y). In the first case, m = γ(m )−j, so \(m\in [\vartheta -2^{n_{i}},\vartheta ]\) for some 𝜗 in the range of γ. In the second case, m is 0 or 1.

Since \(x_{i}-2^{n_{i}}=L(g(n_{i}))\), each parameter m is contained in the set

$$[0,n_{i}]\cup[L(g(n_{i})),L(g(n_{i}))+2^{n_{i}}]\cup\bigcup_{\vartheta\in\gamma[\mathbb{N}]}[\vartheta-2^{n_{i}},\vartheta]. $$

However, if we take the intersection with (n i ,L(g(n i ))),

$$[0,n_{i}]\cap(n_{i},L(g(n_{i})))=\emptyset $$
$$[L(g(n_{i})),L(g(n_{i}))+2^{n_{i}}]\cap(n_{i},L(g(n_{i})))=\emptyset. $$

If \(z\in (n_{i},L(g(n_{i})))\cap [\vartheta -2^{n_{i}},\vartheta ]\) for some \(\vartheta \in \gamma [\mathbb {N}]\), then

$$n_{i}<z\le\vartheta\le z+2^{n_{i}}<L(g(n_{i}))+2^{n_{i}}. $$

This contradicts the assumption that the range of γ and \((n_{i},L(g(n_{i}))+2^{n_{i}})\) are disjoint for all n i . □

We can now prove our main theorem. The proof uses the notion of value-depth complexity (see Chapter 4 in Moschovakis [7]). In our case, the value-depth complexity of a natural number y given natural numbers (x 1,…,x k ) is the length of the shortest term T such that (N P d ,γ)⊧y = T(x i ) for some 1≤ik. The value-depth complexity of y given \(\bar {x}\) is a lower bound for the number of logical steps a program E takes to compute input \(\bar {x}\) if \((\mathbf {N}_{Pd},\gamma )\models E(\bar {x})=y\). Footnote 5

Theorem 1

For sufficiently large i, the number of sequential logical calls made by the tail recursive program E on input (n i ,x i ) is at least g(n i ). I.e.,

$$l^{s}((\mathbf{N}_{Pd},\gamma),E,n_{i},x_{i})\ge g(n_{i}). $$

Proof

By Lemma 8 and the definition of L, for all i,

$$\mathfrak{G}(\bar{m}_{i,j},n_{i},g(n_{i})b+c) $$

for \(0\le j<2^{n_{i}}\), and hence for 0≤j≤(n i +1)k if i is sufficiently large. If j 0,i is the least j such that \(\tau (\bar {m}_{i,j})=0\), we can apply Lemma 6 to conclude that j 0,i is not in the interval ((n i +1)k,g(n i )) for sufficiently large i.

Moreover, j 0,i grows exponentially in n i . This is because the value-depth complexity of \(\gamma ^{2^{n_{i}}}(x_{i})\) given n i and x i is \(2^{n_{i}}\), and so \(2^{n_{i}}\)is a lower bound for the number of calls to γ. But j 0,i is bounded below by a linear function of the number of logical steps by Lemma 1.

Therefore, for sufficiently large i, j 0,i >(n i +2)k, so j 0,i g(n i ). In other words, on input (n i ,x i ), the tail recursion (4) makes at least g(n i ) recursive calls before halting. But by Lemma 1, this shows that the number of logical steps is also at least g(n i ). □

4 Conclusion

For each increasing function g, we have exhibited a structure (N P d ,γ) and a γ-recursive function f such that for each tail recursive program E computing f, there is an infinite family of inputs \(\{(n_{i},x_{i}):i\in \mathbb {N}\}\) such that n i <n i+1 for each i and the number of logical steps in the computation E(n i ,x i ) is at least g(n i ). On the other hand, the recursive program E always makes O(2n) logical steps on all inputs (n,x), which is optimal within a linear factor.

Hence we have found structures with arbitrarily large complexity gaps. On the other hand, with an increasing function γ we can compute the successor by a tail recursion, so in terms of computability there is no difference between recursion and tail recursion over (N P d ,γ). Therefore, we have a positive answer to our original question.

We might imagine some ways to strengthen this specific result. For example, we believe that there should be a difference in complexity when we count just the number of times γ is called, instead of the total number of logical steps. This would be somewhat cleaner. It would also be nice to show that there is an infinite family on inputs on which all tail recursive programs run slowly, instead of having the family of inputs depend on the program.

5 Future Work

More generally, there would be several additional types of functions where it would be desirable to find complexity gaps. Namely, we would like to find gaps among unary functions and relations, i.e., functions into {0,1}. The usual reductions (using pairing functions and the graph relation) fail to preseve complexity measures in the right way when applied to this example.

What would be most interesting is a difference in complexity over a standard structure of the natural numbers, like N u or the related structure of binary string arithmetic. Since Turing machines and other concrete machine models of computation are naturally expressed by tail recursive programs over these structures, such a complexity difference might imply that such machines cannot faithfully express all algorithms without some loss in overhead. This has foundational importance, since the standard way to formalize the complexity of algorithms uses Turing machines.

A final open question is whether we can find a complexity difference lurking among usual computational problems–searching, sorting, and the like. For example, among pure “comparison sorts,” the ones which are naturally expressed by iterative algorithms like selection sort and insertion sort seem to use O(n 2) comparisons on lists of length n, whereas a general recursive algorithm like mergesort uses O(nlogn) comparisons.

We might conjecture that in a sufficiently abstract model of sorting, we might be able to show a gap in the number of comparsions between recursion and tail recursion. Unfortunately, this is false as stated, but we wonder whether there is anything in this spirit.