Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Consider algorithms that request and receive data from an external source in the course of their computations. These algorithms abound and can be found in all sorts of monitoring and control systems. We suppose these algorithms are modelled by Turing machines with oracles that are physical systems, and whose oracle queries ask and obtain data by means of some process to measure a physical quantity. Essentially, through a measurement procedure, the Turing machine will access a sequence of approximations to a real number.

Starting in [5, 6], we began a theoretical investigation of such physical oracles, focussing on classic deterministic physical experiments. To guide our thinking we conceived an abstract experimenter using some physical equipment to undertake an abstract experiment to measure a physical quantity. The Turing machine modelled the experimental procedure and the data from the oracle modelled observations of the equipment: see [5, 6, 10, 12, 13, 16] inter alia.

Technically, we examined what was involved in an algorithm requesting and receiving data from a physical process, and especially interface properties to do with

  1. (a)

    the error margins involved in the data: the queries could have infinite precision, being exact or having finite but vanishingly small errors; or have a finite precision that is a fixed error margin;

  2. (b)

    the time taken by the algorithm to acquire the data: the queries need not take one computational step or unit time, but may take time depending on the size of the query.

We also placed complexity constraints on the computations, especially polynomial time.

The computational power of many of these physical oracles has been established using non-uniform complexity classes. These have the general form \(\mathcal {B}/\mathcal {F}\) consisting of a complexity class \(\mathcal {B}\) equipped with class \(\mathcal {F}\) of special oracles called advice functions. An advice function is a map \(f: \mathbb {N}\rightarrow \varSigma ^*\) that provides extra data f(n) to the Turing machine when computing with inputs of size \(n \in \mathbb {N}\). Advice functions are suitable for representing real numbers (in binary, say). Typically, we take \(\mathcal {B}\) to be the class P, defined by polynomial time deterministic Turing machines; or to be the class BPP, defined by polynomial time Turing machines governed by fair probability distributions. We take \(\mathcal {F}\) to be based on logarithms.

Through a detailed investigation of protocols between analogue and digital components of many types of system (see [8, 13]), we established the computational power of these oracles as follows.

For infinite precision measurements, in deterministic polynomial time, the computational power was shown to be \( P /\log \!\star \). However, in the more realistic case of finite fixed precision measurements in deterministic polynomial time, the computational power was shown to be \( BPP //\log \!\star \). This was done for a wide variety of physical oracles and led to a thesis proposing \( BPP //\log \!\star \) as a limit to computation [15]. The probabilistic form of \( BPP //\log \!\star \) is due to the use of probabilities to handle fair choices of data from within the fixed-size error intervals of the deterministic physical oracle. Probabilistic oracles are the subject of [4].

Our attempts to model measurement algorithmically addressed a longstanding question, first formulated by Geroch and Hartle in their intriguing paper [20]: What are the physically measurable numbers? Are the measurable numbers computable numbers? Measurement is a scientific activity supported by a full theory developed throughout the last century as a chapter of mathematical logic (see [21]). Our computational theory of measurement started in [9, 10] and focussed on the time needed to make a measurement; here we consider the amount of data involved in making a measurement.

The data provided by the oracle is constrained by

  1. (i)

    the size of responses to queries, and

  2. (ii)

    the frequency of calls to the oracle.

The size of the data can be controlled by the size of the values of the advice functions |f(n)|. We will show that for \( BPP //\log \!\star \), for inputs of size n, the amount of bits translates into a modest number of calls to the oracle, which is poly-logarithmic in n.

In this paper, we also introduce the possibility of using physical oracles whose behaviour is modelled stochastically, as one finds in statistical mechanics. Imagine a physical experiment modelled by a random walk on the line, as discussed in [19]. The oracle is non-deterministic and can be connected to a Turing machine that can be deterministic or non-deterministic: we will need both. Specifically, we will use Turing machines and fair probabilistic Turing machines.

Let \(\log ^{(k)}\) be the class of advice functions \(f: \mathbb {N}\rightarrow \varSigma ^*\) such that \(|f(n)| \in O(\log ^{(k)}(n))\). Let \(\mathrm {poly}(\log ^{(k)})\) be the class of polynomial functions in \(\log ^{(k)}\). We prove the following:

Theorem 1

The class of sets decidable in polynomial time by RW fair probabilistic Turing machines that can make up to \(\mathrm {poly}(\log ^{(k)}(n))\) calls to the RW oracle, for inputs of size n, is exactly \( BPP //\log ^{(k+1)}\!\star \).

The hierarchy of complexity classes within \( BPP //\log \!\star \) we establish starts with \( BPP //\log \!\star \) and approaches arbitrarily close to \( BPP \).

We show strict boundedness, i.e., \(k \ge 0\), \(\log ^{(k+1)} \prec \log ^{(k)}\). In particular, this is true for \(k\ge 1\) and we have the following infinite descending chain

$$ \cdots \prec \log ^{(4)} \prec \log ^{(3)} \prec \log ^{(2)} \prec \log , $$

which can generate a hierarchy as in the figure.

figure a

Theorem 2

The classes of sets decided by RW fair probabilistic Turing machines that can make up to

$$\begin{aligned} \cdots \subsetneq \mathrm {poly}(\log ^{(3)}(n))\subsetneq \mathrm {poly}(\log ^{(2)}(n))\subsetneq \mathrm {poly}(\log (n))\subsetneq \mathrm {poly}(n) \end{aligned}$$

calls to the RW oracle coincides with the descending chain of sets

$$\begin{aligned} \cdots \subsetneq BPP //\log ^{(4)}\!\star \subsetneq BPP //\log ^{(3)}\!\star \subsetneq BPP //\log ^{(2)}\!\star \subsetneq BPP //\log \!\star \ , \end{aligned}$$

respectively.

While measuring a physical magnitude, a slight amount of bits of the binary representation of a real number, relative to the size of the input, can originate hyper-computation.

It is striking the extent to which the class \( BPP //\log \!\star \) arises naturally in exploring physical systems and in physically inspired computational models. However other non-uniform classes have been found useful. The computational power of deterministic neural networks having access to real numbers in polynomial time was proposed to be \( P /\text{ poly }\) in [24]. These results contrast with our many results involving \( P /\log \!\star \): our reduction of power in deterministic time is due to the fact that measurement takes time in non-linear systems, while in [24] the systems considered are piecewise linear. However, inspired by the work in [24], the authors of [26] specify hardware presumably designed to be capable of computing a non-decidable fragment \( BPP //\log \!\star \). In our view such systems will not support programming, since programming in such a context will the introduction of a real number into the system with unbounded precision. Eventually, such systems will be capable of emergent computation due to arbitrary unknown reals (if real numbers exist in Nature) specifying their components. Emergent computational activities might well be relevant in learning tasks.

2 Random Walk Oracles

2.1 Random Walk

Consider the random walk experiment (RWE) of having a particle moving along an axis. The particle is sent from position \(x = 0\) to position \(x = 1\). Then, at each positive integer coordinate, the particle moves right, with probability \(\sigma \), or left, with probability \(1 - \sigma \), as outlined in Fig. 1. If the particle ever returns to its initial position \(x = 0\), then it is absorbed. In this process, the particle takes steps of one unit, at time intervals also of one unit, postulated to be the time step of a Turing machine transition (see [25]).

Fig. 1
figure 1

Random walk on the line with absorption at \(x = 0\)

Fig. 2
figure 2

Diagram showing probabilities of the particle being at various distances from the origin, for the case of \(\sigma =1/4\)

We are interested in the probability that the particle is absorbed (see [22]). Let \(p_i\) be the probability of absorption when the particle is at \(x = i\). In our model, the particle is launched from \(x = 0\) but it only starts its random walk at \(x = 1\). It is easy to see that \(p_1 = (1 - \sigma ) + \sigma p_2\). From \(x = 2\), to be absorbed, the particle must initially move from \(x = 2\) to \(x = 1\) (not necessarily in one step), and then from \(x = 1\) to \(x = 0\) (again, not necessarily in one step). Both movements are made, independently, with probability \(p_1\), thus, \(p_2\) is just \(p_1^2\). More generally, we have \(p_k = p_1^k\). Therefore, the equation for the unidimensional random walk with absorption at \(x = 0\) is given by the equation

$$ p_1 = (1 - \sigma ) + \sigma p_1^2 , $$

with solutions \(p_1 = 1\) and \(p_1 = \frac{1 - \sigma }{\sigma }\). For \(\sigma = \frac{1}{2}\), the solutions coincide and \(p_1 = 1\). For \(\sigma < \frac{1}{2}\), the second solution is impossible, because \(\frac{1 - \sigma }{\sigma } > 1\), so, we must have \(p_1 = 1\). For \(\sigma = 1\), the particle always moves to the right, so \(p_1 = 0\). Thus, for the sake of continuity of \(p_1\), for \(\sigma > \frac{1}{2}\), we must choose \(p_1 = \frac{1 - \sigma }{\sigma }\). Consequently, we get

$$ p_1 = \left\{ \begin{array}{ll} 1 &{} \text{ if } \sigma \le \frac{1}{2} \\ \frac{1 - \sigma }{\sigma } &{} \text{ if } \sigma > \frac{1}{2} \end{array}\right. . $$

So, if \(\sigma \le \frac{1}{2}\), with probability 1 the particle always returns, but the number of steps is unbounded. In Fig. 2, we illustrate this situation, for the case \(\sigma = 1/4\), giving the possible locations of the particle, and the respective probabilities, after the first steps.

2.2 Machines with Random Walk Oracles

We will combine the RWE with both Turing machines and fair probabilistic Turing machines. Probabilistic Turing machines have been around since the 1950s and have a number of equivalent formulations. For example, the machine may randomly choose between the available transitions at each step with probability \(\frac{1}{2}\). Perhaps the most elegant and easiest way to describe them is to say that they have access to a fair independent coin toss oracle, returning values ‘heads’ or ‘tails’ with probability \(\frac{1}{2}\). Whilst the definition of the machines can be shown to converge, the different criteria in use for recognising strings do not.

Definition 1

Consider any form of Turing machine that gives probabilistic results, e.g. a Turing machine with any form of random oracle. A set \(A\subset \{0,1\}^*\) is accepted by such a Turing machine \(\mathcal {M}\) in polynomial time if there is a \(\gamma < 1/2\) so that for for every input w, \(\mathcal {M}\) halts in polynomial time and

  • If \(w \in A\), \(\mathcal {M}\) accepts w with error probability bounded by \(\gamma \);

  • If \(w \notin A\), \(\mathcal {M}\) rejects w with error probability bounded by \(\gamma \).

For example, fair probabilistic Turing machines are used to define the class BPP with the criterion that any given run of the algorithm, it has a probability of (say) at most \(\frac{1}{3}\) of giving the wrong answer, whether the answer is accept or reject. Fair probabilistic Turing machines are required for our main theorems.

Now, let us consider a Turing machine coupled with a random walk experiment, as introduced in [19]. To use the RWE as an oracle, we admit that the probability \(\sigma \) that the particle moves forward, encodes some advice. Unlike scatter machine experiments in [1, 6, 12], the RWE does not need any parameters to be initialized, i.e., the Turing machine does not provide the oracle with any dyadic rational, it just “pulls the trigger” to start the experiment. We consider both a Turing machine with added RWE oracle, a RW Turing machine, and a fair probabilistic Turing machine with added RWE oracle, a RW fair probabilistic Turing machine.

For every unknown \(\sigma \in (0, 1)\), the time that a particle takes to be absorbed is unbounded. We introduce a constant time schedule to bound the oracle consultation time. If the particle is absorbed during that time, the finite control of the Turing machine changes to the ‘yes’ state, otherwise, the finite control changes to the ‘no’ state. The experiment has two possible outcomes and a constant time schedule.

We analyse the probability of ‘yes’.

A path of the random walk is a possible sequence of moves that the particle makes until it is absorbed. Note that all such paths are made of an even number of steps. Paths of the random walk along the positive x-axis with absorption at \(x = 0\) are isomorphic to a specific set of well-formed sequences of parentheses. For instance, in a random walk of length 6, the particle could behave as ((())) or (()()), where a movement to the right is represented by “(” and a movement to the left is represented by “)”. The first opening parenthesis corresponds to the first move of the particle from \(x = 0\) to \(x = 1\). The probability of answer in 6 steps is the sum of two probabilities corresponding to the two possible paths. All paths of a certain length have the same probability; namely, for every even number n, the probability of each path of length n is

$$ \sigma ^{\frac{n}{2} - 1}(1 - \sigma )^{\frac{n}{2}} . $$

Therefore, we only need to know the number of possible paths for each length, i.e., the number of well-formed sequences of parentheses satisfying some properties. In [17], the authors generalize the Catalan numbers and prove the following interesting result:

Proposition 1

(Blass and Braun [17]) For every \(\ell , w \in \mathbb {Z}\), \(\ell \ge w \ge 0\), let X be the number of strings consisting of \(\ell \) left and \(\ell \) right parentheses, starting with w consecutive left parentheses, and having the property that every nonempty, proper, initial segment has strictly more left than right parentheses. Then

$$\begin{aligned} X = \frac{w}{2 \ell - w} \ \left( \begin{array}{cc} 2 \ell - w \\ \ell \end{array} \right) \end{aligned}$$

Note that when \(w = \ell = 0\), the undefined fraction \(w /(2 \ell - w)\) is to be interpreted as 1, since this gives the correct value \(X = 1\), corresponding to the empty string of parentheses. From this proposition, we derive the probability q(t) that the particle is absorbed in even time \(t + 1\), for \(t \ge 1\). It suffices to take \(\ell = (t + 1)/2\) and \(w = 1\):

$$ q(t) = \frac{1}{t} \left( {\begin{array}{c}t\\ \frac{t + 1}{2}\end{array}}\right) (1 - \sigma )^{\frac{t + 1}{2}} \sigma ^{\frac{t + 1}{2} - 1} . $$

Therefore, the probability that the particle is absorbed during the time schedule T is given by

$$ F(\sigma , T) = \sum \limits _{\begin{array}{c} t = 1 \\ t \ odd \end{array}}^{T - 1} \frac{1}{t} \left( {\begin{array}{c}t\\ \frac{t + 1}{2}\end{array}}\right) (1 - \sigma )^{\frac{t + 1}{2}} \sigma ^{\frac{t + 1}{2} - 1} . $$

This is the probability of getting the outcome ‘yes’ from the oracle. Figure 3 allows us to understand the behaviour of the probability \(F(\sigma , T)\) as a function of \(\sigma \). We see that, as T increases, \(F(\sigma , T)\) increases as well, since the longer the machine waits, the more likely it is that a particle is absorbed. We can also see that as T approaches infinity, \(F(\sigma , T)\) approaches the probability \(p_1\) that the particle is absorbed, which makes sense, since \(p_1\) represents a probability of absorption with unbounded time. For analytical reasons, we will consider only \(\sigma \in [ \frac{1}{2}, 1 ]\), corresponding to a variation of \(p_1\) from 1 to 0. Note that we could consider any interval contained in [0, 1]. For every T, this probability is a function of \(\sigma \) that satisfies the following conditions:

  1. (a)

    \(F(\bullet , T) \in C^1([\frac{1}{2}, 1])\),

  2. (b)

    for every \(\sigma \in [ \frac{1}{2}, 1 ]\), \(F^{\prime }(\sigma , T) \ne 0\) and

  3. (c)

    n bits of \(F(\bullet , T)\) are computable in time \(O(2^n)\).

These conditions are the basis of an axiomatisation SPO of stochastic physical oracles in the forthcoming paper [4], and from which take the following theorem:

Theorem 3

For every set A, \(A \in BPP //\log \!\star \) if, and only if, it is decidable by a RW Turing machine in polynomial time.

Fig. 3
figure 3

Graphs of \(F(\sigma , T)\) for \(T = 2\), \(T = 10\) and \(T = 100\)

3 Computational Resources

Consider that we have a limiting number of particles that the RW Turing machine can launch, i.e., a bound in the number of oracle calls that the machine can make. We study now how the precision in the measurement of \(\sigma \) depends on the number of oracle calls.

Theorem 4

A RW Turing machine, or a RW fair probabilistic Turing machine, that can make up to \(\xi (n)\) calls to the RW oracle, on input w of size \(|w| = n\), can read \(\frac{1}{2} \log (\xi (n))+c\) bits of the unknown parameter \(\sigma \), where c is a fixed constant, in polynomial time.

Proof

The proof is common to both types of Turing machine. We know that each particle has probability of absorption \(F(\sigma , T)\) in time T. Thus, if we make \(\xi (n)\) oracle calls on an input of size n, the number of times \(\alpha \) that the experiment returns ‘yes’ is a random variable with binomial distribution. Let us consider \(X = \alpha /\xi (n)\), the random variable that represents the relative frequency of absorption (‘yes’). We have the expected value \(\mathbb {E}[X] = \mathbb {E}[\alpha ]/\xi (n) = {\xi (n) F(\sigma , T)}/\xi (n) = F(\sigma , T)\) and the variance \(\mathbb {V}[X] = \mathbb {V}[\alpha ]/\xi (n)^2 = {\xi (n) F(\sigma , T)(1 - F(\sigma , T))}/\xi (n)^2 = F(\sigma , T)(1 - F(\sigma , T))/\xi (n)\). Chebyshev’s inequality states that, for every \(\delta > 0\),

$$ P(|X - \mathbb {E}[X]| > \delta ) \le \frac{\mathbb {V}[X]}{\delta ^2} \le \frac{F(\sigma , T)(1 - F(\sigma ,T))}{\xi (n) \delta ^2} \le \frac{F(\sigma , T)}{\xi (n) \delta ^2} . $$

Let k be the number of bits of \(\sigma \) to be read.Footnote 1 This means that we have to find \(\sigma \) up to an error of \(2^{- k - 5}\). To do this, we first estimate the probability \(F(\sigma )\) up to an error \(\delta \), and then run a bisection algorithm to find the value of \(\sigma \) (this may require polynomial time). The value of \(\delta \) needed to ensure the required accuracy of \(\sigma \) depends on the lower bound of the derivative of F. To allow for this we set \(\delta =C\,2^{-k}\) for some \(C>0\), and then

$$ P(|X - F(\sigma , T)| > C\,2^{-k}) \le \frac{2^{2 k }C^{-2}\, F(\sigma , T)}{\xi (n)} \le \frac{2^{2 k }C^{-2}}{\xi (n)} , $$

and if we want an error probability of at most \(\gamma \), we set

$$ \frac{2^{2 k }C^{-2}}{\xi (n)} \le \gamma . $$

Applying logarithms, we get

$$ 2 k -2\, \log (C) - \log (\xi (n)) \le \log (\gamma ) \ , $$

therefore,

$$ k \le \frac{\log (\xi (n)) + \overbrace{\log (\gamma ) +2\,\log (C) }^{\text{ constant } \text{ value }}}{2} . $$

For the RW Turing machine, for every \(\sigma \), \(F(\sigma , T)\) increases with T and the term \(\log (1/F(\sigma , T))\) decreases; contrary to what one might expect, for every input word w of size n, the longer we wait for the particles to return, the less precision we can obtain for \(\sigma \).Footnote 2 We take the particular case that in every oracle call the machine will wait exactly two time steps for the particle to return (\(T = 2\)). Therefore, \(F(\sigma , 2) = (1 - \sigma )\). Now, with \(k \in O(\log (\xi (n)))\), we have

$$ P(|(1 - X) - \sigma | = P(|X - (1 - \sigma )| > 2^{- k - 5}) \le \gamma . $$

With value \(1 - X\) we can estimate \(\sigma \). \(\Box \)

This result suggests a non-collapsing hierarchy of classes can be defined by the magnitude of the number of queries to the oracle. As we want this to be a hierarchy built on BPP and within \( BPP //\log \!\star \), we must ensure that all of the machines we consider can compute BPP. Thus we consider a RW oracle added to a probabilisitic Turing machine, to give an RW fair probabilistic Turing machine.

4 Lower and Upper Bounds

We encode advice functions in order to compare RW Turing machines with Turing machines with advice. We define the iterated logarithmic functions \(\log ^{(k)}(n)\):

  • \(\log ^{(0)}(n) = n\);

  • \(\log ^{(k + 1)}(n) = \log (\log ^{(k)}(n))\).

Similarly, we define the iterated exponential \(\exp ^{(k)}(n)\):

  • \(\exp ^{(0)}(n) = n\);

  • \(\exp ^{(k + 1)}(n) = 2^{\exp ^{(k)}(n)}\).

The iterated exponential is a well known bound on the number of computation steps of elementary functions (e.g. see [23]). For every \(k \in \mathbb {N}\), the functions \(\log ^{(k)}\) and \(\exp ^{(k)}\) are inverse of each other. Let \(\log ^{(k)}\) also denote the class of advice functions f such that \(|f(n)| \in O(\log ^{(k)}(n))\).

Let c(w) be the encoding of a single word w. We define the encoding \(y(f) = \lim y(f)(n) \) for an advice function \(f \in {\log ^{(k)}\!\star }\) in the following way:

  • \(y(f)(0) = 0.c(f(0))\);

  • if \(f(n + 1) = f(n)s\), then

    $$ y(f)(n + 1) = \left\{ \begin{array}{ll} y(f)(n) c(s) &{} \text{ if } n + 1 \text { is not of the form } \exp ^{(k)}(m) \\ y(f)(n) c(s) 001 &{} \text{ if } n + 1 \text { is of the form } \exp ^{(k)}(m) \end{array}\right. $$

So, for example, if we want to encode a function \(f \in \log \log \!\star \), we just have to place the separator 001 when \(n + 1\) is of the form \(2^{2^m}\), for some \(m \in \mathbb {N}\).

For every k and for every \(f \in {\log ^{(k)}\!\star }\), we have that \(y(f) \in \mathcal {C}_3\). Also, for every n, in order to extract the value of f(n), we only need to find the number \(m \in \mathbb {N}\) such that \(\exp ^{(k)}(m - 1)< n \le \exp ^{(k)}(m)\) and then read y(f) in triplets, until we find the \((m + 1)\)-th separator. Then, it is only needed to ignore the separators and replace each 100 triplet by 0 and each 010 triplet by 1. Since \(f \in {\log ^{(k)}\!\star }\), we know that \(|f(\exp ^{(k)}(m))| = O(\log ^{(k)}(\exp ^{(k)}(m))) = O(m)\). We conclude that \(3 O(m) + 3 (m + 1) = O(m)\) bits are enough to get the value of \(f(\exp ^{(k)}(m))\) and, consequently, \(O(\log ^{(k)}(n))\) bits to get the value of f(n).

Definition 2

Denote by \(\mathrm {poly}(g(n))\) the class of functions \(f:\mathbb {N} \rightarrow \mathbb {N}\) for which there is a polynomial p(x) so that \(f(n)\le p(g(n))\) for all \(n\in \mathbb {N}\).

We can use this to prove the following result:

Theorem 5

(Lower bounds) For every k, every set in \( BPP //\log ^{(k+1)}\!\star \) is decidable in polynomial time by a RW fair probabilistic Turing machine that can make up to \(\xi (n) \in \mathrm {poly}(\log ^{(k)}(n))\) RW oracle calls on inputs of size n.

Proof

Let A be an arbitrary set in \( BPP //\log ^{(k+1)}\!\star \) and \(\mathcal {M}\) a probabilistic Turing machine with advice \(f \in {\log ^{(k+1)}\!\star }\), which decides A in polynomial time with error probability bounded by \(\gamma _1 \in (0, 1/2)\).

Let \(\mathcal {M}^{\prime }\) be a RW fair probabilistic Turing machine with unknown parameter y(f), the encoding of f, and let \( \gamma _2 \in \mathbb {R}\) be such that \(\gamma _1 + \gamma _2 < 1/2\). Let w be a word such that \(|w| \le n\). Theorem 4 assures that \(\mathcal {M}^{\prime }\) can estimate, up to adding constants, \(\frac{1}{2}\log (\xi (n))) = \frac{1}{2}\log \big ((\log ^{(k)}(n))^m\big )\) (which for m large gives an arbitrary constant multiple of \(\log ^{(k + 1)}(n)\)) bits of y(f), and, thus, \(\mathcal {M}^{\prime }\) can read f(n) in scheduled protocol time \(T = 2\) and in machine polynomial time, with an error probability bounded by \(\gamma _2\). We have that and . By definition, the machine can also make a sequence of fair coin tosses of polynomial length. Therefore, \(\mathcal {M}^{\prime }\) can decide A in polynomial time, with error probability bounded by \(\gamma _1 + \gamma _2 < 1/2\). \(\Box \)

Taking the special case \(k=0\), we have the following complementary result to Theorem 3:

Corollary 1

Every set in \( BPP //\log \!\star \) is decidable in polynomial time by a RW fair probabilistic Turing machine that can make up to \(\xi (n) \in \mathrm {poly}(n)\) RW oracle calls on inputs of size n.

In order to state and prove upper bounds, we need the following auxiliary result. This uses the query tree \(\mathcal {T}\), a tree with two branches—‘yes’ and ‘no’—every time a query is made. The probability of taking a path down the tree is just the product of the probabilities of the edges taken at every vertex.

Theorem 6

Let A be the set decided by a RW Turing machine, or RW fair probabilistic Turing machine, \(\mathcal {M}\) with unknown parameter \(\sigma \) that can make up to \(\xi (n)\) calls to the RW oracle, for inputs of size n, with error probability bounded by \(\gamma < 1/4\). If \(\mathcal {M}^{\prime }\) is an identical RW machine, except with unknown parameter \(\tilde{\sigma }\) and the probability of absorption \(\tilde{F}\), such that

$$ |F(\sigma , T) - \tilde{F}(\tilde{\sigma }, T)| < \frac{1}{8 \xi (n)} , $$

then, for any word of size \(\le n\), the probability of \(\mathcal {M}^{\prime }\) making an error when deciding A is \(\le 3/8\).

Proof

We know that \(\mathcal {M}\) and \(\mathcal {M}^{\prime }\) make at most \(\xi (n)\) calls to the oracle, in such a way that the query tree \(\mathcal {T}\) associated to both, has maximum depth \(\xi (n)\). Let w be of size not greater than n. Let D be the assignment of probabilities to the edges of \(\mathcal {T}\) corresponding to the unknown parameter \(\sigma \) and ‘yes’ probability \(F(\sigma , T)\) and \(D^{\prime }\) be the assignment of probabilities given by the unknown parameter \(\tilde{\sigma }\) and ‘yes’ probability \(\tilde{F}(\tilde{\sigma }, T)\). Since \(|F(\sigma , T) - \tilde{F}(\tilde{\sigma }, T)| < 1/8 \xi (n)\) , the difference between any particular probability is at most

$$ \kappa = \frac{1}{8 \xi (n)} . $$

Invoking Proposition 11 of [1], we have two different cases:

  • \(w \notin A\): In this case, an incorrect result corresponds to \(\mathcal {M}^{\prime }\) accepting w. The probability of acceptance \(P_{A}(\mathcal {T}, D^{\prime })\) for \(\mathcal {M}^{\prime }\) is

    $$\begin{aligned} P_{A}(\mathcal {T}, D^{\prime })&\le P_{A}(\mathcal {T},D) + |P_{A}(\mathcal {T},D^{\prime }) - P_{A}(\mathcal {T},D)| \\&\le \gamma + \xi (n) \kappa \\&\le \gamma + \xi (n)\frac{1}{8\,\xi (n)} = \frac{1}{4} + \frac{1}{8} = \frac{3}{8} \end{aligned}$$
  • \(w \in A\): In this case, an incorrect result corresponds to \(\mathcal {M}^{\prime }\) rejecting w. The probability of rejection \(P_{R}(\mathcal {T}, D^{\prime })\) for \(\mathcal {M}^{\prime }\) is

    $$\begin{aligned} P_{R}(\mathcal {T}, D^{\prime })&\le P_{R}(\mathcal {T}, D) + |P_{R}(\mathcal {T}, D^{\prime }) - P_{R}(\mathcal {T}, D)| \\&\le \gamma + \xi (n) \kappa \\&\le \gamma + \xi (n) \frac{1}{8\,\xi (n)} = \frac{1}{4} + \frac{1}{8} = \frac{3}{8} \end{aligned}$$

In both cases, the error probability is bounded by 3 / 8. \(\Box \)

Let \(F(\sigma , T) \!\!\downharpoonleft _{m}\) denote the first m bits of the probability \(F(\sigma , T)\). The next theorem is a corollary of the previous:

Theorem 7

Let A be the set decided by RW fair probabilistic Turing machine \(\mathcal {M}\) with unknown parameter \(\sigma \) that can make up to \(\xi (n)\) calls to the RW oracle, for inputs of size n, with error probability bounded by \(\gamma < 1/4\). If \(\mathcal {M}_n\) is an identical fair probabilistic Turing machine, with unknown parameter \(\tilde{\sigma }\), but with the exception that the probability that the oracle returns ‘yes’ is given by \(F(\sigma , T) \!\!\downharpoonleft _{\log \xi (n) + 3}\), then \(\mathcal {M}_n\) decides the same set as \(\mathcal {M}\) in the same time, but with error probability bounded by 3 / 8.

Now we state and prove upper bounds.

Theorem 8

(Upper bounds) For every k, every set decided in polynomial time by a RW Turing machine, or RW fair probabilistic Turing machine, that can make up to \(\xi (n) = \mathrm {poly}(\log ^{(k)}(n))\) calls to the RW oracle, where n is the size of the input, is in \( BPP //\log ^{(k+1)}\!\star \).

Proof

Let A be a set decided in polynomial time p(n) and with error probability bounded by 1 / 4 by a RW Turing machine \(\mathcal {M}\) with unknown parameter \(\sigma \) that can make up to \(\xi (n) \in \mathrm {poly}(\log ^{(k)}(n))\) calls to the oracle. We specify a probabilistic Turing machine \(\mathcal {M}^{\prime }\) with advice \(f(n) = F(\sigma ,T) \!\!\downharpoonleft _{\log \xi (n) + 3}\) to decide A. We have \(f \in {\log ^{(k+1)}\!\star }\).

By Theorem 7, we know that an RW Turing machine with ‘yes’ probability f(n) decides the same as \(\mathcal {M}\) for words of size \(\le n\), but with error probability \(\le 3/8\). The value \(f(n) = F(\sigma ) \!\!\downharpoonleft _{\log \xi (n) + 3}\) is a dyadic rational with denominator \(2^{\log \xi (n) + 3}\). Thus, \(m = 2^{\log \xi (n) + 3} f(n) \in [0,2^{\log \xi (n) + 3})]\) is an integer. Consider \(\kappa = \log \xi (n) + 3\) fair coin tosses, interpreted as a sequence of bits. The machine \(\mathcal {M}^{\prime }\) then tests if \(\tau _1 \tau _2 \dots \tau _k < m\), where \(\tau _1 \tau _2 \dots \tau _k\) is now interpreted as an integer. If the test is true, the machine returns ‘yes’, otherwise it returns ‘no’. The probability of returning ‘yes’ is \(m / 2^k = f(n)\), as required. The time taken is polynomial in n. \(\Box \)

From Theorems 4 and 8, we get the following corollary:

Theorem 9

The class of sets decidable in polynomial time by RW fair probabilistic Turing machines that can make up to \(\mathrm {poly}(\log ^{(k)}(n))\) calls to the RW oracle, for inputs of size n, is exactly \( BPP //\log ^{(k+1)}\!\star \).

As we want the RW Turing machines to run in polynomial time, the maximum number of oracle calls that we can allow is polynomial. For that bound, the corresponding class is \( BPP //\log \!\star \). Thus, if we restrict more and more the number of queries to the oracle, we can obtain a fine structure of \( BPP //\log \!\star \). Observe that if k is a very large number, the machine is allowed to make only few calls to the oracle, but the advice is smaller, so the number of bits that the machine needs to read is also smaller.

5 The Hierarchy

We explore some properties of advice classes (see [3, 7, 24]).

If \(f: \mathbb {N}\rightarrow \varSigma ^*\) is an advice function, then we use |f| to denote its size, i.e., the function \(|f|:\mathbb {N}\rightarrow \mathbb {N}\) such that \(|f|(n) = |f(n)|\), for every \(n \in \mathbb {N}\). For a class of functions, \(\mathcal {F}\), \(|\mathcal {F}| = \{|f|: f \in \mathcal {F}\}\).

Definition 3

A class of advice functions is said to be a class of reasonable advice functions if:

  1. 1.

    for every \(f \in \mathcal {F}\), |f| is computable in polynomial time;

  2. 2.

    for every \(f \in \mathcal {F}\), |f| is bounded by a polynomial;

  3. 3.

    for every \(f \in \mathcal {F}\), |f| is increasing;

  4. 4.

    \(|\mathcal {F}|\) is closed under addition and multiplication by positive integers;

  5. 5.

    for every polynomial p of positive integer coefficients and every \(f \in \mathcal {F}\), there exists \(g \in \mathcal {F}\) such that \(|f|\circ p\le |g|\).

Definition 4

Let r and s be two total functions. We say that \(r \prec s\) if \(r \in o(s)\). Let \(\mathcal {F}\) and \(\mathcal {G}\) be classes of advice functions. We say that \(\mathcal {F}\prec \mathcal {G}\) if there exists a function \(g \in \mathcal {G}\) such that, for every \(f \in \mathcal {F}\), \(|f| \prec |g|\).

We have \(\log ^{(k + 1)} \prec \log ^{(k)}\), for all \(k \ge 0\). Now, we just need to know the relation between the non-uniform complexity classes of \( BPP \), induced by the relation \(\prec \) in the advice classes. Remember that a set is said to be tally if it is a language over an alphabet of a single symbol (e.g. \(\{ 0 \}\)). Now, consider the set of finite sequences over the alphabet \(\varSigma \) ordered first by size and then alphabetically. The characteristic function of a set \(A \subseteq \varSigma ^*\) is the unique infinite sequence \(\chi _A: \mathbb {N}\rightarrow \{ 0, 1 \}\) such that, for every n, \(\chi _A (n)\) is 1 if, and only if, the n-th word in that order is in A. The characteristic function of a tally set A is a sequence where the i-th bit is 1 if, and only if, the word \(0^i\) is in A. The following theorem generalizes the related theorem of [3, 7, 24], where it is proved for the deterministic case.

Theorem 10

If \(\mathcal {F}\) and \(\mathcal {G}\) are two classes of reasonable sublinear advice functionsFootnote 3 such that \(\mathcal {F}\prec \mathcal {G}\), then \( BPP //\mathcal {F}\subsetneq BPP //\mathcal {G}\).

Proof

Trivially, \( BPP //\mathcal {F}\subseteq BPP //\mathcal {G}\). Let \( linear \) be the set of advice functions of size linear in the size of the input and \(\eta . linear \) be the class of advice functions of size \(\eta n\), where n is the size of the input and \(\eta \) is a number such that \(0< \eta < 1\). There is an infinite sequence \(\gamma \) whose set of prefixes is in \( BPP // linear \) but not in \( BPP //\eta . linear \) for some \(\eta \) sufficiently small.Footnote 4 Let \(g \in \mathcal {G}\) be a function such that, for every \(f \in \mathcal {F}\), \(|f| \prec |g|\). We prove that there is a set in \( BPP //g\) that does not belong to \( BPP //f\), for any \(f \in \mathcal {F}\).

A tally set T is defined in the following way: for each \(n \ge 1\),

$$ \beta _n = \left\{ \begin{array}{ll} \gamma \!\!\downharpoonleft _{|g|(n)}0^{n - |g|(n)} &{} \text{ if } |g|(n) \le n \\ 0^n &{}\text{ otherwise } \end{array}\right. . $$

T is the tally set with characteristic string \(\beta _1 \beta _2 \beta _3 \ldots \). With advice \(\gamma \!\!\downharpoonleft _{|g|(n)}\), it is easy to decide T, since we can reconstruct the sequence \(\beta _1 \beta _2 \ldots \beta _n\), with \((n^2 + n)/2\) bits, and then we just have to check if its n-th bit is 1 or 0. We conclude that \(T \in P/g \subseteq BPP//g\).

We prove that the same set does not belong to BPP /  / f. Suppose that some probabilistic Turing machine \(\mathcal {M}\) with advice f, running in polynomial time, decides T with probability of error bounded byFootnote 5

$$\begin{aligned} 2^{-\log (4|g|(n))} = \frac{1}{4|g|(n)} \end{aligned}$$

Since \(|f| \in o(|g|)\), then, for all but finitely many n, \(|f|(n) < \eta |g|(n)\), for arbitrarily small \(\eta \), meaning that we can compute, for all but finitely many n, |g|(n) bits of \(\gamma \) using an advice of length \(\eta . |g|(n)\), contradicting the fact that the set of prefixes of \(\gamma \) is not in \( BPP //\eta . linear \). The reconstruction of the binary sequence \(\gamma \!\!\downharpoonleft _{|g|(n)}\) is provided by the following procedure:

figure b

The queries are made simulating machine \(\mathcal {M}\) which is a probabilistic Turing machine with error probability bounded by \(2^{-\log (4|g|(n))} = \frac{1}{4|g|(n)}\). Thus, the probability of error of \(\mathcal {M}^{\prime }\) is bounded by

$$ \frac{1}{4|g|(\frac{n^2 - n}{2})} + \cdots + \frac{1}{4|g|(\frac{n^2 - n}{2} + |g|(n))} . $$

As |g| is increasing, the error probability is bounded by

$$ \frac{1}{4|g|(\frac{n^2 - n}{2})} \times |g|(n), $$

which, for \(n \ge 3\), is bounded by

$$ \frac{1}{4|g|(n)} \times |g|(n) = \frac{1}{4} . $$

\(\Box \)

As we are considering prefix advice classes, it is useful to derive the following corollary:

Theorem 11

If \(\mathcal {F}\) and \(\mathcal {G}\) are two classes of reasonable sublinear advice functions such that \(\mathcal {F}\prec \mathcal {G}\), then \( BPP //\mathcal {F}\star \subsetneq BPP //\mathcal {G}\star \).

Proof

The proof of Theorem 10 is also a proof that \( BPP //\mathcal {F}\subsetneq BPP //\mathcal {G}\star \), because the advice function used is \(\gamma \!\!\downharpoonleft _{|g|(n)}\), which is a prefix advice function. Since \( BPP //\mathcal {F}\star \subseteq BPP //\mathcal {F}\), the statement follows. \(\Box \)

We have already seen that, for all \(k \ge 0\), \(\log ^{(k+1)} \prec \log ^{(k)}\). In particular, this is true for \(k\ge 1\) and we have the following infinite descending chain

$$ \cdots \prec \log ^{(4)} \prec \log ^{(3)} \prec \log ^{(2)} \prec \log . $$

Therefore, by Theorem 11, we have also the descending chain of sets

$$ \cdots \subsetneq BPP //\log ^{(4)}\!\star \subsetneq BPP //\log ^{(3)}\!\star \subsetneq BPP //\log ^{(2)}\!\star \subsetneq BPP //\log \!\star , $$

that, according with Theorem 9, coincide with the classes of sets decided by RW fair probabilistic Turing machines that can make up to

$$ \cdots \subsetneq \mathrm {poly}(\log ^{(3)}(n))\subsetneq \mathrm {poly}(\log ^{(2)}(n))\subsetneq \mathrm {poly}(\log (n))\subsetneq \mathrm {poly}(n) $$

calls to the RW oracle, respectively.

6 Conclusion

Summary. We introduced RW fair probabilistic Turing machine specified as fair probabilistic Turing machines having access to a random walk experiment on a line. We then proved that the class of sets decidable in polynomial time by RW fair probabilistic Turing machines that can make up to \(\mathrm {poly}(\log ^{(k)}(n))\) calls to the oracle is exactly \( BPP //\log ^{(k+1)}\!\star \), where \(\log ^{(k)}\) is the class of advice functions f such that \(|f(n)| \in O( \log ^{(k)}(n))\).

We proved that, if \(\mathcal {F}\) and \(\mathcal {G}\) are two classes of reasonable sublinear advice functions such that \(\mathcal {F}\prec \mathcal {G}\), then \( BPP //\mathcal {F}\subsetneq BPP //\mathcal {G}\). Although this result was already discussed for the deterministic case in [3, 7, 24], the probabilistic case seems not to have been considered.

Then, we presented a fine structure of \( BPP //\log \!\star \) based on counting oracle calls:

$$ \cdots \subsetneq BPP //\log ^{(4)}\!\star \subsetneq BPP //\log ^{(3)}\!\star \subsetneq BPP //\log ^{(2)}\!\star \subsetneq BPP //\log \!\star , $$

that coincide with the structure of classes of sets decided by RW fair probabilistic Turing machine that can make up to

$$\cdots \subsetneq \mathrm {poly}(\log ^{(3)}(n)) \subsetneq \mathrm {poly}(\log ^{(2)}(n))\subsetneq \mathrm {poly}(\log (n)) \subsetneq \mathrm {poly}(n) $$

calls to the RW oracle, respectively.

Open Problem. Together with the transfinite chain of advice classes presented in [7, 18], we also have a transfinite chain of non-uniform probabilistic classes:

$$ \cdots \subsetneq BPP //\log ^{(2\omega )}\!\star \subsetneq \cdots \subsetneq BPP //\log ^{(\omega )}\!\star \subsetneq \cdots \subsetneq BPP //\log ^{(2)}\!\star \subsetneq BPP //\log \!\star . $$

In fact, the chain of non-uniform classes can be continued, where \(\log ^{(\omega )} = \bigcap _{k \in \mathbb {N}} \log ^{(k)}\) is a non-empty class (as shown in [7, 18] for diverse transfinite classes). However, we do not know if there is a correspondence between these complexity classes and the classes decided by RW fair probabilistic Turing machines with bounded number of oracle calls, since we only proved such a correspondence for advice classes of the form \(\log ^{(k)}\), with \(k \in \mathbb {N}\). At present, we do not know how to encode a function \(f \in \log ^{(\omega )}\!\star \) into a real number.