Definition of the Subject

One of the cornerstones in the theory of automata is the early result of John von Neumann, who solved the logical problem of nontrivial self‐reproduction. He employed a mathematical device which is a multitude of interconnected identical finite-state machines operating in parallel to form a larger machine. He showed that it is logically possible for such a nontrivial computing device to replicate itself ad infinitum [97]. Such devices are commonly called cellular automata (abbreviated, CA), and can be considered as homogeneously structured models for massively parallel computing systems. The global behavior of cellular automata is achieved by local interactions only. While the underlying rules are quite simple, the global behavior may be rather complex. In general, it is unpredictable.

The data supplied to CAs can be arranged as strings of symbols. Instances of problems to solve can be encoded as strings with a finite number of different symbols. Furthermore, complex answers to problems can be encoded as binary sequences such that the answer is computed bit by bit. In order to compute one piece of the answer, the set of possible inputs is split into two sets associated with the binary outcome. From this point of view, the computational capabilities of CAs are studied in terms of string acceptance, that is, the determination to which of the two sets a given string belongs. These investigations are with respect to and with the methods of language theory. They originated in Stephen N. Cole [18,19] and Alvy R. Smith [76,80]. Over the years substantial progress has been achieved, but there are still some basic open problems with deep relations to other fields. So, exploring the capabilities of cellular automata may benefit the understanding of the nature of parallelism and nondeterminism.

Introduction

In general, the specification of a cellular automaton includes the type and specification of the cells, their interconnection scheme (which can imply a dimension of the system), the local rules which are formalized as local transition function, and the input and output modes. With an eye towards language acceptance, we consider one‐dimensional synchronous devices with nearest neighbor connections whose cells are deterministic finite-state machines. They are commonly called cellular automata in case of parallel input mode, and iterative arrays (abbreviated, IA) if the input mode is sequential. If each cell is connected to only one of its neighbors, say to the right one, then the flow of information through the array is from right to left. The corresponding device is a one-way cellular automaton (abbreviated, OCA). In any case the number of cells is determined by the length of the input string; there is one cell per symbol. If the input is in parallel, then all cells fetch their input symbol during a pre-initial step. To this end, the set of symbols has to be a subset of the set of states. Sometimes for practical reasons and for the design of systolic algorithms, a sequential input mode is more convenient than the parallel one. In iterative arrays the leftmost cell is distinguished to be the communication cell. It is equipped with a one-way read-only input tape.

In order to obtain the binary answer of a system we have to overcome a problem with the end of the computation. It follows from the definitions that the machines never halt. A way to cope with the situation is to define a predicate on configurations. The answer depends on whether a configuration satisfies the predicate or not. Here we apply the common predicate that requires a border cell or the communication cell to be in some state designated to be an accepting state. Further predicates are studied, e. g., in [38,82], while more general input modes are considered in [57]. Due to the bounded number of cells in a computation, the time complexity is exponentially bounded. After exceeding the bound, the computation runs into a loop and is rather useless. With respect to the wide language classes obeying an exponential or polynomial time bound, parallel devices cannot take advantage of their large number of processing elements. They are just a factor to be multiplied with the size of the input. So, there is a particular interest in fast computations, that is, in real-time and linear-time computations. Real time is determined by the shortest time necessary for nontrivial computations, whereas linear time is real time multiplied by an arbitrary but fixed constant greater than or equal to one. In addition, we consider general computations without time bounds which, actually, are exponentially time bounded. The following well-known example from [17] joins several notions. It uses signals in order to construct the mapping n ↦ 2n in time; i. e., the leftmost cell recognizes the time steps 2n, n ≥ 1. The time constructor is then extended to a real-time CA and IA.

Example 1

The unary language consisting of the strings {\( { a^{2^{n}} |\, n \ge 1} \)} is accepted by IAs as well as by CAs in real time.

At initial time the leftmost cell emits a signal which moves with speed \( \frac{1}{3} \) to the right; i. e., the signal alternates from moving one cell to the right and staying for two time steps in a cell (see Fig. 1). In addition, another signal is emitted which moves with maximal speed (speed 1). It bounces between the slow signal and the leftmost cell. It is easy to see that the signal passes through the leftmost cell exactly at the time steps 2n, n ≥ 1. Finally, an iterative array can accept its input when the last input symbol is read at one of these time steps. Similarly, in a CA computation the rightmost cell can emit a third signal which moves with maximal speed to the left. This signal arrives at the leftmost cell at the time step which corresponds to the length of the input. If it meets the bouncing signal at its arrival, the input is accepted. □

Figure 1
figure 1_54

Space-time diagram showing signals of a real-time two-way acceptor of the language {\( { a^{2^{n}} |\,n \ge 1 } \)}

More results about mappings that are constructible in the above sense can be found in [13,63,93]. In [27,94] the series of prime numbers is constructed in real-time devices.

The investigations of iterative arrays and cellular automata as cellular language acceptors originated in [18,19] and [76,80]. In [18,19], it is shown that the family of languages accepted by real-time IAs is closed under intersection, union, and complementation, but is not closed under concatenation and reversal; and in [76,80], where among other results the identity of the sequential complexity class DSPACE(n) (i. e., the class of languages accepted by deterministic Turing machines whose tape is bounded by the length of the input n) and the family of languages accepted by CAs without time bound is shown. A long-standing open problem is the question whether or not one-way information flow is a strict weakening of two-way information flow for unbounded time. In [16,32] it is proved that a PSPACE‐complete language is accepted by OCAs, from which one can draw conclusions about the hardness of sequential OCA simulations. Furthermore, in the same papers strong closure properties are derived for the family of OCA languages. In addition, it is a proper superset of the context-free languages which, in turn, are of great practical relevance.

The proofs are based on characterizations of the parallel language families by certain types of customized sequential machines. Such machines have been developed for all classes of acceptors which are here under consideration [32,36,37]. In particular, speed-up theorems are given that allow to speed up the time beyond real time linearly. Therefore, linear-time computations can be sped up close to real time. Nevertheless, for OCAs and IAs linear time is strictly more powerful than real time. The problem is still open for CAs. In fact, it is an open question whether real-time CAs are strictly weaker than unbounded time CAs. If both classes coincide, then a PSPACE‐complete language would be accepted in polynomial time! Apart from that it is known that linear-time CAs can be simulated by unbounded time OCAs [16,32].

The rest of the article is organized as follows. In the following section (Sect. “Cellular Language Acceptors”), some basic notions and formal definitions are given. The problem in connection with the end of computations is discussed in more detail, and honest time complexities are derived informally. Then, in Sect. “Tools and Techniques” selected tools and techniques are presented that can be applied to prove or to disprove that certain languages are accepted by certain devices. In particular, it is shown that two-way devices can simulate the data structure stack without loss of time. It is often much harder to disprove the acceptance of languages than to prove it, since the technique of a suitable construction is trivially not applicable. Some techniques based on counting and pumping arguments are shown and applied in order to obtain witness languages. In Sect. “Computational Capacities” computational capacity aspects are investigated. A basic hierarchy of language families defined by cellular automata and iterative arrays is established. The levels are compared with well-known linguistic families. Next, Sect. “Closure Properties” is devoted to exploring the closure properties of the language families in question. For example, it turns out that the languages accepted by unbounded time one-way and two-way cellular automata as well as iterative arrays share the strong closure properties of the class DSPACE(n) which characterizes the deterministic context‐sensitive languages. Decidability problems are considered in Sect. “Decidability Problems”. By reductions of Turing machine problems, it follows that almost all interesting properties are undecidable. In fact, they are not even semidecidable for the weakest devices in question. This emphasizes once more the power gained in parallelism even if the number of cells is bounded. In general, their behavior is unpredictable. Finally, in Sect. “Future Directions”, some future directions of cellular language acceptors are discussed.

Cellular Language Acceptors

We denote the set of nonnegative integers by ℕ. In connection with formal languages, strings are called words. Let A denote the set of all words over a finite alphabet A. The empty word is denoted by λ, and we set \( { A^+ = A^{\ast} - \{\lambda \} } \). For the reversal of a word w we write w R and for its length we write |w|. For the number of occurrences of a symbol a in w we use the notation |w| a . We use ⊆ for inclusions and ⊂ for strict inclusions. In order to avoid technical overloading in writing, two languages L and L′ are considered to be equal, if they differ at most by the empty word, i. e., \( L - \{ \lambda \} = L^{\prime} - \{\lambda \} \). Throughout the article, two automata or grammars are said to be equivalent if and only if they accept or generate the same language.

The cells of a cellular automaton are identified by positive integers. In order to handle the application of the local transition function to the border cells, we assume that the missing neighbors are in a permanent so-called border state . A formal definition is:

Definition 2

two-way cellular automaton (CA) is a system \( \langle S,\delta,\texttt{\#},A,F\rangle \), where

  1. 1.

    S is the finite, nonempty set of cell states,

  2. 2.

    \( \texttt{\#}\notin S \) is the permanent boundary state,

  3. 3.

    \( A\subseteq S \) is the nonempty set of input symbols,

  4. 4.

    \( F\subseteq S \) is the set of accepting states, and

  5. 5.

    \( \delta:(S\cup\{\texttt{\#}\})\times S\times (S\cup\{\texttt{\#}\}) \to S \) is the local transition function.

Figure 2
figure 2_54

A (two-way) cellular automaton

If the flow of information is restricted to one-way, the resulting device is a one-way cellular automaton (abbreviated, OCA). In such devices, the next state of each cell depends on the state of the cell itself and the state of its immediate neighbor to the right.

Figure 3
figure 3_54

A one-way cellular automaton

configuration of a cellular automaton 〈 S, δ, #, A, F 〉 at time t ≥ 0 is a description of its global state, which is formally a mapping c t {1, …, n} → S, for n ≥ 1. The configuration at time 0 is defined by the given input w = a 1 ⋯ a n  ∈ A +. We set c 0(i) = a i , for 1 ≤ i ≤ n. Configurations may be represented as words over the set of cell states in their natural ordering. For example, the initial configuration for w is represented by # a 1a 2 ⋯ a n #. Successor configurations are computed according to the global transition function Δ. Let c t , t ≥ 0, be a configuration with n ≥ 2, then its successor c t + 1 is defined as follows:

$$ \begin{aligned} c_{t + 1} &= \Delta(c_t)\\ &\iff \begin{cases} c_{t+1}(1) = \delta(\texttt{\#}, c_t(1),c_t(2)) \\ c_{t+1}(i) = \delta(c_t(i-1),c_t(i),c_t(i+1))\:, \\ \quad i\in\{2,\dots,n-1\}\\ c_{t+1}(n) = \delta(c_t(n-1),c_t(n),\texttt{\#}) \end{cases} \end{aligned} $$

for CAs, and

$$ \begin{aligned} c_{t + 1} &= \Delta(c_t) \\ &\iff \begin{cases} c_{t+1}(i) = \delta(c_t(i),c_t(i+1))\:,\\ \quad i\in\{1,\dots,n-1\}\\ c_{t+1}(n) = \delta(c_t(n),\texttt{\#}) \end{cases} \end{aligned} $$

for OCAs. For n = 1, the next state of the sole cell is δ(#, c t (1), #). Thus, Δ is induced by δ.

A computation can be represented as a space-time diagram , where each row is a configuration and the rows appear in chronological ordering.

In order to define iterative arrays formally we have to provide an initial (quiescent) state for the cells. We assume that once the whole input is consumed an end-of-input symbol is supplied permanently.

Definition 3

An iterative array (IA) is a system \( \langle S, \delta, \delta_0, s_0, \texttt{\#}, \triangleleft, A, F \rangle \), where

  1. 1.

    S is the finite, nonempty set of cell states,

  2. 2.

    \( s_0\in S \) is the quiescent state,

  3. 3.

    \( \texttt{\#}\notin S \) is the permanent boundary state,

  4. 4.

    \( \triangleleft\notin A \) is the end-of-input symbol,

  5. 5.

    A is the finite, nonempty set of input symbols,

  6. 6.

    \( F \subseteq S \) is the set of accepting states,

  7. 7.

    \( \delta: S\times S\times (S\cup\{\texttt{\#}\}) \to S \) is the local transition function for non‐communication cells satisfying \( \delta(s_0,s_0,s_0) = \delta(s_0,s_0,\texttt{\#})=s_0 \),

  8. 8.

    \( \delta_0:(A\cup \{\triangleleft\})\times S \times (S\cup\{\texttt{\#}\})\to S \) is the local transition function for the communication cell.

Figure 4
figure 4_54

An iterative array

A configuration of an iterative array 〈 S, δ, δ0, s 0, #, ◃, A, F 〉 at time t ≥ 0 is a pair (w t ,c t ), where w t  ∈ A is the remaining input sequence and c t  : {1, …, n} → S, for n ≥ 1, is a mapping that maps the single cells to their current states. The configuration (w 0,c 0) at time 0 is defined by the input word w 0 and the mapping c 0(i) = s 0, 1 ≤ i ≤ n. The global transition function Δ is induced by δ and δ0 as follows: Let (w t ,c t ), t ≥ 0, be a configuration and i ∈ {2, …, n −1}. Then

$$ \begin{aligned} (w_{t+1}, c_{t+1}) &= \Delta((w_t,c_t)) \\ \iff \! &\begin{cases} c_{t+1}(1) = \delta_0(a,c_t(1),c_t(2))\\ c_{t+1}(i) = \delta(c_t(i-1),c_t(i),c_t(i+1))\\ c_{t+1}(n) = \delta(c_t(n-1),c_t(n),\texttt{\#}) \end{cases} \end{aligned} $$

where a = ◃, w t + 1 = λ if w t = λ, and a = a 1, w t  + 1 = a 2 ⋯ a n if w t  = a 1 ⋯ a n .

An input w is accepted by a CA, OCA, or an IA \( \mathcal{M} \) if at some time i during its course of computation the leftmost cell enters an accepting state. The language accepted by \( \mathcal{M} \) is denoted by L(\( \mathcal{M} \)). Let t : ℕ → ℕ, t(n) ≥ n (t(n) ≥ n + 1 for IAs) be a mapping. If all w ∈ L(\( \mathcal{M} \)) are accepted with at most t(|w|) time steps, then L(\( \mathcal{M} \)) is said to be of time complexity t.

Observe that time complexities do not have to meet any further conditions. This general treatment is made possible by the way of acceptance. An input w is accepted if the leftmost cell enters an accepting state at some time i ≤ t(|w|). But what if afterwards, a final configuration has been reached? Subsequent states of the leftmost cell are not relevant.

Following the different approach to gather the result of a computation at time step t(|w|) by the outside world does not yield the desired outcome in general. In this case, the intrinsic computation may be hidden in the determination of the time step t(|w|). That is, computational power may be added from the outside world. For example, let L ∈ {a}+ be an arbitrary language. Then L is accepted by some OCA with time complexity

$$ t(n) = \begin{cases} n & {\text{if}}\quad a^n \notin L \\ n + 1 & {\text{if}}\quad a^n \in L \end{cases}\:, $$

where the local transition function is easily designed to realize the behavior depicted in the space-time diagram of Fig. 5.

Figure 5
figure 5_54

An OCA accepting any unary language. Here + is an accepting and − a non‐accepting state

So, it is reasonable to consider only such time complexities t that allow the leftmost cell to recognize the time step t(n). For example, the identity t(n) = n is an honest time complexity for OCAs and CAs. A signal which is initially emitted by the rightmost cell and moves with maximal speed arrives at the leftmost cell exactly at time step n. By slowing down the signal to speed \( \frac{x}{y} \), i. e., the signal alternating moves x cells to the left and stays for y −x time steps in a cell, it is seen that the time complexities \( \frac{x}{y} \cdot n \), for any positive integers x,y, are also honest. Another example are exponential time complexities t(n) = k n, for any integer k ≥ 2. Without going too deep into technical details, a corresponding device can be set up as a kary counter. The rightmost cell simulates the least significant digit and adds one to the counter at every time step. The neighboring cell to the left observes when a carry-over appears, increases its own digit and so on. Then, the leftmost cell produces a first carry-over exactly at time step k n.

Figure 6
figure 6_54

Principle of a pushdown store simulation. Subfigures are in row-major order

The family of languages that are accepted by IAs (and CAs, OCAs) with time complexity t is denoted by \( \mathcal{L} \) t (IA) (and \( \mathcal{L} \) t (CA), \( \mathcal{L} \) t (OCA), respectively). The index is omitted for arbitrary time. Actually, arbitrary time is exponential time due to the space bound. If t is the function n + 1 (the function n), acceptance is said to be in real time and we write \( \mathcal{L}_\mathrm{rt} \)(IA) (\( \mathcal{L}_\mathrm{rt} \)(CA), \( \mathcal{L}_\mathrm{rt} \)(OCA)). Since for nontrivial computations an IA has to read at least one end-of-input symbol, real time has to be defined as (n + 1)-time. The linear-time languages \( \mathcal{L}_\mathrm{lt} \)(IA) are defined according to \( \mathcal{L}_\mathrm{lt} {\text{(IA)}} = \bigcup_{k \in \mathbb{Q}, k \geq 1} \mathcal{L}_{k \cdot n} \)(IA), and similarly for CAs and OCAs.

Tools and Techniques

An elementary technique in automata theory is the usage of multiple tracks. Basically, this means to consider the state set as Cartesian product of some smaller sets. Each component of a state is called a register, and the same register of all cells together forms a track.

The first goal of this section is to show how to simulate pushdown stores , i. e., stores obeying the principle last in first out, by IAs and CAs in real time. Assume without loss of generality that at most one symbol is pushed onto or popped from the stack at each time step. We distinguish one cell that simulates the top of the pushdown store. It suffices to use three additional tracks for the simulation. Let the three pushdown registers of each cell be numbered one, two and three from top to bottom. Each cell prefers to have only the first two registers filled. The third register is used as a buffer. In order to reach that charge it obeys the following rules (cf. Fig. 6).

a):

If all three registers of its left (upper) neighbor are filled, it takes over the symbol from the third register of the neighbor and stores it in its first register. The old contents of the first and second registers are shifted to the second and third register.

b):

If there is only the first register of its left (upper) neighbor filled, the cell erases its first register and shifts the contents of the second and third registers to the first and second register. Observe that the erased symbol is taken over by the left neighbor.

c):

Possibly, more than one of these actions are superimposed.

From the simulation, it follows immediately that real-time IAs as well as real-time CAs accept all languages accepted by sequential pushdown automata as long as they work in real time.

Theorem 4

Given some real-time deterministic pushdown automaton, an equivalent real-time IA and CA can effectively be constructed, i. e., every real-time deterministic context-free language belongs to the families \( \mathcal{L}_{\text{rt}}(\text{IA}) \) and \( \mathcal{L}_{{\text{rt}}}({\text{CA}} \)).

Now we turn to a technique for disproving that languages are accepted. In general, the method is based on equivalence classes which are induced by formal languages. If some language induces a number of equivalence classes which exceeds the number of classes distinguishable by a certain device, then the language is not accepted by that device. First we give the definition of an equivalence relation which applies to real-time IAs.

Definition 5

Let \( L \subseteq A^* \) be a language and \( l \geq 1 \) be a constant. Two words \( w \in A^* \) and \( w^{\prime} \in A^* \) are l‑equivalent with respect to L if and only if

$$ wu\in L \iff w^{\prime}u \in L $$

for all \( u \in A^*, |u| \leq l \). The number of l‑equivalence classes with respect to L is denoted by \( E(L,l) \).

In [19] the following upper bound for the number of equivalence classes distinguishable by real-time IA is derived.

Lemma 6

If \( L\in\mathcal{L}_{{\text{rt}}}({\text{IA}}) \), then there exists a constant \( p \geq 1 \) such that \( E(l,L)\leq p^{l} \).

Proof

Let \( \mathcal{M} \) be a real-time IA with state set S. In order to determine an upper bound for the number of l‑equivalence classes with respect to L(\( \mathcal{M} \)), we consider the possible configurations of \( \mathcal{M} \) after reading all but l input symbols. The remaining computation depends on the last l input symbols and the states of the cells 1, …, l + 2. For the l + 2 states there are at most |S|l + 2 different possibilities. Setting p = |S|3, we derive |S|l + 2 ≤ |S|3l = p l, and obtain at most p l different possibilities. Since the number of equivalence classes is not affected by the last l input symbols, there are at most p l equivalence classes. □

Example 7

The language

$$ \begin{aligned} L = \{\texttt{\&} x_k \texttt{\&} \cdots\texttt{\&} x_1 \texttt{?} y_1\texttt{\&} \cdots\texttt{\&} y_k\texttt{\&} \mid k\geq 1, x_i^R = y_i z_i \\ \text{and}\quad x_i, y_i, z_i \in \{a,b\}^*\} \end{aligned} $$

does not belong to \( \mathcal{L}_\mathrm{rt} \)(IA).

For a pair of different prefixes w = & x k & ⋯&x 1 ? and w′ = &x k & ⋯ & x1? with |x i | = |x i | = k, 1 ≤ i ≤ k, there exists at least one 1 ≤ j ≤ k such that x j  ≠ x j . This implies w& j −1x R j & k −j + 1 ∈ L and w′ & j −1x R j & k −j + 1 ∉ L. Since there are \( { 2^{k^{2}} } \) different prefixes of the given form, language L induces at least \( { 2^{k^2} } \) classes.

On the other hand, if L would be accepted by some real-time IA, then by Lemma 6 there is a constant p ≥ 1 such that E(L,2k) ≤ p 2k. Since L is infinite, we may choose k large enough such that \( { 2^{k^2} > p^{2k}} \), which is a contradiction.□

Now we change to real-time OCAs. The next result is a tool which allows us to show that languages do not belong to the family \( \mathcal{L}_\mathrm{rt} \)(OCA). It is based on pumping arguments for cyclic strings [68].

Lemma 8

Let L be a real-time OCA language. Then there exists a constant \( p \geq 1 \) such that any pair of a word w and an integer k that meets the condition \( w^k \in L \) and \( k \mathchar"313E p^{|w|} \) implies that there is some \( 1\leq q \leq p^{|w|} \) such that \( w^{k+jq} \in L \), for all \( j \geq 0 \).

Proof

For a given real-time OCA \( \mathcal{M} \) = 〈 S, δ, #, A, F 〉, we set p = |S|2. Let w k ∈ L(\( \mathcal{M} \)), where k > p |w|. Then we consider an accepting computation of \( \mathcal{M} \) on input w k. The initial configuration is represented by # w k #. Clearly, a cyclic left part of some configuration leads again to a cyclic left part, though the new left part gets one cell shorter at any time step. Therefore, after |w| time steps the left part of the configuration which still may influence the overall computation result is represented by # w k−11 s 1, where |w 1| = |w| and s 1 ∈ S. After another |w| time steps, we obtain # w k−22 s 2, where |w 2| = |w| and s 2 ∈ S. In general, the relevant part of a configuration at time i · |w|, 1 ≤ i ≤ k, is represented by # w k −i i s i , where |w i| = |w| and s i  ∈ S. In addition, state s k is an accepting one.

Since the number of different words w i is bounded by |S||w|, for \( { w_is_i } \) there are at most

$$ |S|^{|w|+1}\leq |S|^{2\left\lceil\frac{|w|+1}{2}\right\rceil} \leq p^{|w|} $$

different possibilities. Now \( k > p^{|w|} \) implies that \( w_is_i = w_ls_l \) for some \( 1\leq i < l \leq k \). Therefore, there is a loop and, for \( q=l-i \), the word \( { w^{k+jq} } \) is accepted, for any \( { j \ge 0 } \).□

Example 9

The language \( L = \{a^{2^n} \mid n \geq 1 \} \) is not accepted by any real-time OCA.

Contrarily assume there is a real-time OCA accepting L. Then we set \( w = a, k=2^p \) and observe that the conditions of Lemma 8 are met. Therefore, \( a^{2^p+q} \) as well as \( a^{2^p+2q} \) belong to L. If \( 2^p+q \) is not a power of two, we obtain a contradiction. So, let \( 2^p+q = 2^{p+r} \), for some \( r\geq 1 \). We derive \( 2^{p+r} < 2^{p+r}+q = 2^{p+r}+2^{p+r}-2^p=2^{p+r+1}-2^p < 2^{p+r+1} \), and conclude that \( 2^p+2q \) is strictly in between two consecutive powers of two. Hence, \( a^{2^p+2q} \) does not belong to L. □

Next we come back to equivalence classes. By a similar idea as for iterative arrays an equivalence relation can be defined such that the number of equivalence classes distinguishable by real-time OCAs is bounded. But due to the nature of OCA computations, both the prefixes as well as the suffixes of inputs have to be regarded [84,85]. We continue with the equivalence relation:

Definition 10

Let \( L\subseteq A^* \) be a language and \( X\subseteq A^* \) and \( Y\subseteq A^* \) be two sets of words. Two words \( w\in A^* \) and \( w^{\prime}\in A^* \) are \( (L,X,Y) \)-equivalent if and only if

$$ xwy \in L \iff xw^{\prime}y \in L $$

for all \( x \in X \) and \( y \in Y \).

The next step is to derive an upper bound for the number of equivalence classes which can be distinguished by some real-time OCA.

Theorem 11

Let \( L \subseteq A^* \) be a real-time OCA language and \( X = A^{m_1} \) and \( Y=A^{m_2} \) be two sets of words for positive integers m 1 and m 2. Then there exists a constant \( p\geq 1 \) such that the number N of \( (L,X,Y) \)‐equivalence classes is bounded by

$$ N \leq p^{|X|} p^{(m_{2}+1)|Y|} \:. $$

Proof

For any word w and any x ∈ X and y ∈ Y the input xwy implies exactly one real-time OCA configuration at time |w| (see Fig. 7). At this time only the leftmost |xy| + 1 cells can still influence the overall computation result, i. e., the cells 1, …, |xy| + 1. The cells 1, …, |x| are not affected by the cells |xw| + 1, …, |xwy| up to time |w|. So, they are not affected by y up to that time. Furthermore, the cells |x| + 1, …, |xy| + 1 are not affected by the cells 1, …, |x|, that is, not by x. On the other hand, different w may cause different configurations for x and y, where x and y are independent of each other.

Figure 7
figure 7_54

Dependencies in the proof of Theorem 11

It follows that w and w′ are equivalent, if the cells 1, …, |xy| + 1 at time step |w| respectively |w′| are in the same states for all x ∈ X and y ∈ Y. For any y ∈ Y, there are |S|\( { ^{m_2} } \) + 1 different configurations for the cells |x| + 1, …, |xy| + 1. So, there are altogether at most \( |S|^{(m_{2}+1)|Y|} \) different possibilities to distinguish the words w.

The possibilities for the cells 1, …, |x| are as follows. For a word x = x \( { _{m_1} } \) … x 1 the state s i of cells 1 ≤ i ≤ m 1 at time |w| depends only on x i , …, x 1 and w. So, for s i there are at most |S|\( { ^{{|A|}^i} } \) different possibilities to distinguish the words w. Together we obtain

$$ \prod_{i=1}^{m_1} |S|^{|A|^i} < |S|^{|A|^{m_1+1}} $$

possibilities. For p = |S||A| this implies \( |S|^{|A|^{m_1+1}} = p^{|A|^{m_1}} = p^{|X|} \). Thus, the number of equivalence classes is bounded by \( p^{|X|} p^{(m_{2}+1)|Y|} \). □

As an example we show that the language \( L_d \subset \{0,1,(,),|\}^{+} \) whose words are of the form

$$ x ( x_{1} | y_{1} ) \cdots ( x_{n} | y_{n} ) y \:, $$

where x,x i ,y,y i  ∈ {0,1}, for 1 ≤ i ≤ n, and (x|y) = (x i |y i ) for at least one i ∈ {1, …, n}, is not a real-time OCA language. It can be thought of as a dictionary. The task for the OCA is to check whether the pair (x|y) occurs in the dictionary or not.

Example 12

Let X = Y = {0,1}. Two words \( w = (x_{1} | y_{1}) \cdots (x_{n}| y_{n}) \) and \( w^{\prime}=( x^{\prime}_{1}| y^{\prime}_{1})\cdots( x^{\prime}_{m} | y^{\prime}_{m}) \) are (L d , X, Y)‐equivalent if and only if \( \{( x_{1}| y_{1}),\dots, ( x_{n} | y_{n})\} = \{( x^{\prime}_{1}| y^{\prime}_{1}),\dots, ( x^{\prime}_{m} | y^{\prime}_{m})\} \).

First assume that the two sets are equal. Let x ∈ X and y ∈ Y, then xwy ∈ L d implies (x|y) = (x i |y i ), for some i. Since the two sets are equal, we have (x|y) = (x j |y j ), for some j. Therefore, xwy ∈ L d implies xwy ∈ L d and vice versa, i. e., w and w′ are \( (L_d,X,Y) \)-equivalent.

Now assume the two sets are different. Without loss of generality, we can assume that there exist x ∈ X and y ∈ Y with (x|y) = (x i |y i ), for some i, but \( (x|y) \neq (x^{\prime}_{j}|y^{\prime}_{j}) \), for all j = 1, …, m. Then xwy ∈ L d but xwy ∉ L d and, thus, w and w′ are not (L d ,X,Y)-equivalent.

In order to derive a lower bound for the number of (L d , X, Y) equivalence classes induced by L d , let \( m_1,m_2\geq 1 \), \( X=\{0,1\}^{m_1} \) and \( Y=\{0,1\}^{m_2} \). Then the number N of (L d ,X,Y)‐equivalence classes is at least \( 2^{2^{m_1+m_2}} \).

For fixed m 1 ≥ 1 and m 2 ≥ 1, we consider all words of the form \( (x_{1}| y_{1}) \cdots( x_{k} | y_{k}) \) with \( x_{i} \in \{0,1\}^{m_{1}} \) and \( y_{i} \in \{0,1\}^{m_{2}} \), for all \( i \in \{1,\dots,k\} \), and \( (x_{i}| y_{i}) \neq (x_{j} | y_{j}) \), for i ≠ j. These words are said to be of type (m 1,m 2). Following the argumentation above, two words are equivalent if and only if the sets of subwords are equal. There are \( 2^{2^{m_{1} + m_{2}}} \) words of type (m 1,m 2) which belong to different equivalence classes with respect to L d , \( X=\{0,1\}^{m_{1}} \) and \( Y=\{0,1\}^{m_{2}} \).

Now assume that L d is accepted by some real-time OCA. For all m 1,m 2 ≥ 1 there is a constant p such that the number of equivalence classes is at most \( p^{2^{m_1}} p^{(m_{2} + 1) 2^{m_2}} \). For large enough m 1 and m 2 we obtain a contradiction to the lower bound \( 2^{2^{m_1+m_2}} \). □

Helpful tools of a different nature are speed-up theorems . Strong results are obtained in [34,36], where the parallel language families are characterized by certain types of customized sequential machines. Such machines have been developed for all classes of acceptors which are here under consideration. In particular, speed-up theorems are given that allow to speed up the time beyond real time linearly. Therefore, linear-time computations can be sped up close to real time. The question whether or not real time is strictly weaker than linear time is discussed in detail later.

Theorem 13

Let \( \mathcal{M} \) be a CA, OCA, or IA obeying time complexity \( rt + r(n) \), where \( r \colon \mathbb{N} \to \mathbb{N} \) is a mapping and rt denotes real time. Then for all \( k\geq 1 \) an equivalent device \( \mathcal{M}^{\prime} \) of the same type obeying time complexity \( rt + \lfloor\frac{r(n)}{k}\rfloor \) can effectively be constructed.

The next two examples are frequently used applications of the linear speed-up theorem.

Example 14

Let k 0 ≥ 1 and \( \mathcal{M} \) be a device in question with time complexity rt + k 0. Then there is an equivalent real-time device \( \mathcal{M} \)′ of the same type. It suffices to set k = k 0 + 1 and to apply Theorem 13 in order to obtain \( rt + \lfloor \frac{k_0}{k} \rfloor = rt + \lfloor \frac{k_0}{k_0+1} \rfloor = rt \) for the time complexity of \( \mathcal{M} \)′. □

Example 15

Let k 0 ≥ 1 and \( \mathcal{M} \) be a device in question with time complexity rt + k 0 · rt. Then for all rational numbers ε > 0 there is an equivalent device \( \mathcal{M} \)′ of the same type with time complexity \( \lfloor(1 + \epsilon) \cdot rt \rfloor \). We set \( k=\big\lceil\frac{k_0}{\epsilon}\big\rceil \) and apply Theorem 13 in order to obtain \( rt + \Big\lfloor\frac{k_0\cdot rt}{\lceil k_0/\epsilon\rceil}\Big\rfloor \leq rt+\Big\lfloor\frac{k_0\cdot rt}{k_0/\epsilon}\Big\rfloor = rt+\lfloor \epsilon \cdot rt\rfloor =\lfloor(1+\epsilon)\cdot rt\rfloor \). □

Computational Capacities

In this section we explore the computational capacities of real-time, linear-time, and unbounded time devices. The goal is to establish a hierarchy of language families and to compare the levels with well-known linguistic families of the Chomsky hierarchy language family. The properness of some inclusions are long-standing open problems with deep relations to sequential complexity problems. In order to establish the hierarchy we start at the upper and lower end. Straightforward constructions of linearly space-bounded Turing machines from IAs, of IAs from CAs, and of CAs from linearly space-bounded Turing machines show the following lemma [80].

Lemma 16

The families \( \mathcal{L}({\text{CA}}) \) and \( \mathcal({\text{IA}}) \) are identical with the deterministic context‐sensitive languages, i. e., with the complexity class DSPACE(n).

At the lower end we consider the regular languages. Since already the communication cell is a deterministic finite-state machine, clearly, the regular languages are a subset of \( \mathcal{L} \) rt(IA). In Theorem 4 the stronger result that any real-time deterministic context-free language belongs to \( \mathcal{L} \) rt(IA) has been shown. So, the properness of the following inclusion is obvious.

Corollary 17

The regular languages are a proper subset of \( \mathcal{L}_{{\text{rt}}}({\text{IA}}) \).

The question arises whether the real-time condition of Theorem 4 can be relaxed in order to obtain a larger sub-family of the context-free languages which is accepted by real-time IAs. The answer is negative. The following lemma marks a sharp boundary between context-free languages acceptable and non‐acceptable by real-time IAs. Recall that linear context-free languages are accepted by nondeterministic one-turn pushdown automata. Since here we deal with deterministic devices, the alternative grammar characterization is more suitable for our purposes.

A grammar \( \mathcal{G} \) = 〈 N,T,S,P 〉 is said to be linear context free, if the productions in P are of the forms (X → a), (X → Ya), or (X → aY), where X,Y ∈ N are nonterminals and a ∈ T is a terminal symbol (e. g. [72]).

Lemma 18

There exists a deterministic, linear context-free language that does not belong to \( \mathcal{L}_{\text{rt}}({\text{IA}}) \).

Proof

Clearly, language

$$ \begin{aligned} L = \{\texttt{\&} x_k \texttt{\&} \cdots \texttt{\&} x_1 \texttt{?} y_1\texttt{\&} \cdots \texttt{\&} y_k \texttt{\&} \mid k \geq 1, x_i^R = y_iz_i \\ \text{and}\quad x_i, y_i, z_i \in \{a,b\}^*\} \end{aligned} $$

is deterministic and linear context free. By Example 7 it does not belong to \( \mathcal{L} \) rt(IA). □

We turn to real-time OCAs. Similar as above, the linear context-free languages serve as sub-family of the context-free languages which is contained in \( \mathcal{L} \) rt(OCA) [76].

Theorem 19

Given some linear context-free grammar, an equivalent real-time OCA can effectively be constructed, i. e., every linear context-free language belongs to \( \mathcal{L}_{\text{rt}}({\text{OCA}}) \).

Proof

Let \( \mathcal{G} = \langle N,T,S,P \rangle \) be a linear context-free grammar, and \( w = a_1 a_2 \cdots a_n \) be a word in \( L(\mathcal{G}) \). If \( X \Rightarrow^* a_i\cdots a_j \), \( 1\leq i< j \leq n \), then there exists a \( Y\in N \) such that either \( (Y\Rightarrow^* a_i\cdots a_{j-1} \mathrel{\land} X\Rightarrow Ya_j) \) or \( (Y\Rightarrow^* a_{i+1}\cdots a_{j} \mathrel{\land} X \Rightarrow a_iY) \). Based on this fact we define sets of nonterminals for the given word w as follows:

$$ \begin{aligned} N(i,i) &= \{ X \in N \mid (X \to a_i) \in P\}\:, \quad 1 \leq i \leq n\\ N(i,j) &= N_1(i,j) \cup N_2(i,j)\:, \quad 1 \leq i < j \leq n\:, \\ \text{where}\enskip &N_1(i,j) = \{X \in N \mid (X \to Ya_j) \in P\\ &\quad \mathrel{\land} Y \in N(i,j - 1)\} \\ \text{and}\enskip &N_2(i,j) = \{X \in N \mid (X \to a_i Y) \in P\\ &\quad \mathrel{\land} Y \in N(i+1,j)\}\:. \end{aligned} $$

So, the word a 1 ⋯ a n is generated by \( \mathcal{G} \) if and only if S ∈ N(1,n).

A real-time OCA \( \mathcal{M} = \langle S^{\prime},\delta, \texttt{\#},T,F \rangle \) accepting L(\( \mathcal{G} \)) behaves as follows. It analyzes its input a 1 ⋯ a n in such a way that the cells successively compute certain sets N(i,j). In the first step, all cells 1 ≤ i ≤ n compute N(i,i) and (a i , a i ). In subsequent steps k ≥ 2, they compute N(i, i + k −1) and (a i , a i +k −1) if k ≤ n + 1 −i, and keep their state otherwise. The new values can be determined by the state of the cell itself and the state of its right neighbor, i. e., N(i, i + k −2), (a i , a i + k −1) and N(i + 1, i + k −1), (a i + 1, a i + k ). Thus, the leftmost cell computes N(1, n) at time step n. □

Corollary 20

The regular languages are a proper subset of \( \mathcal{L}_{\text{rt}}({\text{OCA}}) \).

Again, the question arises whether the condition of Theorem 19 can be relaxed in order to obtain a larger sub-family of the context-free languages that is accepted by real-time OCAs. Again, the answer is negative.

Lemma 21

There exists a two-linear context-free language that does not belong to \( \mathcal{L}_{\text{rt}}({\text{OCA}}) \).

Proof

Consider the linear language \( L = \{a^nb^n \mid n \geq 1\}\cup\{a^n bva b^n \mid n \geq 1, v \in \{a,b\}^*\} \). By counting arguments, in [85] it is shown that the two-linear concatenation \( LL \) is not accepted by any real-time OCA. □

Beginning with the regular languages at the bottom of the hierarchy, next we deal with the proper supersets given by real-time deterministic and linear context-free languages. Since both families are known to be incomparable with respect to inclusion, the hierarchy splits into two strands. Unfortunately we have to continue with these two strands for a moment.

Theorem 22

The families \( \mathcal{L}_{\text{rt}}({\text{OCA}}) \) and \( \mathcal{L}_{\text{rt}}({\text{IA}}) \) are incomparable.

Proof

By Lemma 18 there exists a linear context-free language not accepted by any real-time IA, but by Theorem 19 it belongs to a real-time OCA.

Conversely, one can show that the two-linear language of Lemma 21 belongs to \( \mathcal{L} \) rt(IA). □

The next step is to go beyond \( \mathcal{L} \) rt(OCA) and \( \mathcal{L} \) rt(IA). Structurally this concerns CAs, but first we increase the time complexity. For the proof of infinite hierarchies in between real time and linear time, it is necessary to control the lengths of words with respect to some internal substructures. The following notion of constructibility expresses the idea that the length of a word relative to the length of a subword should be computable. To this end, a function \( f\colon\mathbb{N} \to \mathbb{N} \) is said to be OCA‐constructible, if there exist an λ-free homomorphism h and a language L ∈ \( \mathcal{L} \) rt(OCA) such that \( h(L) = \{a^{f(n)-n}b^{n} \mid n \geq 1\} \). Since constructible functions describe the length of the whole word dependent on the length of a subword, it is obvious that each constructible function must be greater than or equal to the identity. At a glance, this notion of constructibility might look somehow unusual or restrictive. However, λ-free homomorphisms are very powerful, so that the family of (in this sense) constructible functions is very rich. Examples and the next theorem are presented in [45].

Theorem 23

Let \( r_1,r_2 \colon \mathbb{N} \to \mathbb{N} \) be two increasing functions. If \( r_2 \cdot \log(r_2) \in o(r_1) \) and \( r_1^{-1} \) is OCA constructible, then \( \mathcal{L}_{n + r_2(n)}(\text{OCA}) \subset \mathcal{L}_{n+r_1(n)}(\text{OCA}) \).

Example 24

Let 0 ≤ p < q ≤ 1 be two rational numbers. Clearly, n p · log(n p) is of order o(n q). Moreover, the inverse of n q is OCA‐constructible. Thus, an application of Theorem 23 yields the strict inclusion

$$ \mathcal{L}_{n + n^p}({\text{OCA}}) \subset \mathcal{L}_{n + n^q}({\text{OCA}})\:. $$

Example 25

Let i < j be two positive integers, then \( \log^{[j]}(n)\cdot \log^{[j+1]}(n) \) is of order \( o(\log^{[i]}(n)) \). Since the inverse of \( \log^{[i]}(n) \) is OCA‐constructible, we obtain the strict inclusion

$$ \mathcal{L}_{n+\log^{[j]}(n)}({\text{OCA}}) \subset \mathcal{L}_{n+\log^{[i]}(n)}({\text{OCA}})\:. $$

Similar results are known for iterative arrays. The infinite strict hierarchies in the range between real time and linear time are established in [9]. The constructibility is defined differently. A strictly increasing function \( f \colon \mathbb{N} \to \mathbb{N} \) is IA‐constructible, if there exists an IA with infinitely many cells to the right, such that the leftmost cell enters some state from a distinguished subset of states exactly at all time steps f(i), 1 ≤ i. Example 1 shows that the function 2n is IA‐constructible.

Theorem 26

Let \( r_1,r_2 \colon \mathbb{N} \to \mathbb{N} \) be two increasing functions. If \( r_2 \in o(r_1) \) and \( r_1^{-1} \) is IA‐constructible, then \( \mathcal{L}_{n + r_2(n)}({\text{IA}}) \subset \mathcal{L}_{n + r_1(n)} ({\text{IA}}) \).

The following example is based on natural functions.

Example 27

Since the family of IA‐constructible functions is closed under composition and contains 2n and n k, k ≥ 1, the functions \( \log^{[i]}(n) \), i ≥ 1, and \( \sqrt[k]{n} \) are inverses of constructible functions. Actually, the inverses of 2n and n k are \( \lceil {\log(n)} \rceil \) and \( \lceil {n^{\frac{1}{k}}} \rceil \) but for convenience we simplify the notation. Therefore, an application to the hierarchy theorem yields

$$ \begin{aligned} \mathcal{L}_{{\text{rt}}}({\text{IA}}) \subset \cdots \subset \mathcal{L}_{n + \log^{[i+1]}(n)}({\text{IA}}) \subset \mathcal{L}_{n + \log^{[i]}(n)}({\text{IA}})\\ \subset \cdots \subset \mathcal{L}_{{\text{lt}}}({\text{IA}}) \end{aligned} $$

and

$$ \begin{aligned} \mathcal{L}_\mathrm{rt}({\text{IA}}) \subset \cdots \subset \mathcal{L}_{n+n^{\frac{1}{i+1}}}({\text{IA}}) \subset \mathcal{L}_{n+n^{\frac{1}{i}}}({\text{IA}}) \\ \subset \cdots \subset \mathcal{L}_\mathrm{lt}({\text{IA}})\:, \end{aligned} $$

or in combination, e. g.,

$$ \begin{aligned} &\mathcal{L}_{{\text{rt}}}({\text{IA}}) \subset \cdots \subset \mathcal{L}_{n+(\log^{[j+1]}(n))^{\frac{1}{i+1}}}({\text{IA}}) \\ &\quad\subset \mathcal{L}_{n+(\log^{[j+1]}(n))^{\frac{1}{i}}}({\text{IA}}) \subset \cdots \subset \mathcal{L}_{n+(\log^{[j]}(n))^{\frac{1}{i+1}}} ({\text{IA}}) \\ &\quad\subset \mathcal{L}_{n+(\log^{[j]}(n))^{\frac{1}{i}}}({\text{IA}}) \subset \cdots \subset \mathcal{L}_{{\text{lt}}}({\text{IA}})\:. \end{aligned} $$

Example 9 reveals a unary language not accepted by any real-time OCA. Note that the witness language \( \{a^{2^n} \mid n\geq 1\} \) is not context free. Next, we generalize this observation and show that even massively parallel real-time OCAs cannot accept more unary languages than a single deterministic finite-state machine [73].

Theorem 28

Each unary real-time OCA language is regular.

Proof

Let A = {a} be an alphabet, L ⊆ A + a language, and \( \mathcal{M} = \langle S,\delta, \texttt{\#} ,A,F \rangle \) be an OCA accepting L in real time. We construct an equivalent deterministic finite-state machine \( \mathcal{E} \) with state set S × S, initial state s 0, set of accepting states F′, and transition function δ' as follows.

$$ \begin{aligned} &s_0 = (\texttt{\#},a)\:,\qquad F^{\prime} = \{(s_1,s_2)\in S\times S\mid s_1\in F\}\:, \\ &\quad\delta^{\prime}((s_1,s_2),a) = (\delta(s_2,s_1), \delta(s_2,s_2))\:,\\ &\quad\text{for all } (s_1,s_2)\in S \times S \end{aligned} $$

In order to give evidence that the construction is correct, we depict two correspondent computations in Fig. 8. □

Figure 8
figure 8_54

A finite-state machine simulating a real-time OCA on unary input

Next we can join the two strands of the hierarchy again. The superfamily is \( \mathcal{L} \) rt(CA).

Theorem 29

The family \( \mathcal{L}_{\text{rt}}({\text{OCA}}) \) is properly included in \( \mathcal{L}_{\text{rt}}(CA) \).

Proof

The inclusion follows for structural reasons. For the properness we argue as follows. Since the language \( L=\{{a}^{{2^n}}\mid n\geq 1\} \) is not regular, it does not belong to \( \mathcal{L} \) rt(OCA). On the other hand, Example 1 shows that it belongs to \( \mathcal{L} \) rt(CA). □

Theorem 30

The family \( \mathcal{L}_{\text{rt}}({\text{IA}}) \) is properly included in \( \mathcal{L}_{\text{rt}}({\text{CA}}) \).

Proof

First we give evidence of the inclusion \( \mathcal{L} \) rt(IA) ⊆ \( \mathcal{L} \) rt(CA). A real-time CA can be set up in such a way that it shifts its input successively to the left on an additional track. So, the leftmost cell receives the input symbol by symbol. Therefore, the CA can simulate the iterative array, where the leftmost cell simulates the communication cell.

The properness of the inclusion follows by the inclusion \( \mathcal{L} \) rt(OCA) ⊂ \( \mathcal{L} \) rt(CA) and the incomparability of \( \mathcal{L} \) rt(OCA) and \( \mathcal{L}_\mathrm{rt}(IA) \). □

Once we know that, in general, a real-time CA language cannot be accepted by any real-time OCA, the question arises of how much time is necessary for that, if possible at all. The next result gives an upper bound for the time [17,95]. Admittedly, to this end the input has to be reversed. Alternatively, one could reverse the neighborhood of the cells in an OCA. Then the rightmost cell indicates the result of the computation. In this case the input could remain as it is. In any case, the condition cannot be relaxed since it is an open problem whether the corresponding language families are closed under reversal.

Theorem 31

A language is accepted by a linear-time OCA if and only if its reversal is accepted by a CA in real time.

Proof

Let \( \mathcal{M} \) be a real-time CA. The cells of a linear-time OCA \( \mathcal{M} \)′ accepting L(\( \mathcal{M} \))R collect the information necessary to simulate one transition of \( \mathcal{M} \), in an intermediate step. Therefore, the first step of \( \mathcal{M} \) is simulated in the second step of \( \mathcal{M} \)′. We obtain a behavior as depicted in Fig. 9.

Figure 9
figure 9_54

Intermediate steps in the construction of the proof of Theorem 31

Altogether, \( \mathcal{M} \)′ cannot simulate the last step of \( \mathcal{M} \). So, the construction has to be extended slightly. Each cell has an extra register that is used to simulate transitions of \( \mathcal{M} \) under the assumption that the cell is the leftmost one (see Fig. 10). The transitions of the real leftmost cell now correspond to the missing transitions of the previous simulation. □

Figure 10
figure 10_54

Example of a linear-time OCA simulation of a real-time CA computation on reversed input

Climbing up one level, the hierarchy continues with two-way linear-time devices.

Theorem 32

A language is accepted by a linear-time IA if and only if it is accepted by a linear-time CA.

Proof

The inclusion \( \mathcal{L} \) lt(IA) ⊆ \( \mathcal{L} \) lt(CA) follows by the proof of Theorem 30.

Conversely, a linear-time iterative array can simulate a linear-time CA as follows. In the first phase, it reads the input and stores it successively in its cells. Next, the iterative array starts a time optimal firing squad synchronization algorithm (see for example [62,98]). That is, an algorithm which is initiated by the communication cell, and which synchronizes the n cells within \( 2n-2 \) time steps. Finally, all cells start the simulation of the CA at the same time. Clearly, the iterative array obeys a linear time bound if the cellular automaton does. □

The next inclusion does not follow for structural reasons. It is proved in [16,32] in terms of simulations of equivalent sequential machines. The properness is an open problem.

Theorem 33

Each linear-time CA language belongs to the family \( \mathcal{L}({\text{OCA}}) \).

The family \( \mathcal{L} \)(OCA) is very powerful. It contains the context-free languages as well as a PSPACE‐complete language [16,32]. Altogether we obtained the hierarchy depicted in Fig. 11, where the only known proper inclusions are at the lower end. Nevertheless, even the real-time OCA languages contain important families, e. g., the Dyck languages  [73] and the bracketed context-free languages [26]. Furthermore, the non‐semilinear language \( \{(a^ib)^* \mid i \geq 0\} \) [73] and the inherently ambiguous language \( \{a^ib^jc^k\mid i=j \mbox{ or } j=k \mbox{ for } i,j,k\geq 1\} \) [80] belong to \( \mathcal{L} \) rt(OCA). On the other hand, the context-free languages have been shown to be incomparable with \( \mathcal{L} \) rt(OCA). Whether or not they are a subset of the family \( \mathcal{L} \) rt(CA) is an open question raised in [80].

Figure 11
figure 11_54

Basic hierarchy of language families. A solid arrow indicates a proper inclusion and a dashed arrow an inclusion. In addition, the linear languages (LIN) are properly included in the context-free languages (CFL). Deterministic and real-time deterministic context-free languages are denoted by DCFL and DCFLλ, regular languages by REG, and deterministic context‐sensitive languages by DCSL

We conclude this section with a real-time OCA based characterization of the recursively enumerable languages and its implication to incomparability [58].

Lemma 34

A language is recursively enumerable if and only if there exists a homomorphism h and a language \( L^{\prime}\in \mathcal{L}_\mathrm{rt}({\text{OCA}}) \) such that \( L = h(L^{\prime}) \). The same holds for \( \mathcal{L}_\mathrm{rt} ({\text{IA}}) \).

Proof

It is known that the recursively enumerable languages can be represented by h(L 1 ∩ L 2), where h is a homomorphism and L 1, L 2 are either linear context-free languages [4] or real-time deterministic context-free languages [29]. In fact, in the latter paper deterministic pushdown automata are constructed that use λ-moves. But the constructions can easily be modified to obtain real-time pushdown automata. So, the assertions follow since \( \mathcal{L} \) rt(OCA) contains the linear context-free languages, \( \mathcal{L} \) rt(IA) contains the real-time deterministic context-free languages, and both families are closed under intersection (Theorem 39). □

The homomorphic characterization of the recursively enumerable languages reveals the following general incomparability result.

Theorem 35

Both families \( \mathcal{L}_\text{rt}({\text{OCA}}) \) and \( \mathcal{L}_\text{rt}({\text{IA}}) \) are incomparable to each language family \( \mathcal{L} \) that contains the context-free languages, is itself properly contained in the recursively enumerable languages, and is closed under homomorphism.

Proof

Since there is a context-free language not belonging to \( \mathcal{L} \) rt(OCA), we obtain \( \mathcal{L} \) ⊈ \( \mathcal{L} \) rt(OCA). Conversely, if \( \mathcal{L} \) rt(OCA) ⊆ \( \mathcal{L} \), then the recursively enumerable languages are a subset of \( \mathcal{L} \) by Lemma 34. This contradicts the proper containment in the recursively enumerable languages. The same argumentation applies to \( \mathcal{L} \) rt(IA). □

Example 36

The families \( \mathcal{L} \) rt(OCA) and \( \mathcal{L} \) rt(IA) are incomparable to the languages of indexed grammars, certain grammars with regulated rewriting, certain contextual grammars, and certain Lindenmayer systems, e. g., ET0L systems. □

Closure Properties

Closure properties of families of formal languages indicate their robustness under certain operations. A family of languages is closed under some operation, if any application of the operation on languages from the family yields again a language from the family. It is effectively closed if the result of the operation can be constructed from the given language(s). The knowledge of closure properties often reveals insights in particular features, structures, and capabilities. Moreover, positive closure properties are a useful tool for modular constructions and decompositions in a natural way. Negative properties may serve, for example, as valuable basis for extensions. That is, for building the smallest family of languages which is closed under the operation and contains the family in question. In any case closure properties are filigree tools for dealing with language families.

Boolean Operations

The operations union, intersection, complementation and set difference are commonly called Boolean operations or elementary operations.

For deterministic devices, the closure under complementation is often shown by interchanging accepting and non‐accepting states. But, in general, this requires halting computations.

Theorem 37

For \( X\in \{{\text{CA}}, {\text{OCA}}, {\text{IA}}\} \), the families \( \mathcal{L}_{\text{rt}}(X) \) and \( \mathcal{L}_{\text{lt}}(X) \) are effectively closed under complementation.

Proof

We show the closure exemplarily for linear-time OCAs. The other proofs are similar. The reason that the complementary device cannot be constructed by simply interchanging accepting and non‐accepting states is that an input is accepted when the leftmost cell enters an accepting state at some arbitrary time step. So, in general, the leftmost cell will enter accepting as well as non‐accepting states during a computation. To cope with this problem we modify a given linear-time OCA \( \mathcal{M} \) in the following way. At initial time a signal is emitted by the rightmost cell. The signal moves on an extra track with speed \( \frac{1}{2} \) to the left. It will arrive at the leftmost cell at twice real time, i. e., linear time. This is the time step at which we wish to make the final decision whether to accept or to reject the input. To this end, the leftmost cell has to remember if it has entered an accepting state at some time before. So, we use a copy \( S^{\prime} \) of the state set S of \( \mathcal{M} \), and modify the local transition function to drive the leftmost cell into a state of S′ when it enters an accepting state. Subsequently, the normal behavior of the leftmost cell is simulated, except that states of S′ are used instead of states of S. Now the modified automaton \( \mathcal{M} \)′ accepts input if and only if the leftmost cell is in some state of S′ when the signal arrives. In order to accept the complement of L(\( \mathcal{M} \)) = L(\( \mathcal{M} \)′), it suffices to let the automaton accept input if and only if the leftmost cell is in some state of S when the signal arrives. □

Devices without any time bounds are, in fact, exponentially time bounded due to the space limitation.

Lemma 38

For X ∈ {CA, OCA, IA}, the families \( \mathcal{L}(X) \) are effectively closed under complementation.

Proof

The construction of Theorem 37 is modified as follows. Some device in question with n cells and state set S may run through at most |S|n different configurations until its behavior becomes cyclic. So, it suffices to check whether the input is accepted during the first |S|n time steps. As discussed briefly in Sect. “Cellular Language Acceptors”, exponential time complexities are honest. That is, the leftmost cell can recognize the time step |S|n. To this end, an |S|ary counter is set up on an extra track. The rightmost cell simulates the least significant digit and adds one to the counter at every time step. The neighboring cell to the left observes when a carry-over appears, increases its own digit and so on. The time step at which the first carry-over appears in the leftmost cell is the desired one. □

Now we turn to intersection, union, and set difference.

Theorem 39

For \( X \in \{{\text{CA}}, {\text{OCA}}, {\text{IA}}\} \), the families \( \mathcal{L}_{\text{rt}}(X) \), \( \mathcal{L}_{\text{lt}}(X) \), and \( \mathcal(X) \) are effectively closed under intersection, union, and set difference.

Proof

In order to prove the closures, the well-known two-track technique is applicable. That is, on two different tracks acceptors for the languages in question are simulated independently of each other. By the same construction as in the previous proof, the leftmost cell can remember whether the single tracks accept or not. The accepting states are composed by the accepting and non‐accepting components for the single tracks in the usual way. For example, to show intersection both components have to be accepting ones.□

Reversal

The closure under reversal is of crucial importance. It is an open problem for \( \mathcal{L} \) rt(CA) and, equivalently, for \( \mathcal{L} \) lt(OCA). Moreover, it is linked with the open closure property under concatenation for the same family and, hence, with the question whether linear-time CAs are more powerful than real-time CAs. So, it remains open whether the computational capacities of CAs differ if the rightmost or the leftmost cell indicates acceptance.

Theorem 40

The family \( \mathcal{L}_{\text{rt}}({\text{OCA}}) \) is effectively closed under reversal.

Proof

Let \( \mathcal{M} \) be some real-time OCA. In order to obtain a real-time OCA \( \mathcal{M} \)′ for the language L(\( \mathcal{M} \))R, the arguments of the local transition function are interchanged. That is, δ'(s 2, s 1) = s 3 if δ(s 1, s 2) = s 3. In addition, we have to pay special attention to the boundary state. The further construction is as in the proof of Theorem 31, with the exception that we do not need to collect information in intermediate steps, since the information flow is one-way in both devices. (see Fig. 12). □

Figure 12
figure 12_54

Construction showing the closure of real-time OCA languages under reversal

As mentioned above, the closure under reversal can be interpreted in two different ways. On one hand, one can construct an OCA that accepts the reversal of a given OCA language. On the other hand, one can construct an OCA that accepts the same language with the rightmost cell. The latter point of view yields immediately the closure under reversal of two-way linear-time cellular automata.

Theorem 41

The family \( \mathcal{L}_{\text{lt}}({\text{CA}}) = \mathcal{L}_{\text{lt}}({\text{IA}}) \) is effectively closed under reversal.

Proof

Let \( \mathcal{M} \) be some linear-time CA. In order to obtain a linear-time CA \( \mathcal{M} \)′ for the language L(\( \mathcal{M} \))R, the first and third arguments of the local transition function are interchanged. That is, δ'(s 3, s 2, s 1) = s 4 if δ(s 1, s 2, s 3) = s 4. The resulting device accepts L(\( \mathcal{M} \))R with the rightmost cell. Then the result is sent as a signal to the leftmost cell. Altogether, \( \mathcal{M} \)′ still obeys a linear time bound. □

The closure under reversal of the devices without time bounds follows from the known closures of the characterizing linguistic language family, i. e., the deterministic context‐sensitive languages DSPACE(n).

Corollary 42

The family \( \mathcal{L}({\text{CA}}) = \mathcal{L}({\text{IA}}) \) is effectively closed under reversal.

In [16,32] unbounded time OCAs are simulated by a variant of unbounded time one-way iterative arrays and vice versa. Moreover, it is shown that the family of accepted languages forms an AFL, i. e., an abstract family of languages, (e. g. [72]) which is in addition closed under reversal.

Lemma 43

The family \( \mathcal{L}({\text{OCA}}) \) is effectively closed under reversal.

In order to show negative closure properties it is sometimes convenient to have witness languages not belonging to the family in question. By Example 7 the language

$$ \begin{aligned} L=\{\texttt{\&} x_k \texttt{\&} \cdots\texttt{\&} x_1 \texttt{?} y_1\texttt{\&} \cdots\texttt{\&} y_k\texttt{\&} \mid k\geq 1, x_i^R=y_iz_i \\ \text{and}\quad x_i, y_i, z_i \in \{a,b\}^*\} \end{aligned} $$

is known not to belong to \( \mathcal{L} \) rt(IA).

Theorem 44

The family \( \mathcal{L}_{\text{rt}}({\text{IA}}) \) is not closed under reversal.

Proof

It suffices to show that L R belongs to \( \mathcal{L} \) rt(IA). To this end, a real-time deterministic pushdown automaton accepting L R is easily constructed. Then by Theorem 4 the containment L R ∈ \( \mathcal{L} \) rt(IA) follows. □

If the answer to the open reversal closure of \( \mathcal{L} \) rt(CA) is negative, we have to deal with two different language families. Since the properness of the inclusion \( \mathcal{L}_{\text{rt}}(\text{CA}) \subseteq \mathcal{L}_{\text{lt}}(\text{CA}) \) is also open, the problem gains in importance. A negative answer of the former problem would imply a proper inclusion. A language L ∈ \( \mathcal{L} \) rt(CA) whose reversal does not belong to \( \mathcal{L} \) rt(CA) may serve as witness since \( \mathcal{L} \) lt(CA) is closed under reversal by Theorem 41. In fact, the following stronger relation is shown in [33] by a long proof.

Theorem 45

The family \( \mathcal{L}_\text{rt}({\text{CA}}) \) is closed under reversal if and only if \( \mathcal{L}_\text{rt}({\text{CA}}) \) and \( \mathcal{L}_{\text{lt}}({\text{CA}}) \) are identical.

Concatenation

Concerning the closure properties under concatenation, the situation is similar to reversal. Either the properties are trivial due to the characterizations by well-known language families, or they are negative, or open problems. For devices without time bounds we have, again, the closures of the deterministic context‐sensitive languages DSPACE(n).

Corollary 46

The family \( \mathcal{L}({\text{CA}}) = \mathcal{L}({\text{IA}}) \) is effectively closed under concatenation.

Since any AFL is closed under concatenation and, as mentioned before, \( \mathcal{L} \)(OCA) is an AFL [16,32] whose closure properties are shown by simulations, the next lemma follows immediately.

Lemma 47

The family \( \mathcal{L}({\text{OCA}}) \) is effectively closed under concatenation.

In order to show that \( \mathcal{L} \) rt(IA) is not closed under concatenation, once more the witness is

$$ \begin{aligned} L=\{\texttt{\&} x_k \texttt{\&} \cdots\texttt{\&} x_1 \texttt{?} y_1\texttt{\&} \cdots\texttt{\&} y_k\texttt{\&} \mid k\geq 1, x_i^R=y_iz_i \\ \text{and}\quad x_i, y_i, z_i \in \{a,b\}^*\}\:. \end{aligned} $$

Theorem 48

The family \( \mathcal{L}_{\text{rt}}({\text{IA}}) \) is not closed under concatenation.

Proof

In contrast to the assertion, assume that \( \mathcal{L} \) rt(IA) is closed under concatenation. The language

$$ L_1 = \{\texttt{\&} x (\texttt{\&}\{a,b\}^+)^k\texttt{?} \texttt{\&}^ky \texttt{\&} \!\mid\! x^R=yz,\,x, y, z \in \{a,b\}^*\} $$

is clearly a real-time deterministic context-free and, thus, a real-time IA language. The language

$$ L_2 = (\texttt{\&}\{a,b\}^+)^* $$

is regular and, therefore, accepted by some real-time IA, too. Due to the assumption, the language

$$ L_3 = L_2 L_1\texttt{\&}^* $$

belongs also to \( \mathcal{L} \) rt(IA). Next, the language

$$ L_4 = \{(\texttt{\&}\{a,b\}^+)^k\texttt{?} (\{a,b\}^*\texttt{\&})^k \mid k\geq 1\} $$

is a real-time deterministic context-free language and therefore accepted by some real-time IA. Finally, from the closure under intersection we obtain

$$ L_5 = L_3 \cap L_4 \in \mathcal{L}_{\text{rt}}({\text{IA}})\:. $$

However, the language L 5 ⊂ L contains the words used to show L ∉ \( \mathcal{L} \) rt(IA). We conclude L 5 ∉ \( \mathcal{L} \) rt(IA), a contradiction. □

The question whether or not the family \( \mathcal{L}_\mathrm{rt} \)(OCA) is closed under concatenation was open for a long time. It has been solved negatively in [84]. Here we utilize the language L d of Example 12. Recall that L d  ⊂ {0,1,(,),|}+ is the language whose words are of the form

$$ x(x_{1} | y_{1}) \cdots (x_{n} | y_{n}) y \:, $$

where \( x,x_{i},y,y_{i} \in \{0,1\}^{*} \), for 1 ≤ i ≤ n, and (x|y) = (x i |y i ), for at least one i ∈ {1, …, n}.

The following example will be helpful.

Example 49

The language \( L_c=\{w \mathop{\bullet} w \mid w \in \{0,1\}^+\} \) is a real-time OCA language.

An acceptor for L c has several tracks. On one track the input is preserved. On two different tracks the input is successively shifted to the left. The shifting is in such a way that a symbol moves to the left one cell per time step until it passes through the center cell with input •. Subsequently, it moves to the left every other time step. In order to achieve this slow-down, the second track is used. Symbols are received in the first register, then shifted to the second one and, finally, send to the left neighbor. In addition, at initial time a signal is emitted at the right border. When the signal has passed through the center cell, it starts to compare the original input symbol with the symbol to be shifted out of the cell next.

Let the input be \( a_m \cdots a_1 \mathop{\bullet} a^{\prime}_m \cdots a^{\prime}_1 \). At time step m + 1 + i, 1 ≤ i ≤ m, the signal arrives in the cell with input a i . The symbol a i takes m + 1 −i time steps to arrive at the center, and is to be shifted out of the cell carrying the original input a i at time m + 1 −i + 2i. So, the signal compares a i with a i as required. □

Theorem 50

The family \( \mathcal{L}_{\text{rt}}({\text{OCA}}) \) is not closed under concatenation.

Proof

Consider the language L 1 whose words are of the form

$$ x(x_{1}|y_{1}) \cdots (x_{n}|y_{n}) (x|, $$

where \( x, x_i, y_i \in \{0,1\}^* \), for 1 ≤ i ≤ n.

Similar to Example 49 we obtain a simple construction of a real-time OCA accepting L 1. For symmetry reasons the language L 2 whose words are of the form

$$ y) ( x_{1}| y_{1}) \cdots ( x_{n} | y_{n}) y\:, $$

where \( y,x_i,y_i \in \{0,1\}^* \), for 1 ≤ i ≤ n, is a real-time OCA language, too.

However, the concatenation L 1 L 2 equals L d , which does not belong to \( \mathcal{L} \) rt(OCA). □

The question whether or not one of the families \( \mathcal{L}_\mathrm{rt}(\mathrm{CA}) = \mathcal{L}_{\mathrm{lt}}^R (\mathrm{OCA}) \) or \( \mathcal{L}_{\mathrm{lt}} (\mathrm{CA}) = \mathcal{L}_{\mathrm{lt}} (\mathrm{IA}) \) is closed under concatenation is another famous open problem in this field. Nevertheless, it is shown in [33] that the closure of \( \mathcal{L} \) rt(CA) under reversal implies its closure under concatenation. Since in this case we obtain \( \mathcal{L} \) rt(CA) = \( \mathcal{L} \) lt(CA), the family of linear-time CA languages were also closed under concatenation.

Theorem 51

If the family \( \mathcal{L}_{\text{rt}}({\text{CA}}) \) is closed under reversal, then it is closed under concatenation.

Proof

If \( \mathcal{L} \) rt(CA) is closed under reversal, then by Theorem 45 we have the identity \( \mathcal{L} \) rt(CA) = \( \mathcal{L} \) lt(CA). So, it suffices to construct a linear-time CA \( \mathcal{M} \) for the concatenation of two real-time CA languages L 1 and L 2. By the closure under reversal, Theorem 31, and the speed-up theorems, there are 2n‑time OCAs \( \mathcal{M} \) 1 and \( \mathcal{M} \) 2 for L R1 and L 2.

During a first phase, automaton \( \mathcal{M} \) reverses its input, say a 1 ⋯ a n , on an extra track. This takes n time steps.

During a second phase, automaton \( \mathcal{M} \) simulates automaton \( \mathcal{M} \) 1 on a n  ⋯ a 1 and \( \mathcal{M} \) 2 on a 1 ⋯ a n in parallel. When some cell enters an accepting state during the simulations, the cell is marked on the corresponding track. If a cell at position n + \( 1 \) −i carrying the input symbol a i is marked by the simulation of \( \mathcal{M} \) 1, we have a i  ⋯ a 1 ∈ L R1 and, thus, a 1 ⋯ a i  ∈ L 1. If a cell at position i carrying the input symbol a i is marked by the simulation of \( \mathcal{M} \) 2, we have a i  ⋯ a n  ∈ L 2. So, the input a 1 ⋯ a n belongs to the concatenation L 1 L 2 if and only if \( \mathcal{M} \) 1 marks a cell at position n + 1 −i and \( \mathcal{M} \) 2 a cell at position i + 1, for 1 ≤ i < n. In order to check this condition, \( \mathcal{M} \) reverses the result of the simulation of \( \mathcal{M} \) 1. Now a cell at position i is marked if and only if previously a cell at position n + 1 −i was marked. Therefore, it suffices to verify by a signal whether two adjacent cells are marked. □

The concatenation closure for unary real-time CA languages has been solved in the affirmative [33].

Table 1 summarizes some closure properties of the language families in question.

Table 1 Summary of closure properties. Concatenation REG denotes the concatenation with regular languages at the right, REG concatenation at the left, hom denotes homomorphisms, gsm generalized sequential machine mappings, and inj. length-pres. abbreviates injective length‐preserving. A + indicates closure, a − non-closure, and a question mark an open problem

Decidability Problems

It is well known that all nontrivial decidability problems for Turing machines are undecidable [69]. Moreover, many of them are not even semidecidable, e. g., neither finiteness nor infiniteness. Now we turn to explore undecidable properties for cellular automata and iterative arrays. Most of the early results are shown in [73] by reductions of the the Post Correspondence Problem. In terms of trellis automata the undecidability of emptiness, equivalence, and universality is derived in [22]. Here we present improved results that show the non‐ semidecidability of the properties. Almost all results in this section are obtained by Andreas Malcher in [58,59].

In [30] large Turing machine computations have been encoded in small grammars. These encodings and variants thereof are of tangible advantage for our purposes. To this end, we consider valid computations of Turing machines. Roughly speaking, these are histories of accepting Turing machine computations. It suffices to consider deterministic Turing machines with a single tape and a single read-write head. Without loss of generality and for technical reasons, one can assume that any accepting computation has at least three and, in general, an odd number of steps. Therefore, it is represented by an even number of configurations. Moreover, it is assumed that the Turing machine cannot print blanks, and that a configuration is halting if and only if it is accepting.

Let S be the state set of some Turing machine \( \mathcal{M} \), where s 0 is the initial state, T ∩ S = ∅ is the tape alphabet containing the blank symbol, A ⊂ T is the input alphabet, and F ⊆ S is the set of accepting states. Then a configuration of \( \mathcal{M} \) can be written as a word of the form T ST such that t 1 ⋯ t i st i + 1 ⋯ t n is used to express that \( \mathcal{M} \) is in state s, scanning tape symbol t i + 1, and t 1 to t n is the support of the tape inscription. The set of valid computations VALC(\( \mathcal{M} \)) is now defined to be the set of words of the form w 1 $ w 3 $ ⋯ $ w 2k − 1 ¢ w R2k $ ⋯ $ w R4 $ w R2 , where w i are configurations, $ and ¢ are symbols not appearing in w i , w 1 is an initial configuration of the form s 0 A , w 2k is an accepting configuration of the form T FT , and w i + 1 is the successor configuration of w i , for 1 ≤ i ≤ 2k. The set of invalid computations INVALC(\( \mathcal{M} \)) is the complement of VALC(\( \mathcal{M} \)) with respect to the coding alphabet {$,¢} ∪ T ∪ S. The following lemma shows some of the important properties of valid computations.

Lemma 52

Let \( \mathcal{M} \) be some Turing machine.

  1. 1.

    \( L(\mathcal{M}) \) is empty if and only if \( VALC(\mathcal{M}) \) is empty.

  2. 2.

    \( L(\mathcal{M}) \) is finite if and only if \( VALC(\mathcal{M}) \) is finite.

  3. 3.

    \( L(\mathcal{M}) \) is finite if and only if \( VALC(\mathcal{M}) \) is context free.

  4. 4.

    \( L(\mathcal{M}) \) is finite if and only if \( INVALC(\mathcal{M}) \) is regular.

  5. 5.

    \( VALC(\mathcal{M}) \) can be represented by the intersection of two real-time deterministic, linear context-free languages, such that both deterministic pushdown automata and both linear context-free grammars can effectively be constructed from \( \mathcal{M} \).

  6. 6.

    \( INVALC(\mathcal{M}) \) is a linear context-free language, such that its grammar can effectively be constructed from \( \mathcal{M} \).

Proof

Assertions 1 and 2 are immediate observations. In order to show assertion 3 assume that L(\( \mathcal{M} \)) is finite. Then VALC(\( \mathcal{M} \)) is finite and, clearly, context free. If conversely L(\( \mathcal{M} \)) is infinite, then an application of the pumping lemma shows that VALC(\( \mathcal{M} \)) is not context free [30]. Now, assertion 4 is shown as follows. If L(\( \mathcal{M} \)) is finite, then VALC(\( \mathcal{M} \)) is finite. Therefore, INVALC(\( \mathcal{M} \)) is co-finite and, thus, regular. Conversely, if INVALC(\( \mathcal{M} \)) is regular, then VALC(\( \mathcal{M} \)) is regular since the regular languages are closed under complementation. Therefore, VALC(\( \mathcal{M} \)) is context free which implies that L(\( \mathcal{M} \)) is finite. The two deterministic, linear context-free languages for assertion 5 are constructed in [4]. Assertion 6 has been shown in [30] for a similar definition of invalid computations. The proof can easily be adapted. □

Lemma 53

Let \( \mathcal{M} \) be a Turing machine. Then both languages \( VALC(\mathcal{M}) \) and \( INVALC(\mathcal{M}) \) are accepted by real-time OCAs as well as by real-time IAs.

Proof

Given some Turing machine \( \mathcal{M} \), by Lemma 52 item 6 we can effectively construct a linear context-free grammar for the language INVALC(\( \mathcal{M} \)). Due to Theorem 19, a real-time OCA accepting INVALC(\( \mathcal{M} \)) can be constructed from the grammar. Since \( \mathcal{L} \) rt(OCA) is effectively closed under complementation, we obtain a real-time OCA that accepts the valid computations of \( \mathcal{M} \).

Similarly, by Lemma 52 item 5 we can effectively construct two real-time deterministic pushdown automata whose intersection represents the language VALC(\( \mathcal{M} \)). Due to Theorem 4, two real-time IAs can be constructed from the pushdown automata. Since the family \( \mathcal{L} \) rt(IA) is effectively closed under intersection and complementation, we obtain real-time IAs that accept the valid and the invalid computations of \( \mathcal{M} \). □

Now we are prepared to reduce the finiteness and infiniteness problems of Turing machines to some of the decidability problems in question.

Theorem 54

For any language family that effectively contains \( \mathcal{L}_{\text{rt}}({\text{OCA}}) \) or \( \mathcal{L}_{\text{rt}}({\text{IA}}) \), emptiness, universality, finiteness, infiniteness, context‐freeness, and regularity are not semidecidable.

Proof

Given some Turing machine \( \mathcal{M} \), by Lemma 53 we can effectively construct a real-time OCA and a real-time IA for the language VALC(\( \mathcal{M} \)). Therefore, by Lemma 51 item 1, if emptiness were semidecidable for real-time OCAs or IAs, then emptiness is semidecidable for Turing machines, too.

Since L(\( \mathcal{M} \)) = A if and only if the complement of L(\( \mathcal{M} \)) is empty, the non‐semidecidability of universality follows from the effective closure under complementation and the non‐semidecidability of emptiness.

In the same way as emptiness and universality the non‐semidecidability of finiteness and infiniteness follows.

Since we have constructed real-time OCAs and IAs for INVALC(\( \mathcal{M} \)) as well as for VALC(\( \mathcal{M} \)), the finiteness problem for Turing machines immediately reduces to the context‐freeness and to the regularity problem for \( \mathcal{L} \) rt(OCA) and \( \mathcal{L} \) rt(IA) by Lemma 52 items 3 and 4.□

Theorem 55

For any language family that effectively contains \( \mathcal{L}_{\text{rt}}({\text{OCA}}) \) or \( \mathcal{L}_{\text{rt}}({\text{IA}}) \) equivalence and inclusion are not semidecidable.

Proof

It is easy to construct a real-time OCA and a real-time IA that accept the empty language. So, the semidecidability of equivalence would imply the semidecidability of emptiness. Since L(\( \mathcal{M} \)) = L(\( \mathcal{M} \)′) if and only if L(\( \mathcal{M} \)) ⊆ L(\( \mathcal{M} \)′) and L(\( \mathcal{M} \)′) ⊆ L(\( \mathcal{M} \)), inclusion is not semidecidable either. □

Next the question arises whether some structural properties of cellular language acceptors are (semi)decidable. For example, whether or not a real-time two-way language is a real-time one-way language. We also compare sequential input mode and two-way information flow with parallel input mode and one-way information flow from a decidability point of view. The questions turn out to be not even semidecidable. So the resources inherent in the structures of cellular automata seem to be fairly different.

Let \( \mathcal{M} \) be some Turing machine. We consider the language

$$ L_{\mathcal{M}} = \{w^{|w|!} \mid w \in \triangleright \text{VALC}(\mathcal{M})\triangleleft\}\:, $$

where ▹ and ◃ are new symbols not appearing in VALC(\( \mathcal{M} \)), and deal with its acceptance.

Lemma 56

Given some Turing machine \( \mathcal{M} \), a real-time IA accepting \( L_{\mathcal{M}} \) can effectively be constructed from \( \mathcal{M} \), i. e., the language \( L_{\mathcal{M}} \) belongs to \( \mathcal{L}_{\text{rt}}({\text{IA}}) \).

Proof

We set B to be the alphabet of VALC(\( \mathcal{M} \)), and represent \( L_{\mathcal{M}} \) as intersection of the three languages

$$ \begin{aligned} L_{\mathcal{M},1} &= \{w \mid w \in (\triangleright\text{VALC}(\mathcal{M})\triangleleft)^*\}\:,\\ L_{\mathcal{M},2} &= \{w^i \mid w \in \triangleright B^*\triangleleft,\, i\geq 2,\, i \quad\text{even}\}, \text{and}\\ L_{\mathcal{M},3} &= \{wu \mid w \in \triangleright B^*\triangleleft,\, u \in (\triangleright B^*\triangleleft)^*, |wu|_{\triangleright}=|w|! \}. \end{aligned} $$

Since \( \mathcal{L} \) rt(IA) is closed under intersection, it remains to be shown that each of the languages is accepted by some real-time IA. Language \( L_{\mathcal{M},1} \) is the marked iteration of ▹VALC(\( \mathcal{M} \)) followed by a single ◃, where ◃ is the marking symbol. Since VALC\( (\mathcal{M}) \in \mathcal{L}_\mathrm{rt} \)(IA) and due to its closure under concatenation with single symbols and under marked iteration [73], we obtain \( L_{\mathcal{M}, 1} \in \mathcal{L}_\mathrm{rt} \)(IA).

The copy language \( \{vv \mid v\in B^*\} \) belongs to \( \mathcal{L} \) rt(IA) [19]. So, \( L = \{vv\mid v\in \triangleright B^*\triangleleft\} \) is accepted by some real-time IA. Since \( \mathcal{L}_\mathrm{rt}{\text{(IA)}} \) is closed under marked iteration and intersection, the languages \( L^* \) and \( \triangleright B^*\triangleleft L^*\triangleright B^*\triangleleft \) as well as their intersection are accepted by some real-time IA. Here the marking is hidden in a regular structure. The new word starts after the second symbol ◃, respectively. Clearly, the intersection equals \( L_{\mathcal{M},2} \).

The construction of a real-time IA accepting the language \( L_{\mathcal{M},3} \) makes use of the IA‐constructibility of the factorials (cf. Example 1 and the discussion before Theorem 26). Recall, that this means that there is an IA which indicates by states of the leftmost cell the time steps n!, for n ≥ 1. For further results on IA‐constructibility we refer to [13,27,63].

Since all constructions and closures are effective, the assertion follows. □

Lemma 57

Let \( \mathcal{M} \) be some Turing machine. Then \( L_{\mathcal{M}} \) belongs to \( \mathcal{L}_{\text{rt}}({\text{OCA}}) \) if and only if \( L(\mathcal{M}) \) is finite.

Proof

We observe that a finite language is regular and that the regular languages are accepted by real-time OCAs. Conversely, if L(\( \mathcal{M} \)) is infinite, then we apply Lemma 8 in order to show \( L_{\mathcal{M}} \) ∉ \( \mathcal{L} \) rt(OCA). Assume the contrary, and let p be the constant of Lemma 8. Clearly, due to the infinity of L(\( \mathcal{M} \)) there is some w ∈ ▹VALC(\( \mathcal{M} \))◃ such that |w|! > p |w|. We conclude w |w|! ∈ \( L_{\mathcal{M}} \), and the conditions of Lemma 8 are met with k = |w|!. Therefore, there is some 1 ≤ q ≤ p |w| such that w |w|!+q ∈ \( L_{\mathcal{M}} \). But |w|! < |w|!+q < (|w|+1)! and, thus, |w|!+q is not a factorial which implies the contradiction w |w|!+q ∉ \( L_{\mathcal{M}} \). □

Now we may obtain the next (un)decidability result.

Theorem 58

For any language family \( \mathcal{L} \) that effectively contains \( \mathcal{L}_{\text{rt}}({\text{IA}}) \), it is not semidecidable whether \( L \in \mathcal{L} \) is a real-time OCA language.

Proof

If the problem in question were semidecidable, then the finiteness for Turing machines is also semidecidable. To this end, given some Turing machine \( \mathcal{M} \) we construct a real-time IA for the language \( L_{\mathcal{M}} \) according to Lemma 56. If \( L_{\mathcal{M}} \) is accepted by some real-time OCA, then \( L_{\mathcal{M}} \) is finite by Lemma 57. This implies the finiteness of VALC(\( \mathcal{M} \)) and, thus, the finiteness of L(\( \mathcal{M} \)). □

Since the families \( \mathcal{L} \) rt(OCA) and \( \mathcal{L} \) rt(IA) are incomparable with respect to set inclusion, there is a natural interest to know whether the incomparability is also with respect to the decidability in question. Therefore, we turn to the converse question of Theorem 58, i. e., whether it is (semi)decidable that a real-time OCA language is accepted by some real-time IA. First we need some preliminaries.

A Turing machine \( \mathcal{M} \) is converted into a Turing machine \( \mathcal{M} \)′ such that the input alphabet A′ of \( \mathcal{M} \)′ contains at least two symbols and, furthermore, \( \mathcal{M} \)′ accepts any input of length n if and only if there is at least one input of length n accepted by \( \mathcal{M} \). Clearly, this conversion is always effectively possible. Extending a frequently used witness language, we set

$$ \begin{aligned} L^{\prime}_{\mathcal{M}}= \{\texttt{\&} x_k \texttt{\&} \cdots\texttt{\&} x_1 \texttt{?} y_1\texttt{\&} \cdots\texttt{\&} y_k\texttt{\&} \mid k\geq 1, x_i^R=y_iz_i \\ \text{and}\quad y_iz_i \in (A^{\prime})^*\quad\text{and}\quad x_i \in {\text{VALC}}(\mathcal{M}^{\prime})^R\}\:, \end{aligned} $$

where & and ? are new symbols not appearing in VALC(\( \mathcal{M} \)′).

Lemma 59

Let \( \mathcal{M} \) be some Turing machine. Then \( L^{\prime}_{\mathcal{M}} \) belongs to \( \mathcal{L}_{\text{rt}}({\text{IA}}) \) if and only if \( L(\mathcal{M}) \) is finite.

Proof

If L(\( \mathcal{M} \)) is finite, then so is L(\( \mathcal{M} \)′) and, thus, VALC(\( \mathcal{M} \)′)R is finite, say VALC(\( \mathcal{M} \)′)R = {v 1, v 2, …, v r }. A real-time deterministic pushdown automaton accepting \( L^{\prime}_{\mathcal{M}} \) has r different stack symbols representing the elements in {v 1, v 2, …, v r }. It reads the input until the ? appears. For any occurring v ∈ VALC(\( \mathcal{M} \)′)R the corresponding stack symbol is pushed onto the stack. After reading the ?, the pushdown automaton matches each y i with the suffix of the v ∈ VALC(\( \mathcal{M} \)′)R which is identified by the symbol at the top of the stack. By Theorem 4, we obtain \( L^{\prime}_{\mathcal{M}} \) ∈ \( \mathcal{L} \) rt(IA).

Now let L(\( \mathcal{M} \)) be infinite. Then L(\( \mathcal{M} \)′), VALC(\( \mathcal{M} \)′)R, and \( L^{\prime}_{\mathcal{M}} \) are infinite, as well. In order to show that in this case \( L^{\prime}_{\mathcal{M}} \) does not belong to \( \mathcal{L} \) rt(IA) we apply Lemma 6 as follows. Assume in contrast to the assertion that \( L^{\prime}_{\mathcal{M}} \) is accepted by some real-time IA with state set S. Every v ∈ VALC(\( \mathcal{M} \)′)R has a suffix of the form $us 0, where s 0 is the initial state and u = input(v) is the reversal of the input. Let |u| = k. Moreover, due to the construction of \( \mathcal{M} \)′, for every input word with length k there is an element in VALC(\( \mathcal{M} \)′)R. We denote the set of these elements by V(k) and conclude |V(k)| = |A′|k. For two different prefixes w = &x k & ⋯ &x 1? and w′ = &x k & ⋯ &x1? with x i , x i  ∈ V(k), 1 ≤ i ≤ k, there exists at least one 1 ≤ j ≤ k such that x j \( \ne \)x j . Therefore, w & j −1s 0 \( { \textit{input} (x_j)^R } \) & k −j + 1 ∈ \( L^{\prime}_{\mathcal{M}} \) and w′ & j −1 s 0input(x j )R& k −j + 1 ∉ \( L^{\prime}_{\mathcal{M}} \). Since the number of such prefixes is \( |A^{\prime}|^{k^2} \) and |A′| ≥ 2, we obtain at least \( 2^{k^2} \) different 2k‑equivalence classes with respect to \( L^{\prime}_{\mathcal{M}} \). On the other hand, there is a constant p ≥ 1 such that E(\( L^{\prime}_{\mathcal{M}} \),2k) ≤ p 2k. Since L(\( \mathcal{M} \)) is infinite, we may choose k large enough such that \( 2^{k^2} \mathchar"313E p^{2k} \), which is a contradiction. □

Lemma 60

Given some Turing machine \( \mathcal{M} \), a real-time OCA accepting \( L^{\prime}_{\mathcal{M}} \) can effectively be constructed from \( \mathcal{M} \), i. e., the language \( L^{\prime}_{\mathcal{M}} \) belongs to \( \mathcal{L}_{\text{rt}}({\text{OCA}}) \).

Proof

The language \( L^{\prime}_{\mathcal{M}} \) can be represented as the intersection of L 1 and L 2, where L 1 = {& x k & ⋯ & x 1 ? y 1 & ⋯ & y k & | k ≥ 1, x R i  = y i z i and \( x_i, y_i, z_i \in (A^{\prime})^*\} \) and \( L_2 = (\texttt{\&} {\text{VALC}}(\mathcal{M}^{\prime})^R)^* \texttt{?} ((A^{\prime})^* \texttt{\&})^* \). Since L 1 is a linear context-free language, it belongs to \( \mathcal{L} \) rt(OCA). The family \( \mathcal{L} \) rt(OCA) contains VALC(\( \mathcal{M} \)′), is closed under reversal, marked iteration and right concatenation with regular sets [73]. Therefore, L 2 belongs to \( \mathcal{L} \) rt(OCA), as well. From its closure under intersection we derive \( L^{\prime}_{\mathcal{M}} \) ∈ \( \mathcal{L} \) rt(OCA).□

Similarly to Theorem 58 we obtain the next undecidability of a structural property.

Theorem 61

For any language family \( \mathcal{L} \) that effectively contains \( \mathcal{L}_{\text{rt}}({\text{OCA}}) \) it is not semidecidable whether \( L \in \mathcal{L} \) is a real-time IA language.

Proof

If the problem in question were semidecidable, then also the finiteness for Turing machines. To this end, given some Turing machine \( \mathcal{M} \), we construct a real-time OCA for the language \( L^{\prime}_{\mathcal{M}} \) according to Lemma 60. If \( L^{\prime}_{\mathcal{M}} \) is accepted by some real-time IA, then L(\( \mathcal{M} \)) is finite by Lemma 59. □

In general, a family \( \mathcal{L} \) of languages possesses a  pumping lemma in the narrow sense if for each L ∈ \( \mathcal{L} \) there exists a constant n ≥ 1 computable from L such that each z ∈ L with |z| > n admits a factorization z = uvw, where |v| ≥ 1 and uv i w′ ∈ L, for infinitely many i ≥ 0. The prefix u′ and the suffix w′ depend on u, w and i.

Theorem 62

Any language family whose word problem is semidecidable and that effectively contains \( \mathcal{L}_{\text{rt}}({\text{OCA}}) \) or \( \mathcal{L}_{\text{rt}}({\text{IA}}) \) does not possess a pumping lemma (in the narrow sense).

Proof

Let \( \mathcal{M} \) be a real-time OCA or IA and assume there is a pumping lemma. Clearly, L(\( \mathcal{M} \)) is infinite if and only if it contains some w with |w| > n. So, we can semidecide infiniteness by first computing n and then verifying for all words longer than n whether they belong to L(\( \mathcal{M} \)). If at least for one word the answer is in the affirmative, then by pumping infinitely many words belong to L(\( \mathcal{M} \)). □

Example 63

The families \( \mathcal{L} \) rt(OCA) and \( \mathcal{L} \) rt(IA) itself as well as, e. g., the families \( \mathcal{L} \) rt(CA), \( \mathcal{L} \) lt(OCA), \( \mathcal{L} \) lt(CA) = \( \mathcal{L} \) lt(IA), and \( \mathcal{L} \)(CA) = \( \mathcal{L} \)(IA) = DSPACE(n) do not possess a pumping lemma. □

Theorem 64

There is no minimization algorithm converting some CA, OCA or IA with arbitrary time complexity to an equivalent automaton of the same type with a minimal number of states.

Proof

For a given input alphabet A, we consider a minimal CA or OCA accepting the empty language. It has |A| states and no accepting states. Assume there is a minimization algorithm. Then we can minimize an arbitrary CA and OCA and check whether the result has |A| states and no accepting states. In this case the accepted language is empty. If the minimal automaton has |A| states and at least one accepting state, there is an input such that the leftmost cell is initially accepting. So, the input is accepted and the accepted language is not empty. Hence emptiness is decidable, which is a contradiction to Theorem 54. Similar arguments apply for IAs. □

It remains to be mentioned that there is a nontrivial decidable property of (unbounded) cellular automata. It is known that injectivity of the global transition function is equivalent to the reversibility of the automaton. It is shown in [3] that global reversibility is decidable for one‐dimensional CAs, whereas the problem is undecidable for higher dimensions [41].

Future Directions

The investigation of cellular language acceptors obeying a linear space bound reveals the hierarchy of language families in between the regular and the deterministic context‐sensitive languages established in Sect. “Computational Capacities” (see Fig. 11). If the space bound is omitted, that is, if there is a potentially unlimited number of cells, then computation universality is achieved by direct simulation of Turing machines [78]. In particular, the universality can be achieved in spite of additional structural and computational limitations [2,60,64,66]. Similarly, some space bound supposed for cellular language acceptors does not yield to new language families. A Turing machine sweeping back and forth over the nonempty part of the tape can simulate the parallel device obeying the same space bound.

On the other hand, if the cellular language acceptor is simultaneously s(n) space and t(n) time bounded, a Turing machine simulation takes s(n) · t(n) time. A challenging question for further investigations is to identify languages and language classes for which homogeneously structured massive parallelism can significantly decrease the time complexity of sequential devices. Of particular interest are languages which allow a maximal saving. That is, for a sequential time complexity t(n), the parallel time complexity is bounded by t(n)/s(n), where s(n) is the parallel space complexity. For example, in case of unary real-time languages, OCAs cannot do better than deterministic finite-state machine. Conversely, it is well known that any one-tape Turing machine takes at least O(n 2) time to accept the language of palindromes \( \{w\mid w=w^R, w\in\{a,b\}^*\} \). Since it is a linear context-free language, it is accepted by some real-time OCA, achieving the maximal saving in time.

From a more general point of view, central questions for future studies concern the power of additional limited resources at the disposal of time or space bounded computations. For example, nondeterminism, dimensions, the number of bits communicated to neighboring cells, or the restriction to reversible computations, all these can be seen as limited resources. We discuss some approaches in more detail.

Traditionally, nondeterministic devices have been viewed as having as many nondeterministic guesses as time steps. The studies of this concept of unlimited nondeterminism led, for example, to the famous open LBA-problem or the unsolved question whether or not P equals NP. In order to gain further understanding of the nature of nondeterminism , in [28,44] it has been viewed as an additional limited resource. In [11,46,52] cellular automata started to be considered from this point of view.

In classical computations the states of the neighboring cells are communicated in one time step. That is, the number of bits exchanged is determined by the number of states. A natural and interesting restriction is to limit the number of bits to some constant being independent of the number of states. Iterative arrays with restricted inter-cell communication have been investigated in [93,94], where algorithmic design techniques for sequence generation are shown. In particular, several important infinite, non-regular sequences such as exponential or polynomial, Fibonacci and prime sequences can be generated in real time. Connectivity recognition problems are dealt with in [92], whereas in [99] the computational capacity of one-way cellular automata with restricted inter-cell communication is considered. First results concerning formal language aspects of IAs with restricted inter-cell communication are shown in [53,54].

Finally, we turn to reversibility . Reversibility in the context of computing devices means that deterministic computations are also backward deterministic. Roughly speaking, in a reversible device no information is lost and every configuration occurring in any computation has at most one predecessor. Many different formal models have been studied in connection with reversibility. An early result on general reversible CAs is the possibility to make any CA, possibly irreversible, reversible by increasing the dimension. In detail, in [91] it is shown that any k‑dimensional CA can be embedded into a (k + 1)-dimensional reversible CA. This result has significantly been improved by showing how to make irreversible one‐dimensional CAs reversible without increasing the dimension [65]. Furthermore, it is known that even reversible one‐dimensional one-way CAs are computationally universal [64,66]. These results concern cellular automata with unbounded space. Moreover, in order to obtain a reversible device the neighborhood as well as the time complexity may be increased. In [23] it is shown that the neighborhood of a reversible CA is at most n −1 when the given reversible CA has n states. Additionally, this upper bound is shown to be tight. Cellular language acceptors with bounded space that are reversible on the core of computation, that is, from initial configuration to the configuration given by the time complexity, are introduced in [55,56]. At first glance, such a setting should simplify matters. However, it is quite the contrary, and such real-time reversibility is undecidable. There are many properties and relations still to be discovered in this setting.