1 Introduction

The motivation for insertion–deletion system comes from both linguistics [26] (see [29] as a textbook on this topic) and also from biology, more specifically from DNA processing and RNA editing. In particular, in the theoretical process of mismatched annealing of DNA sequences, certain segments of the strands are either inserted or deleted [31]. Likewise, in RNA editing, some fragments of messenger RNA are inserted or deleted [3, 4]. Another mathematical motivation for insertion operations can be seen in [15], where this operation and its iterated variant were introduced as a generalization of concatenation and Kleene’s closure. The deletion operation was first studied in [18] from a formal language perspective, where the deletion is defined as a quotient-like operation that does not necessarily happen at the right end of the string. Insertion and deletion operations together were introduced in a formal language theoretic fashion in [20]. The corresponding grammatical mechanism is called insertion–deletion system (abbreviated as ins–del system). Informally, if a string \(\eta \) is inserted between two parts \(w_1\) and \(w_2\) of a string \(w_1w_2\) to get \(w_1 \eta w_2\), we call the operation insertion, whereas if a substring \(\delta \) is deleted from a string \(w_1 \delta w_2\) to get \(w_1w_2\), we call the operation deletion. Several variants of ins–del systems have been considered in the literature, like ins–del P systems [2, 23], tissue P systems with ins–del rules [25], context-free ins–del systems [27], matrix ins–del systems [10, 24, 32], random context and semi-conditional ins–del systems [17], etc. All the mentioned papers (as well as [19, 33]) characterized the recursively enumerable languages (i.e., they show computational completeness) using ins–del systems. We refer to the survey article [34] for details of variants of ins–del systems; this survey also discusses some proof techniques for showing computationally completeness results.

One of the important variants of ins–del systems is graph-controlled ins–del systems (abbreviated as GCID systems), introduced in [13] and further studied in [16]. In such a system, the concept of components is introduced, which are associated with insertion or deletion rules. The transition is performed by choosing any applicable rule from the set of rules of the current component and by moving the resultant string to the target component specified in the rule in order to continue processing it.

If the underlying graph of a GCID system establishes a path structure (loops, multiple edges and directions are ignored), then such a GCID system can be seen as a special form of a P system, namely, an ins–del P system. As P systems (a model for membrane computing) draw their origins from modeling computations of biological systems, considering insertions and deletions in this context is particularly meaningful. There is one small technical issue, namely, in a P system, usually there is no specific initial membrane where the computation begins, since the membranes evolve in a so-called maximally parallel way. However, if the collection of axioms in each membrane, except one membrane is empty, then this exceptional membrane can be viewed as an initial membrane to begin with, so that such a system works in the same way as a GCID system where the membranes of a P system correspond to the components of a GCID system. For more details, see [30].

The mentioned connections motivate to study GCID systems. Much research has then be devoted to restricting the computational resources as far as possible while still maintaining computational completeness. To be more concrete, typical questions are: To what extent can we limit the length of contexts and/or the insertion or deletion strings in a rule? How many components are needed with the limited size? Are there kind of trade-offs between these questions? All this is formalized in the following.

In this paper, our objective is to find the sizes \((k;n,i',i'';m,j',j'')\) with which GCID systems can (still) describe the recursively enumerable languages, with the control graphs of the systems being paths so that these results will correspond to computational completeness results of particular ins–del P sytems, as noted above. The descriptional complexity of a GCID system is measured by its size \((k;n,i',i'';m,j',j'')\) where the parameters from left to right denote (i) number of components, (ii) the maximal length of the insertion string, (iii) the maximal length of the left context used in insertion rules, (iv) the maximal length of the right context used in insertion rules and the last three parameters follow a similar representation as (ii), (iii), (iv) with respect to deletion instead of insertion. The generative power of GCID systems for insertion/deletion lengths satisfying \(n+m\in \{2,3\}\) has also been studied in [11, 12, 16]. However, the control graph is not a path for many cases, so that they cannot be also viewed as ins–del P systems.

The main objective of this paper is to characterize recursively enumerable languages (denoted as \(\mathrm {RE})\) by size-bounded GCID systems, whose underlying (undirected) control graph is a path. This objective can be seen as a sort of syntactic restriction on GCID systems, on top of the usually considered numerical values limiting the descriptional complexity. We are interested in the question which type of resources of path-structured GCID (as well as of membrane computing, i.e., P) systems are sufficient to characterize \(\mathrm {RE}\). We prove that GCID system of sizes \((k;n,i',i'';1,j',j'')\) with \(i',i'',j',j''\in \{0,1\}\), and (i) \(k=3,~ n=1,~i'+i''=1,~ j'+j''=2,\) (ii) \(k=3,~ n=2,~i'+i''=1,~j'+j''=1,\) (iii) \(k=4, ~n=2,~i'+i''=0,~j'+j''=1\), (iv) \(k=4,~ n=1,~i'+i''=1,~j'+j''=1,\) all characterize \(\mathrm {RE}\) with a path as a control graph. Previously, such results were only known for GCID systems with arbitrary control graphs [11]. Our results may also revive interest in the conjecture of Ivanov and Verlan [16] which states that \(\mathrm {RE} \ne \mathrm {GCID}(s)\) if \(k=2\) in \(s=(k;1,i',i'';1,j',j'')\), with \(i',i'',j',j''\in \{0,1\}\) and \(i'+i''+j'+j''\le 3\). In the same situation, this statement is known to be true if \(k=1\). If the conjecture were true, then our results for \(k=3\) would be optimal. We will make this and other open problems explicit throughout the paper. Also, at the end of this paper, the reader can find tables summarizing the state-of-the-art in this area.

This study could also raise interest in considering other forms of syntactic restrictions, for instance, cycles (somehow reminiscent of time-variant control [5]) or stars and similar special trees, or variants of our notion of paths.

A preliminary version of this paper appeared in [8]; this version not only contains all proof details but also some new results.

2 Preliminaries

We assume that the readers are familiar with the standard notations used in formal language theory. Here, we recall a few notations for the understanding of the paper. Let \(\mathbb {N}\) denote the set of positive integers, and \([1\ldots k]=\{i\in \mathbb {N}:1\le i\le k\}\). Given an alphabet (finite set) \(\varSigma \), \(\varSigma ^*\) denotes the free monoid generated by \(\varSigma \). The elements of \(\varSigma ^*\) are called strings or words; \(\lambda \) denotes the empty string. For a string \(w \in \varSigma ^*\), |w| is the length of w and \(w^R\) denotes the reversal (mirror image) of w. \(L^R\) and \(\mathcal {L}^R\) are also understood for languages L and language families \(\mathcal {L}\), collecting all reversals of words from L and all reversals of languages from \(\mathcal {L}\), respectively. We also make use of the shuffle operation\({} \sqcup \mathchoice{}{}{}{}\sqcup \) to describe the effect of insertions at a random position in the string.Footnote 1 For the computational completeness results, we are using as our main tool the fact that type-0 grammarsFootnote 2 in Special Geffert Normal Form (SGNF) are known to characterize the recursively enumerable languages.

Definition 1

A type-0 grammar \(G=(N,T,P,S)\) is said to be in SGNF if

  • N decomposes as \(N=N'\cup N''\), where \(N''=\{A_1,B_1,A_2,B_2\}\) and \(N'\) contains at least the two nonterminals S and \(S'\);

  • the only non-context-free rules in P are \(AB \rightarrow \lambda \), where \(AB\in \{A_1B_1,A_2B_2\}\);

  • the context-free rules are of the form (i) \(S'\rightarrow \lambda \), or (ii) \(X \rightarrow Y_1Y_2\), where \(X\in N'\) and \(Y_1Y_2\in ((N'{\setminus }\{X\})(T \cup N'')) \cup ((T\cup N'')(N'{\setminus }\{X\}))\).

The way the normal form is constructed is described in [13], based on [14]. We assume, without loss of generality, in this paper that the context-free rules \(r:X\rightarrow Y_1Y_2\) either satisfy \(Y_1\in \{A_1,A_2\}\) and \(Y_2\in N'\), or \(Y_1\in N'\) and \(Y_2\in \{B_1,B_2\}\cup T\). This also means that the derivation in G undergoes two phases: in phase I, only context-free rules are applied. This phase ends with applying the context-free deletion rule \(S' \rightarrow \lambda \). Then, only non-context-free deletion rules are applied in phase II. Notice that the symbol from \(N'\), as long as present, separates \(A_1\) and \(A_2\) from \(B_1\) and \(B_2\); this prevents a premature start of phase II. We write \(\Rightarrow _r\) to denote a single derivation step using rule r, and \(\Rightarrow _G\) (or \(\Rightarrow \) if no confusion arises) denotes a single derivation step using any rule of G. Then, \(L(G)=\{w\in T^*\mid S\Rightarrow ^* w\}\), where \(\Rightarrow ^*\) is the reflexive transitive closure of \(\Rightarrow \).

2.1 Graph-controlled insertion–deletion systems

Definition 2

A graph-controlled insertion–deletion system (GCID for short) with k components is a construct \(\varPi = (k,V,T,A,H,i_0,i_f,R)\), where,

  • k is the number of components,

  • V is an alphabet,

  • \(T \subseteq V\) is the terminal alphabet,

  • \(A \subset V^*\) is a finite set of axioms,

  • H is a set of labels associated (in a one-to-one manner) to the rules in R,

  • \(i_0 \in [1 \ldots k]\) is the initial component,

  • \(i_f \in [1 \ldots k]\) is the final component, and

  • R is a finite set of rules of the form l : (irj), where l is the label of the rule, r is an insertion rule of the form \((u, \eta , v)_I\) or a deletion rule of the form \((u, \delta , v)_D\), where \((u,v) \in V^* \times V^*\), \(\eta ,\delta \in V^+\) and \(i, j \in [1 \ldots k]\).

If the initial component itself is the final component, then we call the system returning. The pair (uv) is called the context, \(\eta \) is called the insertion string, \(\delta \) is called the deletion string and \(x \in A\) is called an axiom. If one of the u or v is \(\lambda \) for all the insertion (deletion) contexts, then we call the insertion (deletion) as one-sided. If both \(u,v=\lambda \) for every insertion (deletion) rule, then it means that the corresponding insertion (deletion) can be done freely anywhere in the string and is called context-free insertion (context-free deletion). We write rules in R in the form l : (irj), where \(l \in H\) is the label associated to the rule. Often, the component is part of the label name. This will also (implicitly) define H. We shall omit the label l of the rule wherever it is not necessary for the discussion.

We now describe how GCID systems work. Applying an insertion rule of the form \((u,\eta ,v)_I\) means that the string \(\eta \) is inserted between u and v; this corresponds to the rewriting rule \(uv \rightarrow u \eta v\). Similarly, applying a deletion rule of the form \((u, \delta ,v)_D\) means that the string \(\delta \) is deleted between u and v; this corresponds to the rewriting rule \(u \delta v \rightarrow uv\). A configuration of \(\varPi \) is represented by \((w)_i\), where \(i\in [1\ldots k]\) is the number of the current component and \(w\in V^*\) is the current string. In that case, we also say that w has entered component Ci. We write \((w)_i \Rightarrow _{l} (w')_j\) or \((w')_j \Leftarrow _{l} (w)_i\) if there is a rule l : (irj) in R, and \(w'\) is obtained by applying the insertion or deletion rule r to w. By \({(w)_i}_{\Leftarrow _{l'}} ^{\Rightarrow _{l}} (w')_j\), we mean that \((w')_j\) is derivable from \((w)_i\) using rule l and \((w)_i\) is derivable from \((w')_j\) using rule \(l'\). For brevity, we write \((w)_i\Rightarrow (w')_j\) if there is some rule l in R such that \((w)_i\Rightarrow _l (w')_j\). To avoid confusion with traditional grammars, we write \(\Rightarrow _*\) for the transitive reflexive closure of \(\Rightarrow \) between configurations. The language \(L(\varPi )\) generated by \(\varPi \) is defined as \(\{w\in T^*\mid (x)_{i_0}\Rightarrow _* (w)_{i_f} ~\text {for some}~ x \in A\}\). For returning GCID systems \(\varPi \) with initial component C1, we also write \((w)_1\Rightarrow '(w')_1\), meaning that there is a sequence of derivation steps \((w)_1\Rightarrow (w_1)_{c_1}\Rightarrow \cdots \Rightarrow (w_k)_{c_k}\Rightarrow (w')_1\) such that, for all \(i\in \{1,\dots ,k\}\), \(c_i\ne 1\).

The underlying control graph of a graph-controlled insertion–deletion system \(\varPi \) with k components is defined to be a graph with k nodes labelled C1 through Ck and there exists a directed edge from a node Ci to node Cj if there exists a rule of the form (irj) in R of \(\varPi \). We also associate a simple undirected graph on k nodes to a GCID system of k components as follows: there is an undirected edge from a node Ci to Cj (\(i \ne j\)) if there exists a rule of the form \((i,r_1,j)\) or \((j,r_2,i)\) in R of \(\varPi \) (hence, loops and multi-edges are excluded). We call a returning GCID system with k components path-structured if its underlying undirected control graph has the edge set \(\left\{ \{Ci,C(i+1)\} \mid i\in [1\dots k-1] \right\} \). Incidentally, when the underlying control graph forms a path or tree structure, this system can be interpreted as an insertion–deletion P system [23], where several membranes (possibly, with nesting membrane structures) contain objects (in our case, strings) and based on the rules available in the membrane, they evolve (in parallel) and are sent to adjacent membranes, as specified in the rule as its target membrane. For more details, see [30]. We also mention the paper [1] on length P systems, where a linear membrane structure with multiset objects in the form of vectors/numbers is considered.

Let us mention one technicality with our definition: a GCID on three components C1, C2, C3 with (un)directed arcs between C1 and C2, as well as between C2 and C3, would qualify as path-structured if it is returning with respect to any initial (and at the same time, final) component, while when starting with C2, the whole computation looks star-like rather than path-like. This observation might motivate further studies on returning path-structured GCID systems with initial component C1. Some but not all of our simulations provide simulations by path-structured GCID systems following this more restricted notion.

The descriptional complexity of a GCID system is measured by its size\(s=(k;n,i',i'';m,j',j'')\), where the parameters represent resource bounds as given in Table 1. Slightly abusing notation, the language class generated by GCID systems of size s is denoted by \(\mathrm {GCID}(s)\) and the class of languages describable by path-structured GCID systems of size s is denoted by \(\mathrm {GCID}_P(s)\).

Table 1 Size \((k;n,i',i'';m,j',j'')\) of a GCID system

A first thought might be that the path-structure is not restricting the power of GCID systems at all. For instance, assume that we have a rule \((3,(a,x,b)_{I},1)\) (from C3 directly going to C1 without going via C2) that breaches some otherwise path-structured GCID system. Adding a new symbol A, one might try to substitute this rule by the three rules \((3,(a,A,b)_I,2)\), \((2,(A,x,b)_I,1)\) and \((1,(a,A,x)_D,1)\). However, there are at least two major concerns against this strategy:

  • In all interesting situations considered in this paper, we did not allow both left and right contexts for insertion and deletion strings. In other words, we did not consider the size (n, 1, 1; m, 1, 1) in this paper. Note that ins–del system with this size (\(n,m \ge 1\)) describes \(\mathrm {RE}\).

  • Especially when trying to do the same simulation with less contexts, there is a danger that, instead of following the three rules in order (as intended), the derivation might be interrupted by applying other rules. These unintended derivations might be malicious, in the sense of producing terminal words that do not belong to the language of the simulated grammar. Of course, this problem depends on the concrete GCID system, but it shows the difficulties of this approach in general.

In fact, the danger of having malicious derivations in certain simulations will be the main concern in the proofs that follow. As such derivations must be ruled out, quite detailed induction proofs are given in the next section that compiles the main results of this paper.

3 Computational completeness

In this section, we will demonstrate our computational completeness claims by explaining simulations of type-0 grammars in SGNF, as done in previously published constructions. One of the main techniques will be the introduction of specific symbols called markers, associated to each rule that is to be simulated, so that we can argue that a certain sequence of rule applications have to be followed by the simulating GCID system, plus taking care of the positions where the rule applications take place.

Let us look at a simple example first, taken from a more involved construction discussed below. In order to simulate the context-sensitive deletion rule \(f_1:A_1B_1\rightarrow \lambda \) from the given SGNF grammar, Table 2 suggests to have rule \(f_11.1:(1,(\lambda ,f_1,\lambda )_I,2)\) in the first component. Here we can also see some conventions that we follow to label the simulating rules: \(f_11.1\) is the first rule of component C1 that is responsible for simulating the original rule \(f_1\). This rule \(f_11.1\) inserts the rule marker \(f_1\) anywhere in the current string, which is then moved to C2. If now \(f_12.2\) is applied (i.e., the second rule of component C2 that is responsible for simulating \(f_1\)), then the rule marker \(f_1\) is deleted, so this operation would simply undo the previous insertion operation. To see progress, \(f_12.1\) should be applied. The corresponding insertion rule \((f_1,A_1,B_1)_D\) is deleting some \(A_1\) left to some \(B_1\), but notice that after a successful application of this rule, \(f_1\) will be situated left of \(B_1\), which is exactly the situation expected when moving the string into C3; the rule \((f_1,B_1,\lambda )_D\) is only applicable when, in the original string \(f_1\) was placed immediately to the left of the substring \(A_1B_1\), which has now been deleted (or, in a sense, replaced by \(f_1\)). We also say that the applications of rules \(f_12.1\) and \(f_13.1\) are guarded by the occurrence of the rule marker \(f_1\). Notice that after successfully deleting \(A_1B_1\) in the simulation, we can now either delete \(f_1\) from the string and move it back to C1 in order to finish the simulation, or simply re-start by applying \(f_12.1\) if this is possible.

To simplify the presentation and proofs of our further results, the following observations from [11] are used.

Proposition 1

[11] Let \(k,n,i',i'',m,j,j''\) be non-negative integers. The following statements are true.

  1. 1.

    \(\mathrm {GCID}_P(k;n,i',i'';m,j',j'') = [\mathrm {GCID}_P(k;n,i'',i';m,j'',j')]^R\);

  2. 2.

    \(\mathrm {RE}=\mathrm {GCID}_P(k;n,i',i'';m,j',j'')\) iff \(\mathrm {RE}=\mathrm {GCID}_P(k;n,i'',i';m,j'',j')\).

3.1 GCID systems with insertion and deletion length one

In [33], it has been proved that ins–del systems with size (1,1,1;1,1,1) characterize \(\mathrm {RE}\). If we desire to have one-sided context for insertion/deletion, then it is proved in [21, 28] that ins–del systems of size (1, 1, 1; 1, 1, 0) or (1, 1, 0; 1, 1, 1) cannot characterize \(\mathrm {RE}\). It is therefore obvious that we need at least 2 components in a graph-controlled ins–del system of sizes (1, 1, 1; 1, 1, 0) and (1, 1, 0; 1, 1, 1) to characterize \(\mathrm {RE}\). In [11], we characterized \(\mathrm {RE}\) by path-structured GCID systems of size \((3;1,1,1;\;1,1,0)\). Also, in [16], it was shown that \(\mathrm {GCID}_P(3;1,2,0;1,1,0)=\mathrm {RE}\) and \(\mathrm {GCID}_P(3;1,1,0;1,2,0)=\mathrm {RE}\). We now complement these results.

Table 2 Path-structured GCID systems of size (3; 1, 1, 0; 1, 1, 1) simulating type-0 grammars G in SGNF

Theorem 1

\(\mathrm {RE} = \mathrm {GCID}_P(3;1,1,0;1,1,1) = \mathrm {GCID}_P(3;1,0,1;1,1,1)\).

Before giving the full proof, we sketch some basic features of the construction, as displayed in Table 2. (i) We make use of a dummy symbol \(\varDelta \) to rectify a control graph structure that might be otherwise not a path, as if r1.3 is not available, then, r2.3.c could be replaced by \(r1.4: (1,(r',Y_2,\lambda )_{I},3)\), directly moving to C3. (Here, symbols r and \(r'\) serve as markers, specific to the simulation of the rule labeled r.) (ii) We use explicit left-end and right-end markers \(\kappa '\) and \(\kappa \), respectively, to rule out some premature rule applications in C2. (iii) We return to C1 several times when simulating the application of a context-free rule r. This makes the proof that no malicious derivations are possible rather delicate, as many cases have to be considered.

At a first glance, the reader might wonder that the simulation should be straightforward (as initially thought by the authors themselves, as there are that many resources available). However, this is not the case. The problem is that any rule of a component could be applied whenever a string enters that component. Since insertion is only left-context-sensitive, the insertion string can be adjoined any number of times on the right of this context, similar to context-free insertion. This issue is handled by inserting some markers and then inserting \(Y_1\) and \(Y_2\) (from rule \(X\rightarrow Y_1Y_2\)) after the markers. We have to be careful, since a back-and-forth transition may insert many \(Y_1\)’s and/or \(Y_2\)’s after the marker.

Proof

Consider a type-0 grammar \(G=(N,T,P,S)\) in SGNF as in Def. 1. The rules of P are assumed to be labelled bijectively with labels from the set \([1 \ldots |P|]\). We construct a graph-controlled insertion–deletion system \(\varPi \) as follows such that \(L(\varPi ) = L(G)\): \(\varPi = (3,V,T,\{\kappa 'S\kappa \},H,1,1,R)\). The alphabet of \(\varPi \) is \(V \subset N \cup T \cup \{r,r' :r \in [1 \ldots |P|]\}\cup \{\kappa ',\kappa \}\). The simulation rules for different cases are specified in Table 2, which completes the description of R and V. This table should be read as follows.

  • If G contains a context-free rule of the form \(r=X\rightarrow Y_1Y_2\) (with the special linearity properties derived from G being in SGNF), then r is simulated by the rules whose labels are prefixed with r.

  • Similarly, the two genuinely context-sensitive rules \(A_1B_1\rightarrow \lambda \) and \(A_2B_2\rightarrow \lambda \) are simulated as described by the GCID rules prefixed with f.

  • h.1.1 takes care of simulating the only context-free deletion rule, and \(\kappa 1.1\) and \(\kappa '1.1\) take care of the boundary markers.

Clearly, \(\varPi \) has size (3; 1, 1, 0; 1, 1, 1). With the rules of Table 2, we prove \(L(G)\subseteq L(\varPi )\) by showing how the different types of rules are simulated. Let us look into the context-free rules first. The simulation of the deletion rule h is obvious and hence omitted. Applying some rule \(r:X\rightarrow Y_1Y_2\), with \(X\in N'\), to \(w=\alpha X\beta \), where \(\alpha ,\beta \in (N''\cup T)^*\), yields \(w'=\alpha Y_1Y_2\beta \) in G. We assume that \(\alpha 'c'=\kappa '\alpha \) and \(c\beta '=\beta \kappa \) where \(\alpha '\kappa ,\kappa 'c\kappa ,\kappa 'c'\kappa ,\kappa '\beta ' \in \{\kappa '\}(N''\cup T)^*\{\kappa \}\). The stated equality is important in the following sense: if \(\kappa '\alpha = \kappa '\alpha _1\alpha _2 \dots \alpha _n\) then we set \(\alpha '=\kappa '\alpha _1\alpha _2 \dots \alpha _{n-1}\) and \(c'=\alpha _n\). If \(\alpha =\lambda \), then \(c'=\kappa '\). Hence \(c' \in N'' \cup T \cup \{\kappa '\}\). Similarly, when \(\beta \kappa = \beta _1\beta _2 \dots \beta _m\kappa \), we set \(\beta '=\beta _2 \beta _3\dots \beta _m\kappa \) and \(c=\beta _1\). If \(\beta =\lambda \), then \(c=\kappa \). So \(c \in N'' \cup T \cup \{\kappa \}\). The main aim to assume the equality is to single out the last symbol of \(\kappa '\alpha \) and the first symbol of \(\beta \kappa \) and use the singled out symbol as left or right context in the rules r2.3.c or \(r2.4.c'\); see Table 2.

In \(\varPi \), we can find the following simulation:

$$\begin{aligned} \begin{array}{llll} (\kappa ' w\kappa )_1 &{} \Rightarrow _{r1.1} (\alpha 'c' Xrc\beta ' )_2 &{}\Rightarrow _{r2.1} (\alpha 'c'rc\beta ')_1 &{} \Rightarrow _{r1.2} (\alpha 'c' rr'c\beta ')_2\\ &{}\Rightarrow _{r2.2} (\alpha 'c' r'c\beta ')_1 &{}\Rightarrow _{r1.3} (\alpha 'c' r'\varDelta c\beta ')_1 &{}\Rightarrow _{r1.4} (\alpha 'c' r'Y_2\varDelta c\beta ')_2\\ &{}\Rightarrow _{r2.3.c} (\alpha 'c' r'Y_2 c\beta ')_3 &{}\Rightarrow _{r3.1} (\alpha 'c' r'Y_1Y_2 c\beta ')_2 &{}\Rightarrow _{r2.4.c'} (\alpha 'c' Y_1Y_2 c\beta ')_1\\ &{}=\qquad (\kappa ' \alpha Y_1Y_2\beta \kappa )_1 &{}=\qquad (\kappa ' w'\kappa )_1\,. \end{array} \end{aligned}$$

For the non-context-free case, the simulation of \(f:AB\rightarrow \lambda \) is straightforward; hence, details are omitted. By induction, this proves that whenever \(S\Rightarrow ^*w\) in G, then there is a derivation \((\kappa ' S\kappa )_1\Rightarrow '_*(\kappa ' w\kappa )_1\) in \(\varPi \), and finally \((\kappa ' w\kappa )_1\Rightarrow ' (w)_1\) is possible.

We now show \(L(\varPi ) \subseteq L(G)\); this is crucial since it proves that \(\varPi \) not only produces intended strings but neither produces any unintended ones.

Conversely, consider a configuration \((w)_1\), with \((\kappa ' S\kappa )_1\Rightarrow '_*(w)_1\). We assume now that w starts with \(\kappa '\) and ends with \(\kappa \), and that these are the only occurrences of these special letters in w. As no rules ever introduce these letters, there cannot be any other occurrences of these letters in w. If w contains no occurrence of \(\kappa \) or \(\kappa '\), still further derivations might be possible, as the only purpose of these two special letters is to allow for tests as in rules r2.3 and r2.4. The only danger is that such derivations might get stuck when starting with strings w that do not contain \(\kappa \) or \(\kappa '\), but this causes no problems, as there is always the possibility to circumvent this blocking by keeping \(\kappa \) and \(\kappa '\) in until the very end.

We now discuss five situations for w and prove in each case that, whenever \((w)_1\Rightarrow '(w')_1\), then \(w'\) satisfies one of these five situations, or from \((w')_1\) no final configuration can be reached. Inductively, the claim follows.

As \(S\in N'\), the base case \(\kappa 'S\kappa \) is covered in case (iii) presented below.

(i) Assume that w contains one occurrence of \(r'\) (the primed marker of some context-free rule r), but no occurrence of unprimed markers of context-free rules, and no occurrence of any nonterminal from \(N'\), neither an occurrence of \(\varDelta \). Hence, \(w=\kappa '\alpha r'\beta \kappa \) for appropriate strings \(\alpha ,\beta \in (N''\cup T)^*\).

Then, the rules (i.a) r1.3, (i.b) r1.4, as well as the simulation initiation rules like (i.c) f1.1 are applicable. Let us discuss these possibilities now.

Subcase (i.c): If f1.1 is applied, then, say, f is introduced to the right of some occurrence of A. In C2, one can then try to apply (i.c.1) f2.1, (i.c.2) f2, 2, or (i.c.3) \(r2.4.c'\) for an appropriate \(c'\). However, as we are still simulating phase I of G, B cannot be to the right of A, so that Subcase (i.c.1) cannot occur.

Subcase (i.c.2) simply undoes the effect of previously applying f1.1, so that we can ignore its discussion. In Subcase (i.c.3), we are back in C1 with a string that contains no symbols from \(N'\), nor any variants of context-free rule markers, nor any \(\varDelta \), but one non-context-free rule marker. We will discuss this in Case (v) below, proving that such a derivation cannot terminate.

Subcase (i.b): If we apply r1.4 to w immediately, we are trapped in C2.

Subcase (i.a): we apply r1.3 first once. Now, we are in a very similar situation as before, but one \(\varDelta \) is added to the right of \(r'\). This means that continuing with f1.1 will get stuck again in C2. In order to make progress, we should finally apply r1.4. Now, we are in the configuration \((\kappa '\alpha r'Y_2\varDelta ^n\beta \kappa )_2\) for some \(n\ge 1\). As \(Y_1\ne Y_2\), \(r2.4.c'\) is not applicable for any \(c'\), so the derivation is stuck in C2. If we apply r.2.3.c, then we can only proceed if \(n=1\), which means that we applied r1.3 exactly once before. Hence, \((\kappa '\alpha r'Y_2\varDelta \beta \kappa )_2\Rightarrow (\kappa '\alpha r'Y_2\beta \kappa )_3\Rightarrow (\kappa '\alpha r'Y_1Y_2\beta \kappa )_2\Rightarrow (\kappa '\alpha Y_1Y_2\beta \kappa )_1\) is enforced. This corresponds to the intended derivation; the assumed occurrence of \(r'\) in the string was replaced by \(Y_1Y_2\); this corresponds to the situation of Case (iii).

(ii) Assume that w contains one occurrence of r (the unprimed marker of some context-free rule r), but no occurrence of primed markers of context-free rules, and no occurrence of any nonterminal from \(N'\), neither an occurrence of \(\varDelta \). Hence, \(w=\kappa '\alpha r\beta \kappa \) for appropriate strings \(\alpha ,\beta \in (N''\cup T)^*\).

Similarly as discussed in Case (i), trying to start a simulation of some non-context-free rule gets stuck in C2, in particular, as we are simulating phase I of G and there is no nonterminal from \(N'\) in the current string. Hence, we are now forced to apply r1.2. This means that in C2, we have to apply r2.2, leading us to \((w')_1\) with \(w'=\alpha r'\beta \), a situation already discussed in Case (i).

(iii) Assume that w contains one occurrence \(X\in N'\), but no occurrence of unprimed or primed markers of context-free rules, and no occurrence of \(\varDelta \). Hence, \(w=\kappa '\alpha X\beta \kappa \) for appropriate strings \(\alpha ,\beta \in (N''\cup T)^*\).

As we are still simulating phase I of G, we are now forced to apply r1.1 or simulate the context-free deletion rule (which gives a trivial discussion that is omitted; the important point is that this switches to phase II of the simulation of G). This means that in C2, we have to apply r2.1, leading us to \((w')_1\) with \(w'=\kappa '\alpha r\beta \kappa \) for some context-free rule \(r:X\rightarrow Y_1Y_2\), a situation already pondered in Case (ii).

(iv) Assume that \(w\in \{\kappa '\} (N''\cup T)^*\{\kappa \}\). Now, it is straightforward to analyze that we have to follow the simulation of one of the non-context-free deletion rules, or finally apply the rule deleting the special symbols \(\kappa , \kappa '\).

(v) Assume that w contains no primed or unprimed markers of context-free rules, nor a symbol from \(N'\), nor any \(\varDelta \) but contains a non-context-free rule marker. This means we have to apply some rule f1.1, but although this might successfully simulate a non-context-free deletion rule, it will bring us back to C1 with a non-context-free rule marker in the string. Hence, we are back in Case (v), so that this type of derivation can never terminate.

The second claim follows by Proposition 1. The underlying graph of the simulation is shown in Fig. 1a. The corresponding undirected graph is a path and hence the presented GCID system is path-structured. \(\square \)

Fig. 1
figure 1

Control graphs underlying the GCID systems (characterizing RE) in this paper. a Control graph of Theorems 1, 5, 6, b control graph of Theorems 2, 3

In [13], it was shown that GCID systems of sizes \((4;1,1,0;\;1,1,0)\) and \((4;1,1,0;\;1,0,1)\) describe \(\mathrm {RE}\), with the underlying control graph not being a path. In [11], the number of components was reduced from 4 to 3, however, with the underlying graph still not being a path. In the next two theorems we characterize \(\mathrm {RE}\) by path-structured GCID systems of sizes (4; 1, 1, 0; 1, 1, 0) and (4; 1, 1, 0; 1, 0, 1). The former result also complements an earlier result of [16], which stated that \(\mathrm {GCID}_P(3;1,2,0;1,1,0) =\mathrm {GCID}_P(3;1,1,0;1,2,0) = \mathrm {RE}\). We trade-off the number of components against the length of the left context of the insertion/deletion.

Theorem 2

\(\mathrm {RE} = \mathrm {GCID}_P(4;1,1,0;1,1,0) = \mathrm {GCID}_P(4;1,0,1;1,0,1)\).

Proof

Consider a type-0 grammar \(G=(N,T,P,S)\) in SGNF as in Def. 1. The rules of P are assumed to be labelled bijectively with labels from the set \([1 \ldots |P|]\). We construct a graph-controlled ins–del system \(\varPi \) as follows such that \(L(\varPi ) = L(G)\). The alphabet of \(\varPi \) is \(V \subset N \cup T \cup \{p,p',p'',p''' :p \in [1 \ldots |P|]\}\cup \{\kappa \}\). The set of rules R (of \(\varPi \)) is defined as shown in Table 3. The four columns of Table 3 correspond to the four components of \(\varPi \). The rows correspond to the simulation of \(r:X\rightarrow Y_1Y_2\), \(f:AB\rightarrow \lambda \) and of the context-free deletion rule \(h: S' \rightarrow \lambda \). The last row deletes the left-end marker \(\kappa \) introduced in the axiom.

Clearly, \(\varPi \) has size (4; 1, 1, 0; 1, 1, 0). We now prove that \(L(G) \subseteq L(\varPi )\). To this end, we show that if \(w\Rightarrow w'\) in G, with \(w,w'\in (N\cup T)^*\), then \((\kappa w)_1\Rightarrow ' (\kappa w')_1\) according to \(\varPi \). From this fact, the claim follows by a simple induction argument, because finally the left-end marker \(\kappa \) can be removed. As the claim is evident for rule h, we only need to discuss \(w\Rightarrow w'\) due to using a context-free rule (Case CF) or due to using a non-context-free rule (Case \(\overline{\mathrm {CF}}\)).

Table 3 GCID rules of size (4; 1, 1, 0; 1, 1, 0) with axiom \(\kappa S\) and \(c\in N''\cup T\cup \{\kappa \}\)

Case CF: The intended simulation works as follows:

$$\begin{aligned} \begin{array}{lllllll} (\kappa \alpha X\beta )_1 &{}\Rightarrow _{r1.1}&{} (\kappa \alpha rX\beta )_2 &{}\Rightarrow _{r2.1}&{} (\kappa \alpha rXr'\beta )_3 &{}\Rightarrow _{r3.1}&{} (\kappa \alpha rr'\beta )_4 \\ &{}\Rightarrow _{r4.1}&{} (\kappa \alpha rr'r''\beta )_3 &{}\Rightarrow _{r3.2}&{} (\kappa \alpha rr'r''r'''\beta )_2 &{}\Rightarrow _{r2.2}&{} (\kappa \alpha rr''r'''\beta )_2\\ &{}\Rightarrow _{r2.3}&{} (\kappa \alpha rr'''\beta )_3 &{}\Rightarrow _{r3.3}&{} (\kappa \alpha rr'''Y_2\beta )_4 &{}\Rightarrow _{r4.2}&{} (\kappa \alpha r'''Y_2\beta )_3\\ &{}\Rightarrow _{r3.4}&{} (\kappa \alpha r'''Y_1Y_2\beta )_2 &{}\Rightarrow _{r2.4.c}&{} (\kappa \alpha Y_1Y_2\beta )_1. \end{array} \end{aligned}$$

Here, c is the last symbol of \(\kappa \alpha \), possibly \(\kappa \).

Case \(\overline{\mathrm {CF}}\): Let us consider \(f:AB\rightarrow \lambda \). This means that \(w=\alpha AB\beta \) and \(w'=\alpha \beta \) for some \(\alpha ,\beta \in (N\cup T)^*\). Within \(\varPi \), this can be simulated as follows.

$$\begin{aligned} \begin{array}{lllllll} (\kappa w)_1 &{}=&{} (\kappa \alpha AB\beta )_1 &{}\Rightarrow _{f1.1}&{} (\kappa \alpha fAB\beta )_2 &{}\Rightarrow _{f2.1}&{} (\kappa \alpha fAf'B\beta )_3 \\ &{}\Rightarrow _{f3.1}&{} (\kappa \alpha fAf'\beta )_4 &{}\Rightarrow _{f4.1}&{} (\kappa \alpha ff'\beta )_3 &{}\Rightarrow _{f3.2}&{} (\kappa \alpha f \beta )_2\\ &{}\Rightarrow _{f2.2}&{} (\kappa w')_1. \end{array} \end{aligned}$$

The converse inclusion \(L(\varPi )\subseteq L(G)\) is following an inductive argument as in the previous theorem. Let us consider a configuration \((w)_1\). As \(\kappa 1.1\) can be applied at any time in such a configuration, but as the only purpose of having this new symbol \(\kappa \) is to make sure that there is always a symbol to the left of the current deletion position, we can assume from now on that w starts with \(\kappa \), and as no rule will ever introduce \(\kappa \), we can also assume that this is the only place where \(\kappa \) occurs in w. Hence, we will not discuss \(\kappa 1.1\) any further from now on.

We will now discuss \((w)_1\Rightarrow '(w')_1\) for different cases for w, proving that, whenever the derivation may lead to some final configuration, then \(w'\) is falling into one of the cases that we discussed, which proves the validity of the overall discussion by induction.

(i) If w contains no occurrence of any symbol from \(N'\) and no (variants of) rule markers, i.e., \(w\in \{\kappa \}(N''\cup T)^*\), then applying r1.1 to w for any applicable context-free rule r will see no continuation of the derivation in C2. Therefore, we have to use rule f1.1. Due to the fact that all rules are guarded in C2, \((w)_1\Rightarrow _{f1.1}(w_1)_2\Rightarrow (w_2)_s\) either means that \(s=1\) and \(w_2=w\) (by applying f2.2), which means that we observe no progress, or we used f2.1 so that \(s=3\). In the latter case, we can continue with f3.1 only in C3, and in C4, we have to pick f4.1. By doing so, we have verified the following selections on inserting f and \(f'\) into w before: (a) f has been inserted to the left of an occurrence of A; (b) \(f'\) has been inserted in between an occurrence of A and an occurrence of B. Upon returning to C3, we are in a configuration \((\hat{w})_3\) where, in comparison to w, \(\hat{w}\) is obtained by applying the two rewriting rules \(A\rightarrow f\) and \(B\rightarrow f'\). We could now either apply f3.1 (i.b.1) or f3.2 (i.b.2). In the former case (i.b.1), we actually perform a loop that, generalizing what we said above, leads to deleting a subword \(A^n\) to the right of f, as well as a subword \(B^n\) to the right of \(f'\), upon re-entering C3. If now f3.2 is applied, then this verifies that in fact a subword of the form \(A^nB^n\) for some \(n\ge 1\) was deleted from w. Notice that this also captures Case (i.b.2). Now, we return to C1 with a string that satisfies the conditions of string \(w_1\) discussed at the beginning of Case (i), so that we now either return to C1 with a string \(w'\) that was obtained from w by deleting the subword \(A^nB^n\), or we restart the derivation as discussed with \((w_2)_3\) above.

(ii) and (iii): If w contains exactly on occurrence of a symbol from \(N'\), i.e., \(w\in \{\kappa \}(N''\cup T)^*N'(N''\cup T)^*\), then again we have the opportunity to apply some rule r1.1, belonging to some context-free rule r (Case (ii)), or to apply some rule designed to start the simulation of a non-context-free rule, say, of f (Case (iii)).

In Case (ii), it might be that the context-free deletion rule is directly simulated, i.e., \(S'\in N'\) is deleted. In that case, \((w)_1\Rightarrow (w')_1\) directly corresponds to some rule application of G, i.e., \(w\Rightarrow w'\) in G is true. Otherwise, \((w)_1\Rightarrow (w_1)_2\), and a rule marker r is randomly inserted into w, i.e., \(w_1\in r\sqcup \mathchoice{}{}{}{}\sqcup w\). This potential source of nondeterministic development is clarified within C2 and C3. As all rules are guarded, we can continue only if some \(X\in N'\) occurs in w that is the left-hand side of some context-free rule \(\hat{r}\). So, for some \(\alpha \in \{\kappa \}(N''\cup T)^*\) and \(\beta \in (N''\cup T)^*\), \(w=\alpha X\beta \), and if \((w_1)_2\Rightarrow (w_2)_3\), then \(w_2\in r\sqcup \mathchoice{}{}{}{}\sqcup \alpha X\hat{r}'\beta \). Again, checking all rules in C3 reveals that only r3.1 can be applicable, i.e., we now know that \(w_2=\alpha rX\hat{r}'\beta \), and if we apply r3.1, then the resultant string \(w_3=\alpha r\hat{r}'\beta \) moves to C4. If we now applied r4.2, then the derivation is stuck in C3, as no rule in that component can deal with a primed marker of a context-free rule. Hence, we have to apply \(\hat{r}4.1\). Hence, \(w_4=\alpha r\hat{r}'\hat{r}''\beta \) moves back to C3. Only one rule is applicable here, the one that can handle double-primed markers, so that \((w_4)_3\Rightarrow (w_5)_2\) means we have applied \(\hat{r}3.2\). Hence, \(w_5=\alpha r\hat{r}'\hat{r}''\hat{r}'''\beta \). In particular as \(\hat{r}''\) is immediately to the left of \(\hat{r}'''\), rule \(\hat{r}2.4.c\) is not applicable for any c. So, the only applicable rule is r2.2, assuming that \(\hat{r}=r\). (The case \(\hat{r}\ne r\) has no continuation.) Hence, \(w_5=\alpha rr'r''r'''\beta \), and \((w_5)_2\Rightarrow (w_6)_2\Rightarrow (w_7)_3\) with \(w_6=\alpha rr''r'''\beta \) and \(w_7=\alpha rr'''\beta \) is the only possible continuation. Now, assume that \(r:X\rightarrow Y_1Y_2\) is the context-free rule whose markers occur in \(w_7\). Only r3.3 is applicable now, followed by r4.2 and r3.4. This means we observe \((w_7)_3\Rightarrow (w_8)_4\Rightarrow (w_9)_3\Rightarrow (w_{10})_2\), with \(w_{10}=\alpha r'''Y_1Y_2\beta \). Thus, \((w_{10})_2\Rightarrow (w')_1\) such that \(w\Rightarrow w'\) in G by applying r.

In Case (iii), we can follow a similar reasoning as in Case (i), which means that we would faithfully simulate applications of context-sensitive deletion rules, except for one possible deviation: upon reaching C2, we might now apply r2.1 for some suitable context-free rule r. This can happen each time when we arrive at C2, so we have to separately discuss these possibilities. The first such opportunity is given if we start with, say, f1.1 and if we apply r2.1 next for some context-free rule \(r:X\rightarrow Y_1Y_2\), where \(X\in N'\) is occurring in w. This results in \((w)_1\Rightarrow (w_1)_2\Rightarrow (w_2)_3\), where \(w_2\in f\sqcup \mathchoice{}{}{}{}\sqcup \alpha Xr'\beta \), assuming \(w=\alpha X\beta \). It is easy to check that this derivation gets stuck in C3. The second possibility to deviate from (i) is when we return to C2 later. As discussed in Case (i), the effect of a derivation using f-simulation rules only, ending up in C2 and starting in configuration \((w)_1\) amounts in replacing some subword \(A^nB^n\), for any \(n\ge 0\), by the rule marker f. Notice that the case \(n=0\) corresponds to a random insertion of f, a case that was already discussed above. This derivation gets stuck in C3 for each n when trying to apply r2.1 instead of f2.1 or f2.2.

Hence, in each of the cases (i), (ii), (iii), whenever \((w)_1\Rightarrow (w')_1\), then \(w\Rightarrow ^* w'\) in G. Also, if we start with a string w satisfying one of the conditions described in these cases, then \(w'\) satisfies one of these cases, as well. As Case (ii) corresponds to the start situation, we can conclude by induction that, whenever \((\kappa S)_1\Rightarrow _*(v)_1\) for some \(v\in T^*\), then \(S\Rightarrow ^* w\) in G. The second claim follows by Proposition 1. The underlying graph of the simulation is shown in Fig. 1b. \(\square \)

Theorem 3

\(\mathrm {RE} = \mathrm {GCID}_P(4;1,1,0;1,0,1) = \mathrm {GCID}_P(4;1,0,1;1,1,0)\).

Proof

Consider a type-0 grammar \(G=(N,T,P,S)\) in SGNF. The rules of P are assumed to be labelled bijectively with labels from the set \([1 \ldots |P|]\). We construct a graph-controlled ins–del system \(\varPi \) such that \(L(\varPi ) = L(G)\), with \(\varPi = (4,V,T,\{S\},H,1,1,R)\). The alphabet V of \(\varPi \) satisfies \(V \subset N \cup T \cup \{p,p',p'',p''' :p \in [1 \ldots |P|]\}\). The set of rules R (of \(\varPi \)) is defined as shown in Table 4. \(\varPi \) has the claimed size. The intended simulation of a context-free rule is as follows.

$$\begin{aligned} \begin{array}{lllllll} (\alpha X\beta )_1&{}\Rightarrow _{r1.1}&{} (\alpha Xr\beta )_2 &{}\Rightarrow _{r2.1}&{} (\alpha r\beta )_1 &{}\Rightarrow _{r1.2}&{} (\alpha rr'\beta )_2 \\ &{}\Rightarrow _{r2.2}&{} (\alpha r'\beta )_1 &{}\Rightarrow _{r1.3}&{} (\alpha r'r''\beta )_2 &{}\Rightarrow _{r2.3}&{} (\alpha r' r''Y_2\beta )_3\\ &{}\Rightarrow _{r3.1}&{} (\alpha r' Y_2 \beta )_4 &{}\Rightarrow _{r4.1}&{} (\alpha r'r'''Y_2\beta )_3 &{}\Rightarrow _{r3.2}&{} (\alpha r'r'''Y_1Y_2\beta )_2\\ &{}\Rightarrow _{r2.4} &{} (\alpha r'Y_1Y_2\beta )_2 &{}\Rightarrow _{r2.5}&{} (\alpha Y_1Y_2\beta )_1. \end{array} \end{aligned}$$

The intended simulation of a non-context-free rule is as follows.

$$\begin{aligned} \begin{array}{lllllll} (\alpha AB\beta )_1&{}\Rightarrow _{f1.1}&{} (\alpha ABf\beta )_2 &{}\Rightarrow _{f2.1}&{} (\alpha Af'Bf\beta )_3 &{}\Rightarrow _{f3.1}&{} (\alpha f'Bf\beta )_4\\ &{}\Rightarrow _{r4.1}&{} (\alpha f'f\beta )_3 &{}\Rightarrow _{r3.2}&{} (\alpha f\beta )_2 &{}\Rightarrow _{r2.2}&{} (\alpha \beta )_1. \end{array} \end{aligned}$$

This shows that \(L(G) \subseteq L(\varPi )\). The main complication for the correctness proof is the fact that we may return to C1 with strings containing rule markers. This brings along a detailed discussion of four different situations for w when considering \((S)_1\Rightarrow '_*(w)_1\Rightarrow '(w')_1\) according to \(\varPi \).

Consider some derivation \((S)_1 \Rightarrow '_{*} (w)_1\) in \(\varPi \). Clearly, we can decompose this derivation into \((w_i)_1\Rightarrow ' (w_{i+1})_1\), such that \(S=w_0\), \(w=w_m\), and the chosen derivation moves exactly the strings \(w_0\), ..., \(w_m\) into component C1. Note that markers are attached when applying rules in C1, except the (only) context-free deletion rule. The rule markers introduced in the (non-)context-free rules avoid the interference of the rules among themselves. All rules in each component are guarded by markers and hence a rule can be applied only if the derivations comes with the appropriate marker. Let us make this more precise. Consider a configuration \((w)_1\) such that \((S)_1 \Rightarrow '_{*} (w)_1\) in \(\varPi \). Again, we will discuss several cases for a derivation \((w)_1\Rightarrow '(w')_1\) showing that \(w'\) falls under one of these cases whenever the whole discussed derivation may lead to a final configuration. An easy induction argument then shows that this case distinction is complete.

Table 4 GCID rules of size (4; 1, 1, 0; 1, 0, 1) simulating a type-0 grammar in SGNF

(i) If w contains no symbols from \(N'\) nor unprimed or primed markers of context-free rules, then only f1.1 is possibly applicable. As all rules in C2 and C3 are guarded, a successful complete simulation of \(f:AB\rightarrow \lambda \) is enforced. Observe that also \(w'\) contains no symbols from \(N'\) nor unprimed or primed markers of context-free rules, but if w had contained an occurrence of a marker of non-context-free rules, then \(w'\) would also contain such an occurrence.

(ii) If w contains one occurrence of a primed marker of a context-free rule r, but no symbols from \(N'\), nor unprimed markers of context-free rules, then r1.3 or f1.1 are possibly applicable. On applying p1.3, we arrive at \((w_1)_2\) with \(r'r''\) as a substring of \(w_1\), i.e., \(w_1=\alpha r'r''\beta \) for some \(\alpha ,\beta \in (N''\cup T)^*\). Observe that r2.5 is not applicable to this configuration. By the use of rule markers, the rules r2.3, r3.1, r4.1, r3.2, r2.4, r2.5 must be applied in this order, leading to \((w)_1\Rightarrow '(w')_1\) such that \(w'\) is obtained from w by applying the context-free rule r as intended. Alternatively, we could start \((w)_1\Rightarrow (w_1)_2\) by applying, say, f1.1. If we now apply f2.1, there is no alternative but applying f3.1 in C3, which would mean that we end up (on applying f2.1 and f3.1) simulating a rule application of \(AB\rightarrow \lambda \). As the resulting string \(w_1'\) enjoys the same properties as \(w_1\) in C2, let us explore further alternatives for \(w_1\), containing both an occurrence of \(r'\) and an occurrence of, say, f. In fact, now r2.5 could be applicable, which would yield \((w)_1\Rightarrow ' (w')_1\), where \(w'\) now contains no occurrences of symbols from \(N'\) nor unprimed or primed markers of context-free rules, but one occurrence of a marker of non-context-free rules. Inductively, to such configurations, the reasoning of case (i) would apply, which means that such derivations can never lead to terminal strings, as whenever \((w')_1\Rightarrow '_*(w'')_1\), then \(w''\) also contains an occurrence of a marker of non-context-free rules.

(iii) If w contains one occurrence of an unprimed marker of a context-free rule r, but no symbols from \(N'\) nor primed markers of context-free rules, then r1.2 or f1.1 are possibly applicable. On applying, say, f1.1, we are now enforced to (correctly) simulate applications of the rule \(r:AB\rightarrow \lambda \) as described in item (i), as in particular no rules in C2 or C3 can work with the marker r without \(r'\) or a fitting nonterminal from \(N'\). As such simulations lead to configurations in C2 that are similar to the ones under current study, we need to explore only \((w)_1\Rightarrow (w_1)_2\) due to applying r1.2. As \(w_1\) contains no occurrence of a nonterminal from \(N'\), r2.2 must be applied, leading to \((w')_1\), where \(w'\) is obtained from w by replacing the only occurrence of the marker r by the marker \(r'\). Additionally, \(w'\) inherits from w the properties of not containing further markers of context-free rules nor symbols from \(N'\). Hence, a possible further discussion would continue as in Case (ii).

(iv) If w contains no occurrences of primed or unprimed rule markers but one symbol \(X\in N'\), then r1.1 or f1.1 are possibly applicable, for some context-free rule r including the context-free deletion rule. Hence, \(w=\alpha X\beta \) with \(\alpha ,\beta \in (N''\cup T)^*\). If \((w)_1\Rightarrow (w')_1\) due to applying h1.1, then \(w'=\alpha \beta \), and we have to continue the derivation as discussed in Case (i). Otherwise, we might apply f1.1. In that case, we have to follow a (correct) simulation of some non-context-free rule f (possibly a number of times) in the arising derivation \((w)_1\Rightarrow (w')_1\), as there are no rules in C2 or C3 that can deal with nonterminals from \(N'\) without referring to a fitting context-free rule marker (or double- or triple-primed variants thereof). Therefore, we are left with discussing the application of a rule r1.1, which starts the simulation of a context-free rule r with left-hand side X. In that case, \((w)_1\Rightarrow (w_1)_2\) with \(w_1=\alpha Xr \beta \). Now, due to the use of rule markers, the application of r2.1 is enforced, leading to the configuration \((w')_1\) with \(w'=\alpha r\beta \), i.e., \(w'\) was obtained from w by replacing X by r. Now, the discussion continues as in Case (iii).

The observations above on the non-interference of context-free and non-context-free rule simulations shows that no malicious derivations are possible. Hence, \(L(\varPi ) \subseteq L(G)\).

The second claim follows by Proposition 1. The underlying graph of the simulation is shown in Fig. 1b; obviously, it is a path.

3.2 GCID systems with insertion length two

In [13], it is shown that \(\mathrm {GCID}(4;2,0,0;1,1,0)=\mathrm {RE}\) with the underlying control graph not even being a tree. In this subsection, we show that, even if we restrict the control graph to be a path, \(\mathrm {GCID}_P(4;2,0,0;1,1,0)\) systems will still characterize RE. Further, if we allow a context (either left or right) for insertion, then we can still describe \(\mathrm {RE}\) while decreasing the number of components from 4 to 3, yet obtaining path-structured GCID systems.

Notice that we do need some context somewhere, as it is known that a language as simple as \(\{a^nb\mid n\ge 1\}\) cannot be described by any \(\mathrm {GCID}_P\) system of size (k; 2, 0, 0; 2, 0, 0) for any k; see [22, Thm. 17].

Theorem 4

\(\mathrm {RE} = \mathrm {GCID}_P(4;2,0,0;1,1,0) = \mathrm {GCID}_P(4;2,0,0;1,0,1)\).

Proof

Consider a type-0 grammar \(G=(N,T,P,S)\) in SGNF. The rules of P are assumed to be labelled bijectively with labels from the set \([1 \ldots |P|]\). We construct a graph-controlled insertion–deletion system \(\varPi \) as follows such that \(L(\varPi ) = L(G)\) : \(\varPi = (4,V,T,\{S\},H,1,1,R)\). The alphabet of \(\varPi \) is \(V \subset N \cup T \cup H\) where \(H = \{r,r',r'',r''' :r \in [1 \ldots |P|]\}\). The set of rules R (of \(\varPi \)) is given in Table 5.

Table 5 GCID rules of size (4; 2, 0, 0; 1, 1, 0) simulating a type-0 grammar in SGNF

The intended derivation of a context-free rule \(r: X \rightarrow Y_1Y_2\) is as follows.

$$\begin{aligned} \begin{array}{lllllll} (\alpha X\beta )_1 &{}\Rightarrow _{r1.1}&{} (\alpha Xrr'\beta )_2 &{}\Rightarrow _{r2.1}&{} (\alpha Y_1r''Xrr'\beta )_3 &{}\Rightarrow _{r3.1}&{} (\alpha Y_1r''rr'\beta )_4\\ &{}\Rightarrow _{r4.1}&{} (\alpha Y_1r''r'\beta )_3 &{}\Rightarrow _{r3.2}&{} (\alpha Y_1r''\beta )_2 &{}\Rightarrow _{r2.2}&{} (\alpha Y_1Y_2r'''r''\beta )_3 \\ &{}\Rightarrow _{r3.3}&{} (\alpha Y_1Y_2r'''\beta )_2 &{}\Rightarrow _{r2.3}&{} (\alpha Y_1Y_2\beta )_1. \end{array} \end{aligned}$$

The intended derivation of a non-context-free deletion rule is as follows.

$$\begin{aligned} \begin{array}{lllllll} (\alpha AB\beta )_1&{}\Rightarrow _{f1.1}&{} (\alpha fAB\beta )_2 &{}\Rightarrow _{f2.1}&{} (\alpha ff'AB\beta )_3 &{}\Rightarrow _{f3.1}&{} (\alpha ff'B\beta )_4\\ {} &{} \Rightarrow _{f4.1}&{} (\alpha ff'\beta )_3 &{}\Rightarrow _{f3.2}&{} (\alpha f\beta )_2 &{}\Rightarrow _{f2.2}&{} (\alpha \beta )_1. \end{array} \end{aligned}$$

This show that \(L(G) \subseteq L(\varPi )\). Conversely, consider some derivation \((S)_1 \Rightarrow '_{*} (w)_1\Rightarrow ' (w')_1\) in \(\varPi \). By induction, assume that w contains at most one symbol from \(N'\) and no (unprimed or primed variants of) rule markers. In the following, let \(r:X\rightarrow Y_1Y_2\), \(s:X'\rightarrow Y_1'Y_2'\) and \(t:X''\rightarrow Y_1''Y_2''\) be three (not necessarily distinct) context-free rules, with the corresponding markers r, s and t, while \(f:AB\rightarrow \lambda \) is a non-context-free deletion rule. Notice (*) that whenever we discuss a configuration \((u)_2\), we could (in principle) always paste in a (correct) simulation of some non-context-free deletion rule f, leading to \((u')_2\) with \(u\Rightarrow _{f}u'\).

First scenario: \((w)_1\Rightarrow _{r1.1}(w_1)_2\Rightarrow _{s2.1}(w_2)_3\). Hence, first \(rr'\) and then \(Y_1's''\) have been introduced into w to obtain \(w_2\). The only rule in C3 that is potentially applicable is s3.1, which deletes the only occurrence from \(N'\) from \(w_2\) (which hence also checks that there is exactly one such occurrence in w, as otherwise no rule in C3 is applicable, because w contains no primed rule markers), so that the configuration \((w_3)_4\) can be described as being obtained from \(w=\alpha X'\beta \) by inserting \(rr'\) into \(\alpha Y_1's''\beta \). Now in C4, it is checked if \(r=s\), as only in that case and if the previous insertions were carried out in the right places, be can arrive at \((w_4)_3\) with \(w_4=\alpha Y_1r''r'\beta \). As \(r''r'\) is a substring of \(w_4\), the only applicable rule in C3 would be r3.2. This enforces the subsequent configuration \((w_5)_2\) with \(w_5=\alpha Y_1r''\beta \).

According to (*), we could now digress and paste in several (correct) simulations of applications of non-context-free deletion rules, which does not interfere with the flow of the argument. In particular, observe that if \((w_5)_2\Rightarrow _{f2.1}(x)_3\), then \((x)_3\Rightarrow _{f3.1}(y)_4\) is enforced, ignoring the back-loop \((x)_3\Rightarrow _{f3.2}(w_5)_2\).

Notice that we also might try to start some new simulation of a context-free rule t by \((w_5)_2\Rightarrow _{t2.1}(x)_3\). In order to continue, we must have \(Y_1=X''\), leading to \((y)_4\) with \(y=\alpha t''r''\beta \), allowing no further continuation.

So, we are left with discussing an application of r2.2, leading to \((w_6)_3\) with \(w_6=\alpha Y_1Y_2r'''r''\beta \), where the correct position of the insertion is tested in C3, as otherwise rule is applicable there. This also enforces \((w_6)_3\Rightarrow _{r3.3}(w_7)_2\), with \(w_7=\alpha Y_1Y_2r'''\beta \).

Clearly, by applying r2.3, we could now arrive at \((w_8)_1\) with \(w\Rightarrow _{r}w_8\) as desired, where \(w_8\) contains no (primed) rule markers and only one occurrence of a symbol from \(N'\), concluding the induction step for this case.

Also, according to (*), we could paste in several (correct) simulations of applications of non-context-free deletion rules, which does not interfere with the overall argument, as completed such simulations are enforced, as discussed above.

Finally, restarting some new simulation of a context-free rule would again get stuck in C3 or in C4, as described above.

Second scenario: \((w)_1\Rightarrow _{r1.1}(w_1)_2\Rightarrow _{s2.2}(w_2)_3\). It is easy to check that now the derivation is stuck in C3.

Third scenario: \((w)_1\Rightarrow _{r1.1}(w_1)_2\Rightarrow _{f2.1}(w_2)_3\). As noted in (*), we can paste in several correct simulations of non-context-free deletion rules, not changing the main argument of this proof. Notice that no other rules are applicable in C3 but f3.1 (or f3.2, but this makes no progress), and similarly in C4.

Fourth scenario: \((w)_1\Rightarrow _{f1.1}(w_1)_2\Rightarrow _{r2.1}(w_2)_3\). Now, the only possible continuation is via \((w_2)_3\Rightarrow _{r3.1}(w_3)_4\), but then the derivation is stuck in C4, as \(w_3\) neither contains \(f'\) nor a double-primed rule marker left to an unprimed one of the same rule.

Fifth scenario: \((w)_1\Rightarrow _{f1.1}(w_1)_2\Rightarrow _{r2.2}(w_2)_3\). This attempted derivation is immediately stuck in C3.

Sixth scenario: \((w)_1\Rightarrow _{f1.1}(w_1)_2\Rightarrow _{f2.1}(w_2)_3\). As described above, this may now start an intended simulation of a non-context-free deletion rule. Any attempts to intercalate other rules will get stuck, apart from those un-doing the previous step, which can be ignored.

Seventh scenario: \((w)_1\Rightarrow _{f1.1}(w_1)_2\Rightarrow _{f2.2}(w_2)_1\). As \(w_2=w\), this scenario can be ignored.

As the reader can easily check, these are all possibilities of derivations, starting with \((w)_1\). This proves that whenever \((w)_1\Rightarrow '(w')_1\), then \(w\Rightarrow ^+ w'\) in G, which shows \(L(\varPi )\subseteq L(G)\) by induction.

The claimed path structure which can be easily seen by inspection is depicted in Fig. 1b without the loop over C2. Proposition 1 shows that \(\text {GCID}_P\) systems of size (4; 2, 0, 0; 1, 0, 1) are computationally complete. \(\square \)

Theorem 5

\(\mathrm {RE} = \mathrm {GCID}_P(3;2,1,0;1,0,1)= \mathrm {GCID}_P(3;2,0,1;1,1,0)\).

Proof

Consider a type-0 grammar \(G=(N,T,P,S)\) in SGNF as in Def. 1. The rules of P are assumed to be labelled bijectively with labels from \([1 \ldots |P|]\). We construct a \(\mathrm {GCID}_P\) system \(\varPi = (3,V,T,\{S\},H,1,1,R)\) of size (3; 2, 1, 0; 1, 0, 1) as follows such that \(L(\varPi ) = L(G)\). Here, let \(V\subset N\cup T\cup [1 \ldots |P|]\) contain in particular those rule labels used in the rules listed in Table 6. The three columns of the table correspond to the three components of \(\varPi \). The rows correspond to the rules simulating \(r: X\rightarrow Y_1Y_2\), \(f: AB \rightarrow \lambda \) and \(h: S' \rightarrow \lambda \). \(\varPi \) is of size (3; 2, 1, 0; 1, 0, 1). We now prove that \(L(G) \subseteq L(\varPi )\). As the claim is evident for \(h:S'\rightarrow \lambda \), we show that if \(w\Rightarrow w'\) in G, then \((w)_1\Rightarrow ' (w')_1\) according to \(\varPi \) in two more cases.

Case CF: Here, \(w=\alpha X\beta \) and \(w'=\alpha Y_1Y_2\beta \) for some \(\alpha ,\beta \in (N''\cup T)^*\). The simulation of \(r:X\rightarrow Y_1Y_2\) performs as follows:

$$\begin{aligned} \begin{array}{l} (\alpha X\beta )_1 ~_{\Rightarrow _{r1.1}}^{\Leftarrow _{r2.2}} (\alpha Xr\beta )_2 \Rightarrow _{r2.1} (\alpha r\beta )_3 \Rightarrow _{r3.1} (\alpha rY_1Y_2\beta )_2 \Rightarrow _{r2.2} (\alpha Y_1Y_2\beta )_1\,. \end{array} \end{aligned}$$

Note the role of the right context r in the deletion rule r2.1. If the marker r is not present for the deletion, then after applying r3.1, when we come back to C2, we can apply r2.1 again and could end-up with a malicious derivation.

Case \(\overline{\mathrm {CF}}\): Here \(w=\alpha AB\beta \) and \(w'=\alpha \beta \) for some \(\alpha ,\beta \in (N\cup T)^*\). The rules \(f:AB\rightarrow \lambda \) can be simulated as follows.

$$\begin{aligned} \begin{array}{c} (\alpha AB\beta )_1 ~_{\Rightarrow _{f1.1}}^ {\Leftarrow _{f2.2}} (\alpha ABf\beta )_2 \Rightarrow _{f2.1} (\alpha Af\beta )_3 \Rightarrow _{f3.1} (\alpha f\beta )_2 \Rightarrow _{f2.2} (\alpha \beta )_1\,. \end{array} \end{aligned}$$

We now prove the converse \(L(\varPi ) \subseteq L(G)\). Consider some derivation \((S)_1 \Rightarrow '_{*} (w)_1\) in \(\varPi \). We claim that then \(S\Rightarrow ^* w\) in G; in particular, \(w\in (N''\cup T)^*(N'\cup \{\lambda \})(N''\cup T)^*\). This trivially holds for \(w=S\). So, suppose we have some \(w\in (N''\cup T)^*(N'\cup \{\lambda \})(N''\cup T)^*\) for which \((S)_1\Rightarrow '_*(w)_1\) in \(\varPi \), as well as \(S\Rightarrow ^* w\) in G by induction hypothesis. Let \((w)_1\Rightarrow '(w')_1\) in \(\varPi \), so that \((S)_1\Rightarrow '_*(w')_1\) in \(\varPi \). As \((w)_1\Rightarrow '(w')_1\), inspecting the rules tells us that there are various ways in which \(w=w'\) is possible. Clearly, then \(w'\) enjoys the same properties as w. So, assume \(w'\ne w\) from now on. The derivation \((w)_1\Rightarrow '(w')_1\) has to start like \((w)_1\Rightarrow (w_1)_j\) in \(\varPi \). If \(j=1\), then rule h1.1 was applied, so in fact \(w_1=w'\), and the effect is just the same as applying \(S'\rightarrow \lambda \) in G, so that \(w'\) satisfies the claim. Otherwise, \(j=2\). Here, there are two different ways to continue, which we are going to discuss next.

Table 6 GCID rules of size (3; 2, 1, 0; 1, 0, 1) simulating a type-0 grammar in SGNF

Applyingr1.1 to obtain \(w_1\) from w, for some \(r :X \rightarrow Y_1Y_2\). Hence, w contains an occurrence of X, and this is also the only place where any symbol from \(N'\) can appear by assumption. So, \(w=\alpha X\beta \) for some \(\alpha ,\beta \in (N''\cup T)^*\) and \(w_1=\alpha Xr \beta \) is transferred to C2. All rules in C2 are guarded by a rule marker. Hence, the only rules applicable now are r2.1 and r2.2, yielding a string \(w_2\). On applying the latter rule, \(w_2=w\) is returned to C1, so that \(w'=w\), a case ruled out before. Hence, \(w_2=\alpha r\beta \) is moved to component C3. Again, all rules in C3 are guarded by rule markers, so that the only applicable rule is r3.1. The resulting string \(w_3= \alpha rY_1Y_2\beta \) moves to C2. The rule r2.1 cannot be applied due to an inappropriate position of r and the only nonterminal of \(w_3\) is after r, however the rule r2.1 demands the nonterminal to be on the left of r. Hence, r2.2 has to be applied to \(w_3\), yielding \(w_4= \alpha Y_1Y_2\beta \) which moves to C1, i.e., \(w'=w_4\). Obviously, \(w\Rightarrow w'\) using rule \(r:X \rightarrow Y_1Y_2\) of G. Also, \(w'\in (N''\cup T)^*(N'\cup \{\lambda \})(N''\cup T)^*\).

Applyingf1.1 to obtain \(w_1\) from w, for \(f:AB\rightarrow \lambda \). Hence, w contains some occurrence of B, i.e., \(w=\alpha B\beta \), which is the occurrence checked when inserting f, i.e., \(w_1=\alpha Bf\beta \) is the string moved to C2. Applying f2.2 now would mean that \(w'=w\), contradicting our assumptions. Hence, f2.1 is applied, so that the resulting string \(w_2=\alpha f\beta \) moves to C3. Due to the use of rule markers, the only potentially applicable rule is f3.1. As the derivation is continued (by assumption of the existence of \(w'\)), \(\alpha =\alpha ' A\) holds. Hence, \(w_3=\alpha ' f\beta \) moves to C2. Now, if f2.2 is applied, \(w_4=\alpha '\beta \) moves to C1, i.e., \(w'=w_4\). As \(w=\alpha B\beta =\alpha ' AB\beta \), \(w'\) can be obtained from w by applying \(AB\rightarrow \lambda \), which shows the claim in this case. Recall that \(w'\) was obtained by first inserting f into w and then applying the rules f2.1, f3.1, and f2.2 in this order. So, if we would choose to apply f2.1 to \(w_4\) (instead of f2.2), this means that \(\alpha '=\alpha ''B\), and \(w_4=\alpha ''f\beta \). But then,

$$\begin{aligned} \begin{array}{l} (w_3)_2 = (\alpha ' f\beta )_2 \Rightarrow _{f2.2} (\alpha '\beta )_1= (\alpha ''B\beta )_1\Rightarrow _{f1.1} (\alpha ''Bf\beta )_2 \Rightarrow _{f2.1} (w_4)_3 \end{array} \end{aligned}$$

is also possible, so we can arrive at the same situation by first finishing the simulation of one application of \(AB\rightarrow \lambda \) and then starting another such application. Therefore, using the rule sequence f1.1, followed by \((f2.1, f3.1)^m\), and finalized by f2.2, in order to obtain \(w'\) from w in C1, simulates an m-fold application of \(f:AB\rightarrow \lambda \). Hence, \(w\Rightarrow ^m w'\) in G and \(w'\) satisfies the required properties.

This concludes that proof that \(L(G)=L(\varPi )\). Proposition 1 shows that also GCID systems of size (3; 2, 0, 1; 1, 1, 0) are computationally complete. Figure 1a depicts the control graph of the simulation. \(\square \)

Theorem 6

\(\mathrm {RE} = \mathrm {GCID}_P(3;2,1,0;1,1,0) = \mathrm {GCID}_P(3;2,0,1;1,0,1)\).

Table 7 GCID rules of size (3; 2, 1, 0; 1, 1, 0) simulating a type-0 grammar in SGNF

Proof

Consider a type-0 grammar \(G=(N,T,P,S)\) in SGNF as in Def. 1. The rules of P are assumed to be labelled bijectively with labels from \([1 \ldots |P|]\). We construct a GCID system \(\varPi = (3,V,T,\{S\},H,1,1,R)\) of size (3; 2, 1, 0; 1, 1, 0) as follows such that \(L(\varPi ) = L(G)\). V again contains rule markers, apart from the symbols of G. We refer to Fig. 7 for the rules of \(\varPi \). The control graph is shown in Fig. 1a. The three columns of the table correspond to the three components of \(\varPi \). The rows correspond to the rules simulating \(r: X \rightarrow Y_1Y_2\), \(f: AB \rightarrow \lambda \) and \(h: S' \rightarrow \lambda \).

We now prove that \(L(G) \subseteq L(\varPi )\) as follows. We show that if \(w\Rightarrow w'\) in G, then \((w)_1\Rightarrow ' (w')_1\) according to \(\varPi \). From this fact, the claim follows by a simple induction argument. \(w\Rightarrow w'\) could be due to different rule types, as discussed above. As the correctness is evident for \(h:S'\rightarrow \lambda \), it remains to discuss two more cases. We first discuss the simulation of context-free rules and then of non-context-free rules.

Context-free rules \(r:X\rightarrow Y_1Y_2\). Here, \(w=\alpha X\beta \) and \(w'=\alpha bY\beta \) for some \(\alpha ,\beta \in (N''\cup T)^*\). The simulation performs as follows:

$$\begin{aligned} (\alpha X\beta )_1 ~_{\Rightarrow _{r1.1}}^{\Leftarrow _{r2.2}} (\alpha rX\beta )_2 \Rightarrow _{r2.1} (\alpha r\beta )_3 \Rightarrow _{r3.1} (\alpha rY_1Y_2\beta )_2 \Rightarrow _{r2.2} (\alpha Y_1Y_2\beta )_1\,. \end{aligned}$$

Non-context-free rules \(f:AB\rightarrow \lambda \). This means that \(w=\alpha AB\beta \) and \(w'=\alpha \beta \) for some \(\alpha ,\beta \in (N\cup T)^*\). Within \(\varPi \), this can be simulated as follows.

$$\begin{aligned} (\alpha AB\beta )_1 ~^{\Leftarrow _{f2.2}}_{\Rightarrow _{f1.1}} (\alpha fAB\beta )_2 \Rightarrow _{f2.1} (\alpha fB\beta )_3 \Rightarrow _{f3.1} (\alpha f\beta )_2 \Rightarrow _{f2.2} (\alpha \beta )_1\,. \end{aligned}$$

Conversely, a derivation \((w)_1\Rightarrow '(w')_1\) has to start like \((w)_1\Rightarrow (w_1)_j\) in \(\varPi \). If \(j=1\), the applied rule is h1.1, then \(S'\) is deleted from w to obtain \(w'\), and this exactly corresponds to an application of the context-free rule \(S'\rightarrow \lambda \). More precisely, in the assumed derivation \((w)_1\Rightarrow ' (w')_1\), if some rule from C1 (other than h1.1) is applied to w, the rule will insert a rule marker into the string w and branch to C2. The introduction of rule markers in C1 will take care of the non-interference among the non-context-free and context-free rules. We now discuss the possibilities in detail. For \((w)_1\Rightarrow '(w')_1\), inspecting the rules tells us that there are various ways in which \(w=w'\) is possible. Clearly, then \(w'\) enjoys the same properties as w. So, assume \(w'\ne w\) from now on.

Applyingr1.1 to \(w_1 = (\alpha X\beta )_1\) for \(\alpha ,\beta \in (N'' \cup T)^*\) and \(X \in N'\), the only nonterminal from \(N'\), will insert a marker r randomly into the string \(w=\alpha X\beta \) yielding \((w_1)_2\). In C2, all rules are guarded by markers and the only applicable rules are r2.1 and r2.2 yielding a string \(w_2\). If the rule r2.2 is applied, then \(w_2=w\) is returned to C1, so that \(w'=w\), a ruled out case. Hence on applying r2.1, \(w_2 = \alpha r\beta \) is moved to C3. Again all rules are guarded by a marker and the only applicable rule is r3.1 which inserts \(Y_1Y_2\) into \(w_2\) to yield \(w_3 = \alpha rY_1Y_2\beta \) and moves back to C2. At this point, the rule r2.1 is not applicable since \(Y_1 \ne X\) by SGNF. The only other applicable rule to \(w_3\) is r2.2 which deletes the marker r in \(w_3\) and yields \(\alpha Y_1Y_2\beta = w'\) and moves to C1. Also, \(w \Rightarrow w'\) using the rule \(r: X \rightarrow Y_1Y_2\) of G. Note that \(w' \in (N'' \cup T)^*N'(N''\cup T)^*\).

Applyingf1.1 to \(w=\alpha AB\beta \), we obtain a string \(w_1\) by inserting f anywhere within w. \(w_1\) is transferred to component C2. If now f2.2 is applied, a string \(w_2\) would be moved back to C1 that can be described as undoing the insertion of f and this leads to our starting point, ruled out before. So, we can assume that f2.1 is applied. To make it applicable, \(w_1\) should be of the form \(\alpha fA\beta '\) to produce a string \(w_2=\alpha f\beta '\) that is moved to C3. Notice that possible strings \(w_2\) can be described as being obtained from w by replacing some occurrence of A in w by f. Now at C3, to make the rule f3.1 applicable to \(w_2=\alpha f\beta '\), \(w_2\) should be of the form \(\alpha fB\beta \) to produce a string \(w_3=\alpha f\beta \) that is moved back to C2. Notice that \(w_3\) can be described as being obtained from w by replacing some occurrence of AB in w by f. If f2.2 is applied to \(w_3\), then the marker f is deleted and the resultant string \(w_4=\alpha \beta \) is moved back to C1. This means that \((w)_1\Rightarrow '(w')_1\) implies \(w\Rightarrow w'\) in G as required. Recall that \(w'\) was obtained by first inserting f into w and then applying the rules f2.1, f3.1, and f2.2 in this order. So, if we would choose to apply f2.1 to \(w_3\) (instead of f2.2), this means that \(\beta =AB\beta ''\), and \(w_4=\alpha f\beta ''\). But then,

$$\begin{aligned} (w_3)_2 = (\alpha f\beta )_2 = (\alpha fAB\beta '')_2 \Rightarrow _{f2.1} (\alpha fB\beta '')_3 \Rightarrow _{f3.1} (\alpha f\beta '')_2 =(w_4)_2 \end{aligned}$$

is also possible, so we can arrive at the same situation by first finishing the simulation of one application of \(AB\rightarrow \lambda \) and then starting another such application simulation. Therefore, using the rule sequence f1.1, followed by \((f2.1 f3.1)^m\), and finalized by f2.2, in order to obtain \(w'\) from w in C1, simulates an m-fold application of \(f:AB\rightarrow \lambda \). Hence, \(w\Rightarrow ^m w'\) in G and \(w'\) satisfies the required properties.

In particular, once a marker r or f is introduced in C1, no rule in C2 can be applied if the appropriate marker is not present, because all rules are guarded. This also shows that, once the context-free rule simulation has started, it cannot continue with C2-rules stemming from the non-context-free rule simulation, nor vice versa. This shows that no malicious derivations are possible, so that each derivation (sub)sequence of \((w_i)_1\Rightarrow ' (w_{i+1})_1\) corresponds either to a non-context-free or to a context-free rule application.

The claimed size of \(\varPi \) can be verified by inspecting Fig. 7. Proposition 1 shows that path-structured GCID systems of size (3; 2, 1, 0; 1, 1, 0) are computationally complete. The underlying graph of the simulation can be seen in Fig. 1a and is hence a path.

3.3 Consequences in ins–del P systems

Graph-controlled insertion–deletion systems whose underlying control graph is a path are equivalent to insertion–deletion P systems. The components in the former system correspond to the membranes in the latter and the path structure in the former correspond to the balancedness of the parenthesis (that represent membranes) in the latter. The family of languages generated by ins–del P system with k membranes and size \((n,i',i'',m,j',j'')\), where the size parameters have the same meaning as in GCID system is represented by \(\mathrm {ELSP}_k(\mathrm {INS}_n^{i',i''}\mathrm {DEL}_m^{j',j''})\). This notation was used in [16], based on [30]. The results of Theorems 1 to 6 are summarized in the following corollary using the notation of [16, 30].

Corollary 1

For \(i', i'', j',j'' \in \{0,1\}\) with \(i'+i'' = j'+j''=1\), the following ins–del P systems are computationally complete.

  1. 1.

    (Thms 1) \(\mathrm {RE}= \mathrm {ELSP}_3(\mathrm {INS}_1^{i',i''}\mathrm {DEL}_1^{1,1})\).

  2. 2.

    (Thms 2, 3) \(\mathrm {RE}= \mathrm {ELSP}_4(\mathrm {INS}_1^{i',i''}\mathrm {DEL}_1^{j',j''})\).

  3. 3.

    (Thms 4) \(\mathrm {RE}= \mathrm {ELSP}_4(\mathrm {INS}_2^{0,0}\mathrm {DEL}_1^{j',j''})\).

  4. 4.

    (Thms 5, 6) \(\mathrm {RE}= \mathrm {ELSP}_3(\mathrm {INS}_2^{i',i''}\mathrm {DEL}_1^{j',j''})\). \(\square \)

How the above results improve on or complement the existing results in the domain of ins–del P system or path-structured GCID system is shown in Table 8. It is however open, to discuss one example only, if \(\mathrm {ELSP}_3(\mathrm {INS}_2^{0,0}\mathrm {DEL}_1^{1,0})\) equals \(\mathrm {RE}\) or how \(\mathrm {ELSP}_4(\mathrm {INS}_2^{0,0}\mathrm {DEL}_1^{1,0})\) and \(\mathrm {ELSP}_3(\mathrm {INS}_1^{1,0}\mathrm {DEL}_2^{0,0})\) relate to each other. Similar questions can be destilled from the subsequent tables, too.

Table 8 Results in insertion–deletion P systems

4 Further corollaries and summary

4.1 Consequences in GCID systems

Theorem 7

(i) \(\mathrm {GCID}(4;1,0,1;1,0,1)=\mathrm {RE}\); (ii) \(\mathrm {GCID}(4;1,0,1;1,1,0)=\mathrm {RE}\); (iii); \(\mathrm {GCID}(4;2,0,0;1,0,1)=\mathrm {RE}\); (iv) \(\mathrm {GCID}_P(4;1,0,1;2,0,0)=\mathrm {RE}\).

Proof

In [13], it was proved that GCID systems of sizes (4; 1, 1, 0; 1, 1, 0), (4; 1, 1, 0; 1, 0, 1), (4; 2, 0, 0; 1, 1, 0) and (4; 1, 1, 0; 2, 0, 0) equal \(\mathrm {RE}\) where the control graph of the system with the last size alone is a path. The theorem follows by Proposition 1. \(\square \)

Corollary 2

The following consequences are trivial.

  1. 1.

    [11] \(\mathrm {RE} = \mathrm {GCID}(3;1,1,0;1,1,0) = \mathrm {GCID}(3;1,0,1;1,0,1)\) implies

    1. (a)

      \(\mathrm {RE} = \mathrm {GCID}(3;2,1,0;1,1,0) = \mathrm {GCID}(3;2,0,1;1,0,1)\)

    2. (b)

      \(\mathrm {RE} = \mathrm {GCID}(3;1,1,0;2,1,0) =\mathrm {GCID}(3;1,0,1;2,0,1)\)

  2. 2.

    [11] \(\mathrm {RE} = \mathrm {GCID}(3;1,1,0;1,0,1) = \mathrm {GCID}(3;1,0,1;1,1,0)\) implies

    1. (a)

      \(\mathrm {RE} = \mathrm {GCID}(3;2,1,0;1,0,1)= \mathrm {GCID}(3;2,0,1;1,1,0)\)

    2. (b)

      \(\mathrm {RE} = \mathrm {GCID}(3;1,1,0;2,0,1) = \mathrm {GCID}(3;1,0,1;2,1,0)\)

  3. 3.

    [11] \(\mathrm {RE} = \mathrm {GCID}_P(5;1,1,1;1,0,0) = \mathrm {GCID}_P(5;1,0,0;1,1,1)\) implies

    1. (a)

      \(\mathrm {RE} = \mathrm {GCID}_P(5;2,1,1;1,0,0) = \mathrm {GCID}_P(5;1,0,0;2,1,1)\)

  4. 4.

    [11] \(\mathrm {RE} = \mathrm {GCID}_P(3;1,1,1;1,1,0) = \mathrm {GCID}_P(3;1,1,1;1,0,1)\) implies

    1. (a)

      \(\mathrm {RE} = \mathrm {GCID}_P(3;2,1,1;1,1,0) = \mathrm {GCID}_P(3;2,1,1;1,0,1)\)

  5. 5.

    (Thm. 1)\(\mathrm {RE} = \mathrm {GCID}_P(3;1,1,0;1,1,1) = \mathrm {GCID}_P(3;1,0,1;1,1,1)\) implies

    1. (a)

      \(\mathrm {RE} = \mathrm {GCID}_P(3;1,1,0;2,1,1) = \mathrm {GCID}_P(3;1,0,1;2,1,1)\)

  6. 6.

    (Thm. 2) \(\mathrm {RE} = \mathrm {GCID}_P(4;1,1,0;1,1,0) = \mathrm {GCID}_P(4;1,0,1;1,0,1)\) implies

    1. (a)

      \(\mathrm {RE} = \mathrm {GCID}_P(4;1,1,0;2,1,0) = \mathrm {GCID}_P(4;1,0,1;2,0,1)\)

  7. 7.

    (Thm. 3) \(\mathrm {RE} = \mathrm {GCID}_P(4;1,1,0;1,0,1) = \mathrm {GCID}_P(4;1,0,1;1,1,0)\) implies

    1. (a)

      \(\mathrm {RE} = \mathrm {GCID}_P(4;1,1,0;2,0,1) = \mathrm {GCID}_P(4;1,0,1;2,1,0)\)

Table 9 Analysis of the generative power of GCID system of sizes \((k;1,i',i'';1,j',j'')\)
Table 10 Analysis of the generative power of GCID system for \(n=1\) and \(m=2\)

5 Summary and open problems

In this paper, we focused on examining the computational power of graph-controlled ins–del systems with paths as control graphs, which naturally correspond to a variant of P systems. We lowered the resource requirements to describe all recursively enumerable languages. On fixing the parameters \(n,i',i'',m,j',j''\) in a GCID size \((k;n,i',i'';m,j',j'')\), we present in Tables 9, 10 and 11, the number of components k and the type of control graph, with which \(\mathrm {GCID}(k;n,i',i'';m,j',j'')\) describe RE, as our main focus of study was to reduce the number of components required for computational completeness results while keeping the ID size fixed. However, it is open whether these resource bounds are optimal.

Here we considered the underlying graph of \(\mathrm {GCID}\) systems to be path-structured only. One may also consider also tree structure, which may give additional power, especially to ins–del P systems. The resources used in the results of ins–del P systems need not be optimal since in ins–del P systems, each membrane can have initial strings and they all evolve in parallel which may reduce the size.

Table 11 Analysis of the generative power of GCID system for \(n=2\) and \(m=1\)

The reader may have noticed that we discussed in detail the case of insertion strings of length two, but a similar discussion for the case of deletion strings of length two is missing. We can trivially inherit some computational completeness results from the case when both insertion and deletion strings have length one only. However, it is open whether

$$\begin{aligned} \mathrm {RE} = \mathrm {GCID}_P(3;1,1,0;2,1,0) = \mathrm {GCID}_P(3;1,1,0;2,0,1), \end{aligned}$$

to state one concrete question in that direction. It is also interesting to observe that we never used the possibility to introduce finite sets of axioms in our simulations ; we only used singleton sets as sets of axioms. Also, only in two simulations (in this paper) we made use of the possibility to have strings of length greater than one as axioms. Is it always the case that these seemingly small modifications in the definition of GCID systems make no difference for the described language class?

In view of the connections with P systems, it would be also interesting to study Parikh images of (restricted) graph-controlled ins–del systems, as started out for matrix-controlled ins–del systems in [7]. This also relates to the macroset GCID systems considered in [6].

Furthermore, as it seems to be quite difficult to achieve computational completeness results for the remaining open cases, attention has now turned to the question of what kinds of known grammatical mechanism could be simulated with such weaker forms of (regulated) ins–del systems. We refer the readers to [9, 12] for first results in this direction in connection with graph control.