1 Introduction

Insertion-deletion systems are a computational model based on the operations of insertion and deletion of substrings in a string. Initially motivated on linguistic grounds, they more recently became quite popular as a theoretical model for DNA-based computations, as the basic operations fit well into this area. For further discussions on the history of this model, as well as giving insights into the rich literature of this area, we refer to [7, 14, 15].

In a nutshell, the rules of an insertion-deletion system (or ins-del system) can be of two types: insertion or deletion, i.e., either, a string is specified that may be inserted in a prescribed context within the current string, or it may be deleted relative to the context conditions. The potential biological meaning of such a rule should be clear. The main research question is under which restrictions can computational completeness results still be obtained. For instance, it is known [13] that for each recursively enumerable language (or \(\mathrm {RE}\) language for short), there exists an ins-del system where only single symbols are inserted or deleted, and the allowed context conditions (to the left or to the right) are again (at most) single symbols. However, if we disallow checking contexts both to the left and to the right, then not all RE languages can be described; cf. [14]. In such situations, several regulation mechanisms have been studied and shown to achieve computational completeness results. From the viewpoint of biocomputing, let us only mention ins-del P systems [8, 9], sometimes in disguise [4], tissue P systems with ins-del rules [10] and semi-conditional ins-del systems [6].

Meduna and Svec have reported on the use of several variants of context conditions in regulated rewriting in the textbook [11]. Here, (simple) semi-conditional rules are of particular importance. In the semi-conditional case, the conditions are sets of words and a rule can be applied if all words from its permitting condition are present and no word from the forbidden condition is present in the string. A semi-conditional grammar is said to be simple if each rule has only either a permitting condition or a forbidden condition. Let the maximum length of a string in the permitting and forbidden set be denoted by i and j, respectively; then the ordered pair (ij) is called the degree of the semi-conditional grammar. From a biological point of view, these conditions can be interpreted as global context conditions, as opposed to the local context conditions traditionally represented within the ins-del rules themselves.

Ivanov and Verlan initiated the study of semi-conditional ins-del systems in [6]. They proved that with degree (2, 2), inserting and deleting single symbols without any local context is sufficient to describe any RE language. Conversely, extending previous computational incompleteness results on non-regulated ins-del systems, it was shown in the same paper that ins-del systems that may insert or delete single symbols in one-sided single-symbol context are not able to describe the regular language \(\{ab\}^+\), assuming that these systems can also globally check for single symbols only, i.e., if they are of degree (1, 1).

No previous computational completeness results have been known for other degrees. This motivates the present study. We think that it might be possible to also globally check for the presence or absence of short molecular parts (strings) within biocomputational devices. Furthermore, we managed to cope with the already mentioned simple restriction on semi-conditional rules. Clearly, this additional restriction is a technical challenge. More specifically, we prove that simple semi-conditional ins-del systems of degree (2, 1) are computationally complete if (i) strings of length two may either be inserted or deleted without any local conditions, or (ii) only single symbols (with one-sided single-symbol local context) may be inserted, but strings of length two may be deleted without any local conditions, or (iii) only single symbols (with one-sided single-symbol local context) may be inserted and single symbols (with two-sided single-symbol local context) may be deleted. We finally present a trade-off result for systems of degree (3, 1).

2 Preliminaries

Let \(\mathbb {N}\) denote the set of non-negative integers, and \([1\ldots k]=\{i\in \mathbb {N}:1\le i\le k\}\). If \(\varSigma \) is an alphabet (finite set), then \(\varSigma ^*\) denotes the free monoid generated by \(\varSigma \). The elements of \(\varSigma ^*\) are called strings or words; \(\lambda \) denotes the empty string. The morphism from the monoid \(\varSigma ^*\) to \(\mathbb {N}\) (with addition), defined by \(a\mapsto 1\) for \(a\in \varSigma \) is called length of a word; usually, we write |w|. \(\varSigma ^{\le i}\) collects all words over \(\varSigma \) of length at most i. A word v is a subword of \(x\in \varSigma ^*\) if there are words uw such that \(x=uvw\). Let \( sub (x)\subseteq \varSigma ^*\) denote the set of all subwords of \(x\in \varSigma ^*\). We also use the shuffle operation to describe the effect of insertions at a random position in the string. \(w^R\) denotes the reversal of \(w\in \varSigma ^*\). For the computational completeness results, we are using the fact that type-0 grammars in SGNF are known to characterize the class \(\mathrm {RE}\) of recursively enumerable languages.

Definition 1

([5]). A type-0 grammar \(G=(N,T,P,S)\) is said to be in Special Geffert Normal Form, or SGNF for short, if N decomposes as \(N=N'\cup N''\), where \(N''=\{A,B,C,D\}\) and \(N'\) contains at least the two nonterminals S and \(S'\), the only non-context-free rules in P are the two erasing rules \(AB\rightarrow \lambda \) and \(CD \rightarrow \lambda \), the context-free rules are of the following forms:

$$X \rightarrow Yb \text { or }X\rightarrow bY, \text { where } X,Y\in N',~X \ne Y,~b\in T\cup N'', \text { or }S'\rightarrow \lambda .$$

The way the normal form is constructed is described in [5]. Also, the derivation of a string is done in two phases. In phase I, the context-free rules are applied repeatedly; this phase is completed by applying the rule \(S' \rightarrow \lambda \) in the derivation. In phase II, only the non-context-free erasing rules are applied repeatedly until a terminal string is reached. From its invention, this normal form turned out to be a very tool for proving computational completeness results for (regulated) ins-del systems.

Definition 2

([7, 12]). An insertion-deletion system, or ins-del system for short, is a construct \(\gamma =(V,T,A,R)\), where V is an alphabet, \(T \subseteq V\) is the terminal alphabet, A is a finite language over V, R is a finite set of triplets of the form \((u,\eta ,v)_{ins}\) or \((u,\delta ,v)_{del}\), where \((u,v) \in V^* \times V^*\), \(\eta ,\delta \in V^+\).

The pair (uv) is called the context, \(\eta \) is called the insertion string, \(\delta \) is called the deletion string and \(x \in A\) is called an axiom. If one of the u or v is \(\lambda \) for all the insertion (deletion) contexts, then we call the insertion (deletion) one-sided. If both \(u,v=\lambda \) for every insertion (deletion) rule, then it means that the corresponding insertion (deletion) can be done freely anywhere in the string and is called context-free insertion (context-free deletion). The descriptional complexity of an ins-del system is measured by its size \(s=(n,i',i'';m,j',j'')\), where the parameters represent resource bounds as given in Table 1.

Table 1. Parameters in the size of ins-del system.

Definition 3

([6]). A semi-conditional insertion-deletion system of degree (ij), \(i,j \ge 0\) is a construct \(\varPi = (V,T,A,R)\), where V is a finite alphabet, \(T \subseteq V\) is the terminal alphabet, \(A \subseteq V^*\) is a finite set of axioms, R is a finite set of rules of the form \([(u,s,v)_t,\mathcal {P},\mathcal {F}]\) where \(u, s, v \in V^*\), \(t\in \{ ins , del \}\), \(\mathcal {P,F}\) are finite subsets of \(V^*\).

The set \(\mathcal {P}\) is called the permitting set and \(\mathcal {F}\) is called the forbidden set. For clarity, we often use unique labels for rules, even identifying a rule with its label, i.e., if \(r\in R\) is a rule (label), then \(r: [(u_r,s_r,v_r)_{t_r},\mathcal {P}_r,\mathcal {F}_r]\). The ordered pair (ij) is called the degree of the semi-conditional ins-del system \(\varPi \) where i is the smallest integer such that \(\bigcup _{r\in R}\mathcal {P}_r\subseteq V^{\le i}\) and j is the smallest integer such that \(\bigcup _{r\in R}\mathcal {F}_r\subseteq V^{\le j}\). We write \(x \Rightarrow _r y\) if \(\mathcal {P}_r\subseteq sub (x)\) and \(\mathcal {F}_r\cap sub (x)=\emptyset \) and either

  1. 1.

    \(t_r= ins \) and \(x = x_1u_rv_rx_2,~y = x_1u_rs_rv_rx_2\), for some \(x_1,x_2 \in V^*\); or

  2. 2.

    \(t_r= del \) and \(x = x_1u_rs_rv_rx_2,~y = x_1u_rv_rx_2\), for some \(x_1,x_2 \in V^*\).

The language generated by a semi-conditional insertion-deletion system \(\varPi \) is

$$L(\varPi ) = \{w \in T^* \mid x \Rightarrow ^* w \text { for some } x \in A \}\,,$$

where \(\Rightarrow ^*\) is the reflexive and transitive closure of \(\Rightarrow =\bigcup _{r\in R}\Rightarrow _r\). The families of languages generated by semi-conditional insertion-deletion systems of degree at most (ij) having ID size at most \(s =(n,i',i'';m,j',j'')\) is denoted as \(\mathrm {SC}_{i,j}\mathrm {ID}(s)\). If, for each \(r\in R\), either \(\mathcal {P}_r=\emptyset \) or \(\mathcal {F}_r=\emptyset \), then the semi-conditional ins-del system is said to be simple. The families of languages generated by such simple semi-conditional insertion-deletion (denoted in short as SSCID) systems of degree at most (ij) and ID size at most s is denoted as \(\mathrm {SSC}_{i,j}\mathrm {ID}(s)\).

Example 1

Consider the non context-free language \(L_1=\{a^nb^{n}c^n \mid n \ge 1\}\). We construct a simple semi-conditional ins-del system \(\varPi \) of degree (1, 1) and ID size (3, 1, 1; 1, 0, 0) describing \(L_1\) as follows: \(\varPi = (\{A,B,a,b,c\}, \{a,b,c\}, \{abc\}, R)\) where the set of rules of R are given in Table 2.

Table 2. SSCID rules describing \(\{a^nb^nc^n \mid n \ge 1\}\).

We will now explain the working of the rules in Table 2. From the rules, we can see that r1 can be applied in the absence of B and r2 can be applied in the presence of A, thus, r1 has to be applied before r2 is applied. Note that in r1, as the contexts are a and b, once aAb is introduced between a and b, the rule r1 cannot (immediately) be applied again until A is deleted. Similarly, rule r2 cannot be applied for a second time unless B is deleted. Starting from the axiom abc, the only applicable rule is r1 which will results in aaAbbc. Now, r3 cannot be applied, as deleting A requires the presence of B and this symbol is not introduced yet. So, the only applicable rule is r2 which results in aaAbbBcc. Now, r4 cannot be applied as it requires the absence of A and A is still present in the derived string. The only applicable rule is hence r3 which deletes the A and then the only applicable rule is r4 which deletes the B and results to aabbcc. A sample derivation is given below for better understanding the system.

$$\begin{aligned} abc \Rightarrow _{r1} aaAbbc \Rightarrow _{r2} aaAbbBcc \Rightarrow _{r3} aabbBcc \Rightarrow _{r4} aabbcc=a^2b^2c^2. \end{aligned}$$

The above process is repeated and as the rules are applied in a deterministic manner, it is easy to see that \( L(\varPi ) = L_1\).    \(\square \)

Remark 1

The purpose of Example 1 is to explain how the system works and the size used in this example does not necessarily correspond to computational completeness results obtained in this paper. On the other hand, if a type-0 grammar (in SGNF) is given for \(L_1\), then \(L_1\) can be simulated by a simple semi-conditional ins-del system with the sizes that are shown in the computational completeness result.    \(\square \)

The results of this paper and a sketch on how they complement the existing results of [6] are given in Table 3.

Table 3. Comparing the results of [6] and this paper.

3 Main Results

In order to make some of our results simple, we claim the following, similar to other regulation mechanisms, as for example in [4].

Theorem 1

If \(s=(n,i',i'';m,j',j'')\) is some ID size and (ij) is some degree, then \(\mathrm {SSC}_{i,j}\mathrm {ID}(s)=[\mathrm {SSC}_{i,j}\mathrm {ID}(s')]^R\), with \(s'= (n,i'',i';m,j'',j')\), and moreover, \(\mathrm {SSC}_{i,j}\mathrm {ID}(s)=\mathrm {RE}\) if and only if \(\mathrm {SSC}_{i,j}\mathrm {ID}(s')=\mathrm {RE}\).

In order to show that simple semi-conditional ins-del systems of certain sizes describe RE, we make use of the fact that RE languages can be generated by grammars in Special Geffert Normal Form where the rules are of the type (i) \(p: X \rightarrow bY\) (ii) \(q: X \rightarrow Yb\) (iii) \(f: AB \rightarrow \lambda \) (iv) \(g: CD \rightarrow \lambda \) and (v) \(h: S' \rightarrow \lambda \), where \(p,q,f,g,h \in [1 \dots |P|]\) are labels associated with each type of rule of SGNF. We provide a simulation of these rules by rules of simple semi-conditional ins-del system. The simulation of type \(g: CD \rightarrow \lambda \) rules is similar to the simulation of f-type rules. Also, we always simulate the h type rule by \([(\lambda ,S',\lambda )_{del},\emptyset ,\mathcal {M}]\), with \(\mathcal {M}\in \{\mathcal {M}'',\mathcal {M}'''\}\) as defined below. Therefore, in the following proofs we mostly discuss the simulations of rules of type pqf and we let

$$\begin{array}{lcllcl} M &{}=&{} \{m \mid m \in [1 \dots |P|]\}, &{}\quad M' &{}=&{} \{m' \mid m \in [1 \dots |P|]\}, \\ M'' &{}=&{} \{m'' \mid m \in [1 \dots |P|]\}, &{}\quad M''' &{}=&{} \{m''' \mid m \in [1 \dots |P|]\},\\ \mathcal {M}''&{}=&{}M \cup M'\cup M'',&{}\quad \mathcal {M}'''&{}=&{}M \cup M'\cup M''\cup M'''. \end{array}$$

We first recall from [6] that \( \mathrm {SC}_{2,2}\mathrm {ID}(1,0,0;1,0,0) = \mathrm {RE}\). In the following we decrease the degree to (2, 1) and further make the system simple but at the cost of increasing the insertion and deletion lengths from one to two. The computational completeness of \(\mathrm {SSC}_{0,2}\mathrm {ID}(2,0,0;2,0,0)\) is under study.

Theorem 2

\(\mathrm {SSC}_{2,1}\mathrm {ID}(2,0,0;2,0,0)= \mathrm {RE}\).

Proof

Consider a type-0 grammar \(G=(N,T,P,S)\) in SGNF in which the rules of P are labelled uniquely by numbers \([1 \ldots |P|]\). We construct an SSCID system \(\varPi = (V,T,\{S\},R)\) of degree (2, 1) and ID size (2, 0, 0; 2, 0, 0) as follows such that \(L(\varPi ) = L(G)\). The alphabet of \(\varPi \) is \(V \subset N \cup T \cup \mathcal {M}'''\). The set of rules of R in \(\varPi \) is given as follows. (i) For every rule of type \(p: X \rightarrow bY\) in G, the simulating rules are stated in Fig. 1(a). (ii) For every rule of type \(q: X \rightarrow Yb\) in G, the simulating rules are stated in Fig. 1(b). (iii) Rules of type \(f: AB \rightarrow \lambda \) are simulated by the (SSC)ID rule \(f1 = [(\lambda ,AB,\lambda )_{del}, \emptyset , \emptyset ]\).

Fig. 1.
figure 1

Simulating context-free rules of SGNF by \(\mathrm {SSC}_{2,1}\mathrm {ID}(2,0,0;2,0,0)\).

We now proceed to prove that \(L(\varPi ) = L(G)\). We initially prove that \(L(G) \subseteq L(\varGamma )\) by showing that \(\varPi \) correctly simulates the application of the rules of the types pqf. We focus on the p rule simulation, as this is the most complicated one. The application of \(p:X\rightarrow bY\) to \(\alpha X \beta \) derives \(\alpha bY\beta =w\), which is correctly simulated by \(\varPi \) as follows:

$$\begin{array}{c} \alpha X\beta \Rightarrow _{p1} \alpha pp'X\beta \Rightarrow _{p2} \alpha p\beta \Rightarrow _{p3} \alpha bp''p\beta \Rightarrow _{p4} \alpha bYp'''p''p\beta \Rightarrow _{p5} \alpha bYp''p\beta \Rightarrow _{p6} w. \end{array}$$

Simulation idea: We insert strings of length two in a random manner, such that one symbol of it acts as a marker to stitch to the correct position in the string. The correct position is verified with permitting strings or deletion strings of length two, which verifies that the previously introduced string has been inserted only at a particular correct position. For example, \(pp'\) is randomly inserted by rule p1 and the rule p2 demands that this insertion happens to the left of the only non-terminal X present in the string. Similarly, the permitting string in p5 demands to have the substring \(p'''p''\) present in the string, thus \(Yp'''\) (see rule p4) is inserted between b and \(p''\) and \(bp''\) itself is inserted by rule p3. The forbidden strings in insertion rules prevent from using of the same rule again and also indirectly bring the order among the applications of the rules.

We now prove the converse inclusion \(L(\varPi ) \subseteq L(G)\) by showing that the rules stated in Fig. 1(a) can only be used in the intended way.

Consider a sentential form \(w_0 = \alpha X\beta \) derivable in \(\varPi \) and G, where \(X\in N'\) and \(\alpha ,\beta \in (N''\cup T)^*\). Notice that, from the perspective of G, we are (still) in phase I. The only applicable rule is p1 (or any other insertion rule r1 where the left-hand side of rule r ix X) since other insertion rules like p3 or p4 forbid the presence of any non-terminal of \(N'\). All deletion rules of Fig. 1 require the presence of rule markers (i.e. elements of \(\mathcal {M}'''\)), but \( sub (w_0)\cap \mathcal {M}'''=\emptyset \). On applying the rule p1, \(pp'\) is inserted anywhere in the string thus yielding , with \(pp'\in sub (w_1)\). We cannot apply any insertion rule r1, r3 or r4, as \(p'\in \mathcal {F}_{r1}\cap \mathcal {F}_{r3}\cap \mathcal {F}_{r4}\). In particular, this rules out repeated applications of p1. Also, we cannot apply rule h1 now, as here (and also in any of the further steps discussed below) some rule marker is present in the string. Hence, we must apply a deletion rule of Fig. 1 to \(w_1\). The application of any r5 or r6 requires \(r''\) to appear, which is not the case for \(w_1\). By the uniqueness of rule labels, the only applicable rule is p2 which actually fixes the position of \(pp'\) on the left of X, thereby deleting \(p'X\). Hence, we obtain a unique string \(w_2\) satisfying \(w_1\Rightarrow _{p2} w_2= \alpha p\beta \). Now, there is a choice in applying r3 or r4 for some rule r. We focus on \(r=p\) in the following, as this is the only possible fruitful continuation, as we will soon see. If p4 is applied to \(w_2\), we get and now p3 cannot be applied, as \(p'''\in sub(w_2')\cap \mathcal {F}_{p3}\).

The derivation is stuck, as no other rule can be applied. In particular, p5 is not applicable, since \(p'' \notin sub(w_2')\). Thus, the only applicable rule on \(w_2\) is p3 which inserts \(bp''\) randomly into \(w_2\) yielding , with \(bp''\in sub (w_3)\). The re-application of p3 on \(w_3\) is stopped since \(p''\) is a member of its forbidden set. On applying the only possible rule p4 on \(w_3\),Footnote 1 \(Yp'''\) is randomly inserted, resulting in , with \(Yp''',bp''\in sub (w_4)\). A careful case analysis reveals that now p5 is the only applicable rule.Footnote 2 Since p5 demands that \(p'''p''\in sub (w_4)\), this crucial rule application fixes several of our previous choices: (a) Recall that we could have applied any rule r3 (instead of p3) and any rule \(\bar{r}4\) (instead of p4). But if we would have chosen \(\bar{r}\ne r\), then the substring \(r'''r''\) would not be present in \(w_4\). We will see in the next step that only \(r=p\) is possible, which we will therefore use already in the following to avoid clumsy formulations. (b) Previously, we had the choice inserting \(Yp''',bp''\) anywhere into \(w_2\). However, \(p'''p''\in sub (w_4)\) ensures that \(Yp'''\) must have been inserted between b and \(p''\). Hence, we know that \(bYp'''p''\in sub (w_4)\). Now, \(w_4\Rightarrow _{p5} w_5\) yields \(bYp''\in sub (w_5)\). With symbols from \(M\cup M''\) being present in \(w_5\), we understand that only rule p6 is applicable. Also, the deletion operation fixes that the right-hand side bY introduced with rules r3 and r4 corresponds to that of p, as this deletion is only possible if \(r=p\). Similarly, \(bp''\) must have been inserted to the left of p due to \(p''p\in sub (w_5)\). Applying \(p_6\) on \(w_5\) deletes the markers \(p''p\), thus yielding \(w_6 = \alpha bY\beta \). This series of rule applications that yields \(w_6 = \alpha bY\beta \) from \(w_0 = \alpha X\beta \) corresponds to the rewriting rule \(X \rightarrow bY\) of G.

Consider now a sentential form \(w_0\) derivable both in \(\varPi \) and in G, with \(N'\cap sub (w_0)=\emptyset \). This means that the derivation of grammar G is in phase II. Hence, \(w_0=xyt\), where \(x\in \{A,C\}^*\), \(y\in \{B,D\}^*\), \(t\in T^*\). Clearly, if \(w_0\in T^*\), no further derivation is possible. If AB or CD are substrings of \(w_0\), we can (directly) apply f1 or g1, this way removing this substring as intended. Alternatively, we can apply r1 for some context-free rule r of G. As we have considered above, we would have to apply r2 next, but this is not possible due to the absence of symbols from \(N'\). Hence, any such attempt will get stuck.

By induction, the previous arguments (that basically present the induction steps) show that \(L(\varPi ) \subseteq L(G)\), thus proving the theorem.    \(\square \)

Next, we recall from [6] that \(\mathrm {SC}_{1,1}\mathrm {ID}(1,1,0;2,0,0) \ne \mathrm {RE}\). In the following we show that computational completeness can be achieved if we increase the degree of the system from (1, 1) to (2, 1), even when maintaining simplicity. The computational completeness of \(\mathrm {SSC}_{0,2}\mathrm {ID}(1,1,0;2,0,0)\) is open for investigation.

Theorem 3

\(\mathrm {SSC}_{2,1}\mathrm {ID}(1,1,0;2,0,0) = \mathrm {SSC}_{2,1}\mathrm {ID}(1,0,1;2,0,0) = \mathrm {RE}.\)

The reader might wonder why we could not deduce this result by sequentializing the construction of Theorem 2 or even by starting from a \(\mathrm {SSC}_{2,1}\mathrm {ID}(2,0,0;2,0,0)\) system. In fact, as long as special symbols like rule labels are introduced as in rule p1 in Fig. 1(a), where a string of two rule labels is inserted (in this example \(pp'\)) we might do the following. First, introduce the left one of them (in this example it is p) with the context conditions of the previous simulation (in this example it is \(\mathcal{M}'''\)), and then introduce the right one (in this example it is \(p'\)) in the context of the left one (in this example it is p). One can avoid repetitions by having this newly introduced marker (in this example it is \(p'\)) in the forbidden context. This trick can only work if we do not expect that this symbol (that we now check for not showing up in the string) may not already be present in the string. In our example we do not expect \(p'\) to be present before we introduced it, so we can sequentialize p1 in the described way. However, this expectation is not met, for instance, when trying to sequentialize rule p3 in Fig. 1(a) in a similar fashion. Here, we would need different ideas. In more general terms, this prevents us from starting out from a \(\mathrm {SSC}_{2,1}\mathrm {ID}(2,0,0;2,0,0)\) system in our simulation for proving the claimed computational completeness result for \(\mathrm {SSC}_{2,1}\mathrm {ID}(1,1,0;2,0,0)\). Hence, we now show a different simulation, starting from type-0 grammars in SGNF again.

Proof

Consider a type-0 grammar \(G=(N,T,P,S)\) in SGNF. The rules of P are labelled uniquely by numbers \([1 \ldots |P|]\). We construct an SSCID system \(\varPi = (V,T,\{S\},R)\) of degree (2, 1) and ID size (1, 1, 0; 2, 0, 0) as follows such that \(L(\varPi ) = L(G)\). The alphabet of \(\varPi \) is \(V \subset N \cup T \cup \mathcal {M}''\). The set of rules R of \(\varPi \) is given as follows: (i) For every rule of type \(p: X \rightarrow bY\) in G, the simulating rules are stated in Fig. 2(a), (ii) For every rule of type \(q: X \rightarrow Yb\) in G, the simulating rules are stated in Fig. 2(b), (iii) Rules of type \(f: AB \rightarrow \lambda \) is simulated by the SSCID rules \(f1 = [(\lambda ,AB,\lambda )_{del},\emptyset ,\emptyset ]\).

Fig. 2.
figure 2

Simulation of context-free rules of SGNF by \(\mathrm {SSC}_{2,1}\mathrm {ID}(1,1,0;1,0,0)\).

We first explain the idea behind the construction of q rule simulation in \(\varPi \) as follows. We introduce three markers \(q,q',q''\) in order to have \(qq'q''\) present in the string. The X of \(N'\) is deleted before \(q'\) is introduced. So, the effect of executing q1 through q4 is the same as that of applying the rewriting rule \(X\rightarrow qq'q''\). Then, Y is inserted in between q, \(q'\) and b is inserted in between \(q'\) and \(q''\). Note that b cannot be introduced for a second time, as the string will be having \(q'bq''\) and not \(q'q''\) (see rule q5). On deleting the markers, first q is deleted in the presence of the \(Yq'\) and \(bq''\) to ensure that Y and b are correctly introduced. Then, the markers \(q'\) and \(q''\) are deleted in this order. The order of deletion is important since otherwise, the rules q3 and/or q4 can be applied again and a malicious string can be obtained by using the rules q5 and/or q6.

One can show that \(L(G) \subseteq L(\varPi )\) by an inductive argument. The main point is to understand the simulation of a context-free rule, say, of type q:

$$\begin{array}{c} \alpha X\beta \Rightarrow _{q1} \alpha Xq\beta \Rightarrow _{q2} \alpha q\beta \Rightarrow _{q3} \alpha qq'\beta \Rightarrow _{q4} \alpha qq'q''\beta \Rightarrow _{q5} \\ \alpha qq'bq''\beta \Rightarrow _{q6} \alpha qYq'bq''\beta \Rightarrow _{q7} \alpha Yq'bq''\beta \Rightarrow _{q8} \alpha Ybq''\beta \Rightarrow _{q9} \alpha Yb\beta . \end{array}$$

To show the converse inclusion \(L(G) \supseteq L(\varPi )\), consider a string \(w_0\) derivable both in G and in \(\varPi \). We discuss possible derivations for \(w_0\) in \(\varPi \) and have to show that these either get stuck or correspond to derivation steps in G, which would then entail the claim by induction. Observe that any rules \(r_j\) for \(j>1\) require that \( sub (w_0)\cap \mathcal {M}''\ne \emptyset \), either by the permitting context, or because this is a requirement of the ins-del rules themselves. Hence, if \(N'\cap sub (w_0)=\emptyset \), i.e., the SGNF grammar G would work in phase II, we have to apply one of h1, f1, g1, which directly corresponds to an erasing rule of G.

Therefore, we now consider a sentential form \(w_0 = \alpha X\beta \) derivable in \(\varPi \) and G, where \(X\in N'\) and \(\alpha ,\beta \in (N''\cup T)^*\). The only applicable rules are some rules q1 that insert the marker q to the right of X, thus yielding \(w_1 = \alpha Xq\beta \). Notice that now (and also within the future discussions) always a marker from \(\mathcal {M}''\) is present in the string, which disables applying rule h1 prematurely. No rule r3 is applicable, as \(N'\cap sub (w_1)\ne \emptyset \). For any of the rules r4, r5, r6, r7, r8 to be applicable, \(M'\cap sub (w_1)\ne \emptyset \) is necessary, which is not the case. Similarly, r9 is not applicable. Hence, the only applicable rule is q2 which deletes X yielding the string \(w_2 = \alpha q\beta \). Again, none of the rules r4, r5, r6, r7, r8 is applicable, as \(M'\cap sub (w_1)=\emptyset \). The presence of the marker q disables r1 and r9. As \(N'\cap sub (w_1)=\emptyset \), no rule r2 is applicable. Due to the uniqueness of the rule labels, q3 is hence the only applicable rule, with \(w_2\Rightarrow _{q3} w_3=\alpha qq'\beta \). As \(q,q'\) are present in \(w_3\), any rule like r1, r3, r8, r9 is disabled. The absence of symbols from \(N'\cup M''\) disables applying r2, r4, r5, r7. Label uniqueness leaves us with applying either q4 or q6. Hence, if \(w_3\Rightarrow w_4\) in \(\varPi \), then \(w_4\in \{\alpha qq'q''\beta ,\alpha qYq'\beta \}\). If \(w_4=\alpha qYq'\beta \), a case analysis reveals that if \(w_4\Rightarrow w_5\) in \(\varPi \), then this must be due to applying \(q_4\), i.e., \(w_5=\alpha qYq'q''\beta \). Now, q5 is the only applicable rule, so that \(w_6=\alpha qYq'bq''\beta \) is enforced. Alternatively, on \(w_4=\alpha qq'q''\beta \), only rules q5 and q6 can apply. However, the order of application of q5, q6 does not matter, because if q5 is applied, then only q6 can be applied next, and vice versa. Hence, if \(w_4\Rightarrow w_5\Rightarrow w_6\) in \(\varPi \), \(w_6=\alpha qYq'bq''\beta \) is again enforced.

The presence of symbols from \(M,M',M''\) and \(N'\) in the substring \(qYq'bq'\) within \(w_6\) prevents applying any of the insertion rules, as well as of any r8 or r9. Because we can assume that \(X\ne Y\) in any rule \(q:X\rightarrow Yb\) or \(p:X\rightarrow bY\) of G, no rule r2 can be applied at this point. The only applicable rule on \(w_6\) is hence q7 which deletes the marker q, thus yielding \(w_7 = \alpha Yq'bq''\beta \). Let us stress that q7 could not have been applied at any earlier point, as it also checks that both \(Yq'\) and \(bq''\) are present within the sentential form. Following the application of q7, the rules q8, q9 are applied in a deterministic way which will delete the markers \(q', q''\), respectively, from \(w_7\) thus finally yielding \(w_9=\alpha Yb\beta \). A case-by-case analysis shows that no other rules are applicable within a derivation \(w_7\Rightarrow w_8\Rightarrow w_9\) within \(\varPi \). This series of rule applications yielding \(w_9 = \alpha Yb\beta \) from \(w_0 = \alpha X\beta \) corresponds to the rewriting rule \(X \rightarrow Yb\). The second claim \(\mathrm {SSC}_{2,1}\mathrm {ID}(1,0,1;2,0,0) = \mathrm {RE}\) follows now with Theorem 1.    \(\square \)

It is shown in [6] that \(\mathrm {SC}_{1,1}\mathrm {ID}(1,1,0;1,1,1) \ne \mathrm {RE}\). Analogous to the previous theorem, we show in the following that computational completeness of the system with ID (1, 1, 0; 1, 1, 1) can be achieved if we increase the degree of the systems from (1, 1) to (2, 1). We prove the result even for simple semi-conditional ins-del systems. Thus, the size in the following result is optimal. The computational completeness of \(\mathrm {SSC}_{0,2}\mathrm {ID}(1,1,0;1,1,1)\) is under investigation.

Theorem 4

\(\mathrm {SSC}_{2,1}\mathrm {ID}(1,1,0;1,1,1) = \mathrm {SSC}_{2,1}\mathrm {ID}(1,0,1;1,1,1) =\mathrm {RE}.\)

Proof

Consider a type-0 grammar \(G=(N,T,P,S)\) in SGNF. The rules of P are labelled uniquely by numbers \([1 \ldots |P|]\). We construct an SSCID system \(\varPi = (V,T,\{S\},R)\) of degree (2, 1) and ID size (1, 1, 0; 1, 1, 1) as follows such that \(L(\varPi ) = L(G)\). The alphabet of \(\varPi \) is \(V \subset N \cup T \cup \mathcal {M}''\). The set of rules R of \(\varPi \) is given as follows: (i) For every rule of type \(p: X \rightarrow bY\) in G, the simulating rules are stated in Fig. 2(a). (ii) For every rule of type \(q: X \rightarrow Yb\) in G, the simulating rules are stated in Fig. 2(b). (iii) Rules of type \(f: AB \rightarrow \lambda \) in G are simulated by rules as stated in Fig. 3.

Fig. 3.
figure 3

How to simulate \(f:AB\rightarrow \lambda \) by \(\mathrm {SSC}_{0,1}\mathrm {ID}(1,1,0;1,1,1)\)

We now proceed to prove that \(L(\varPi ) = L(G)\). We initially prove that \(L(G) \subseteq L(\varPi )\) by showing that \(\varPi \) correctly simulates the application of the rules of the types pqf. The working of the simulation rules for the cases p and q are already explained in Theorem 3. Hence, we now explain only the working of f.

The idea behind the construction of f rules is follows. We want to pin AB with the markers and to obtain a substring of the form \(fAf'Bf''\). Though f is inserted at random, the correct position of f insertion is taken care with rule f4. Rule f6 is applicable only when A is deleted, since \(f'\) is inserted to the right of A and f cannot be present to the left of \(f'\) unless A is deleted. As only one \(f'\) is present in between A and B in the string (see rules f4 and f5) this makes sure that the A and B that are next to each other only gets deleted. Also, as f4 and f5 have both left and right context for deleting, we cannot delete more than one A and one B. To delete \(f'\) the presence of \(f''\) is required which ensures the presence of B. Finally, the markers \(f',f''\) and f are deleted. Note that the permitting sets for all the rules in the simulation of f rule are empty.

: The rule \(f: AB \rightarrow \lambda \) of G is simulated by rules of \(\varPi \) as stated in Fig. 3 as follows:

$$\begin{array}{l} \alpha AB\beta \Rightarrow _{f1} \alpha fAB\beta \Rightarrow _{f2} \alpha fAf'B\beta \Rightarrow _{f3} \alpha fAf'Bf''\beta \Rightarrow _{f4} \\ \alpha ff'Bf''\beta \Rightarrow _{f5}\alpha ff'f''\beta \Rightarrow _{f6} \alpha ff''\beta \Rightarrow _{f7} \alpha f\beta \Rightarrow _{f8} \alpha \beta . \end{array}$$

By induction, this shows that \(L(G) \subseteq L(\varPi )\).

To show the reverse inclusion \(L(G) \supseteq L(\varPi )\), assume that \(w_0\) can be derived both in G and in \(\varPi \). Hence, \(w_0\in (N''\cup T)^*(N'\cup \{\lambda \})(N''\cup T)^*\). If \(N'\cap sub (w_0)\ne \emptyset \), from the perspective of G, we are still simulating phase I. We have to work through the explanations and case distinctions considered in Theorem 3 once more. A problem could arise if in a sentential form \(w_i\) considered in these discussions, \((\mathcal {M}''\cup N')\cap sub (w_i)= \emptyset \), as then rules like f1 become applicable. However, this is never the case, so that there is no danger in starting a simulation of an f- or g-rule prematurely (i.e., when still simulating phase I).

Hence, \(w_0\in (N''\cup T)^*\). If \(w_0\in T^*\), nothing remains to be shown. Hence, w.l.o.g., we consider a sentential form \(w_0 = \alpha AB \beta \) in \(\varPi \) (and in G), where \(A,B \in N''\) and \(\alpha , \beta \in (N'' \cup T)^*\). At first glance, it may seem that we could start the simulation with one of the three rules f1 or f2 or f3. If we apply f2 and f3 (in this sequence, as first applying f3 would block f2, and actually any derivation starting with f3 on \(w_0\) is immediately blocked), then the only applicable rule is f5 which will delete B between \(f'\) and \(f''\), yielding \(f'f''\) as a substring of some \(w'''\), with \(w_0\Rightarrow _{f2}w'\Rightarrow _{f3}w''\Rightarrow _{f5}w'''\). Alternatively, this process yielding \(w'''\) can be described by applying the rewriting rule \(B\rightarrow f'f''\) to \(w_0\). The marker f has neither been introduced earlier nor could be inserted later, because its insertion rule f1 demands absence of \(f', f''\) in particular. But, in the absence of f, it is impossible to delete the markers \(f',f''\) using the rules f6 and f7, respectively.

Hence, in order to make a productive move, we have to begin by applying rule f1 to \(w_0=\alpha AB\beta \), which randomly inserts the marker f. So, if \(w_0\Rightarrow _{f1}w_1\), then . Notice that f1 cannot be applied again on \(w_1\), nor can g1 be, as these rules require all rule marker symbols to be absent. This kind of reasoning reminds valid for the whole derivation that we are going to discuss, disabling unwanted premature starts of other simulations throughout. The only rules that are applicable on \(w_1\) are f2, f3, or f8. As applying f8 simply deletes the f marker introduced in the previous derivation step, this gives no overall progress, so that we can ignore this as an unnecessary detour of the derivation process. Now we apply rules f2 and f3 to \(w_1\) in order, as applying f3 first would lead to a blockage of the derivation. We remark here that it is possible that on applying f2, the marker \(f'\) may be placed after any occurrence of A in \(w_1\). Similar is the case with the application of rule f3 with respect to B. Hence in general, if \(w_0\Rightarrow _{f1}w_1\Rightarrow _{f2}w_2\Rightarrow _{f3}w_3\), then with \(Af',Bf''\in sub (w_3)\). By the forbidden context conditions, none of the insertion rules are applicable to \(w_3\). In order to apply f4, \(fAf'\in sub (w_3)\) is necessary, and in order to apply f5, \(f'Bf''\in sub (w_3)\). The only way to get rid of the introduced markers again is to apply f6, f7 and f8 (in this order). But before being able to apply f6, the substrings \(fAf'\) and \(f'Bf''\) of \(w_3\) have to be transformed to \(ff'\) and \(f'f''\), respectively, so that f4 and f5 have to be applied in any order. Hence, we find \(w_3\Rightarrow w_4\Rightarrow w_5\), with \(w_5\) could have been alternatively derived from \(w_0\) by applying the rewriting rule \(AB\rightarrow ff'f''\). Hence, \(w_5=\alpha ff'f''\beta \), because \(w_0=\alpha AB\beta \) was also derivable in G, and any such string contains the substring AB only in one place. It is not hard to see that f6 is the only applicable rule now. Application of the rules f6, f7, f8 in a deterministic manner (i.e., each time there is no other rule that applies, and there is only one location in the current string that may be transformed) finally yields \(w_8=\alpha \beta \). This series of rule applications, yielding \(w_8\) from \(w_0 = \alpha AB\beta \), corresponds to applying the rewriting rule \(AB \rightarrow \lambda \) of G. By induction, the claim \(L(\varPi )\subseteq L(G)\) follows.

Theorem 1 now entails \(\mathrm {SSC}_{2,1}\mathrm {ID}(1,0,1;1,1,1) =\mathrm {RE}.\)    \(\square \)

In the previous theorem, the insertion had one-sided context and deletion had both the left and right contexts. In this case computational completeness was achieved with degree (2, 1). If we further wish to have one-sided context for deletion as well, then computational completeness is achieved with increasing the degree to (3, 1). These are the first \(\mathrm {RE}\) results ever for degree (3, 1).

Theorem 5

\(\mathrm {SSC}_{3,1}\mathrm {ID}(1,1,0;1,1,0) = \mathrm {SSC}_{3,1}\mathrm {ID}(1,0,1;1,0,1) = \mathrm {RE}.\)

Proof

The proof is very similar to the previous one. We will first show that \(\mathrm {SSC}_{3,1}\mathrm {ID}(1,1,0;1,1,0) = \mathrm {RE}.\) The second part then follows from Theorem 1.

Consider a type-0 grammar \(G=(N,T,P,S)\) in SGNF. The rules of P are labelled uniquely by numbers \([1 \ldots |P|]\). We construct an SSCID system \(\varPi = (V,T,\{S\},R)\) of degree (3, 1) and ID size (1, 1, 0; 1, 1, 0) as follows such that \(L(\varPi ) = L(G)\). The alphabet of \(\varPi \) is \(V \subset N \cup T \cup \mathcal {M}''\). The set of rules R of \(\varPi \) is given as follows: (i) For every rule of type \(p: X \rightarrow bY\) in G, the simulating rules are stated in Fig. 2(a). (ii) For every rule of type \(q: X \rightarrow Yb\) in G, the simulating rules are stated in Fig. 2(b). (iii) Rules of type \(f: AB \rightarrow \lambda \) in G are simulated as stated in Fig. 4(a). The idea behind the construction of f rules is very similar to the working of the rules in Fig. 3 and hence omitted. However we now highlight the difference in the two simulations (stated in Figs. 3 and 4(a)). Rules f1, f2, f3, f7, f8 in both the simulations are the same. If rules f4, f5, f6 of the former simulation deletes a symbol say \(\alpha \) between the contexts \(c_1\) and \(c_2\) using the deletion rule \((c_1,\alpha ,c_2)_{del}\), then the same is taken care by the rules f4, f5, f6 (respectively) of the latter simulation by their permitting string \(c_1\alpha c_2\). Some formal arguments are presented below.

: The intended derivation is the same as the one given in Theorem 4. This already shows that \(L(G) \subseteq L(\varPi )\) by induction.

To show the reverse inclusion, we consider a sentential form \(w_0 = \alpha AB \beta \) in \(\varPi \), where \(A,B \in N''\) and \(\alpha , \beta \in (N'' \cup T)^*\). As in the proof of Theorem 4, we end up applying f1, f2, f3, in this order, to arrive at with \(Af',Bf''\in sub (w_3)\). The only way to continue is to apply rules f4 or f5 (in any order), with f4 guaranteeing that \(fAf'\in sub (w_3)\) and with f5 guaranteeing that \(f'Bf''\in sub (w_3)\). Altogether, if f4 and f5 could have been applied, then \(w_3=\alpha fAf'Bf''\beta \), as there is only one position in \(w_0\) where the substring AB could occur. Now, \(w_3\Rightarrow _{f4}w_4\Rightarrow _{f5}w_5=\alpha ff'f''\beta \), and the same result is obtained when first applying f5 and then f4. A simple case analysis shows that only f6 is applicable now, yielding \(w_6=\alpha ff''\beta \). From this point on, the argument continues again as in Theorem 3.    \(\square \)

Theorem 6

\(\mathrm {SSC}_{3,1}\mathrm {ID}(1,1,0;1,0,1) = \mathrm {SSC}_{3,1}\mathrm {ID}(1,0,1;1,1,0) = \mathrm {RE}.\)

Proof

Along with the simulations presented in Figs. 2(a) and (b), we present a simulation of f rule in Fig. 4(b) which is a reflection of the simulation stated in Fig. 4(a) in order to prove that \(\mathrm {SSC}_{3,1}\mathrm {ID}(1,1,0;1,0,1)=\mathrm {RE}\) and hence we are not giving a formal proof. The claim \(\mathrm {SSC}_{3,1}\mathrm {ID}(1,0,1;1,1,0)=\mathrm {RE}\) again follows with Theorem 1.    \(\square \)

Fig. 4.
figure 4

Simulation of the rule \(f:AB \rightarrow \lambda \)

4 Conclusion and Future Work

In this paper, we introduced the mechanism of simple semi-conditional restrictions on the application of rules of ins-del systems. We described recursively enumerable languages with simple semi-conditional ins-del systems of degrees (2, 1) and (3, 1), as shown in Table 3, ignoring symmetric results obtainable from Theorem 1. We list below some most challenging problems in this area.

  • While Ivanov and Verlan could prove that semi-conditional ins-del systems of degree (2, 2) and ID size (1, 0, 0; 1, 0, 0) are computationally complete, it is open if simple semi-conditional ins-del systems of degree (2, 2) and ID size (1, 0, 0; 1, 0, 0) characterize \(\mathrm {RE}\).

  • Again, Ivanov and Verlan could prove that semi-conditional ins-del systems of degree (1, 1) and ID size (2, 0, 0; 1, 1, 0) are computationally complete, but even with degree (2, 1), it is unclear whether simple semi-conditional ins-del systems of this size characterize \(\mathrm {RE}\).

  • With more limited resources, it seems to be difficult if not impossible to characterize \(\mathrm {RE}\). In such situations, it would be good to see if we can at least describe all context-free languages or nice sub-classes thereof, as attempted in similar situations in [1,2,3].

We also pose the following, a more general, open problem for further study: Given the degree (ij) satisfying \(i,j \ge 1\) and \(3 \le i+j \le 4\), with what sizes does a simple semi-conditional ins-del system characterize \(\mathrm {RE}\)?