Keywords

1 Introduction

Context-free (CF) grammars are extensively studied since they serve as formal models in many areas of computer science. One of their good properties is that their membership problem is efficiently solvable. These grammars were invented by Noam Chomsky to describe the structures of words in sentences of natural languages. However, it turned out that certain natural languages contain phenomena such as cross-serial dependencies, that cannot be handled by CF grammars (see e.g. [12]). The more powerful context-sensitive (CS) grammars are able to model cross-dependencies, but the membership problem for them is already PSPACE-complete. In [11] Joshi gave several properties that a grammar should have in order to be able to model natural languages. These properties are the ability to handle limited cross-serial dependencies, the constant growth of the associated language, and the polynomial time solvability of the membership problem. Grammars satisfying these properties are called mildly context-sensitive grammars.

One way to enrich CF grammars with context sensitivity and, in turn, raise their generative power is to control their derivations by context conditions. For example, in random context grammars (RCG’s) [18] two sets of nonterminals, a permitting P and a forbidding one Q, are associated to every context-free rule. Then a rule is applicable, if it is applicable in the context-free sense and nonterminals in Q do not occur, while every nonterminal in P does occur in the current sentential form. If in an RCG each rule is associated with an empty forbidding set (resp. permitting set), then the grammar is called a permitting (resp. forbidding) RCG.

It turned out that RCG’s with erasing rules have equal power to that of Turing machines, thus recently a restricted variant of them was introduced and investigated [15]. In these grammars the permitting and forbidding sets are associated to the nonterminals rather than to the rules. Moreover, one of these sets is always a singleton and the other one is empty. We will call these grammars restricted random context grammars (rRCG’s) in this paper. It turned out that even with this very limited ability of controlling the derivations these grammars are equivalent to random context grammars [2, 15]. Moreover, permitting rRCG’s are as powerful as permitting RCG’s [2], and this is the case for the forbidding variants too if erasing rules are allowed [8].

In [16] a variant of random context grammars, called semi-conditional grammars (SCG’s) were introduced. In these grammars every rule r is associated with two words, a permitting word \(v_1\) and a forbidding one \(v_2\), and r is applicable only if \(v_1\) is a subword of the sentential form, but \(v_2\) is not. Moreover, an SCG G is of degree (ij) if the length of its permitting words is at most i and that of the forbidding words is at most j. It was shown in [16] that SCG’s without erasing rules and with degree (1, 2) or (2, 1) have equal power to that of CS grammars. This clearly means that these grammars are too powerful to meet all the conditions of mild context-sensitivity. It turned out in [16], on the other hand, that these grammars with degree (1, 1) cannot generate all languages in CS. The invention of SCG’s was motivated by the grammars of Kelemen [13], where only a permitting word was associated to each rule (we call these grammars permitting semi-conditional grammars in this paper). In [16] it remained an open question whether permitting SCG’s can generate all CS languages. In this paper we show that there is a CS language that cannot be generated by any permitting RCG if the use of erasing rules is not allowed. Some results concerning grammars mentioned in this introduction are given in Fig. 1.

Fig. 1.
figure 1

A comparison of the power of some variants of grammars mentioned in the introduction. Arrows with solid lines represent strict inclusions, while arrows with dashed lines indicate inclusions which are not known to be strict. References to the presented equalities or strict inclusions are also given. Inclusions represented by dashed lines follow from definitions. \(\mathrm {SCG}_\lambda \), \(\mathrm {RCG}_\lambda \), and \(\mathrm {rRCG}_\lambda \) (resp. \(\mathrm {SCG}\), \(\mathrm {RCG}\), and \(\mathrm {rRCG}\)) denote the classes of the corresponding grammars with erasing rules (resp. with no erasing rules). For a class of grammars \(\mathrm {C}\), \(\mathcal{L}(\mathrm {C})\) denotes the class of languages generated by grammars in \(\mathrm {C}\), and \(\mathrm {pC}\) (resp. \(\mathrm {fC}\)) denotes that subclass of \(\mathrm {C}\), where only permitting (resp. forbidding) context conditions are used.

The proof of our result is based on a pumping lemma similar to the one presented in [6], where it was shown that the class of languages generated by permitting RCG’s with no erasing rules is strictly included in the class of languages generated by RCG’s. In the proof of the pumping lemma in [6] the following property was essential: sufficiently long derivations of a permitting RCG with no erasing rules always contain two sentential forms \(\alpha \) and \(\beta \) such that \(\beta \) is derived from \(\alpha \) and, for every nonterminal A, \(|\alpha |_A\le |\beta |_A\) (here \(|\alpha |_A\) and \(|\beta |_A\) denote the number of occurrences of A in \(\alpha \) and \(\beta \), respectively). This property follows from Dickson’s lemma [4] which states that any infinite sequence \(v_1,v_2,\ldots \) of n-vectors over the natural numbers contains an infinite sub-sequence \(v_{i_1}\le v_{i_2}\le \ldots \), where \(\le \) is the componentwise ordering of n-vectors.

In the proof of our pumping lemma (Lemma 2) we need to find such sentential forms \(\alpha \) and \(\beta \) in a derivation of a permitting SCG G that satisfy a stronger condition: if u is a permitting word of G, then \(\beta \) should contain at least as many occurrences of u as the number of these is in \(\alpha \). To do so we will use Higman’s lemma [10], which ensures that in any infinite sequence \(v_1,v_2,\ldots \) of words, there is an infinite subsequence \(v_{i_1}\le _s v_{i_2}\le _s \ldots \), where \(\le _s\) is the subsequence (or scattered subword) relation. However, to find an appropriate \(\alpha \) and \(\beta \) we cannot apply directly Higman’s lemma to the sentential forms of a derivation, but rather to certain carefully defined words obtained from these sentential forms.

The paper is organized as follows. First, we introduce the necessary notions and notations. Then, in Sect. 3 we present the main result of the paper. Finally, we give some concluding remarks in Sect. 4. Due to space reasons, some proofs are omitted in the paper. The interested reader can find them in [9].

2 Preliminaries

We define here the necessary notions, however we assume that the reader is familiar with the basic concepts of the theory of formal languages. For a comprehensive guide we refer to [17]. An alphabet \(\varSigma \) is a finite, nonempty set of symbols whose elements are also called letters. Words over \(\varSigma \) are finite sequences of letters in \(\varSigma \). As usual, \(\varSigma ^*\) denotes the set of all words over \(\varSigma \) including the empty word \(\varepsilon \). For a letter \(a\in \varSigma \) and a word \(u\in \varSigma ^*\), |u| denotes the length of u and \(|u|_a\) is the number of occurrences of a in u. \(\mathbb {N}\) denotes the set of natural numbers. For \(n,m\in \mathbb {N}\), \(n<m\), [nm] denotes the set \(\{n,n+1,\ldots ,m\}\). If \(n=1\), then [nm] is denoted by [m]. The set of positions in u (\(\mathrm {pos}(u)\) for short) is [|u|].

Let \(u\in \varSigma ^*\). A word v is a scattered subword of u, if v can be obtained from u by erasing some (possibly zero) letters. Moreover, v is a subword of u if there are words \(u_1,u_2\in \varSigma ^*\) such that \(u=u_1vu_2\). Let \(i\in \mathrm {pos}(u)\) and \(m\ge 1\) be such that \(i+m-1\in \mathrm {pos}(u)\). Then \(\mathrm {subw}(u,i,m)\) denotes that subword of u which starts on the ith position and has length m. It will always be clear from the context whether we consider an arbitrary subword of u or that one which starts on a certain position. Those subwords of u that have length m are also called m-subwords. The subsequence relation \(\le _s\) over \(\varSigma ^*\) is a binary relation defined as follows. For \(u,v\in \varSigma ^*\), \(u\le _s v\), if u is a scattered subword of v. Let \(f:[k]\rightarrow [l]\) \((k,l\ge 1)\) be a (partial) function. The domain and range of f, denoted by dom(f) and ran(f), respectively, are defined as follows: \(dom(f)=\{i\in [k]\mid \exists j\in [l]: f(i)=j\}\) and \(ran(f)=\{i\in [l]\mid \exists j\in [k]: f(j)=i\}\). If \(I\subsetneq [k]\), then \(f|_I\) denotes the restriction of f to I. Let \(u,v\in \varSigma ^*\) and \(f:\mathrm {pos}(v)\rightarrow \mathrm {pos}(u)\) be a (partial) function. If, for every \(i\in dom(f)\), \(\mathrm {subw}(v,i,1)=\mathrm {subw}(u,f(i),1)\), then we call f letter-preserving. A well-quasi-ordering (wqo for short) on a set S is a reflexive, transitive binary relation \(\le \) such that any infinite sequence \(a_1,a_2,\ldots \) \((a_i\in S, i\ge 1)\) contains a pair \(a_j\le a_k\) with \(j<k\). The following result is due to [10] (see also [14]).

Proposition 1

Let \(\varSigma \) be an alphabet. Then \(\le _s\) is a wqo on \(\varSigma ^*\). Consequently, for every infinite sequence \(u_1,u_2,\ldots \) \((u_i\in \varSigma ^*, i\ge 1)\), there is an infinite subsequence \(u_{i_1}\le _s u_{i_2}\le _s\ldots \).

A semi-conditional grammar (SCG for short) is a 4-tuple \(G=(V,\varSigma , R, S)\), where V and \(\varSigma \) are alphabets of the nonterminal and terminal symbols, respectively (it is assumed that \(V\cap \varSigma =\emptyset \)), \(S\in V\) is the start symbol, and R is a finite set of production rules of the form \((A\rightarrow \alpha ,p,q)\), where \(A\in V, \alpha \in (V\cup \varSigma )^+\) (that is \(A\rightarrow \alpha \) is a usual non-erasing context-free rule), and \(p,q\in (V\cup \varSigma )^*\). For such a rule r, the words p and q are called the permitting and forbidding contexts of r, respectively. The right-hand side of r (denoted by \(\mathrm {rhs}(r)\)) is \(\alpha \). We will often denote \(V\cup \varSigma \) by \(V_G\). The derivation relation \(\Rightarrow _G\) of G is defined as follows. For every word \(u_1,u_2, \alpha \in V_G^*\) and \(A\in V\), \(u_1Au_2\Rightarrow _G u_1\alpha u_2\) if and only if there is a rule \((A\rightarrow \alpha , p,q)\in R\) such that (i) p is a subword of \(u_1Au_2\), and (ii) if \(q\not =\varepsilon \), then q is not a subword of \(u_1Au_2\). We will often write \(\Rightarrow \) instead of \(\Rightarrow _G\) when G is clear from the context. As usual, the reflexive, transitive closure of \(\Rightarrow \) is denoted by \(\Rightarrow ^*\) and the language generated by G is \(L(G)= \{u\in \varSigma ^*\mid S\Rightarrow ^* u\,\}\). A word \(\alpha \in V_G^*\) is called a sentential form of G (or just a sentential form if G is clear from the context). A derivation der from \(\alpha \) to \(\beta \) is a sequence \(\alpha _1\Rightarrow \alpha _2\Rightarrow \ldots \Rightarrow \alpha _{n+1}\) of sentential forms, for some \(n\ge 0\) such that \(\alpha _1=\alpha \) and \(\alpha _{n+1}=\beta \). The length of der (denoted by |der|) is n. Let \(\alpha \) and \(\beta \) be sentential forms and der a derivation from \(\alpha \) to \(\beta \). The sentential form vector of der (denoted by \(\mathrm {sfv}(der)\)) is \((u_1,\ldots ,u_k)\) \((k=|\alpha |, u_i\in V_G^*, i\in [k]\)), such that \(\beta =u_1\ldots u_k\) and, for every \(i\in [k]\), \(u_i\) is derived from \(\mathrm {subw}(\alpha ,i,1)\). Let \(der: \alpha _1\Rightarrow \ldots \Rightarrow \alpha _{n}\) \((n\ge 1)\) and \(der':\alpha _n\Rightarrow \ldots \Rightarrow \alpha _{k}\) \((k\ge n)\). Then \(der\,der'\) denotes the derivation \(\alpha _1\Rightarrow \ldots \Rightarrow \alpha _{k}\).

If, for every rule \((A\rightarrow \alpha ,p,q)\) in R, \(q=\varepsilon \), then G is a permitting SCG, or a pSCG for short. Let \(G=(V,\varSigma ,R,S)\) be a pSCG. The set of permitting contexts of G is \(pw(G)=\{p\mid (r,p,q)\in R\}\) and \(\mathrm {max}_{pw(G)}=\mathrm {max}\{|u|\mid u\in pw(G)\}\). We denote by \(\mathcal{L}(\mathrm {pSCG})\) and \(\mathcal{L}(\mathrm {CS})\) the families of languages generated by pSCG’s and context-sensitive grammars, respectively.

3 The Main Result

Here we show that pSCG’s are strictly weaker than context sensitive grammars by proving that the language \(L=\{a^{2^{2^n}}\mid n\ge 0\}\) cannot be generated by any pSCG (Theorem 4). The proof, roughly, consists of the following main steps. First we define the notion of m-embedding (Definition 1). Intuitively, a word \(\alpha \) can be m-embedded to a word \(\beta \), if there is an injective mapping of the m-subwords of \(\alpha \) to the m-subwords of \(\beta \) such that this mapping preserves the order of these words and satisfies certain additional conditions. Then we show that if a pSCG \(G=(V,\varSigma ,R,S)\) with \(m=\mathrm {max}_{pw(G)}\) has a derivation der from \(\alpha \) to \(\beta \) \((\alpha , \beta \in V_G^*)\) such that \(|\alpha |< |\beta |\) and \(\alpha \) can be m-embedded to \(\beta \), then the der can be “pumped” so that the obtained derivation is a valid derivation of G (cf. Lemma 2, which we will often refer to as our pumping lemma). Finally, we show that sufficiently long derivations of G always contain sentential forms \(\alpha \) and \(\beta \) such that \(\alpha \) can be m-embedded to \(\beta \) (Lemma 3). To this end we will use the fact that \(\le _s\) is a wqo on \(V_G^*\).

Definition 1

Let \(\varSigma \) be an alphabet, \(\alpha ,\beta \in \varSigma ^*\), \(k=|\alpha |\), \(l=|\beta |\), and \(m\ge 1\). An m-embedding of \(\alpha \) to \(\beta \) is a strictly increasing function \(g:[k-m+1]\rightarrow [l]\) such that the following (partial) mapping \(f:\mathrm {pos}(\beta )\rightarrow \mathrm {pos}(\alpha )\) is letter-preserving and well defined: for every \(i\in [k-m+1]\) and \(\kappa \in [0,m-1]\), \(f(g(i)+\kappa )=i+\kappa \). If g is an m-embedding, then the above f is denoted by \(\mathrm {inv}_m(g)\). Moreover, if an m-embedding of \(\alpha \) to \(\beta \) exists, then we denote this by \(\alpha \,{{\rightsquigarrow }_{m}}\, \beta \).

Example 1

Here we give two examples to demonstrate the notion of m-embedding.

  1. (1)

    Let \(\alpha =BAAB\) and \(\beta =BAAAB\). Any 3-subword of \(\alpha \) is a subword of \(\beta \), too. Due to the letter-preserving property, only the following g can be a 3-embedding of \(\alpha \) to \(\beta \): \(g(1)=1\) and \(g(2)=3\). The mapping \(f=\mathrm {inv}_m(g)\) is letter-preserving, but not well defined. Indeed, with \(i=1\) and \(\kappa =2\) we get \(f(g(i)+\kappa )=f(3)=3\), while with \(i=2\) and \(\kappa =0\), \(f(g(i)+\kappa )=f(3)=2\). This implies that there is no 3-embedding of \(\alpha \) to \(\beta \).

  2. (2)

    Let \(\alpha =ABBAC\), \(\beta =AABBAABAC\) and g be the following strictly increasing function: \(g(1)=2, g(2)=3\), and \(g(3)=7\). Then the mapping \(f=\mathrm {inv}_m(g)\) is letter-preserving and well defined: \(f(i)=i-1\) \((i\in [2,5])\) and \(f(i)=i-4\) \((i\in [7,9])\). Thus g is a 3-embedding of \(\alpha \) to \(\beta \).

The following properties of m-embeddings will be useful in what follows.

Proposition 2

Let \(\varSigma \) be an alphabet, \(m\ge 1\), and \(\alpha ,\beta \in \varSigma ^*\). Assume that g is an m-embedding of \(\alpha \) to \(\beta \) and \(f=\mathrm {inv}_m(g)\). Then the following statements hold.

  1. (i)

    For every \(i\in \mathrm {pos}(\alpha )\), \(|\{t\in \mathrm {pos}(\beta )\mid f(t)=i\}|\le m\).

  2. (ii)

    If \(|\alpha |=|\beta |\), then \(\alpha =\beta \).

  3. (iii)

    If \(i,j\in pos(\alpha )\) with \(j=i+1\), then \(g(j)-g(i)=1\) or \(g(j)-g(i)\ge m\).

Proof

See [9].

We will also need the following operation which inserts words into certain positions of a word. Let \(\varSigma \) be an alphabet and \(\alpha =X_1\ldots X_k\) \((k\ge 1, X_i\in \varSigma , i\in [k])\). Let moreover \(u_1,\ldots ,u_l\in \varSigma ^*\) and \(f:[k]\rightarrow [l]\) be a (partial) function. The substitution of \(\mathbf {u}=(u_1,\ldots , u_l)\) into \(\alpha \) by f (denoted by \(\mathrm {subst}(\mathbf {u},\alpha ,f)\)) is the word \(\beta =v_1\ldots v_{k}\), where \(v_i\) \((i\in [k])\) is defined as follows. If f(i) is defined, then let \(v_i=u_{f(i)}\), and let \(v_i=X_i\) otherwise.

Sometimes we will need to extend a function f used in a substitution. An extension of f (with respect to \(\alpha \)) is a function \(\hat{f}\) defined as follows. For every \(i\in dom(f)\), \(\hat{f}(i)=f(i)\), and for every \(i\in [k]-dom(f)\), \(\hat{f}\) is either undefined or defined as follows: if there is a \(j\in dom(f)\) such that \(\mathrm {subw}(\alpha ,i,1)=\mathrm {subw}(\alpha ,j,1)\), then take such a j and let \(\hat{f}(i)=f(j)\). Notice that f is always an extension of itself.

Example 2

Let \(\varSigma =\{A,B,C\}\), \(\alpha =ABCBB\) and \(\mathbf{u}=(u_1,u_2,u_3)\), where \(u_1=AA, u_2=ABC, u_3=CC\). Let furthermore \(f:[5]\rightarrow [3]\) be the following partial function. \(f(2)=f(3)=3\), \(f(5)=1\). Then \(\mathrm {subst}(\mathbf{u},\alpha ,f)=Au_3u_3Bu_1=ACCCCBAA\) and f has two possible extensions other than f. \(\hat{f}(1)\) is undefined. \(\hat{f}(2)=\hat{f}(3)=3\), \(\hat{f}(5)=1\) and \(\hat{f}(4)\) is either 1 or 3 resulting in \(\mathrm {subst}(\mathbf{u},\alpha ,\hat{f})\) equal to \(AC^4A^4\) or \(AC^6A^2\), respectively.

The following lemma will be crucial in the proof of our pumping lemma.

Lemma 1

Let \(G=(V,\varSigma ,R,S)\) be a pSCG, \(m=\mathrm {max}_{pw(G)}\), and \(\alpha , \alpha ', \beta \in V_G^*\). Assume that \(\alpha \Rightarrow _G^*\alpha '\) and \(\alpha \,{{\rightsquigarrow }_{m}}\, \beta \). Let der be a derivation from \(\alpha \) to \(\alpha '\), g an m-embedding of \(\alpha \) to \(\beta \), and \(f=\mathrm {inv}_m(g)\). Then, for every extension \(\hat{f}\) of f, \(\beta \Rightarrow _G^*\beta _{\hat{f}}\), where \(\beta _{\hat{f}}=\mathrm {subst}(\mathrm {sfv}(der),\beta ,\hat{f})\).

Proof

Let \(\beta '=\mathrm {subst}(\mathrm {sfv}(der),\beta ,f)\). We first show that \(\beta \Rightarrow _G^*\beta '\) by induction on \(n=|der|\). If \(n=0\), then one can see that \(\beta '=\beta \), and thus the statement trivially holds. Assume that it holds for n. We prove it for \(n+1\). In this case der can be written as \(der=der_1der_2\), where \(der_1\) is \(\alpha _0\Rightarrow _G\ldots \Rightarrow _G\alpha _n\), \(der_2\) is \(\alpha _n\Rightarrow _G\alpha _{n+1}\), \(\alpha _0=\alpha \), and \(\alpha _{n+1}=\alpha '\). Let \(\beta _n=\mathrm {subst}(\mathrm {sfv}(der_1),\beta ,f)\). By the induction hypothesis, there is a derivation \(der_1'\) from \(\beta \) to \(\beta _n\). Let \((u_1,\ldots ,u_k)=\mathrm {sfv}(der_1)\) (\(k=|\alpha |\)). Assume that G rewrites a nonterminal A during \(\alpha _n\Rightarrow _G\alpha _{n+1}\) using a rule \(r=(A\rightarrow \gamma ,p,\varepsilon )\) (see Fig. 2 for an example).

Let \(i\in [k]\) and \(\kappa \in \mathrm {pos}(u_i)\) be such that the rewritten A occurs on the \(\kappa \)th position of \(u_i\). Let \(i_1<i_2\ldots <i_\xi \) be all the positions in \(\mathrm {pos}(\beta )\) with \(f(i_j)=i\) (\(j\in [\xi ]\)). Let \((v_1,\ldots ,v_l)=\mathrm {sfv}(der_1')\) (\(l=|\beta |\)). Then, for every \(j\in [\xi ]\), \(v_{i_j}=u_i\) and thus, for every such j, there is a position \(\kappa _{j}\in \mathrm {pos}(\beta _n)\) satisfying that \(\kappa _j\) corresponds to the \(\kappa \)th position in \(v_{i_j}\). Clearly \(\mathrm {subw}(\beta _n,\kappa _{j},1)=A\) and \(\beta '=\beta _{n+1}=\mathrm {subst}((\gamma ), \beta _n, h)\) where \(h:\mathrm {pos}(\beta )\rightarrow \{1\}\) is defined as follows: \(h(j)=1\) if \(j\in \{\kappa _{1},\ldots ,\kappa _{\xi }\}\), and it is undefined otherwise. Therefore, to prove \(\beta _n\Rightarrow _G^*\beta '\) it is enough to show that G can use r to rewrite each nonterminal A that occurs on a position \(\kappa _{j}\) \((j\in [\xi ])\) in \(\beta _n\).

Since G can apply r at the step \(\alpha _n\Rightarrow _G\alpha _{n+1}\), \(\alpha _n\) should contain the permitting context p. Then there are \(\mu \in [k]\) and \(\nu \in [0,m-1]\) such that p occurs in the subword \(u_\mu \ldots u_{\mu +\nu }\) of \(\alpha _n\) (notice that G has no erasing rules). Since g is an m-embedding of \(\alpha \) to \(\beta \), it is clear that \(v_{g(\mu )}\ldots v_{g(\mu )+\nu }=u_\mu \ldots u_{\mu +\nu }\). Thus, there is at most one index \(j\in [\xi ]\) such that \(v_{g(\mu )}\ldots v_{g(\mu )+\nu }\) contains that A which occurs on the \(\kappa _j\)th position in \(\beta _n\). If no such j exists, then G can rewrite all A’s occurring on positions \(\kappa _j\) \((j\in [\xi ])\) in \(\beta _n\), since the permitting context p is always present as a subword of \(v_{g(\mu )}\ldots v_{g(\mu )+\nu }\). Otherwise let \(j\in [\xi ]\) be such that the subword \(v_{g(\mu )}\ldots v_{g(\mu )+\nu }\) includes that A which occurs on the \(\kappa _j\)th position of \(\beta _n\). Then G should rewrite first those A’s in \(\beta _n\) that occur on positions other that \(\kappa _j\) and, at the last step, that A which occurs on the \(\kappa _j\)th position. Therefore \(\beta _n\Rightarrow _G^*\beta _{n+1}=\beta '\) which implies that \(\beta \Rightarrow _G^*\beta '\).

To finish the proof of the lemma consider a derivation \(der'\) from \(\beta \) to \(\beta '\). Looking at the inductive proof of \(\beta \Rightarrow ^*\beta '\), one can see that, for each derivation step in \(der'\), there is a \(j\in dom(f)\) such that \([j,j+m-1]\subseteq dom(f)\), and the necessary permitting context is in a sentential form derived from \(\mathrm {subw}(\beta ,j,m)\). In other words, those letters in \(\beta \) that are on such positions which are not included in dom(f) do not occur in the permitting contexts used during \(der'\). Assume that \(i\in \mathrm {pos}(\beta )-dom(f)\) such that \(\hat{f}(i)=f(j)\), for some \(j\in dom(f)\). Let u be the f(j)th word in \(\mathrm {sfv}(der)\) and \(X=\mathrm {subw}(\beta ,j,1)\). Then G derives u during \(der'\) from this X. On the other hand, by the definition of \(\hat{f}\), \(\mathrm {subw}(\beta ,i,1)=X\). Thus, \(der'\) can be extended to such a derivation where G, using the appropriate rules simultaneously, derives u also from that X which occurs on the ith position of \(\beta \). Following this way of thinking one can see that \(der'\) can be extended to a derivation of \(\beta _{\hat{f}}\) from \(\beta \) which completes the proof of the lemma.

Fig. 2.
figure 2

The inductive proof of Lemma 1 assuming \(m=3\)

To prove out pumping lemma we need the following preparation. Let \(g:\alpha \,{{\rightsquigarrow }_{m}}\,\beta \) be an m-embedding. The mapping

$$\mathrm {cmp}_g(i)={\left\{ \begin{array}{ll}g(i)&{} \text { for }i\in [k-m+1]\\ g(k-m+1)+i-(k-m+1)&{} \text { for }i\in [k-m+2,k]\end{array}\right. }$$

is called the completion of g. Note that \(dom(\mathrm {cmp}_g)=pos(\alpha )\) and by the definition of an m-embedding \(\mathrm {cmp}_g\) is letter-preserving. If \(f=\mathrm {inv}_m(g)\) then \(f(\mathrm {cmp}_g(i))=i\) holds for \(i\in [k]\).

Proposition 3

Let \(\alpha \in \varSigma ^*, |\alpha |=k\), \(z_i\in \varSigma ^*, z_i\ne \varepsilon \) \((i\in [k])\) and \(\beta =z_1\cdots z_k\) with \(|\beta |=l\). Suppose that \(g:\alpha \,{{\rightsquigarrow }_{m}}\,\beta \) with \(f=\mathrm {inv}_m(g)\) and \(\bar{g}=\mathrm {cmp}_g\).

Let us introduce the notations \(x_i={\left\{ \begin{array}{ll}z_{f(i)}&{}\text { if }i\in dom(f)\\ \mathrm {subw}(\beta ,i,1)&{}\text { if }i\not \in dom(f) \end{array}\right. }\),

\(\zeta (i,r)=\sum _{j=1}^{i-1}|z_j|+r\) \((i\in [k],r\in [|z_i|])\) and \(\xi (i,r)=\sum _{j=1}^{i-1}|x_j|+r\) \((i\in [l], r\in [|x_i|])\). Then for the mapping \(g'(\zeta (i,r)):=\xi (\bar{g}(i),r)\) \((\zeta (i,r)\in [l-m+1])\),

$$\begin{aligned} {g':\beta \,{{\rightsquigarrow }_{m}}\, \mathrm {subst}({\mathbf u},\beta ,f)} \end{aligned}$$

holds, where \({{\mathbf u}=(z_1,\ldots ,z_k)}\).

Proof

See [9].

Lemma 2

Let \(G=(V,\varSigma ,R,S)\) be a pSCG and \(m=\mathrm {max}_{pw(G)}\). Suppose that \(\alpha \Rightarrow ^*\beta \), \(\beta \Rightarrow ^*\gamma \), \(\alpha \,{{\rightsquigarrow }_{m}}\, \beta \), and \(|\alpha |<|\beta |\) hold for some \(\alpha ,\beta ,\gamma \in V_G^*\). Then there is a \(\gamma '\in \varSigma ^*\) such that (i) \(\alpha \Rightarrow ^*\gamma '\) and (ii) \(|\gamma |< |\gamma '|\le (m+1)|\gamma |.\)

Proof

Let \(k=|\alpha |\), \(l=|\beta |\), and g be an m-embedding of \(\alpha \) to \(\beta \). Let moreover \(f=\mathrm {inv}_m(g)\), and \(der'\), \(der''\) be any derivations from \(\alpha \) to \(\beta \) and from \(\beta \) to \(\gamma \), respectively. Applying Proposition 3 with these parameters and \(\varSigma =V_G\), \(\mathbf {u}=\mathrm {sfv}(der')\) we get \(g':\beta \,{{\rightsquigarrow }_{m}}\, \beta '\), where \(\beta '=subst(\mathbf {u},\beta ,f)\). Let \(f'=\mathrm {inv}_m(g')\). By Lemma 1 \(\beta \Rightarrow ^* \beta '\).

Let \(\hat{f'}\) be the following function. For every \(\tau \in pos(\beta ')\), if \(\tau \in dom(f')\), then let \(\hat{f}'(\tau )=f'(\tau )\). Otherwise let \(\tau =\xi (i,r)\), for some \(i\in [l]\) and \(r\in [|x_i|]\), and we define \(\hat{f}'(\tau )\) as follows. If \(i\in dom(f)\), then let \(\hat{f}'(\tau )=\zeta (f(i),r)\), and let \(\hat{f}'(\tau )=i\), otherwise. Notice that \(\hat{f}'\) is a letter preserving function form \(\beta '\) to \(\beta \). Indeed, if \(i\in dom(f)\), then \(x_i=z_{f(i)}\), and \(x_i=\mathrm {subw}(\beta ,i,1)\), otherwise. Let \(\tau \in pos(\beta ')-dom(f')\). Since \(g'\) is an m-embedding of \(\beta \) to \(\beta '\), there is a \(\tau '\in pos(\beta ')\) with \(f'(\tau ')=\hat{f}'(\tau )\). Then \(\mathrm {subw}(\beta ',\tau ,1)=\mathrm {subw}(\beta ,\hat{f}'(\tau ),1)=\mathrm {subw}(\beta , f'(\tau '),1)=\mathrm {subw}(\beta ',\tau ',1)\). Thus \(\hat{f}'\) is an extension of \(f'\).

Now let \(\gamma '=\mathrm {subst}((v_1,\ldots ,v_l),\beta ',\hat{f'})\), where \((v_1,\ldots ,v_l)=\mathrm {sfv}(der'')\). By Lemma 1, \(\beta '\Rightarrow _G^*\gamma '\). This, together with \(\alpha \Rightarrow _G^*\beta \) and \(\beta \Rightarrow _G^*\beta '\) implies \(\alpha \Rightarrow _G^*\gamma '\), i.e., Statement (i) of the lemma holds. Statement (ii) can be seen as follows. Since \(g'\) is an m-embedding of \(\beta \) to \(\beta '\), for every \(i\in [l]\), there is a \(\tau \in pos(\beta ')\) with \(f'(\tau )=i\). Thus, each \(v_i\) \((i\in [l])\) is substituted for a position in \(\beta '\) by \(f'\). Therefore \(|\gamma |=|v_1\ldots v_l|\le |\mathrm {subst}((v_1,\ldots ,v_l),\beta ',f')|\le |\mathrm {subst}((v_1,\ldots ,v_l),\beta ',\hat{f}')|=|\gamma '|\). Moreover, since \(|\alpha |<|\beta |\), there is an \(i\in [k]\) such that \(|z_i|\ge 2\). Let \(j\in [l]\) with \(f(j)=i\). Then \(|x_j|\ge 2\), so \(|\beta |=l<l+1\le \sum _{s=1}^l|x_s|=|\beta '|\). This implies that \(|\gamma |=|\gamma '|\) cannot hold, consequently \(|\gamma |<|\gamma '|\).

On the other hand, by (i) of Proposition 2, for every \(i\in [k]\), \(z_i\) is substituted for at most m different positions in \(\beta \) by f. Moreover, one can see that, for every \(i\in dom(f)\), \(\hat{f}'\) is an injective function from \([\xi (i,1),\xi (i,|x_i|)]\) to [l]. Furthermore, \(\hat{f}'\) is injective on the set \(\{\tau \mid \tau =\xi (i,1), i\in [l]-dom(f)\}\), too. Consequently, for every \(i\in [l]\), \(v_i\) is substituted for at most \(m+1\) different positions in \(\beta '\) by \(\hat{f}'\). Therefore, \(|\gamma '|\le (m+1)|\gamma |\) should hold finishing the proof of Statement (ii).

Next we demonstrate some of the constructions used in the previous proof.

Example 3

Let \(G=(V,\varSigma ,R,S)\) be a pSCG, \(\alpha =ABBA\), \(\beta =EABBFBACA\) \((A,B,C,E,F\in V\cup \varSigma )\), and \(m=2\). Let \(\gamma \in \varSigma ^*\), and assume that G has two derivations \(der'\) and \(der''\) from \(\alpha \) to \(\beta \) and from \(\beta \) to \(\gamma \), respectively. Clearly \(|\alpha |<|\beta |\) and \(\alpha \,{{\rightsquigarrow }_{m}}\, \beta \) with the following m-embedding g: \(g(1)=2\), \(g(2)=3\), and \(g(3)=6\). Then, according to Lemma 2, we can give a \(\gamma '\in \varSigma ^*\) with the following properties: (i) \(\alpha \Rightarrow ^*\gamma '\) and (ii) \(|\gamma |< |\gamma '|\le (m+1)|\gamma |.\)

Assume, for instance, that \(\mathrm {sfv}(der')=(z_1,z_2,z_3,z_4)\), where \(z_1=E\), \(z_2=AB\), \(z_3=BF\), and \(z_4=BACA\). Let \(f=\mathrm {inv}_m(g)\). Then \(\beta '=subst(\mathrm {sfv}(der'),\beta ,f)=x_1\ldots x_9\), where \(x_1=E\), \(x_2=E\), \(x_3=AB\), \(x_4=BF\), \(x_5=F\), \(x_6=BF\), \(x_7=BACA\), \(x_8=C\), and \(x_9=A\). Now, if we define \(g'\) according to the proof of Lemma 2, then we get that \(dom(g')=[1,8]\), and \(g'(i)=i+1\) if \(i\in [1,3]\), and \(g'(i)=i+4\), otherwise. It is easy to verify that \(g'\) is an m-embedding of \(\beta \) to \(\beta '\). Let \(f'=\mathrm {inv}_m(g')\). Then \(dom(f')=[2,5]\cup [8,13]\). Let us define now \(\hat{f}'\) according to the proof of Lemma 2, that is, \(\hat{f}'(1)=1\), \(\hat{f}'(6)=\hat{f}'(7)=5\), \(\hat{f}'(14)=8\), and \(\hat{f}'(15)=9\). One can check that, for every \(\tau \in [1,15]-dom(f')\), \(\mathrm {subw}(\beta ',\tau ,1)=\mathrm {subw}(\beta ,\hat{f}'(\tau ),1)\). Assume that \(\mathrm {sfv}(der'')=v_1\ldots v_9\), where \(v_i\in \varSigma ^*\) \((i\in \mathrm {pos}(\beta ))\). Then \(\gamma '=\mathrm {subst}(\mathrm {sfv}(der''),\beta ',\hat{f'})=v_1v_1\ldots v_5v_5\ldots v_9v_8v_9\). By (i) of Lemma 2, \(\alpha \Rightarrow ^*\gamma '\), and it is easy to check that \(\gamma '\) satisfies Statement (ii) too.

The following proposition together with Lemma 3 will be used to show that sufficiently long derivations of pSCG’s always contain sentential forms \(\alpha \) and \(\beta \) satisfying the conditions of Lemma 2. The statement can be seen using Proposition 1 (see also the proof of Lemma 2 in [6]).

Proposition 4

Let \(\varSigma \) be an alphabet and \(n_1,n_2,\ldots \) an infinite sequence of numbers in \(\mathbb {N}\). Then there is \(M\in \mathbb {N}\) such that, for every sequence \(v_1,v_2,\ldots , v_n\) where \(n\ge M\) and \(v_i\in \varSigma ^*\) with \(|v_i|\le n_i\) \((i\in [n])\), there are numbers \(i<j\) in [M] satisfying \(v_i\le _s v_j\).

Let \(G=(V,\varSigma ,R,S)\) be a pSCG and \(m\ge 1\). We will apply the above result to appropriate derivations of G in order to find sentential forms \(\alpha \) and \(\beta \) satisfying \(\alpha \,{{\rightsquigarrow }_{m}}\, \beta \). However, Proposition 4 ensures only that we can find such \(\alpha \) and \(\beta \) which satisfy \(\alpha \le _s\beta \). Clearly, this does not imply \(\alpha \,{{\rightsquigarrow }_{m}}\, \beta \). Thus we will apply Proposition 4 not directly to the derivations of G but to sequences of words derived from these derivations. To this end we will use two functions \(\mathrm {wdo}\) and p defined below.

Let \(\varSigma \) be an alphabet and \(m\ge 1\). We denote by \(\varSigma ^{\le m}\) the set of all words in \(\varSigma ^*\) with length at most m. Since \(\varSigma ^{\le m}\) is a finite set, we will treat it as an alphabet. Now let \(\mathrm {wdo}:{\varSigma }^*\rightarrow (\varSigma ^{\le m})^*\) be defined as follows. Let \(u\in {\varSigma }^*\). If \(|u|< m\), then let \(\mathrm {wdo}(u)=u\) (that is, u on the right-hand side is considered as a letter in \(\varSigma ^{\le m}\)). If \(|u|\ge m\), then let \(\mathrm {wdo}(u)=\mathrm {subw}(u,1,m)\ldots \mathrm {subw}(u,|u|-m+1,m)\) (again, \(\mathrm {subw}(u,i,m)\) \((i\in [1,|u|-m+1])\) is considered as a letter in \(\varSigma ^{\le m}\)). The name \(\mathrm {wdo}\) comes from the word window, since for a word u, \(\mathrm {wdo}(u)\) is that word whose letters are determined by moving a window of length m on u from left to right. The intuition behind the definition of \(\mathrm {wdo}\) is the following: if \(\mathrm {wdo}(\alpha )\le _s\mathrm {wdo}(\beta )\), then every m-subword of \(\alpha \) has to be an m-subword of \(\beta \) too. On the other hand \(\mathrm {wdo}(\alpha )\le _s\mathrm {wdo}(\beta )\) still does not imply \(\alpha \,{{\rightsquigarrow }_{m}}\, \beta \) (see, for example, the first item in Example 1). Thus we will use the following function p before applying \(\mathrm {wdo}\) on the sentential forms of G. Let \(\varSigma \) be an alphabet. Then \(\hat{\varSigma }\) denotes the alphabet \(\{a^{(i)}\mid a\in \varSigma , i\in [m]\}\). Now let \(p:\varSigma ^*\rightarrow {\hat{\varSigma }}^*\) be defined as follows. For a word \(u=a_1\ldots a_k\in \varSigma ^*\) \((a_i\in \varSigma , i\in [k])\), \(p(u)=a_1^{(1 \mod m)}\ldots a_k^{(k \mod m)}\). Intuitively, p associates the number \(i\mod m\) to the ith letter of u (we put this number in parentheses in order not to confuse it with the usual notation of the iteration of a letter). We will see in the proof of the next lemma that for two sentential forms \(\alpha \) and \(\beta \) of G, \(\mathrm {wdo}(p(\alpha ))\le _s \mathrm {wdo}(p(\beta ))\) implies \(\alpha \,{{\rightsquigarrow }_{m}}\, \beta \).

Lemma 3

Let \(G=(V,\varSigma ,R,S)\) be a pSCG and \(m\ge 1\). Then there is \(M\in \mathbb {N}\) such that the following holds. For every derivation \(\alpha _0\Rightarrow \alpha _1\Rightarrow \ldots \Rightarrow \alpha _n\) of G with \(n\ge M\), there are \(i<j\) in [M] such that \(\alpha _i \,{{\rightsquigarrow }_{m}}\, \alpha _j\).

Proof

Let \(m=\mathrm {max}_{pw(G)}\), \(\rho =\mathrm {max}\{|\mathrm {rhs}(r)|\mid r\in R\}\) and consider the sequence \(n_1,n_2,\ldots \) where \(n_i=i\rho \) \((i\ge 1)\). Let moreover M be the number given in Proposition 4 and \(\alpha _0\Rightarrow \alpha _1\Rightarrow \ldots \Rightarrow \alpha _n\) be a derivation of G with \(n\ge M\). Clearly, \(|\mathrm {wdo}(p(\alpha _i))|\le n_i\), for every \(i\in [n]\). Then, by Proposition 4, there are numbers \(i<j\) in [M] such that \(\mathrm {wdo}(p(\alpha _i))\le _s \mathrm {wdo}(p(\alpha _j))\). We show that \(\alpha _i \,{{\rightsquigarrow }_{m}}\, \alpha _j\). To simplify the notation, let us denote \(\mathrm {wdo}(p(\alpha _i))\) and \(\mathrm {wdo}(p(\alpha _j))\) by u and v, respectively. If \(|u|=1\), then \(|\alpha _i|\le m\) and \(\alpha _i\) is a subword of \(\alpha _j\). In this case \(\alpha _i \,{{\rightsquigarrow }_{m}}\, \alpha _j\) trivially holds. Assume now that \(|u|\ge 2\), and let \(k=|u|\) and \(l=|\alpha _j|\). Since u is a scattered subword of v, there are \(i_1<\ldots <i_k\) in \(\mathrm {pos}(v)\) such that \(u=\mathrm {subw}(v,i_1,1)\ldots \mathrm {subw}(v,i_k,1)\). Then let \(g:[k]\rightarrow [l]\) be a strictly increasing function defined as \(g(\nu )=i_\nu \) \((\nu \in [k])\). Notice that \(k=|\alpha _i|-m+1\). Let moreover \(f:\mathrm {pos}(\alpha _j)\rightarrow \mathrm {pos}(\alpha _i)\) be a (partial) function defined as \(f(g(\nu )+\kappa )=\nu +\kappa \) \((\nu \in [k], \kappa \in [0,m-1])\). To see that g is an m-embedding of \(\alpha _i\) to \(\alpha _j\) it is enough to show that f is letter preserving and well-defined.

Let \(\nu \in [k]\). Using the definition of \(\mathrm {wdo}\) we get that \(\mathrm {subw}(p(\alpha _i),\nu ,m)=\mathrm {subw}(p(\alpha _j),g(\nu ),m)\) and in turn \(\mathrm {subw}(\alpha _i,\nu ,m)=\mathrm {subw}(\alpha _j,g(\nu ),m)\). Thus f is letter preserving. Now, let \(\nu \in [k-1]\). Using again the definition of \(\mathrm {wdo}\) we get that \(\mathrm {subw}(p(\alpha _i),\nu ,m)=\mathrm {subw}(p(\alpha _j),g(\nu ),m)\) and \(\mathrm {subw}(p(\alpha _i),\nu +1,m)=\mathrm {subw}(p(\alpha _j),g(\nu +1),m)\). Thus, the upper index added by p to the first letter of \(\mathrm {subw}(\alpha _i,\nu ,m)\) should match that of \(\mathrm {subw}(\alpha _j,g(\nu ),m)\). Similar observation holds for the words \(\mathrm {subw}(\alpha _i,\nu +1,m)\) and \(\mathrm {subw}(\alpha _j,g(\nu +1),m)\). This implies that either \(g(\nu +1)-g(\nu )=1\) or \(g(\nu +1)-g(\nu )\ge m\) should hold. It is easy to see that in both cases the definition of f is consistent. Therefore g is an m-embedding of \(\alpha _i\) to \(\alpha _j\).

Theorem 4

\(\mathcal{L}(\mathrm {pSCG})\subsetneq \mathcal{L}(\mathrm {CS})\).

Proof

By [16] \(\mathcal{L}(\mathrm {pSCG})\subseteq \mathcal{L}(\mathrm {CS})\). Thus, since \(L=\{a^{2^{2^n}}\mid n\ge 0\}\) is clearly included in \(\mathcal{L}(\mathrm {CS})\), it is enough to show that \(L\not \in \mathcal{L}(\mathrm {pSCG})\). Assume on the contrary that \(L\in \mathcal{L}(\mathrm {pSCG})\) and let G be a pSCG with \(L(G)=L\). Let moreover \(m=\mathrm {max}_{pw(G)}\). Since L is not a context-free language, we can assume that \(m\ge 1\). Then let M be the number of Lemma 3, and \(u=a^{2^{2^{mMN}}}\), where \(N=\mathrm {max}\{|\mathrm {rhs}(r)|\mid r\in R\}\). Let moreover \(der:S=\alpha _0\Rightarrow \alpha _1\Rightarrow \ldots \Rightarrow \alpha _n=u\) be one of the shortest derivations of G from S to u. Clearly \(n\ge M\). Thus, by Lemma 3, there are \( i<j\) in [M] such that \(\alpha _i \,{{\rightsquigarrow }_{m}}\, \alpha _j\). We can assume that \(|\alpha _i|<|\alpha _j|\). Indeed, assume on the contrary that this is not the case. Then, since G has no erasing rules, \(|\alpha _i|=|\alpha _j|\). This, using (ii) of Proposition 2, implies that \(\alpha _i=\alpha _j\). This yields that \(der':\alpha _0\Rightarrow \ldots \Rightarrow \alpha _i\Rightarrow \alpha _{j+1}\Rightarrow \ldots \Rightarrow \alpha _n\) is also a derivation of G from S to u with \(|der'|<n\). However this contradicts the assumption that der is a shortest derivation from S to u. Applying Lemma 2 we get that there is an \(u'\in \varSigma ^*\) such that \(\alpha _i\Rightarrow ^* u'\) and \(|u|< |u'|\le (m+1)|u|\). Since \(S\Rightarrow ^*\alpha _i\), also \(S\Rightarrow ^*u'\) holds. Consequently, \(u'\in L\).

Clearly, the shortest word \(v\in L\) with \(|u|<|v|\) is \(a^{2^{2^{mMN+1}}}\). On the other hand, \(|u'|\le (m+1)2^{2^{mMN}}<2^{2^{mMN}}\cdot 2^{2^{mMN}}=2^{2^{mMN+1}}=|v|\). Thus \(|u'|<|v|\) yielding \(u'\not \in L\) which is a contradiction. Therefore \(L\not \in \mathcal{L}(\mathrm {pSCG})\).

4 Conclusions

In this paper we have investigated permitting semi-conditional grammars introduced by Kelemen [13]. We showed that these grammars are strictly weaker than context-sensitive grammars when erasing rules are not allowed. However, it is still open whether this remains true if erasing rules are allowed. In [19] it was shown that allowing erasing rules does not increase the generative power of permitting random context grammars. To decide whether this holds also for permitting semi-conditional grammars is a possible topic for future work. It is also an interesting question, for example, whether the inclusion \(\mathcal{L}(\mathrm {pRCG})\subseteq \mathcal{L}(\mathrm {pSCG})\) depicted in Fig. 1 is strict or not.