Composition Closure of Linear Extended Top-down Tree Transducers

Engelfriet, Joost; Fülöp, Zoltán; Maletti, Andreas

doi:10.1007/s00224-015-9660-2

Composition Closure of Linear Extended Top-down Tree Transducers

Published: 29 December 2015

Volume 60, pages 129–171, (2017)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Theory of Computing Systems Aims and scope Submit manuscript

Composition Closure of Linear Extended Top-down Tree Transducers

Download PDF

211 Accesses
5 Citations
Explore all metrics

Abstract

Linear extended top-down tree transducers (or synchronous tree-substitution grammars) are popular formal models of tree transformations that are extensively used in syntax-based statistical machine translation. The expressive power of compositions of such transducers with and without regular look-ahead is investigated. In particular, the restrictions of ε-freeness, strictness, and nondeletion are considered. The composition hierarchy turns out to be finite for all ε-free (all rules consume input) variants of these transducers except for the nondeleting ε-free transducers. The least number of transducers needed for the full expressive power of arbitrary compositions is presented. In all remaining cases (incl. the nondeleting ε-free transducers) the composition hierarchy does not collapse.

Composition Closure of Linear Weighted Extended Top-Down Tree Transducers

Composition Closure of ε-Free Linear Extended Top-Down Tree Transducers

Compositions of Tree-to-Tree Statistical Machine Translation Models

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Top-down tree transducers are simple formal models that encode tree transformations (i.e., relations between trees). They were introduced in [23, 24] and intensively studied thereafter (see [14–16] for an overview). Roughly speaking, a top-down tree transducer processes the input tree symbol-by-symbol, and specifies in its rules how to translate an input symbol into an output tree fragment together with instructions on how to process the subtrees of the input symbol. This asymmetry between input and output (single symbol vs. tree fragment) was removed in extended top-down tree transducers (xt), which were introduced and studied in [1, 2]. In such a transducer the input side of a rule can now also contain a tree fragment, in which each variable can occur at most once as a placeholder for a subtree. In particular, the tree fragment can even be just a variable, which matches every input tree, and such rules are called ε-rules because they do not process any part of the input tree. In this contribution we only consider linear xt (l-xt), in which the output side of each rule contains each variable at most once as well. Restricted variants of l-xt are used in most approaches to syntax-based machine translation [18, 19].

We also add regular look-ahead [7] (i.e., the ability to check a regular property for the subtrees in an input tree fragment) to l-xt, so our most expressive model is the linear extended top-down tree transducer with regular look-ahead (l- xt^R). Contrary to most of the literature [7, 17] we present our models as synchronous grammars [4] because we sometimes use the auxiliary link structure in our proofs. Instead of variables in the input side and a state-variable combination in the output side of a rule, we immediately only use states with the restriction that all states that occur in the output side must also occur in the input side. Moreover, each state that occurs in both sides, must occur exactly once in the input side and exactly once in the output side, which corresponds to the classical linearity condition. In this way, for each rule the states establish links (a state links its occurrence in the output side with its occurrence in the input side), which form an injection from the state occurrences in the output side to the state occurrences in the input side. Regular look-ahead is specified only for the state occurrences (in the input side) that do not participate in the injection (i.e., those states that exclusively occur in the input side). A derivation of the grammar simultaneously generates an input tree and an output tree, which can contain states that are (possibly) linked by explicit links. A rule application expands two linked state occurrences at the same time, thus generating new input and output fragments with new (linked) state occurrences. Moreover, every unlinked state (in the input tree) is expanded into a tree from its regular look-ahead. Example 2 shows an l- xt^R, for which we illustrate a few derivation steps in Fig. 2. The tree transformation computed by the example l- xt^R is described in Example 8. In the following, we use l- XT^R and l-XT to denote the class of all tree transformations computed by l- xt^R and l-xt, respectively.

The expressive power of the various subclasses of l- XT^R is already well understood [13, 17]. However, in practice complex systems are often specified with the help of compositions of tree transformations [22] because it is much easier to develop (or train) small components that manage a part of the overall transformation. Consequently, [19] and others declare that closure under composition is a very desirable property for classes of tree transformations (especially in the area of natural language processing). If a class $\mathcal {C}$ of tree transformations is closed under composition, then any composition chain τ ₁ ;⋯ ;τ _n of tree transformations τ ₁,…, τ _n of $\mathcal {C}$ can be replaced by a single tree transformation $\tau \in \mathcal {C}$. If $\mathcal {C}$ represents the class of all tree transformations computable by a device, then closure under composition means that we can replace any composition chain specified by several devices by just a single device, which enables an efficient modular development. Unfortunately, neither l- XT^R nor l-XT is closed under composition [2, 3, 17].

In general, for a class $\mathcal {C}$ of tree transformations (that contains the identity transformations) we obtain a composition hierarchy $\mathcal {C} \subseteq \mathcal {C}^{2} \subseteq \mathcal {C}^{3} \subseteq \dotsb $, where $\mathcal {C}^{n}$ denotes the class of n-fold compositions of transformations from $\mathcal {C}$. The class $\mathcal {C}$ might be closed under composition at power n (i.e., $\mathcal {C}^{n} = \mathcal {C}^{n+1}$) or its composition hierarchy might be infinite (i.e., $\mathcal {C}^{n} \subsetneq \mathcal {C}^{n+1}$ for all n). The former case yields that $\mathcal {C}^{n} = \mathcal {C}^{m}$ for all m ≥ n, which means that the composition hierarchy of $\mathcal {C}$ collapses at power n. In particular, $\mathcal {C}$ is closed under composition if its composition hierarchy collapses at power 1. We note that in practice (e.g., in statistical machine translation) the classes that are closed under composition at a small power are also important because for such classes we can limit the length of composition chains [22]. In this contribution, we investigate the composition hierarchy of the classes l- XT^R and l-XT together with their subclasses determined by any combination of the properties: ε-freeness, strictness, and nondeletion, which are abbreviated by ‘ ’, ‘s’, and ‘n’, respectively. Roughly speaking, ε-freeness requires that there are no ε-rules, strictness guarantees that the output side of each rule contains at least one output symbol, and nondeletion requires that for each rule exactly the same states occur in the input and output side. We use the property abbreviations in front of l- XT^R and l-XT to obtain the class of all tree transformations computable by such restricted l- xt^R and l-xt, respectively. For instance, denotes the class of all tree transformations computed by ε-free and strict l- xt^R.

It is known that none of our considered classes is closed under composition [3, Section 3.4]. In addition, it is known [3, Theorem 6.2] that the class is closed at power 2. We complete the picture as follows. For each of the remaining classes, we either provide the least power at which the class is closed under composition or show that the composition hierarchy of the class is infinite (denoted by ∞). Our results (together with the mentioned existing result) are presented in Table 1.

Table 1 Characterization of the composition hierarchies

Full size table

Our contribution is organized as follows. Section 2 recalls the necessary concepts and introduces our notation. We continue in Section 3 with the formal introduction of our main model (l- xt^R) including its syntax and semantics and the restrictions that we consider later. In addition, we recall some known equalities between certain fundamental classes of tree transformations in preparation for our first main results. In Section 4 we give a power at which the classes , , , and of tree transformations are closed under composition (see Table 1). This is completed in Section 5, where we conclude that these powers are minimal. In Section 6 we prove that the composition hierarchy of each of the remaining classes is infinite. Finally, we present the Hasse diagram of all the ε-free classes in Section 7.

2 Preliminaries

We denote the set of all nonnegative integers by $\mathbb {N}$. In the following, let S be a set. The power set of S is the set $\mathcal P(S) = \{S^{\prime } \mid S^{\prime } \subseteq S\}$ of all subsets of S. For an element s of S, we identify the singleton set {s} with s, whenever convenient; this should not lead to confusion. The cardinality of S is denoted by |S|. The set of all words (finite sequences) over S is $S^{\ast } = \bigcup _{n \in \mathbb {N}} S^{n}$, where S ⁰ = {ε} contains only the empty word ε. The length of a word w ∈ S ^∗ is the unique $n \in \mathbb {N}$ such that w ∈ S ⁿ. We write |w| for the length of w. The concatenation of two words v, w ∈ S ^∗ is denoted by v.w or simply vw.

For sets S and T, every subset of S×T is a relation from S to T. Given relations R ₁ ⊆ S×T and R ₂ ⊆ T×U, the inverse of R ₁ is the relation $R_{1}^{-1} = \{ (t, s) \mid (s, t) \in R_{1}\}$, the domain of R ₁ is

$$\text{dom}(R_{1}) = \{s \in S \mid \exists t \in T : (s, t) \in R_{1}\}~, $$

and the composition of R ₁ and R ₂ is the relation

$$R_{1} \ ; R_{2} =\left\{ (s, u) \mid \exists t \in T: (s, t) \in R_{1},\, (t, u) \in R_{2} \right\} \subseteq S \times U~. $$

Given a relation R⊆S×S, the powers of R are defined by R ⁰={(s, s)∣s ∈ S} and R ⁿ⁺¹ = R ⁿ ;R for $n \in \mathbb {N}$. The reflexive and transitive closure of R is $R^{\ast } = \bigcup _{n \in \mathbb {N}} R^{n}$. These notions and notations are lifted to classes $\mathcal {C}_{1}$ and $\mathcal {C}_{2}$ of relations in the usual manner. Namely, we let $\mathcal {C}_{1}^{-1} = \{ R_{1}^{-1}\mid R_{1} \in \mathcal {C}_{1}\}$ and $\mathcal {C}_{1} \ ; \mathcal {C}_{2} = \{R_{1} \ ; R_{2} \mid R_{1} \in \mathcal {C}_{1},~R_{2} \in \mathcal {C}_{2}\}$. Moreover, the powers of a class $\mathcal {C}$ are defined by $\mathcal {C}^{1} = \mathcal {C}$ and $\mathcal {C}^{n+1} =\mathcal {C}^{n} \ ; \mathcal {C}$ for n ≥ 1. Note that we do not consider the 0-th power for classes. The composition hierarchy [resp. composition closure] of $\mathcal {C}$ is the family $(\mathcal {C}^{n} \mid n \geq 1)$ [resp. the class $\bigcup _{n \geq 1} \mathcal {C}^{n}$]. The classes $\mathcal {C}$ of tree transformations that we will discuss always contain the identity relations. For such a class, $\mathcal {C}^{n} \subseteq \mathcal {C}^{n+1}$ for all n ≥ 1. If $\mathcal {C}^{n}=\mathcal {C}^{n+1}$, then $\mathcal {C}$ is closed under composition at power n. For n = 1 we shorten this to just $\mathcal {C}$ is closed under composition. If $\mathcal {C}$ is closed under composition at power n, then $\mathcal {C}^{n}$ is the composition closure of $\mathcal {C}$.

An alphabet Σ is a nonempty and finite set, of which the elements are called symbols. The alphabet Σ is ranked if there additionally is a mapping rk$ : {\Sigma } \to \mathbb {N}$ that assigns a rank to each symbol. We let

$${\Sigma}_{k}= \{\sigma \in {\Sigma} \mid \textnormal{rk}(\sigma) = k\} $$

for every $k \in \mathbb {N}$. Often the mapping ‘rk’ is obvious from the context, so we typically denote ranked alphabets by Σ alone. If it is not obvious, then we use the notation σ ^(k) to indicate that the symbol σ has rank k. For the rest of this paper, Σ, Δ, and Γ will denote arbitrary ranked alphabets if not specified otherwise.

For every set T, let ${\Sigma }(T) = \{\sigma ({t_{1}, \dotsc , t_{k}} ) \mid k \in \mathbb {N},~ \sigma \in {\Sigma }_{k},~ t_{1}, \dotsc , t_{k} \in T\}$. Instead of σ() with σ ∈Σ₀ we will simply write σ. Let S be a set of “states” with S∩Σ = ∅, to be used as additional leaf labels. The set T _Σ(S) of Σ-trees with states in S is the smallest set U such that S⊆U and Σ(U)⊆U. We write T _Σ for T _Σ(∅), and any subset of T _Σ(S) is a tree language. Given a unary symbol γ ∈Σ₁ and a tree t ∈ T _Σ(S), we write γ ^k(t) for the tree γ(⋯γ(t)⋯ ), in which γ occurs k times on top of t.

The set $\text {pos}(t) \subseteq \mathbb {N}^{\ast }$ of positions of t ∈ T _Σ(S) is inductively defined by pos(s) = {ε} for every s ∈ S and

$$\text{pos} \left( \sigma({t_{1}, \dotsc, t_{k}} ) \right) = \left\{\varepsilon \right\} \cup \bigcup\limits_{i = 1}^{k} \left\{iw \mid w \in \text{pos}(t_{i}) \right\} $$

for every $k \in \mathbb {N}$, σ ∈Σ_k, and t ₁,…, t _k∈ T _Σ(S). The positions of t are partially ordered by the prefix order ≼ on $\mathbb {N}^{\ast }$; i.e., for words $w_{1}, w_{2} \in \mathbb {N}^{\ast }$, we have w ₁≼w ₂ if and only if there exists $w^{\prime }_{1} \in \mathbb {N}^{\ast }$ such that $w_{1}w^{\prime }_{1} = w_{2}$. As usual we write w ₁≺w ₂ if w ₁ is a proper prefix of w ₂; i.e., w ₁≼w ₂ and w ₁≠w ₂. For words $w_{1}, w_{2} \in \mathbb {N}^{\ast }$, we denote the longest common prefix of w ₁ and w ₂ by lcp (w ₁, w ₂). Note that lcp (w ₁, w ₂) ∈pos(t) for all w ₁, w ₂∈pos(t) because pos(t) is prefix-closed. The size |t| of a tree t ∈ T _Σ(S) is |pos(t)|; i.e., the number of its positions. Its height ht(t) is max{|w|∣w ∈pos(t)}; i.e., the maximal length of its positions. Let t, u ∈ T _Σ(S) and w ∈pos(t). The label of t at w is t(w), the subtree of t rooted at w is t|_w, and the tree that is obtained from t by replacing the subtree t|_w at w by u is denoted by t[w←u]. Formally, s(ε) = s|_ε = s and s[ε←u] = u for every s ∈ S, and for all $k \in \mathbb {N}$, σ ∈Σ_k, and t ₁,…, t _k∈ T _Σ(S) we have

(i)
if w = ε, then
$$\begin{array}{@{}rcl@{}} \left( \sigma({t_{1}, \dotsc, t_{k}} ) \right)(w) &=& \sigma \\ \left( \sigma({t_{1}, \dotsc, t_{k}}) \right)|_{w} &=& \sigma({t_{1}, \dotsc, t_{k}}) \\ \left( \sigma({t_{1}, \dotsc, t_{k}}) \right)[w \gets u] &=& u \end{array} $$
(ii)
if w = i v with 1 ≤ i ≤ k and v ∈pos(t _i), then
$$\begin{array}{@{}rcl@{}} \left( \sigma({t_{1}, \dotsc, t_{k}}) \right)(w) &= &t_{i}(v) \\ \left( \sigma({t_{1}, \dotsc, t_{k}}) \right)|_{w} &=& t_{i}|_{v} \\ \left( \sigma({t_{1}, \dotsc, t_{k}}) \right)[w \gets u] &=& \sigma({t_{1}, \dotsc, t_{i-1}} , t_{i}[v \gets u], {t_{i+1}, \dotsc, t_{k}} ). \end{array} $$

For 1 ≤ i≤rk(t(w)), the tree t|_{w
i} is the i-th direct subtree below w in t. For every subset Δ⊆Σ∪S, we let pos_Δ(t) = {w ∈pos(t)∣t(w)∈Δ}. A tree t ∈ T _Σ(S) is linear (resp. nondeleting) in a subset Q⊆S of states if |pos_q(t)|≤1 (resp. |pos_q(t)|≥1) for every q ∈ Q. Moreover,

$$\text{states}(t) = \{s \in S \mid \text{pos}_{s}(t) \neq \emptyset\} $$

is the set of states that occur in t. For every selection W⊆pos_S(t) of leaves and mapping $\theta : W \to \mathcal P(T_{\Sigma }(S))$ assigning a tree language to each selected leaf, we define the tree language

$$\begin{array}{@{}rcl@{}} &&\quad t\left[w \leftarrow \theta(w) \mid w \in W \right] \\ &&= \left\{ t[w_{1} \leftarrow u_{1}] \dotsm [w_{n} \leftarrow u_{n}] \mid u_{1} \in \theta(w_{1}), \dotsc, u_{n} \in\theta(w_{n}) \right\} \subseteq T_{\Sigma}(S)~, \end{array} $$

where W = {w ₁,…, w _n}. Similarly, given a selection Q⊆S of states and a mapping $\theta : Q \to \mathcal P(T_{\Sigma }(S))$ assigning a tree language to each selected state, we define the tree language

$$t\left[q \leftarrow \theta(q) \mid q \in Q\right] = t \left[w \leftarrow \theta^{\prime}(w) \mid w \in \text{pos}_{Q}(t) \right]~, $$

where $\theta ^{\prime } : \text {pos}_{Q}(t) \to \mathcal P(T_{\Sigma }(S))$ is given by θ ^′(w) = θ(t(w)) for all w ∈pos_Q(t). The latter operation is also called OI-substitution [10] of θ in t. To simplify the notation, we fix the set X = {x ₁, x ₂, x ₃,…} of variables, which we assume to be disjoint with all ranked alphabets considered in the paper. For every $k \in \mathbb {N}$, we let X _k = {x _i∣1 ≤ i ≤ k}. Given t ∈ T _Σ(X) and θ:X _k→T _Σ(X), we simply write t[θ(x ₁),…, θ(x _k)] for t[x←θ(x)∣x ∈ X _k].

A tree homomorphism from Σ to Δ is a mapping φ:Σ→T _Δ(X) such that φ(σ)∈ T _Δ(X _k) for every $k \in \mathbb {N}$ and σ ∈Σ_k. It is

linear (resp. nondeleting) if for every $k \in \mathbb {N}$ and σ ∈Σ_k the tree φ(σ) is linear (resp. nondeleting) in X _k, and
strict (resp. delabeling) if φ(σ)∉X (resp. φ(σ)∈ X∪Δ(X)) for every σ ∈Σ.

We abbreviate the above restrictions by ‘l’, ‘n’, ‘s’, and ‘d’, respectively. The tree homomorphism φ induces a mapping φ ^∗:T _Σ→T _Δ defined inductively by φ ^∗(σ(t ₁,…, t _k)) = φ(σ)[φ ^∗(t ₁),…, φ ^∗(t _k)] for all $k \in \mathbb {N}$, σ ∈Σ_k, and t ₁,…, t _k∈ T _Σ. As usual, we will from now on denote the induced mapping φ ^∗ by φ, and we will also call it a tree homomorphism. We denote by H the class of all tree homomorphisms, and for any combination w of ‘l’, ‘n’, ‘s’, and ‘d’ we denote by w-H the class of all tree homomorphisms of type w. For instance, snl-H is the class of all strict, nondeleting and linear tree homomorphisms.

In the following, we need the class of regular tree languages [15, 16] and some basic properties of that class. The set Reg(Σ) contains all regular tree languages T⊆T _Σ over the ranked alphabet Σ. A well-known folklore result states that t[s ← θ(s) ∣ s ∈ S] ∈Reg(Σ) for every finite S, tree t ∈ T _Σ(S), and θ : S→Reg(Σ).

A bimorphism is a triple B = (ψ, T, φ) consisting of a regular tree language T ∈Reg(Γ), an input tree homomorphism ψ:T _Γ→T _Σ, and an output tree homomorphism φ:T _Γ→T _Δ. The tree transformation τ(B)⊆T _Σ×T _Δ computed by the bimorphism B is the relation τ(B) = {(ψ(t), φ(t))∣t ∈ T}, which will also be called a bimorphism. Given two combinations v and w of restrictions for tree homomorphisms, we let $\mathcal {B}(v, w)$ denote the class of all tree transformations computed by bimorphisms B = (ψ, T, φ) such that ψ and φ are tree homomorphisms of type v and w, respectively.

3 Linear Extended Top-down Tree Transducers

Our main model is the linear extended top-down tree transducer [1, 2, 18, 19] with regular look-ahead (l- xt^R), which is based on the classical linear top-down tree transducer without [23, 24] and with regular look-ahead [7]. We will present it as a synchronous grammar [4] because we will use an auxiliary structure, called the links, in later proofs. In synchronous grammars, occurrences of equal states in the left- and right-hand side of a rule (representing the input and output side, respectively) are (implicitly) linked and these links are made explicit in a derivation. Each derivation step replaces such a pair of linked state occurrences (at the same time) by the left- and right-hand side of a rule for that state. In a rule of an l- xt^R, the (implicit) links form an injection from the state occurrences in the right-hand side to the state occurrences in the left-hand side. Thus, some states might exclusively occur in the left-hand side. Such states can be used to implement regular look-ahead, which restricts the subtrees that are acceptable at these occurrences. It should be clear (see [17, Theorem 4.4]) that there is no need to have regular look-ahead for the other states in the left-hand side, as that can be incorporated into the (nondeterministic) state behavior of the transducer.

Definition 1 ([17, Section 2.2])

A linear extended top-down tree transducer with regular look-ahead (l- xt^R) is a tuple M = (Q,Σ,Δ, Q ₀, R, c), where

Q is a finite set of states and Q ₀ ⊆ Q is a set of initial states,
Σ and Δ are ranked alphabets of input and output symbols that are both disjoint with Q,
R⊆T _Σ(Q)×Q×T _Δ(Q) is a finite set of rules such that for every (ℓ, q, r)∈ R
- states(r)⊆states(ℓ), i.e., all states that occur in r must occur in ℓ, and
- ℓ and r are linear in states(r),
c:Q ^la→Reg(Σ) is a mapping that assigns regular look-ahead to each (potentially) deleted state, where $Q^{\text {la}} = \bigcup _{(\ell , q, r) \in R} \left ( \text {states}(\ell ) \setminus \text {states}(r) \right )$. Formally, the set Q ^la depends on R (or M), but we prefer the simpler notation and hope that it does not lead to confusion.

For a rule (ℓ, q, r)∈ R we say that ℓ and r are its left- and right-hand side. In contrast to other definitions [13, 17], we do not allow the same state to occur several times in the right-hand side. However, with the help of a simple renaming, each traditional linear extended top-down tree transducer can be written in our slightly more restrictive format. Next, we recall some important syntactic properties of our model. To this end, let M = (Q,Σ,Δ, Q ₀, R, c) be an l- xt^R in the following. It is

a linear extended top-down tree transducer (without look-ahead) [l-xt], if c(q) = T _Σ for every q ∈ Q ^la,
a linear top-down tree transducer with regular look ahead [l- t^R] if ℓ ∈Σ(Q) for every (ℓ, q, r)∈ R,
a linear top-down tree transducer (without look ahead) [l-t] if it is both an l-xt and an l- t^R,
ε-free (resp. strict) if ℓ∉Q (resp. r∉Q) for every (ℓ, q, r)∈ R,
delabeling if ℓ ∈Σ(Q) and r ∈ Q∪Δ(Q) for every (ℓ, q, r)∈ R,
nondeleting if states(r) = states(ℓ) for every (ℓ, q, r)∈ R (i.e., Q ^la = ∅), and
a finite-state relabeling [qr] if every rule of R is of the form
$$\left( \sigma({q_{1}, \dotsc, q_{k}} ), q, \delta({q_{1}, \dotsc, q_{k}}) \right) $$
with $k \in \mathbb {N}$, σ ∈Σ_k, δ ∈Δ_k, and q, q ₁,…, q _k∈ Q.

Since the look-ahead component c is trivial for all l-xt, we simply omit it from their representation. We note that every nondeleting l- xt^R is an l-xt. Moreover, all l- t^R are automatically ε-free. Note also that every qr [finite-state relabeling] is a strict nondeleting delabeling l-t. For clearness’ sake, we sometimes write rules as $\ell \overset {q}{\longrightarrow } r$ instead of (ℓ, q, r) and, to simplify the notation in examples and illustrations, we write as a shorthand for the k rules $\ell \overset {q_{1}}{\longrightarrow } r, \dotsc , \ell \overset {q_{k}}{\longrightarrow } r$. Note that for every $(\ell \overset {q}{\longrightarrow } r) \in R$ the trees ℓ and r are linear in states(r). Hence for every state p ∈states(r) the sets pos_p(ℓ) and pos_p(r) are singletons that we identify with their unique element.

Example 2

Let us consider the l- xt^R M ₁=(Q,Σ,Δ, Q ₀, R, c) given by

Q = {⋆, p, q, q ^la,id,id^′} and Q ₀={⋆},
${\Sigma } = \{ \sigma ^{(2)}, \gamma _{1}^{(1)}, \gamma _{2}^{(1)} \} \cup {\Delta }$ and ${\Delta } = \{ \sigma _{1}^{(2)}, \sigma _{2}^{(2)}, \gamma ^{(1)}, \alpha ^{(0)}\}$,
R consists of the following rules
$$\begin{array}{lll} \sigma_{1}(p, q) \;\overset{\star,p}{\longrightarrow} \sigma_{1}(p, q)\quad \sigma_{2}(\text{id}, {\text{id}^{\prime}}) \overset{p,q}{\longrightarrow} \sigma_{2}(\text{id}, \text{id}^{\prime})\quad \gamma_{1}(p) \overset{p}{\longrightarrow} p \\ \sigma(q, q^{\text{la}}) \overset{q}{\longrightarrow} q \;\quad\quad\quad\quad\sigma(q^{\text{la}}, q) \overset{q}{\longrightarrow} q\quad\quad\quad\quad\;\; \gamma_{2}(q) \overset{q}{\longrightarrow} q \\ \quad\;\;\gamma(\text{id}) \overset{\text{id}, \text{id}^{\prime}}{\longrightarrow} \gamma(\text{id})\quad\quad\quad\quad\quad\;\; \alpha \overset{\text{id}, \text{id}^{\prime}}{\longrightarrow} \alpha & \end{array} $$
c:Q ^la→Reg(Σ) is given by $c(q^{\text {la}}) = T_{\{\gamma _{2}, \alpha \}}$ because Q ^la = {q ^la}.

Obviously, c(q ^la) is a regular tree language. Additionally, we note that the state id^′ is essentially just a renaming of the state id (and both realize the identity on T _{{γ, α}}). The l- xt^R M ₁ is an ε-free, delabeling, linear top-down tree transducer with regular look-ahead. It is not strict and not nondeleting.

Next, we recall the semantics of the l- xt^R M = (Q,Σ,Δ, Q ₀, R, c), which is (mostly) given by synchronous substitution. Formally, a link is just an element $(v, w) \in \mathbb {N}^{\ast } \times \mathbb {N}^{\ast }$. While the links in a rule are implicit and established due to occurrences of equal states, we need an explicit representation of the links in the sentential forms computed by M. These links together with the trees into which they point will form a dependency that is used in proofs later on. Our derivation relation is thus defined over structures consisting of an input tree, an output tree, and a set of links relating positions of those trees. Let us formalize this notion, which we call form.

Definition 3 ([12, Section 3])

A form (over Q, Σ, and Δ) is a triple 〈ξ, L, ζ〉 consisting of an input tree ξ ∈ T _Σ(Q), an output tree ζ ∈ T _Δ(Q), and a set L⊆pos(ξ)×pos(ζ) of links relating positions in the two trees.

Next, we formalize the links in a rule ρ ∈ R. These links are added to the links of a form whenever the rule ρ is applied in the derivation process. Since these links are relative to the positions at which the rule is applied, two parameters $v, w \in \mathbb {N}^{\ast }$ indicate those two positions.

Definition 4

Let $(\ell \overset {q}{\longrightarrow } r) \in R$ and $v, w \in \mathbb {N}^{\ast }$. The set of links of $\ell \overset {q}{\longrightarrow } r$ for the positions v and w is

$$\text{links}_{v, w}(\ell \overset{q}{\longrightarrow} r) = \left\{\left( v.\text{pos}_{p}(\ell),~ w.\text{pos}_{p}(r) \right) \mid p \in \text{states}(r) \right\}~. $$

Example 5

Let us compute two such sets of links. Whenever it is clear that the relevant positions are in {1,…,9}^∗, we write positions without separating dots; e.g., 211 stands for the position 2.1.1 of length 3.

$$\begin{array}{@{}rcl@{}} \text{links}_{1, 21} \left( \sigma_{1}(p, q) \overset{\star}{\longrightarrow} \sigma_{1}(p, q) \right) &=& \left\{(11, 211),~(12, 212)\right\}\\ \text{links}_{1, 21} \left( \sigma(q^{\text{la}}, q) \overset{q}{\longrightarrow} q \right) &=& \left\{(12, 21) \right\} \end{array} $$

We use grayed splines to indicate links in illustrations. The rules ρ ₁ and ρ ₂ above and their links, which are those of links_{ε, ε}(ρ ₁)={(1,1),(2,2)} and links_{ε, ε}(ρ ₂)={(2, ε)}, are displayed in Fig. 1.

The derivation process is started with a simple form 〈q ₀,{(ε, ε)}, q ₀〉 consisting of an initial state q ₀∈ Q ₀ as input and output tree and the trivial link relating both occurrences of q ₀ (i.e., the roots of the trees). The current form can evolve in two ways. Either (i) we apply a rule (ℓ, q, r)∈ R to a pair (v, w) of linked occurrences of the state q or (ii) we apply the look-ahead. In the former case, such a rule application replaces the linked occurrences of q in the input and output tree by the left- and right-hand side of the rule to obtain the new input and output trees, respectively. The links of the derived form are obtained by adding the links of the rule (ℓ, q, r) for v and w to the current links. Since we are interested in the links used during the derivation, we preserve all links [in particular also the link (v, w) just used] and never remove a link. In the latter case, in which we want to apply the look-ahead, we require an occurrence of a state q at position v of the input tree that does not take part in any link with an occurrence of q in the output tree. It turns out that such a state q must be in Q ^la, and we can replace that occurrence of q by any tree of the regular look-ahead tree language c(q). Note that such replacements are independent, so different occurrences of q can be replaced by different look-ahead trees of c(q). We can (potentially) continue these replacements until the form is an element of $T_{\Sigma } \times \mathcal P(\mathbb {N}^{\ast } \times \mathbb {N}^{\ast }) \times T_{\Delta }$.

Definition 6 ([12, Section 3])

Given two forms 〈ξ, L, ζ〉 and 〈ξ ^′, L ^′, ζ ^′〉 over Q, Σ, and Δ, we write 〈ξ, L, ζ〉⇒_M〈ξ ^′, L ^′, ζ ^′〉 if one of the following two conditions holds:

there exist a rule $(\ell \overset {q}{\longrightarrow } r) \in R$ and a link (v, w)∈ L∩(pos_q(ξ)×pos_q(ζ)) such that
$$\xi^{\prime} = \xi[v \gets \ell] \qquad \zeta^{\prime} = \zeta[w \gets r] \quad \text{and} \quad L^{\prime} = L \cup \text{links}_{v, w}(\ell\overset{q}{\longrightarrow} r), $$
there exist a state q ∈ Q ^la, a position v ∈pos_q(ξ) with w∉pos_q(ζ) for all links (v, w)∈ L, and a tree t ∈ c(q) such that
$$\xi^{\prime} = \xi[v \gets t]\qquad \zeta^{\prime} = \zeta \quad \text{and} \quad L^{\prime} = L ~. $$

A form 〈ξ, L, ζ〉 is a sentential form (of M) if $\left \langle q_{0}, \{(\varepsilon , \varepsilon )\}, q_{0} \right \rangle \Rightarrow ^{\ast }_{M} \langle \xi , L, \zeta \rangle $ holds for some q ₀∈ Q ₀. The set of all sentential forms is denoted by $\mathcal {S}\mathcal {F}(M)$.

A few derivation steps using the l- xt^R M ₁ of Example 2 are illustrated in Fig. 2. Next, we define the tree transformation computed by an l- xt^R.

Definition 7

The l- xt^R M computes the set $\mathcal {D}(M)$ of dependencies, which are the sentential forms with state-free input and output trees. Hence

$$\mathcal{D}(M) = \left\{ \langle t, L, u \rangle \in \mathcal{S}\mathcal{F}(M) \mid t \in T_{\Sigma},~ u \in T_{\Delta} \right\} ~. $$

Moreover, it computes the tree transformation τ(M), which is given by

$$\tau(M) = \{(t, u) \mid \langle t, L, u\rangle \in \mathcal D(M) \text{ for some } L \subseteq \mathbb{N}^{\ast} \times \mathbb{N}^{\ast}\}~. $$

Two l- xt^R M ₁ and M ₂ are equivalent if τ(M ₁) = τ(M ₂).

Example 8

Let M ₁ be the l- xt^R of Example 2. Then

$$\left\langle \sigma_{1} \left( \gamma_{1}(\sigma_{2}(\alpha, \alpha)), \gamma_{2}(\sigma(\gamma_{2}(\alpha), \sigma_{2}(\alpha, \alpha))) \right),~L,~\sigma_{1} \left( \sigma_{2}(\alpha, \alpha), \sigma_{2}(\alpha, \alpha) \right) \right\rangle \in \mathcal{D}(M_{1}) $$

where

$$\begin{array}{@{}rcl@{}} L &=& \{(\varepsilon, \varepsilon),~ (1, 1),~ (11, 1),~(111, 11),~ (112, 12)\} \cup\\ && \{(2, 2),\, (21, 2),\, (212, 2),\, (2121, 21),\, (2122, 22)\}~, \end{array} $$

which corresponds to the final sentential form of the derivation displayed in Fig. 2. To describe the tree transformation computed by M ₁ in general, we first need some terminology. A tree t ∈ T _Σ is “special” if there exist a tree $c \in T_{\{\sigma , \gamma _{2}, \alpha \}}(X_{1})$ and two trees t ₁, t ₂∈ T _{{γ, α}} such that (i) t = c[σ ₂(t ₁, t ₂)], (ii) c is linear and nondeleting in X ₁, and (iii) for all w ∈pos(c) we have c(w) = σ only if $w \prec \text {pos}_{x_{1}}(c)$. For such a special tree, the subtree σ ₂(t ₁, t ₂) is the “anchor” of t. Furthermore, the “left spine” of a tree t ∈ T _Σ is the set pos(t)∩{1}^∗ of positions. For every i ∈{1,2} and position v on the left spine, if t(v) = σ _i, then the subtree t|_v2 is a “ σ _i-rib” of t.

The domain of τ(M ₁) consists of all trees t ∈ T _Σ such that (i) the sequence of labels of (the positions on) the left spine of t (from root to leaf) is in σ ₁{σ ₁, γ ₁}^∗ σ ₂ γ ^∗ α, (ii) each σ ₁-rib of t is special, and (iii) the unique σ ₂-rib of t is in T _{{γ, α}}. Such a tree t is only related to u in the transformation τ(M ₁), where u is obtained from t by (i) removing all γ ₁-symbols on the left spine and (ii) replacing each σ ₁-rib by its anchor. Consequently, τ(M ₁) is actually a partial function.

Since every pair (t, u)∈ τ(M) is ultimately created by (at least) one successful derivation, leading to a dependency 〈t, L, u〉, we can inspect the links in L, which associate subtrees of t with subtrees of u. Roughly speaking, the links establish which parts of the output tree u were generated due to a particular part of the input tree t. Variants of this correspondence are called contribution in [9] and origin in [20]. Occasionally, we are not interested in the links. In those cases we also write $q \Rightarrow _{M}^{\ast } (t, u)$ provided that $\langle q,\{(\varepsilon , \varepsilon )\}, q \rangle \Rightarrow _{M}^{\ast } \langle t, L, u\rangle $ for some $L \subseteq \mathbb {N}^{\ast } \times \mathbb {N}^{\ast }$. The next, basic lemma expresses the fact that the replacements in the derivations of an l- xt^R are context-free.

Lemma 9 (context-freeness)

For every state q ∈ Q, input tree t ∈ T _Σ and output tree u∈T _Δ , we have $q \Rightarrow _{M}^{\ast } (t, u)$ if and only if there exists a rule (ℓ,q,r) ∈ R with pos(ℓ) ⊆ pos(t) and pos(r) ⊆ pos(u) such that

t(v) = ℓ(v) for all v ∈ pos _Σ(ℓ) and u(w) = r(w) for all w ∈ pos _Δ(r),
$\ell (v) \Rightarrow _{M}^{\ast } (t|_{v}, u|_{w})$ for every $(v, w) \in \text {links}_{\varepsilon , \varepsilon }(\ell \overset {q}{\longrightarrow } r)$ , and
t|_v ∈ c(ℓ(v)) for all v ∈ pos(ℓ) with ℓ(v) ∈ states(ℓ)∖states(r).

This lemma can be used in proofs by induction on the length of a derivation because the derivations $\ell (v) \Rightarrow _{M}^{\ast } (t|_{v}, u|_{w})$ are shorter than the derivation $q \Rightarrow _{M}^{\ast } (t, u)$.

Notation 10

To allow concise statements, we introduce the following shorthands, which mirror those already defined for tree homomorphisms:

We use these abbreviations in conjunction with l- xt^R to restrict to transducers with the indicated properties. For example, snl-xt stands for “strict and nondeleting linear extended top-down tree transducer” (without look-ahead). We use the same abbreviations with the stem (i.e., the material behind the hyphen) in capital letters for the corresponding classes of computed tree transformations. For instance, snl-XT stands for the class of all tree transformations computable by snl-xt, and QR denotes the class of all tree transformations computable by qr. We already remarked that every nondeleting l- xt^R is an l-xt, so we have nl-XT^R = nl-XT and similarly for the non-extended case and for all defined subclasses. To write such statements concisely, we also use sets of restrictions containing ‘ ’, ‘s’, ‘n’, and ‘d’ in front of the (potentially already restricted) stems. For instance, for every , we denote by the class of all tree transformations computed by l- xt^R that obey all restrictions in . In particular, ∅l-XT^R = l-XT^R. In this manner we can simply state for all .

We observe that for every ; i.e., every linear tree homomorphism is a linear top-down tree transformation (with the same properties: ‘s’, ‘n’, ‘d’). In fact, if φ:T _Σ → T _Δ is a linear tree homomorphism, then an equivalent l-t M _φ has the set Q = X _m∪{⋆} of states, where m is the maximal rank of an element of Σ, the initial state ⋆, and all rules (σ(x ₁,…, x _k), q, φ(σ)) with σ ∈ Σ_k, $k\in \mathbb {N}$, and q ∈ Q. It should be clear that τ(M _φ) = φ.

Next, we recall some results that relate l-xt^R to bimorphisms. In [2] the class $\mathcal {B}(\text {snl}, \text {snl})$ is denoted by BI, and in [21] the class $\mathcal {B}(\text {snl}, \text {nl})$ is denoted by B(LCE,LC).

Proposition 11 ([2] and [21, Theorems 17 and 4])

Thus, every tree transformation in l- XT^R is the composition of an inverse tree homomorphism, the identity on a regular tree language, and a tree homomorphism. We will need a similar, but simpler result that tells us how to emulate an l- xt^R by the composition of an inverse homomorphism and an l- t^R.

Proposition 12 ([13, Lemma 4.1 and Corollary 4.1])

For every

Proof

We prove both inclusions starting with ( ⊇). Let φ be a nondeleting and linear tree homomorphism from Γ to Σ, and let M = (Q,Γ,Δ, Q ₀, R, c) be an l-t^R. We construct the l- xt^R M ^′ = (Q,Σ,Δ, Q ₀, R ^′, c ^′) such that for each rule (γ(q ₁,…, q _k), q, r) in R (with $k \in \mathbb {N}$, γ ∈Γ_k, and q ₁,…, q _k∈ Q) the rule (φ(γ)[q ₁,…, q _k], q, r) is in R ^′. No further rules are in R ^′. Note that since φ is nondeleting, we have states(φ(γ)[q ₁,…, q _k]) = {q ₁,…, q _k} and thus Q ^la is the same for M ^′ and M. For every q ∈ Q ^la we set c ^′(q) = φ(c(q)), which is regular because, as is well known, the class of regular tree languages is closed under linear tree homomorphisms. Using Lemma 9, it is straightforward to show that $q \Rightarrow ^{\ast }_{M^{\prime }} (t,u)$ if and only if there exists s ∈ T _Γ with t = φ(s) and $q \Rightarrow ^{\ast }_{M} (s, u)$. Thus τ(M ^′)={(φ(s), u)∣(s, u)∈ τ(M)} = φ ⁻¹ ;τ(M).

For the remaining inclusion ( ⊆), let M = (Q,Σ,Δ, Q ₀, R, c) be an l-xt^R. We turn R into a ranked alphabet such that $\textnormal {rk}(\ell \overset {q}{\longrightarrow } r) = \left |{\text {pos}_{Q}(\ell )}\right |$ for every $(\ell \overset {q}{\longrightarrow } r) \in R$. Using this ranked alphabet R we now construct the l-t^R M ^′ = (Q, R∪Σ,Δ, Q ₀, R ^′, c) and the nondeleting and linear tree homomorphism φ from R∪Σ to Σ as follows. For every $k \in \mathbb {N}$ and rule ρ = (ℓ, q, r) in R _k with pos_Q(ℓ) = {v ₁,…, v _k} the rule $\rho \left (\ell (v_{1}), \dotsc , \ell (v_{k}) \right ) \overset {q}{\longrightarrow } r$ is in R ^′ and φ(ρ) = ℓ[v _i←x _i∣1 ≤ i ≤ k]. No further rules are in R ^′. Additionally, φ(σ) = σ(x ₁,…, x _k) for every $k \in \mathbb {N}$ and σ ∈Σ_k, which yields that φ(t) = t for every t ∈ T _Σ. This latter part is needed for the look-ahead c. Clearly, if we apply the construction in the above proof of the first inclusion to φ and M ^′, then we reobtain M. Hence φ ⁻¹ ;τ(M ^′) = τ(M). □

We use this proposition to establish our first composition result, which extends the classical composition result of [7] for linear top-down tree transducers with regular look-ahead. The only difference is that our first transducer has extended left-hand sides (i.e., it is an l- xt^R instead of just an l- t^R).

Lemma 13 (composition on the right)

For every

Proof

Immediate from Proposition 12 and the composition closure result (l-T^R)² ⊆ l-T^R for linear top-down tree transducers with regular look-ahead; it is straightforward to check that the proof of this result in [7, Theorem 2.11] preserves the properties ‘s’ and ‘n’. □

We conclude this section by discussing two results on regular look-ahead. First, we recall that when deletion is allowed, regular look-ahead adds expressive power.

Proposition 14 ([17, Lemma 4.3])

Proof

The counter-example presented in the proof of [17, Lemma 4.3], which shows l-T^R⫅̸l-XT, is in sl-T^R. □

Second, we recall from [7, Theorem 2.6] that an l- t^R (with look-ahead) can be decomposed into two l-t (without look-ahead), of which the first is a finite-state relabeling. This result can easily be generalized to extended top-down tree transducers and their compositions.

Lemma 15 (look-ahead decomposition)

for every n≥1 and .

Proof

The second inclusion is immediate because . We prove the first inclusion by induction on n. For n = 1 an obvious generalization of the construction in the proof of [7, Theorem 2.6], which preserves ‘s’ and ‘n’, can be used. For n ≥ 1, we have

where the case n = 1 is used in the first step, Lemma 13 in the second step, and the induction hypothesis in the last step. □

Lemma 15 implies that

for every n ≥ 1 and , so the classes and have the same composition closure. However, this closure is potentially achieved at different powers.

4 Four Classes that are Closed at a Finite Power

In this section, we show that the four classes , , , and are closed under composition at a finite power. We first recall a central result of [3], which shows that none of them is closed under composition.

Proposition 16 ([3, Section 3.4])

We note that [3] states the even stronger result that the class $\mathcal {B}(\text {snl}, \text {snl})^{2}$ is not contained in the class of all bimorphisms, which implies the above result by Proposition 11. In [3] the class $\mathcal {B}(\text {snl}, \text {snl})$ is denoted by $\mathcal {B}(\text {s}, \text {c})$. The proof of Theorem 31 in Section 5 implies Proposition 16 for instead of l-XT^R, which is all we need; for the implication see non-inclusion (ii) in the proof of Theorem 47.

Next we recall another central result of [3]: the (very restricted) class is not closed under composition (by the previous proposition), but is closed under composition at power 2.

Proposition 17 ([3, Theorem 6.2])

for every n≥3.

As we will show now, the (strict) classes and are also closed under composition already at the second power. We start with a lemma that decomposes an into two transducers of which one is an , for which we have the composition closure result of Proposition 17. For the benefit of Section 6, we make ε-freeness optional in the next two lemmas.

Lemma 18 (decomposition on the right)

sdl-H for every .

Proof

This is proved for strict tree homomorphisms in [5, Section I-2-1-3-5]. The proof can be generalized to as follows. Let M = (Q,Σ,Δ, Q ₀, R) be an sl-xt. Clearly, we may assume a separation of the states into deleted and non-deleted states. More precisely, we assume m ≥ max{rk(σ)∣σ ∈Σ} such that Q = Q ₁∪{1,…, m} with $Q_{1} \cap \mathbb {N} = \emptyset $ and for every rule (ℓ, q, r)∈ R the following three conditions hold: (i) q ∈ Q ₁ and states(r)⊆Q ₁, (ii) ℓ is linear in Q, and (iii) states(ℓ)∖states(r)⊆{1,…, m}. Let Δ^′ be the ranked alphabet {δ _n∣δ ∈Δ, 0 ≤ n ≤ m} with rk(δ _n)=rk(δ) + n. In addition, let $\overline {\Sigma } = \{\overline \sigma \mid \sigma \in {\Sigma }\}$ be the ranked alphabet with $\textnormal {rk}(\overline \sigma ) = \textnormal {rk}(\sigma )$. We suppose that Δ, Δ^′, and $\overline {\Sigma }$ are pairwise disjoint. As intermediate alphabet we take ${\Gamma } = {\Delta } \cup {\Delta }^{\prime } \cup \overline {\Sigma }$, and let α ∈Δ₀ be an arbitrary nullary output symbol. Now we first construct the strict delabeling tree homomorphism φ from Γ to Δ such that (i) φ(δ _n) = φ(δ) = δ(x ₁,…, x _k) for every δ ∈Δ_k and 0 ≤ n ≤ m and (ii) $\varphi (\overline \sigma ) = \alpha $ for every σ ∈Σ. Thus, φ turns every δ _n into δ and deletes its last n arguments.

For every rule (ℓ, q, r)∈ R there exist $k \in \mathbb {N}$, δ ∈Δ, and r ₁,…, r _k∈ T _Δ(Q) such that r = δ(r ₁,…, r _k) because M is strict. We construct the nondeleting sl-xt M ^′=(Q,Σ,Γ, Q ₀, R ^′) such that R ^′ contains the rule

$$\left( \ell, q, \delta_{n}({r_{1}, \dotsc, r_{k}}, {i_{1}, \dotsc, i_{n}} )\right) $$

for every rule (ℓ, q, δ(r ₁,…, r _k))∈ R, where states(ℓ)∖states(r) = {i ₁,…, i _n} with i ₁<⋯ < i _n. This is a proper rule because ℓ is linear in Q. Moreover, R ^′ contains the rules $\left (\sigma (1, \dotsc , k), i, \overline \sigma (1, \dotsc , k)\right )$ for all $k \in \mathbb {N}$, σ ∈Σ_k, and i ∈{1,…, m}. The set R ^′ contains no further rules. Thus, M ^′ simulates M but attaches the subtrees that are deleted by M to the root of the right-hand side of each rule. It is straightforward to show that τ(M ^′) ;φ = τ(M). Clearly, if M is ε-free, then so is M ^′. □

The next lemma is our second composition result, which is more restricted than the first, which is Lemma 13, but sufficiently powerful in combination with Lemma 18.

Lemma 19 (composition on the left)

sdl-H; for every .

Proof

Let φ:T _Σ→T _Γ be a strict delabeling linear tree homomorphism, and let M = (Q,Γ,Δ, Q ₀, R) be a strict l-xt. Moreover, let Q ^′ = Q∪{⋆} for a new state ⋆∉Q. We extend φ to a tree transformation φ ^′:T _Σ(Q ^′)→T _Γ(Q ^′) such that φ ^′(q ^′) = q ^′ for every q ^′∈ Q ^′ and

$$\varphi^{\prime} \left( \sigma({t_{1}, \dotsc, t_{k}}) \right) = \varphi(\sigma) \left[\varphi^{\prime}(t_{1}), \dotsc, \varphi^{\prime}(t_{k}) \right] $$

for every $k \in \mathbb {N}$, σ ∈Σ_k, and t ₁,…, t _k∈ T _Σ(Q ^′). We construct the l-xt M ^′=(Q ^′,Σ,Δ, Q ₀, R ^′) such that for each rule (ℓ, q, r)∈ R we have all rules (ℓ ^′, q, r) in R ^′ for which (i) ℓ ^′∈ T _Σ(Q ^′) is linear in Q, (ii) φ ^′(ℓ ^′) = ℓ, and (iii) |pos_Σ(ℓ ^′)|=|pos_Γ(ℓ)|. No further rules are in R ^′.

Let us quickly consider a small example. Suppose that R contains the rule $\gamma \left (\alpha , \gamma ^{\prime }(q) \right ) \overset {q}{\longrightarrow } \delta (q)$ and we have (i) φ(σ) = γ(x ₃, x ₂), (ii) φ(σ ^′) = γ ^′(x ₁), and (iii) φ(α) = α with {α ⁽⁰⁾,(σ ^′)⁽²⁾, σ ⁽³⁾}⊆Σ. Then R ^′ contains the rule $\sigma \left (\star , \sigma ^{\prime }(q, \star ), \alpha \right ) \overset {q}{\longrightarrow } \delta (q)$.

It should be clear that τ(M ^′) = φ ;τ(M). Finally, we observe that M ^′ is strict because it has the same right-hand sides of rules as M, and it is ε-free if M is ε-free because φ is strict. □

The previous two lemmas are now used to prove that and are closed under composition at power 2.

Theorem 20

; for every n≥1.

Proof

The second inclusion is trivial because . For the first inclusion, we first prove that ;sdl-H. The idea of this inclusion is that the first splits off a tree homomorphism of type ‘sdl’ on the right (using Lemma 18), which is then absorbed on the left by the second (using Lemma 19). This auxiliary statement is proved by induction. For n = 1, we have to prove ;sdl-H, which is the statement of Lemma 18. For n ≥ 1 we obtain that

where we used Lemma 18 in the second step, Lemma 19 in the third step, and the induction hypothesis in the last step. From Lemma 15 and the above inclusion we now conclude that

Since , this implies that

where the second inclusion is due to Proposition 17. Since sdl-H⊆sl-T we can apply Lemma 13 to obtain that

Applying Lemmas 15 and 13 once more, we obtain

Since we have proved the statement. □

Up to now, we have shown that the (strict) classes and are closed under composition at the second power. In the rest of this section, we will show that the classes and are closed under composition at the third and fourth power, respectively. We start with a normal form for , in which every rule that violates the strictness condition is simulated by a chain of rules for a (non-extended) l-t^R.

Lemma 21 (non-strict normal form)

For every there exists an equivalent such that ℓ ∈ Σ (Q ^′) for every rule (ℓ,q,r) ∈ R ^′ with r∈Q ^′.

Proof

Let M = (Q,Σ,Δ, Q ₀, R, c) and M ^′=(Q ^′,Σ,Δ, Q ₀, R ^′, c ^′). Every (non-strict) rule ρ = (ℓ, q, r) in R with r ∈ Q can be simulated by a finite set $R^{\prime }_{\rho }$ of l-t^R rules as follows. We consider new states of the form 〈ρ, v〉 where v ∈pos(ℓ)∖{ε,poss_r(ℓ)}. Moreover, we let 〈ρ, ε〉 = q and 〈ρ,poss_r(ℓ)〉 = r. For every position v ∈pos(ℓ) such that v≺poss_r(ℓ), we have the following rule in $R^{\prime }_{\rho }$:

$$\ell(v) \left( \langle \rho, v1\rangle, \dotsc, \langle \rho, vk\rangle \right) \overset{\langle \rho, v\rangle}{\longrightarrow} \langle \rho, vi \rangle~, $$

where k = rk(ℓ(v)) and $i \in \mathbb {N}$ is the unique integer such that v i≼poss_r(ℓ). The look-ahead for every new state 〈ρ, v j〉 with j≠i is defined by

$$c^{\prime}(\langle \rho, vj\rangle) = (\ell|_{vj}) \left[q^{\prime} \gets c(q^{\prime})\mid q^{\prime}\in \text{states}(\ell|_{vj}) \right] ~. $$

We note that states(ℓ|_{v
j})⊆Q ^la. The tree language c ^′(〈ρ, v j〉) is regular by the folklore result stating that OI-substitution preserves regularity, which we mentioned at the end of Section 2. Recall that in OI-substitution, different occurrences of the same state q ^′ can be replaced by different trees of c(q ^′). The rules in $R^{\prime }_{\rho }$ simulate the rule ρ by consuming the left-hand side ℓ position by position, following the path from the root to the unique occurrence of r. Thus, we define the set Q ^′ of states of M ^′ to consist of Q together with all the mentioned new states. The set R ^′ of rules consists of all strict rules in R together with the rules in $R^{\prime }_{\rho }$ for all non-strict rules ρ in R. The look-ahead c ^′ equals c on the states in Q ^la, and is defined as above for the new states. Then τ(M ^′) = τ(M) and M ^′ satisfies the requirements. □

Example 22

We illustrate the construction on the example rule

$$\rho = \sigma \left( \sigma(p,p), \sigma(\alpha, r) \right)\overset{q}{\longrightarrow} r~, $$

for which p, q, r ∈ Q and the relevant look-ahead is c(p) = T. Corresponding to this rule, M ^′ has the following two rules in $R^{\prime }_{\rho }$:

$$\sigma \left( \langle \rho, 1\rangle, \langle \rho, 2\rangle \right) \overset{q}{\longrightarrow} \langle \rho, 2\rangle \qquad \text{ and } \qquad \sigma\left( \langle \rho, 21\rangle, r \right) \overset{\langle \rho, 2\rangle} {\longrightarrow} r $$

because q = 〈ρ, ε〉 and r = 〈ρ,22〉. Moreover, we have

$$c^{\prime}(\langle \rho, 1\rangle) = \{ \sigma(t_{1}, t_{2}) \mid t_{1}, t_{2} \in T\} $$

and c ^′(〈ρ,21〉) = {α} for the look-ahead c ^′ of M ^′.

The next lemma is similar to Lemma 18, in that it demonstrates how to decompose an into a delabeling l-t^R and an , for which we now have the composition closure result of Theorem 20. The proof is, however, more complicated than the one of Lemma 18. Since the delabeling property is not essential in the following, we actually state a weaker variant.

Lemma 23 (decomposition on the left)

Proof

Let M = (Q,Σ,Δ, Q ₀, R, c) be an ε-free l-xt^R such that ℓ ∈Σ(Q) for every rule (ℓ, q, r)∈ R with r ∈ Q. We can assume this normal form without loss of generality by Lemma 21. Additionally, we may assume that |Q|≥m, where m = max{rk(σ)∣σ ∈Σ}. We will construct an l-t^R M ₁ and a strict such that τ(M ₁);τ(M ₂) = τ(M). Intuitively speaking, the transducer M ₁ processes the input by nondeterministically executing a number of non-strict rules of M. Whenever it executes two consecutive non-strict rules, M ₁ simulates the state behavior of M. Moreover, M ₁ marks the positions in the (processed) input where it has applied a sequence of consecutive non-strict rules by indicating the corresponding state transition of M. The transducer M ₂ then uses these markings to execute the missing strict rules of M.

As intermediate ranked alphabet we use Γ=Σ∪(Σ×Q×Q), where each triple 〈σ, q ^′, q〉∈Σ×Q×Q has the same rank as σ. We fix m pairwise different states p ₁,…, p _m∈ Q. We construct the l-t^R M ₁=(Q ₁,Σ,Γ,{p ₁}, R ₁, c ₁) with states Q ₁ = Q∪(Q×Q) and the set R ₁ of rules consists of:

(i)
the rule $\sigma ({p_{1}, \dotsc , p_{k}} ) \overset {p}{\longrightarrow } \sigma ({p_{1}, \dotsc , p_{k}})$ for every $k \in \mathbb {N}$, σ ∈Σ_k, and p ∈ Q,
(ii)
the two rules
for every non-strict rule $\sigma ({q_{1}, \dotsc , q_{k}}) \overset {q}{\longrightarrow } q_{i}$ in R with 1 ≤ i ≤ k and every p, q ^′∈ Q, and
(iii)
the rule
for every $k \in \mathbb {N}$, σ ∈ Σ_k, and q ^′, q ∈ Q.

We note that $Q_{1}^{\text {la}} \subseteq Q^{\text {la}}$. The look-ahead mapping $c_{1} : Q_{1}^{\text {la}} \to \text {Reg}({\Sigma })$ is given by c ₁(q) = c(q) for every $q \in Q_{1}^{\text {la}}$. Actually, M ₁ is delabeling.

Next, we construct the

$$M_{2}=(Q_{2},{\Gamma}, {\Delta}, Q_{0},R^{\prime} \cup R_{2}, c_{2})$$

with the set Q ₂ = Q of states, the set R ^′={(ℓ, q, r)∈ R∣r∉Q} of strict rules of M, and the set

$$R_{2} = \{ \langle \sigma, q^{\prime}, q\rangle(\ell_{1}, \dotsc, \ell_{k}) \overset{q^{\prime}}{\longrightarrow} r \mid q^{\prime} \in Q,\, (\sigma(\ell_{1}, \dotsc, \ell_{k}) \overset{q}{\longrightarrow} r ) \in R^{\prime} \} \enspace. $$

Again, $Q_{2}^{\text {la}} \subseteq Q^{\text {la}}$, where $Q_{2}^{\text {la}}$ contains the look-ahead states of M ₂, so we just set the look-ahead mapping $c_{2} : Q_{2}^{\text {la}} \to \text {Reg}({\Gamma })$ to c ₂(q) = c(q) for every $q \in Q_{2}^{\text {la}}$.

Intuitively, it should be clear that τ(M ₁) ;τ(M ₂) = τ(M). Whenever M ₂ arrives in state q ^′ at an input position with label 〈σ, q ^′, q〉, it knows that M ₁ has applied a sequence of non-strict rules of M that led from state q ^′ to state q, and thus M ₂ can continue acting as if it is already in state q. Formally, it can be proved that, for every state q ∈ Q, input tree t ∈ T _Σ, and output tree u ∈ T _Δ, we have $q \Rightarrow _{M}^{\ast } (t, u)$ if and only if there exists s ∈ T _Γ such that $p_{1} \Rightarrow _{M_{1}}^{\ast } (t, s)$ and $q \Rightarrow _{M_{2}}^{\ast } (s, u)$. The proof is by induction on the length of the derivations using Lemma 9. It uses several elementary properties of the derivations of M ₁ and M ₂ such as (for all p, q, q ^′, q ^″∈ Q):

if $p_{1} \Rightarrow _{M_{1}}^{\ast } (t, s)$, then $p \Rightarrow _{M_{1}}^{\ast } (t, s)$,
if $p_{1} \Rightarrow _{M_{1}}^{\ast } \left (t, \sigma (s_{1}, \dotsc , s_{k}) \right )$, then $\langle q^{\prime }, q\rangle \Rightarrow _{M_{1}}^{\ast } \left (t, \langle \sigma , q^{\prime }, q\rangle (s_{1}, \dotsc , s_{k}) \right )$,
$p_{1} \Rightarrow _{M_{1}}^{\ast } \left (t, \langle \sigma , q^{\prime }, q\rangle (s_{1}, \dotsc , s_{k}) \right )$ if and only if $\langle q^{\prime \prime }, q^{\prime }\rangle \Rightarrow _{M_{1}}^{\ast } \left (t, \langle \sigma , q^{\prime \prime }, q \rangle (s_{1}, \dotsc , s_{k}) \right )$, and
$q \Rightarrow _{M_{2}}^{\ast } \left (\sigma (s_{1}, \dotsc , s_{k}), u\right )$ if and only if $q^{\prime } \Rightarrow _{M_{2}}^{\ast } \left (\langle \sigma , q^{\prime }, q\rangle (s_{1}, \dotsc , s_{k}), u \right )$.

□

Lemmas 23 and 13 now enable us to prove that the class is closed under composition at power 3. The proof is similar to, but easier than, the one of Theorem 20.

Theorem 24

for every n≥1.

Proof

Again, the second inclusion is trivial because l-T^R and are subclasses of . Similar to the proof of Theorem 20, the idea of the first inclusion is that the last splits off an l-t^R on the left (using Lemma 23), which is then absorbed on the right by the penultimate (using Lemma 13). Formally we prove by induction on n that

which suffices by Theorem 20. For n = 1 we obtain , which is stated in Lemma 23. In the induction step for n ≥ 1, we obtain

where we use Lemma 23 in the second step, Lemma 13 in the third step, and the induction hypothesis in the last step. □

It is immediate from Theorem 24 and Lemma 15 that the class is closed under composition at power 4. Thus, in contrast to Theorem 20, look-ahead influences the power of closedness in the non-strict case, as will be proved in the next section.

Corollary 25

for every n≥1.

A summary of our results concerning the powers at which the considered classes are closed under composition is provided in Table 2. In the next section, we will demonstrate that these powers are indeed the least ones with this property.

Table 2 Summary of the results of Section 4

Full size table

5 Least Power of Closedness

In this section, we will determine the least power at which the composition closure is achieved for the classes , , , and , which are all computed by certain ε-free l-xt^R. For the strict classes the least power is 2, as stated in the next theorem. In the remainder of this section we consider the non-strict classes.

Theorem 26

for every n≥3.

Proof

The first inclusion is trivial and its strictness follows from Proposition 14. The second inclusion is also trivial and its strictness follows from Proposition 16, which shows that the class is not closed under composition. The three equalities are proved in Theorem 20. □

In the following, we will use the computed dependencies in $\mathcal {D}(M)$, for which we recall some important properties from [11]. Let $L \subseteq \mathbb {N}^{\ast } \times \mathbb {N}^{\ast }$ be a set of links [e.g., the set L of links in a dependency $\langle t, L, u\rangle \in \mathcal {D}(M)$]. The elements of dom(L) are also called link origins of L. For the next definition, proposition and lemma, let M = (Q,Σ,Δ, Q ₀, R, c) be the considered ε-free l-xt^R.

Definition 27 ([11, Definitions 4 and 5])

A set $L \subseteq \mathbb {N}^{\ast } \times \mathbb {N}^{\ast }$ of links is

strictly input hierarchical if for all links (v ₁, w ₁),(v ₂, w ₂) ∈ L
- v ₁≺v ₂ implies w ₁≼w ₂ and
- v ₁ = v ₂ implies w ₁≼w ₂ or w ₂≼w ₁,
input link-distance bounded by $b \in \mathbb {N}$ if for all link origins v ₁, v ₂∈dom(L) with v ₁≺v ₂ and |v ₂|−|v ₁|>b there exists a link origin v ∈dom(L) such that v ₁≺v≺v ₂ and |v|−|v ₁| ≤ b.

The set $\mathcal {D}(M)$ of dependencies has those properties if for each dependency $\langle t, L, u\rangle \in \mathcal {D}(M)$ the set L of links has them. We also say that $\mathcal {D}(M)$ is input link-distance bounded if there exists an integer $b \in \mathbb {N}$ such that it is input link-distance bounded by b.

We assume that the corresponding properties are defined for the output side, using L ⁻¹ instead of L. For example, L is strictly output hierarchical if L ⁻¹ is strictly input hierarchical. The set $\mathcal {D}(M)$ computed by the ε-free l-xt^R M always has these properties as shown in [11].

Proposition 28 ([11, Corollary 1 and Theorem 2])

The set $\mathcal {D}(M)$ of dependencies is strictly input and output hierarchical, and it is input and output link-distance bounded.

These properties should be intuitively clear. They are discussed in more detail in [11]. Roughly speaking, the set L of links of a sentential form of M is strictly input and output hierarchical because links cannot cross each other. In addition, if b is the maximal height of the left-hand (resp. right-hand) side of a rule of M, then L is obviously input (resp. output) link-distance bounded by b. Next, we observe some simple consequences of Proposition 28, which we will use later. Whenever we mention ‘(in)comparable’ in the following, we refer to the partial prefix order ≼.

Lemma 29

Let $\langle t, L, u \rangle \in \mathcal {D}(M)$ be a dependency, and let $\mathcal {D}(M)$ be input link-distance bounded by b.

(i)
For all links (v, w),(v ^′, w ^′) ∈ L, v and v ^′ are incomparable if and only if w and w ^′ are incomparable.
(ii)
For all positions v ₁, v ₂ ∈pos(t) and link origins v ₀, v ₃∈ d o m(L) with v ₀≼v ₁≺v ₂≼v ₃ and |v ₂|−|v ₁|>b, there exists a link origin v ∈ d o m(L) such that v ₁≺v≺v ₂ and |v|−|v ₁| ≤ b.

Proof

We start with the if-direction in the first item. Without loss of generality, suppose that v≼v ^′. Then by the definition of strictly input hierarchical, we know that w and w ^′ are comparable. The other direction is similarly true by the definition of strictly output hierarchical. We prove the second item by induction on |v ₃|−|v ₀| as follows. Since v ₀≼v ₁≺v ₂≼v ₃ and |v ₃|−|v ₀|≥|v ₂|−|v ₁|>b, there exists a link origin v ∈dom(L) such that v ₀≺v≺v ₃ and |v|−|v ₀| ≤ b. Consequently, v≺v ₂. Now we distinguish two cases: (a) If v ₁≺v, then v ₁≺v≺v ₂ and |v|−|v ₁|≤|v|−|v ₀| ≤ b proving the second item. (b) Otherwise, we have v ₀≺v≼v ₁≺v ₂≼v ₃ with v, v ₃∈dom(L) and |v ₂|−|v ₁|>b. Since |v ₃|−|v|<|v ₃|−|v ₀|, we can apply the induction hypothesis to v≼v ₁≺v ₂≼v ₃ to prove the statement. □

In the proofs of Theorems 31 and 33 we will see applications of these properties and the following linking theorem, which we also recall from [11].

Proposition 30 ([11, Theorem 4])

Let Ω and Ψ be ranked alphabets with Ψ₀ ≠ ∅ and Ψ ₁ ≠ ∅. Let k,n≥1, and let m ₁,…, m _k be such that

$$\left\{ \left( c^{\prime}[t_{1}, \dotsc, t_{n}]~,~c^{\prime\prime}[t_{1}, \dotsc, t_{n}] \right) \mid t_{1}, \dotsc, t_{n} \in T_{\Psi} \right\} \subseteq \tau(M_{1}); \dotsb ; \tau(M_{k}) ~, $$

where c ^′ ,c ^′′ ∈T _Ω (X _n ) are linear and nondeleting in X _n . There exist trees t ₁ ,…,t _n ∈T _Ψ , dependencies

$$\langle u_{0}, L_{1}, u_{1} \rangle \in \mathcal{D}(M_{1})~,~\langle u_{1}, L_{2}, u_{2} \rangle \in \mathcal{D}(M_{2})~, \dotsc,~\langle u_{k-1}, L_{k}, u_{k} \rangle \in \mathcal{D}(M_{k}) $$

with u ₀ =c ^′[t ₁ ,…,t _n] and u _k =c ^′′[t ₁ ,…,t _n], and a family (v _ij ,w _ij )∈L _j of links for 1≤i≤n and 1≤j≤k, such that for all 1≤i≤n:

(i)
$\text {pos}_{x_{i}}(c^{\prime \prime }) \preceq w_{ik}$,
(ii)
v _i(j+1) ≼w _ij for all 1≤j≤k−1, and
(iii)
$\text {pos}_{x_{i}}(c^{\prime }) \preceq v_{i1}$.

Intuitively, the items mean that (i) position w _{i
k} is in the subtree t _i of the output tree u _k = c ^″[t ₁,…, t _n], (ii) position w _{i
j} has prefix v _i(j+1) in the intermediate tree u _j, and (iii) position v _i1 is in the subtree t _i of the input tree u ₀ = c ^′[t ₁,…, t _n].

To show that an integer k > 1 is the least power at which the closure under composition is achieved for a class $\mathcal {C}$, we present a tree transformation $\tau \in \mathcal {C}^{k}$ that is not in $\mathcal {C}^{k-1}$. Roughly speaking, this is achieved by deducing certain links given the tree transformation with the help of Proposition 30. These links are necessary in the dependency for the determined input-output tree pairs. Thus, we obtain a partial specification of several dependencies in the sense that we know some of its links, but not necessarily all of them. Then we consider whether these partial specifications can be implemented by a composition of . It can be seen from Proposition 30 that we will often not be able to identify both positions of a link exactly, but rather determine that one of its positions has a certain other prominent position as prefix. In such cases, we graphically display the link using a spline with an inverted arrow head that points to the subtree rooted at that prominent position (instead of to the actual position). For example, the splines in Fig. 3 indicate that a position of t on the left (resp. u on the right) is linked to position 2 on the right (resp. on the left).

We now prove that 3 is the least power at which the class is closed under composition.

Theorem 31

Proof

The inclusion follows from Lemma 15. To prove the strictness, let $M_{1}^{\prime } = (Q^{\prime }, {\Sigma }, {\Delta }, \{\star \}, R^{\prime })$ be the that is obtained from the of Example 2 by removing the state q ^la and all rules for the input symbol σ; i.e., the rules $\sigma (q, q^{\text {la}}) \overset {q}{\longrightarrow } q$ and $\sigma (q^{\text {la}}, q) \overset {q}{\longrightarrow } q$. Thus, $\tau (M^{\prime }_{1})$ is the restriction of τ(M ₁) to input trees that do not contain any occurrence of σ. In addition, we use the two bimorphisms $B_{2}, B_{3}\in \mathcal {B}(\text {snl}, \text {snl})$ of [5, Section II-2-2-3-1], where strictness is denoted by ‘e’ and nondeletion by ‘c’. These bimorphisms are similar to the two bimorphisms that are used in [3, Section 3.4] to prove Proposition 16. By Proposition 11, , hence B ₂ and B ₃ can also be defined by and M ₃, respectively. For convenience, we present M ₂ and M ₃ explicitly before we show that $\tau = \tau (M_{1}^{\prime }) \ ; \tau (M_{2}) \ ; \tau (M_{3})$ cannot be computed by a composition of two .

Let M ₂=(Q ₂,Δ,Γ,{⋆}, R ₂) be the with Q ₂={⋆,id,id^′}, the ranked alphabet Γ = {σ ⁽²⁾, γ ⁽¹⁾, α ⁽⁰⁾}, and the set R ₂ consisting of the rules

$$\begin{array}{cc} \sigma_{1}(\star, \sigma_{2}(\text{id}, \text{id}^{\prime})) \overset{\star}{\longrightarrow} \sigma(\sigma(\star, \text{id}), \text{id}^{\prime}) &\qquad\qquad \gamma(\text{id})\overset{\text{id}, \text{id}^{\prime}}\longrightarrow \gamma(\text{id}) \\ \sigma_{2}(\text{id}, \text{id}^{\prime}) \overset{\star}{\longrightarrow} \sigma(\text{id}, \text{id}^{\prime}) & \qquad\qquad\alpha \overset{\text{id}, \text{id}^{\prime}}{\longrightarrow} \alpha~. \end{array} $$

Moreover, let M ₃=(Q ₃,Γ,Δ,{⋆}, R ₃) be the with Q ₃={⋆, p,id,id^′} and the set R ₃ consisting of the rules

Note that both τ(M ₂) and τ(M ₃) are partial functions.

We present a proof by contradiction, so we assume that τ = τ(N ₁);τ(N ₂) for some and N ₂. By Proposition 28 there exist a ₁, a ₂, b ₁, b ₂≥1 such that $\mathcal {D}(N_{1})$ and $\mathcal {D}(N_{2})$ are strictly input and output hierarchical, input link-distance bounded by a ₁ and a ₂, respectively, and output link-distance bounded by b ₁ and b ₂, respectively. Let n = 2⋅ max(a ₁, a ₂, b ₁, b ₂)+2. We select the trees

$$\begin{array}{@{}rcl@{}} c &=&{\gamma_{2}^{n}}(x_{1})~, \\ c^{\prime} &=& \sigma_{1} \left( \sigma_{1} \left( \dotsm \sigma_{1}(\sigma_{2}(x_{n}, x_{n-1}), c[\sigma_{2}(x_{n-2}, x_{n-3})]) \dotsm, c[\sigma_{2}(x_{4}, x_{3})] \right),\right. \\ &&\left.c[\sigma_{2}(x_{2}, x_{1})] \right)~,\text{and}\\ c^{\prime\prime} &=& \sigma_{1} \left( \sigma_{1} \left( \dotsm \sigma_{1}(x_{n}, \sigma_{2}(x_{n-1}, x_{n-2})) \dotsm, \sigma_{2}(x_{3}, x_{2}) \right), x_{1} \right)~, \end{array} $$

of which c ^′ and c ^″ are linear and nondeleting in X _n (see Fig. 4). To be completely formal, c ^′ and c ^″ are defined inductively as follows: First, $c^{\prime } = c^{\prime }_{1}$ with $c^{\prime }_{n-1} = \sigma _{2}(x_{n}, x_{n-1})$ and $c^{\prime }_{i-1} = \sigma _{1}(c^{\prime }_{i+1}, c[\sigma _{2}(x_{i}, x_{i-1})])$ for every even integer 2 ≤ i ≤ n−2. Second, we let $c^{\prime \prime } = \sigma _{1}(c^{\prime \prime }_{2}, x_{1})$ with $c^{\prime \prime }_{n} = x_{n}$ and $c^{\prime \prime }_{i-2} = \sigma _{1}(c^{\prime \prime }_{i}, \sigma _{2}(x_{i-1},x_{i-2}))$ for every even 4 ≤ i ≤ n.

It is straightforward to check that

$$\left( c^{\prime}[t_{1}, \dotsc, t_{n}], c^{\prime\prime}[t_{1}, \dotsc, t_{n}] \right) \in \tau $$

for all t ₁,…, t _n∈ T _Ψ with Ψ = {γ ⁽¹⁾, α ⁽⁰⁾}. Note that, according to Example 8, every σ ₁-rib $c[\sigma _{2}(t_{i},t_{i-1})] = {\gamma _{2}^{n}} \left (\sigma _{2}(t_{i},t_{i-1}) \right )$ is transformed into σ ₂(t _i, t _i−1) by $\tau (M^{\prime }_{1})$. Consequently, we can apply Proposition 30 to obtain that there exist trees t ₁,…, t _n∈ T _Ψ, dependencies $\langle c^{\prime }[t_{1}, \dotsc , t_{n}], L_{1}, u_{1}\rangle \in \mathcal {D}(N_{1})$ and $\langle u_{1}, L_{2}, c^{\prime \prime }[t_{1}, \dotsc , t_{n}] \rangle \in \mathcal {D}(N_{2})$, and links (v _i1, w _i1) ∈ L ₁ and (v _i2, w _i2) ∈ L ₂ for 1 ≤ i ≤ n such that

$\text {pos}_{x_{i}}(c^{\prime \prime }) \preceq w_{i2}$,
v _i2≼w _i1, and
$\text {pos}_{x_{i}}(c^{\prime }) \preceq v_{i1}$.

The splines with the inverted arrow heads indicate some of those links in Fig. 4.

Now, let us consider the obtained (partial) dependencies, which are depicted in Fig. 4. We clearly have (ε, ε),(v _n2, w _n2) ∈ L ₂ and

$$1^{\frac n2} = \text{pos}_{x_{n}}(c^{\prime\prime}) \preceq w_{n2} ~. $$

Thus $\left |{w_{n2}}\right | \geq \frac {n}{2} > b_{2}$. Since $\mathcal {D}(N_{2})$ is output link-distance bounded by b ₂, there exists a link (v ^′, w ^′) ∈ L ₂ with ε≺w ^′≺w _n2 and |w ^′| ≤ b ₂. Consequently, the position w ^′ has label σ ₁ in u ₂ = c ^″[t ₁,…, t _n] as indicated in Fig. 4. Formally, w ^′=1^m for some $1 \leq m\leq b_{2} \leq \frac n2 - 1$. Let i = 2m, which yields 2 ≤ i ≤ n−2. Then w ^′≺w _(i+1)2 and w ^′≺w _i2 because $w^{\prime } \prec \text {pos}_{x_{i+1}}(c^{\prime \prime }) \preceq w_{(i+1)2}$ and $w^{\prime } \prec \text {pos}_{x_{i}}(c^{\prime \prime }) \preceq w_{i2}$. Since $\mathcal {D}(N_{2})$ is strictly output hierarchical, we can conclude that v ^′≼v _(i+1)2≼w _(i+1)1 and v ^′≼v _i2≼w _i1. Additionally, w ^′ and $\text {pos}_{x_{i-1}}(c^{\prime \prime })$ are incomparable and $\text {pos}_{x_{i-1}}(c^{\prime \prime }) \preceq w_{(i-1)2}$, so also the positions w ^′ and w _(i−1)2 are incomparable (see Lemma 32(i) for a proof of this and similarly straightforward arguments). Consequently, Lemma 29(i) shows that v ^′ and v _(i−1)2 are also incomparable. Using v _(i−1)2≼w _(i−1)1, we obtain that v ^′ and w _(i−1)1 are incomparable, and in particular that v ^′ ≰ w _(i−1)1.

Next, we inspect the input tree u ₀ = c ^′[t ₁,…, t _n] and the links (ε, ε), (v _i1, w _i1), and (v _(i−1)1, w _(i−1)1) in L ₁. We already know that $\text {pos}_{x_{i}}(c^{\prime }) \preceq v_{i1}$ and $\text {pos}_{x_{i-1}}(c^{\prime }) \preceq v_{(i-1)1}$. Let

$$V = \{v \in 1^{\ast}2 \mathbb{N}^{\ast} \mid v\prec \text{pos}_{x_{i}}(c^{\prime}),\, c^{\prime}(v) \neq \sigma_{2}\} $$

be the set of positions of c ^′ (and hence of u ₀) that are in an occurrence of the tree c and are prefixes of $\text {pos}_{x_{i}}(c^{\prime })$. Since |V| = n > a ₁, it follows from Lemma 29(ii) that there exists a link (v ^″, w ^″) ∈ L ₁ such that v ^″∈ V, which also yields v ^″≺v _i1 and v ^″≺v _(i−1)1. Since $\mathcal {D}(N_{1})$ is strictly input hierarchical, we obtain that w ^″≼w _i1 and w ^″≼w _(i−1)1. Since v ^″ and v _(i+1)1 are incomparable, Lemma 29(i) implies that w ^″ and w _(i+1)1 are incomparable, and in particular that w ^″≰w _(i+1)1.

Summing up, we have

$$ v^{\prime}\preceq w_{(i+1)1} \qquad v^{\prime}\preceq w_{i1} \qquad v^{\prime} \not\preceq w_{(i-1)1} $$

(1)

$$ w^{\prime\prime}\not\preceq w_{(i+1)1} \qquad w^{\prime\prime} \preceq w_{i1} \qquad w^{\prime\prime} \preceq w_{(i-1)1}~. $$

(2)

Since v ^′≼w _i1 and w ^″≼w _i1, we must have v ^′≼w ^″ or w ^″≼v ^′. In the former case, we obtain v ^′≼w ^″≼w _(i−1)1 contradicting the last statement of (1). Similarly, in the second case, we obtain w ^″≼v ^′≼w _(i+1)1 contradicting the first statement of (2). Since both cases are contradictory, the assumption that we can compute τ with two is wrong. □

Fortunately, we can reuse the ideas used in the proof of Theorem 31 to conclude that 4 is the least power at which the class is closed under composition. The slightly more elaborate proof first establishes that a deleting rule, which is a rule $\ell \overset {q}{\longrightarrow } r$ such that $\text {states}(r) \subsetneq \text {states}(\ell )$, must be used at a certain position and then employs the classical cut-and-paste technique to establish that this deletion (without look-ahead) enables undesired translations.

We will use some well-known elementary properties of the prefix order, which we state in the next lemma.

Lemma 32

Let $v, v_{1}, v_{2}, v^{\prime }_{1}, v^{\prime }_{2} \in \mathbb {N}^{\ast }$ be positions with $v_{1} \preceq v^{\prime }_{1}$ and $v_{2} \preceq v^{\prime }_{2}$.

(i)
If v ₁ and v ₂ are incomparable, then so are $v^{\prime }_{1}$ and $v^{\prime }_{2}$.
(ii)
If v ₁ and v ₂ are incomparable, $v \preceq v_{1}^{\prime }$ , and $v\preceq v_{2}^{\prime }$ , then v≼v ₁ and v≼v ₂.
(iii)
If v ₁ and $v^{\prime }_{2}$ are incomparable and $v^{\prime }_{1}$ and v ₂ are incomparable, then v ₁ and v ₂ are incomparable.

Proof

If v ₁ and v ₂ are incomparable, then lcp (v ₁, v ₂) is a proper prefix of both v ₁ and v ₂. Hence lcp$(v^{\prime }_{1}, v^{\prime }_{2}) = \textnormal {lcp}(v_{1}, v_{2})$, which implies the first two items. For the third item we note that if v ₁≼v ₂ then $v_{1} \preceq v^{\prime }_{2}$, and symmetrically, if v ₂≼v ₁ then $v_{2} \preceq v^{\prime }_{1}$. □

Theorem 33

Proof

Since the inclusion is trivial, it remains to prove its strictness. Let M ₁ be the of Example 2, and let M ₂ and M ₃ be the bimorphisms defined as in the proof of Theorem 31. We will show that the tree transformation τ = τ(M ₁);τ(M ₂);τ(M ₃) cannot be computed by a composition of three .

We again present a proof by contradiction, hence we assume that

$$\tau = \tau(N_{1}) ; \tau(N_{2}) ; \tau(N_{3}) $$

for some , N ₂, and N ₃. By Proposition 28 there exist integers a ₁, a ₂, a ₃, b ₁, b ₂, b ₃≥1 such that $\mathcal {D}(N_{1})$, $\mathcal {D}(N_{2})$, and $\mathcal {D}(N_{3})$ are strictly input and output hierarchical, input link-distance bounded by a ₁, a ₂, and a ₃, respectively, and output link-distance bounded by b ₁, b ₂, and b ₃, respectively. As before, let n = 2⋅ max(a ₁, a ₂, a ₃, b ₁, b ₂, b ₃)+2. Moreover, let $m \in \mathbb {N}$ be such that m > ht(ℓ) for all rules (ℓ, p, r)∈ R ₁. This time, we select the trees

$$\begin{array}{@{}rcl@{}} s &=&{\gamma_{2}^{m}}(\alpha) \enspace, \\ c &=& \sigma \left( s, \sigma(s, \dotsm \sigma(s, x_{1})\dotsm) \right) ~\text{ with } n^{2} \text{ occurrences of } \sigma~, \\ c^{\prime} &=& \sigma_{1}\left( \sigma_{1} \left( \dotsm \sigma_{1}(\sigma_{2}(x_{n}, x_{n-1}), c[\sigma_{2}(x_{n-2}, x_{n-3})]) \dotsm, c[\sigma_{2}(x_{4}, x_{3})] \right),\right. \\ &&\left.c[\sigma_{2}(x_{2}, x_{1})] \right)~, \text{and}\\ c^{\prime\prime} &=& \sigma_{1} \left( \sigma_{1} \left( \dotsm \sigma_{1}(x_{n}, \sigma_{2}(x_{n-1}, x_{n-2})) \dotsm, \sigma_{2}(x_{3}, x_{2}) \right), x_{1} \right)~. \end{array} $$

We note that c ^′ and c ^″ are the same as in the proof of Theorem 31 (see Fig. 4), except that we selected a more complicated tree c; thus, c ^′ and c ^″ are again linear and nondeleting in X _n, and can be defined formally as in that proof. Clearly (c ^′[t ₁,…, t _n], c ^″[t ₁,…, t _n])∈ τ for all t ₁,…, t _n∈ T _Ψ with Ψ = {γ ⁽¹⁾, α ⁽⁰⁾}. This time, every σ ₁-rib c[σ ₂(t _i, t _i−1)] is of the form

$$\sigma\left( {\gamma_{2}^{m}}(\alpha), \sigma \left( {\gamma_{2}^{m}}(\alpha), {\cdots} \sigma({\gamma_{2}^{m}}(\alpha), \sigma_{2}(t_{i}, t_{i-1})) {\cdots} \right) \right)~. $$

It is transformed into σ ₂(t _i, t _i−1) by τ(M ₁) as before (see Example 8). So we can apply Proposition 30 once again to obtain that there exist t ₁,…, t _n∈ T _Ψ, dependencies

$$\begin{array}{@{}rcl@{}} &&\langle c^{\prime}[t_{1}, \dotsc, t_{n}], L_{1}, u_{1}\rangle \in \mathcal{D}(N_{1})~, \qquad \langle u_{1}, L_{2}, u_{2}\rangle \in \mathcal{D}(N_{2}) \quad \text{and} \\ &&\langle u_{2}, L_{3}, c^{\prime\prime}[t_{1}, \dotsc, t_{n}] \rangle \in \mathcal{D}(N_{3})~, \end{array} $$

and links (v _i1, w _i1) ∈ L ₁, (v _i2, w _i2) ∈ L ₂, and (v _i3, w _i3) ∈ L ₃ for 1 ≤ i ≤ n such that

$\text {pos}_{x_{i}}(c^{\prime \prime }) \preceq w_{i3}$,
v _i3≼w _i2 and v _i2≼w _i1, and
$\text {pos}_{x_{i}}(c^{\prime }) \preceq v_{i1}$.

We first observe that for every j ∈{1,2,3}, the positions v _1j,…, v _{n
j} are pairwise incomparable (as also shown in the proof of Proposition 30 in [11, Theorem 4]). In fact, since $\text {pos}_{x_{1}}(c^{\prime \prime }), \dotsc , \text {pos}_{x_{n}}(c^{\prime \prime })$ are pairwise incomparable, so are w ₁₃,…, w _n3 by the first item above and Lemma 32(i). Hence the corresponding link origins v ₁₃,…, v _n3 are pairwise incomparable by Lemma 29(i). This implies that w ₁₂,…, w _n2 are pairwise incomparable by the second item above, and hence so are the corresponding link origins v ₁₂,…, v _n2 using again Lemma 29(i). This argument can be repeated once more to show the observation.

We now start the analysis of the given dependencies in the same way as in the proof of Theorem 31 by considering the output tree u ₃ = c ^″[t ₁,…, t _n]. Entirely similar to that proof, we obtain a position v ^′∈pos(u ₂) such that v ^′≼w _(i+1)2, v ^′≼w _i2, and v ^′≰w _(i−1)2.

Next we move to the input tree u ₀ = c ^′[t ₁,…, t _n], where the analysis will be slightly different. As before, we consider the links (ε, ε), (v _i1, w _i1), and (v _(i−1)1, w _(i−1)1) in L ₁, for which we already know that $pos_{x_{i}}(c^{\prime }) \preceq v_{i1}$ and $\text {pos}_{x_{i-1}}(c^{\prime }) \preceq v_{(i-1)1}$. Let

$$V = \{v \in 1^{\ast}2 \mathbb{N}^{\ast} \mid v \prec \text{pos}_{x_{i}}(c^{\prime}),\, c^{\prime}(v)\neq\sigma_{2} \}~. $$

Clearly, |V| = n ²>n⋅a ₁. Thus, since $\mathcal {D}(N_{1})$ is input link-distance bounded by a ₁, the set $V^{\prime } = \{v \in V \mid \exists w \in \mathbb {N}^{\ast }: (v, w) \in L_{1} \}$ of link origins in V contains at least n elements by Lemma 29(ii). Let W ^′ = {w∣∃v ∈ V ^′:(v, w)∈ L ₁} be the set of corresponding link targets. Since the elements of V ^′ are pairwise comparable, the elements of W ^′ are also pairwise comparable by Lemma 29(i). For every w ∈ W ^′, we have w≼w _i1 and w≼w _(i−1)1 because v≺v _i1 and v≺v _(i−1)1 for every v ∈ V ^′ and $\mathcal {D}(N_{1})$ is strictly input hierarchical. Additionally, for every w ∈ W ^′, the positions w and w _(i+1)1 are incomparable because v and v _(i+1)1 are incomparable for every v ∈ V ^′. Since v _i2 and v _(i−1)2 are incomparable by the above observation, and v _i2≼w _i1 and v _(i−1)2≼w _(i−1)1, we obtain from Lemma 32(ii) that w≼v _i2 and w≼v _(i−1)2 for every w ∈ W ^′. Moreover, for every w ∈ W ^′, since v _(i+1)2≼w _(i+1)1, w≼v _i2, v _(i+1)2 and v _i2 are incomparable, and w and w _(i+1)1 are incomparable, Lemma 32(iii) shows that w and v _(i+1)2 are incomparable.

Now we distinguish two cases. First, let us assume that |W ^′|≥n. In this case, we can continue to derive a contradiction in much the same way as in the proof of Theorem 31. Since the positions in W ^′ are pairwise comparable, there are positions w _min, w _max∈ W ^′ of minimal and maximal length, respectively, with w _min≺w _max. Clearly, |w _max|−|w _min|≥n−1>a ₂. Since (ε, ε),(v _i2, w _i2) ∈ L ₂ and w _min≼v _i2, there must be a link (v ^″, w ^″) ∈ L ₂ such that w _min≺v ^″≺w _max by Lemma 29(ii). This implies that v ^″≺v _i2, v ^″≺v _(i−1)2, and v ^″ and v _(i+1)2 are incomparable. Since $\mathcal {D}(N_{2})$ is strictly input hierarchical, we obtain that w ^″≼w _i2, w ^″≼w _(i−1)2, and from Lemma 29(i) we obtain that w ^″ and w _(i+1)2 are incomparable, which takes us to the situation

$$\begin{array}{rrc} v^{\prime} \preceq w_{(i+1)2} &\qquad v^{\prime} \preceq w_{i2} & \qquad v^{\prime} \not\preceq w_{(i-1)2} \\* w^{\prime\prime} \not\preceq w_{(i+1)2} &\qquad w^{\prime\prime} \preceq w_{i2} & \qquad w^{\prime\prime} \preceq w_{(i-1)2}~, \end{array} $$

which is the analogue of (1) and (2) [in the proof of Theorem 31] and thus contradictory for the same reasons.

In the remaining case, we have |W ^′| < n. Together with |V ^′|≥n, we obtain by the pigeonhole principle that several input positions of V ^′ are linked in L ₁ to the same output position $\overline w$ of W ^′. We choose $(\overline v, \overline w) \in L_{1}$ such that $\overline v\in V^{\prime }$ and $\left |{\overline v}\right |$ is minimal. Consequently, a rule (ℓ, p, r)∈ R ₁ with a state r ∈ P as right-hand side must have been applied at position $\overline v$ of u ₀ = c ^′[t ₁,…, t _n].

Since $\overline {v} \in V$, the subtree $t|_{\overline v}$ is of the form

$$\sigma\! \left( s,\sigma \left( s, \dotsm\sigma(s, \sigma_{2}(t_{i}, t_{i-1}))\!\dotsm \right) \right), $$

where $s\! = {\gamma _{2}^{m}}(\alpha )$. Hence, since N ₁ is ε-free, the root of the left-hand side ℓ has label σ. Moreover, $\ell |_{1} = {\gamma _{2}^{k}}(p^{\prime })$ for some 0 ≤ k < m and p ^′∈ P. By the choice of $\overline v$, the state r occurs in ℓ|₂ and so the state p ^′ is deleted [i.e., p ^′∉states(r)] in this rule. Therefore, the subtree $u_{0}|_{{\overline v}.1^{k+1}} = \gamma _{2}^{m-k}(\alpha )$ has been created using the second item of Definition 6. Since N ₁ is an , its look-ahead mapping is trivial, and thus any tree can be created instead of $u_{0}|_{{\overline v}.1^{k+1}}$; e.g., the tree σ ₂(α, α). This shows that also $\langle u^{\prime }_{0}, L_{1}, u_{1}\rangle \in \mathcal {D}(N_{1})$, where $u^{\prime }_{0} = u_{0}[\overline v.1^{k+1} \gets \sigma _{2}(\alpha , \alpha )]$, and so $(u^{\prime }_{0}, c^{\prime \prime }[t_{1}, \dotsc , t_{n}]) \in \tau (N_{1}); \tau (N_{2}) ; \tau (N_{3})$. However, since $u^{\prime }_{0}|_{\overline v}$ is of the form

$$\sigma ({\gamma_{2}^{k}}(\sigma_{2}(\alpha, \alpha)), \sigma\left( s, \dotsm \sigma(s, \sigma_{2}(t_{i}, t_{i-1})) \dotsm ) \right)~, $$

the σ ₁-rib $u^{\prime }_{0}|_{1^{h}2}$ of $u^{\prime }_{0}$ with $1^{h}2 \preceq \overline v$ (i.e., $h = \frac {i}{2}-1$) has two occurrences of σ ₂. Hence $u^{\prime }_{0}$ is not in the domain of τ(M ₁) [see Example 8]. This implies that $u^{\prime }_{0}$ is not in the domain of τ = τ(M ₁);τ(M ₂);τ(M ₃), but

$$(u^{\prime}_{0}, c^{\prime\prime}[t_{1}, \dotsc, t_{n}]) \in \tau(N_{1}) ~; \tau(N_{2}); \tau(N_{3}) = \tau~, $$

which is a contradiction.

Since both cases are contradictory, τ cannot be computed by a composition of three . □

Thus, we have shown that the least power, at which the composition closure is achieved for the classes and , is 3 and 4, respectively. This is stated in the next theorem.

Theorem 34

For every n≥4,

Proof

We have for all n ≥ 1 by Lemma 15. The equalities follow from Theorem 24. The fourth and fifth inclusions are strict by Theorems 31 and 33, respectively. The strictness of the second and third inclusion follows from that of the fourth and fifth, respectively. The strictness of the first inclusion is a consequence of Proposition 14; it also follows from that of the third. □

In Table 3 we summarize the main results of this and the previous section, which allow us to present the least power at which the closure of the considered composition hierarchies is achieved. For the sake of completeness, we also present the corresponding results for the classes and $\mathcal {B}(\textnormal {l,l})$ that were obtained in [3, 5]. Recall that $\mathcal {B}$(l,l) is the class of all tree transformations computable by bimorphisms, in which both tree homomorphisms are linear.

Table 3 Summary of the results of Section 5

Full size table

6 Infinite Composition Hierarchies

To complete the picture, we will need one further result showing the infiniteness of the composition hierarchy for a large number of classes. In order to obtain a result that is as general as possible, we use bimorphisms [3] instead of l-xt^R in this section; cf. Proposition 11. We conclude several results for various tree transducer classes from the result for bimorphisms.

To handle bimorphisms properly, we need to define links for tree homomorphisms. As observed after Notation 10, every linear tree homomorphism φ:T _Σ→T _Δ can be viewed as a linear top-down tree transducer M _φ. In particular, for every t ∈ T _Σ there is a (unique) set L _φ(t)⊆pos(t)×pos(φ(t)) of links such that $\langle t, L_{\varphi }(t), \varphi (t) \rangle \in \mathcal {D}(M_{\varphi })$. We now generalize this notion to arbitrary tree homomorphisms.

Definition 35

Let φ:T _Σ→T _Δ be a tree homomorphism and t ∈ T _Σ. The set of t-links of φ, denoted by L _φ(t), is the smallest subset of pos(t)×pos(φ(t)) such that

(ε, ε)∈ L _φ(t) and
(v i, w w ^′) ∈ L _φ(t) for all links (v, w)∈ L _φ(t), integers 1 ≤ i≤rk(t(v)), and positions $w^{\prime } \in \text {pos}_{x_{i}} \left (\varphi (t(v)) \right )$.

Intuitively, (v, w)∈ L _φ(t) means that φ translates the subtree of t rooted at v into the subtree of φ(t) rooted at w. Note that for a given position v there can be several such positions w (which are, of course, pairwise incomparable), since φ is not necessarily linear, or there may be no such w, since φ is not necessarily nondeleting. We will need the following elementary properties of L _φ(t).

Lemma 36

Let φ:T _Σ →T _Δ be a tree homomorphism, and let t∈T _Σ , u=φ(t), and L=L _φ (t).

(i)
If (v,w)∈L, then φ(t| _v )=u| _w.
(ii)
If (v,w)∈L, then L _φ (t| _v )={(v ^′ ,w ^′ )∣(vv ^′ ,ww ^′ )∈L}.
(iii)
If φ is nondeleting, then for all (v ₁ ,w ₁ )∈L and all v ₁ ≼v∈pos(t) there exists a position w ₁ ≼w such that (v,w)∈L.
(iv)
For all links (v ₁ ,w ₁ ),(v ₂ ,w ₂ )∈L with v ₁ ≼v ₂ and w ₁ ≼w ₂ , and all v ₁ ≼v≼v ₂ there exists a unique position w ₁ ≼w≼w ₂ such that (v,w)∈L.
(v)
For all (v ₁ ,w ₁ )∈L and all w ₁ ≼w∈pos(u) there exist unique positions $v, w^{\prime }, w^{\prime \prime } \in \mathbb {N}^{\ast }$ such that v ₁ ≼v, w ₁ ≼w ^′ , w=w ^′ w ^′′ , (v,w ^′ )∈L, and w ^′′ ∈pos _Δ (φ(t(v))).

Proof

The proofs of statements (i) and (ii) are straightforward, and hence left to the reader. It is also straightforward to prove the following three statements, which are the special case of statements (iii)–(v), in which we have (v ₁, w ₁)=(ε, ε). We also leave their proofs to the reader.

(iii)′
If φ is nondeleting, then dom(L) = pos(t).
(iv)′
For all (v ₂, w ₂) ∈ L and all v≼v ₂ there exists a unique position w≼w ₂ such that (v, w)∈ L.
(v)′
For every w ∈pos(u) there exist unique positions $v, w^{\prime }, w^{\prime \prime } \in \mathbb {N}^{\ast }$ such that w = w ^′ w ^″, (v, w ^′) ∈ L, and w ^″∈pos_Δ(φ(t(v))).

Each non-primed statement can now easily be obtained from the corresponding primed statement with the help of (i) and (ii). We start with statement (iii). Let (v ₁, w ₁) ∈ L and v ₁≼v ∈pos(t). Since v ₁≼v, let $\widehat v$ be such that $v_{1}\widehat v = v$. Obviously $\widehat {v} \in \text {pos}(t|_{v_{1}})$, and consequently, by statement (iii)^′, there exists $\widehat w$ such that $(\widehat v,\widehat w) \in L_{\varphi }(t|_{v_{1}})$. Together with (v ₁, w ₁) ∈ L and statement (ii) we conclude that $(v_{1}\widehat v, w_{1}\widehat w) \in L$. Thus, (v, w)∈ L where $w_{1} \preceq w = w_{1}\widehat w$.

For statement (iv), let (v ₁, w ₁),(v ₂, w ₂) ∈ L with v ₁≼v ₂ and w ₁≼w ₂, and let v ₁≼v≼v ₂. Since w ₁≼w ₂, let $\widehat w_{2}$ be such that $w_{1}\widehat w_{2} = w_{2}$. Similarly, since v ₁≼v≼v ₂, let $\widehat v \preceq \widehat v_{2}$ such that $v_{1}\widehat v = v$ and $v_{1}\widehat v_{2} = v_{2}$. Since (v ₁, w ₁) ∈ L, statement (ii) implies that $(\widehat v_{2}, \widehat w_{2}) \in L_{\varphi }(t|_{v_{1}})$. Thus, since also $\widehat v \preceq \widehat v_{2}$, statement (iv)^′ implies that there exists $\widehat w \preceq \widehat w_{2}$ such that $(\widehat v, \widehat w) \in L_{\varphi }(t|_{v_{1}})$. Using statement (ii) again, we have $(v_{1}\widehat v, w_{1}\widehat w) \in L$. Hence the requirements are fulfilled by $w = w_{1} \widehat w$; note that $w_{1} \preceq w_{1}\widehat w \preceq w_{1}\widehat w_{2} = w_{2}$. The uniqueness of w follows immediately from the uniqueness condition in statement (iv)^′.

Finally, for statement (v), let (v ₁, w ₁) ∈ L and w ₁≼w ∈pos(u). By statement (i) we have $\varphi (t|_{v_{1}}) = u|_{w_{1}}$. Since w ₁≼w, let $\widehat w$ be such that $w_{1}\widehat w = w$. Obviously, $\widehat w \in \text {pos}(u|_{w_{1}})$. By statement (v)^′ applied to $\widehat w$, there exist $\widehat v, \widehat w^{\prime }, \widehat w^{\prime \prime }$ such that $\widehat w = \widehat w^{\prime } \widehat w^{\prime \prime }$, $(\widehat v, \widehat w^{\prime }) \in L_{\varphi }(t|_{v_{1}})$, and $\widehat w^{\prime \prime }\in \text {pos}_{\Delta } \left (\varphi (t|_{v_{1}}(\widehat v)) \right )$. Since (v ₁, w ₁) ∈ L we can use statement (ii) applied to $(\widehat v, \widehat w^{\prime }) \in L_{\varphi }(t|_{v_{1}})$ to conclude that $(v_{1}\widehat v, w_{1} \widehat w^{\prime }) \in L$. Hence the requirements are fulfilled by $v = v_{1} \widehat v$, $w^{\prime } = w_{1}\widehat w^{\prime }$, and $w^{\prime \prime } = \widehat w^{\prime \prime }$. The uniqueness of v, w ^′, and w ^″ follows immediately from the uniqueness condition in statement (v)^′. □

The unique position v ∈pos(t) corresponding to the position w ∈pos(u) in Lemma 36(v) is informally called the position in t that creates the symbol u(w) at w. Since item (v) holds in particular for (v ₁, w ₁)=(ε, ε), that position does not depend on the link (v ₁, w ₁) ∈ L. Similarly, the unique position w in item (iv) does not depend on the link (v ₁, w ₁).

We now turn to the proof of the infiniteness of the composition hierarchies. The main auxiliary notion used in that proof is the assignment of levels to positions in a tree. Let t ∈ T _Σ. Since the branching positions of t (i.e., those that are labeled by symbols of rank at least 2) will play an essential role, we define the set of branching positions of t, the set of branching positions of t together with two different successor indices, and the set of branching positions along a given path, as follows:

$$\begin{array}{@{}rcl@{}} \text{br}_{t} &=&\{ v \in \text{pos}(t) \mid t(v) \notin {\Sigma}_{0} \cup {\Sigma}_{1}\} \\ \text{bri}_{t} &=& \left\{ \langle v, i, j\rangle \mid v \in \text{br}_{t},\, 1 \leq i, j \leq \textnormal{rk}(t(v)),\, i \neq j \right\} \end{array} $$

and for every v ₁, v ₂∈pos(t) with v ₁≼v ₂ we let

$$\text{br}_{t}(v_{1}, v_{2}) =\{ v \in \text{br}_{t} \mid v_{1} \preceq v \preceq v_{2} \}~. $$

Let ℓ ≥ 2 be arbitrary (called distance in the sequel). We inductively define the sets $\text {PI}^{\ell }_{n}(t) \subseteq \text {pos}(t) \times \mathbb {N} \times \mathbb {N}$ of special positions of level n and distance ℓ with successor indices and the sets $\text {P}^{\ell }_{n}(t) \subseteq \text {pos}(t)$ for the same special positions without successor indices for every $n \in \mathbb {N}$ as follows:

$$\begin{array}{@{}rcl@{}} \text{PI}^{\ell}_{0}(t) &=& \text{bri}_{t} \\ \text{P}^{\ell}_{0}(t) &=& \text{br}_{t} = \{v \mid \exists i, j : \langle v, i, j\rangle \in \text{PI}^{\ell}_{0}(t) \} \\ \text{PI}^{\ell}_{n+1}(t) &=&\{ \langle v, i, j\rangle \in \text{bri}_{t} \!\mid \!\exists v_{1} \in \text{P}^{\ell}_{n}(t) :\! vi \preceq v_{1}, |{\text{br}_{t}(vi, v_{1}) \cap \text{P}^{\ell}_{n}(t)} | \geq \ell^{n+1} \\ &&\qquad\qquad\qquad\;\;\; \!\exists v_{2} \in \text{P}^{\ell}_{n}(t) :\! vj \preceq v_{2}, |{\text{br}_{t}(vj, v_{2}) \cap \text{P}^{\ell}_{n}(t)}| \geq \ell^{n+1} \} \\ \text{P}^{\ell}_{n+1}(t) &=&\{ v \mid \exists i,j: \langle v, i, j\rangle \in \text{PI}^{\ell}_{n+1}(t) \} \end{array} $$

Intuitively, each branching position is a special position of level 0 (for any distance ℓ) and a branching position v is a special position of level n+1 if there are two paths in different direct subtrees below v that both have at least ℓ ⁿ⁺¹ special positions of level n along the path. Clearly, $\text {PI}^{\ell }_{n+1}(t)\subseteq \text {PI}^{\ell }_{n}(t)$ and $\text {P}^{\ell }_{n+1}(t) \subseteq \text {P}^{\ell }_{n}(t)$ for all $n \in \mathbb {N}$. Note that in the definition of $\text {PI}^{\ell }_{n+1}(t)$, the condition that $v_{1}, v_{2} \in \text {P}^{\ell }_{n}(t)$ is superfluous, but technically convenient.

Example 37

Let t be the tree depicted in Fig. 5. Then

$$\begin{array}{@{}rcl@{}} {\text{P}^{2}_{0}}(t) &=& \{\varepsilon, 1, 11, 112, 1121, 11211, 12, 121, 2, 21, 211, 2111 \} = \text{br}_{t} \\ {\text{P}^{2}_{1}}(t) &=& \{\varepsilon, 1\} \\ {\text{P}^{2}_{2}}(t) &=& \emptyset~. \end{array} $$

Lemma 38

Let t∈T _Σ and $\ell , n \in \mathbb {N}$ with ℓ≥2. Moreover, let $v, v^{\prime } \in \mathbb {N}^{\ast }$ and $i, j \in \mathbb {N}$.

(i)
$\langle v^{\prime }, i, j\rangle \in \textnormal {PI}^{\ell }_{n}(t|_{v})$ if and only if $\langle vv^{\prime }, i, j\rangle \in \textnormal {PI}^{\ell }_{n}(t)$ , and $v^{\prime } \in \textnormal {P}^{\ell }_{n}(t|_{v})$ if and only if $vv^{\prime } \in \textnormal {P}^{\ell }_{n}(t)$.
(ii)
If $v, viv^{\prime } \in \textnormal {P}^{\ell }_{n}(t)$ , then there exists $m \in \mathbb {N}$ such that $\langle v, i, m\rangle \in \textnormal {PI}^{\ell }_{n}(t)$.

Proof

We prove the items individually. We start with (i), which is obvious because whether or not 〈v, i, j〉 is in $\text {PI}^{\ell }_{n}(t)$ only depends on the positions of which v is a prefix. Statement (ii) is also trivial for n = 0, hence we only prove it for n+1. Let $v, viv^{\prime } \in \text {P}^{\ell }_{n+1}(t)$. Since $v \in \text {P}^{\ell }_{n+1}(t)$ there exist integers i ₁, i ₂ such that $\langle v, i_{1}, i_{2}\rangle \in \text {PI}^{\ell }_{n+1}(t)$. If i ∈{i ₁, i ₂}, then the statement is obviously true. In the remaining case, let i∉{i ₁, i ₂}. There exists a position $v_{2} \in \text {P}^{\ell }_{n}(t)$ such that v i ₂≼v ₂ and $\left |{\text {br}_{t}(vi_{2}, v_{2}) \cap \text {P}^{\ell }_{n}(t)}\right | \geq \ell ^{n+1}$. Since $viv^{\prime } \in \text {P}^{\ell }_{n}(t)$, there exist $i^{\prime } \in \mathbb {N}$ and $v_{1} \in \text {P}^{\ell }_{n}(t)$ such that v i v ^′ i ^′≼v ₁ and $\left |{\text {br}_{t}(viv^{\prime }i^{\prime }, v_{1}) \cap \text {P}^{\ell }_{n}(t)}\right | \geq \ell ^{n+1}$. Hence v i≼v ₁ and $\left |{\text {br}_{t}(vi, v_{1}) \cap \text {P}^{\ell }_{n}(t)}\right | \geq \ell ^{n+1}$, which shows that $\langle v, i, i_{2}\rangle \in \text {PI}^{\ell }_{n+1}(t)$. □

We now prove that a nondeleting tree homomorphism preserves the maximal level of the special positions of a tree.

Lemma 39

Let φ:T _Γ →T _Δ be a nondeleting tree homomorphism, and let t=γ(t ₁ ,…,t _k ) for some $k \in \mathbb {N}$ , γ∈Γ _k , and t ₁ ,…,t _k ∈T _Γ . Moreover, let $\ell , n, i, j \in \mathbb {N}$ be such that ℓ≥2 and $\langle \varepsilon , i, j\rangle \in \textnormal {PI}^{\ell }_{n}(t)$ . Then for every $z_{1} \in \text {pos}_{x_{i}}(\varphi (\gamma ))$ and $z_{2} \in \text {pos}_{x_{j}} (\varphi (\gamma ))$ there exists $\langle w, i^{\prime }, j^{\prime }\rangle \in \textnormal {PI}^{\ell }_{n}(\varphi (t))$ such that w∈pos(φ(γ)) and wi ^′ ≼z ₁ and wj ^′ ≼z ₂.

Proof

Let u = φ(t) = φ(γ)[φ(t ₁),…, φ(t _k)] and L = L _φ(t). We prove the statement by induction on n. In the induction base, we have n = 0 and $\langle \varepsilon , i, j\rangle \in \text {PI}^{\ell }_{0}(t) = \text {bri}_{t}$. Consider $z_{1} \in \text {pos}_{x_{i}}(\varphi (\gamma ))$ and $z_{2} \in \text {pos}_{x_{j}}(\varphi (\gamma ))$, which are occurrences of the variables x _i≠x _j in φ(γ). Let w = lcp(z ₁, z ₂) be their longest common prefix. Since x _i≠x _j, we have w≺z ₁ and w≺z ₂, so let $i^{\prime }, j^{\prime } \in \mathbb {N}$ be the unique (and necessarily distinct) integers such that w i ^′≼z ₁ and w j ^′≼z ₂. Clearly, w ∈pos(φ(γ)) and $\langle w, i^{\prime }, j^{\prime }\rangle \in \text {bri}_{u} = \text {PI}^{\ell }_{0}(u)$. This completes the induction base.

In the induction step, let $\langle \varepsilon , i, j\rangle \in \text {PI}^{\ell }_{n+1}(t)$, and suppose that $v_{1}\in \text {P}^{\ell }_{n}(t)$ and $v_{2} \in \text {P}^{\ell }_{n}(t)$ are the required special positions of level n such that i≼v ₁ and j≼v ₂ and

$$\left|{\text{br}_{t}(i, v_{1}) \cap \text{P}^{\ell}_{n}(t)}\right| \geq \ell^{n+1} \qquad \text{and} \qquad \left|{\text{br}_{t}(j, v_{2}) \cap \text{P}^{\ell}_{n}(t)}\right| \geq \ell^{n+1}~. $$

Now, we follow a similar approach as in the induction base. Figure 6 illustrates the used positions and their relations. Consider positions $z_{1} \in \text {pos}_{x_{i}}(\varphi (\gamma ))$ and $z_{2} \in \text {pos}_{x_{j}}(\varphi (\gamma ))$. As before, we let

$$w = \textnormal{lcp}(z_{1}, z_{2})\in \text{pos}(\varphi(\gamma)) $$

be their longest common prefix, and let w i ^′≼z ₁ and w j ^′≼z ₂. Clearly, i ^′≠j ^′ and so 〈w, i ^′, j ^′〉∈bri_u. It remains to show that $\langle w, i^{\prime }, j^{\prime }\rangle \in \text {PI}^{\ell }_{n+1}(u)$.

By Definition 35, we have (ε, ε)∈ L and (i, z ₁) ∈ L. Since φ is nondeleting and (i, z ₁) ∈ L and i≼v ₁, it follows from Lemma 36(iii) that there exists $w^{\prime }_{1}$ such that $z_{1} \preceq w^{\prime }_{1}$ and $(v_{1}, w^{\prime }_{1}) \in L$. Thus, Lemma 36(i) shows that $u|_{w^{\prime }_{1}} = \varphi (t|_{v_{1}})$. By assumption we have $v_{1} \in \text {P}^{\ell }_{n}(t)$, which yields $\varepsilon \in \text {P}^{\ell }_{n}(t|_{v_{1}})$ by Lemma 38(i); i.e., $\langle \varepsilon , i^{\prime \prime }, j^{\prime \prime }\rangle \in \text {PI}^{\ell }_{n}(t|_{v_{1}})$ for some i ^″, j ^″. Since φ is nondeleting, the sets $\text {pos}_{x_{i^{\prime \prime }}}\left (\varphi (t(v_{1}))\right )$ and $\text {pos}_{x_{j^{\prime \prime }}} \left (\varphi (t(v_{1})) \right )$ are nonempty. Consequently, the induction hypothesis implies the existence of $w^{\prime \prime }_{1} \in \text {P}^{\ell }_{n}(\varphi (t|_{v_{1}})) = \text {P}^{\ell }_{n}(u|_{w^{\prime }_{1}})$. Hence $w^{\prime }_{1}w^{\prime \prime }_{1} \in \text {P}^{\ell }_{n}(u)$ by Lemma 38(i). Let $w_{1} = w^{\prime }_{1}w^{\prime \prime }_{1}$, and let w ₂ be determined in an analogous way. We claim that w ₁ and w ₂ are the special positions of level n that are required to show that $\langle w, i^{\prime }, j^{\prime }\rangle \in \text {PI}^{\ell }_{n+1}(u)$. We will only verify the condition

$$ \left|{\text{br}_{u}(wi^{\prime}, w_{1}) \cap \text{P}^{\ell}_{n}(u)}\right| \geq \ell^{n+1} $$

(3)

because the proof for w ₂ works analogously. Due to w i ^′≼z ₁, we obtain that $w_{1} \in \text {br}_{u}(wi^{\prime }, w_{1}) \cap \text {P}^{\ell }_{n}(u)$ .

Let $\bar {v}_{1} \in \text {br}_{t}(i, v_{1}) \cap \text {P}^{\ell }_{n}(t)$ be any position of level n along the path from i to v ₁ such that $\bar {v}_{1} \prec v_{1}$. Hence there exists a unique integer $i_{1} \in \mathbb {N}$ such that $\bar {v}_{1}i_{1} \preceq v_{1}$. Since $(i, z_{1}), (v_{1}, w^{\prime }_{1}) \in L$ together with $i \preceq \bar {v}_{1}i_{1} \preceq v_{1}$ we can use Lemma 36(iv) to conclude that there exists $z_{1} \preceq \widehat {w}^{\prime }_{1} \preceq w^{\prime }_{1}$ such that $(\bar {v}_{1}i_{1}, \widehat {w}^{\prime }_{1}) \in L$. Applied once more to $i \preceq \bar {v}_{1} \preceq \bar {v}_{1}i_{1}$ and the links $(i, z_{1}), (\bar {v}_{1}i_{1}, \widehat {w}^{\prime }_{1}) \in L$, there exists $z_{1} \preceq \bar {w}^{\prime }_{1} \preceq \widehat {w}^{\prime }_{1}$ with $(\bar {v}_{1}, \bar {w}^{\prime }_{1}) \in L$. Let $z \in \mathbb {N}^{\ast }$ be such that $\bar {w}^{\prime }_{1} z = \widehat {w}^{\prime }_{1}$. By Definition 35 we have $z \in \text {pos}_{x_{i_{1}}} \left (\varphi (t(\bar {v}_{1})) \right )$. Since $\bar {v}_{1}, v_{1} \in \text {P}^{\ell }_{n}(t)$, we conclude from Lemma 38(ii) that there exists $j_{1} \in \mathbb {N}$ such that $\langle \bar {v}_{1}, i_{1}, j_{1}\rangle \in \text {PI}^{\ell }_{n}(t)$. Hence $\langle \varepsilon , i_{1}, j_{1}\rangle \in \text {PI}^{\ell }_{n}(t|_{\bar {v}_{1}})$ by Lemma 38(i) and $u|_{\bar {w}^{\prime }_{1}} = \varphi (t|_{\bar {v}_{1}})$ by Lemma 36(i). Now we can apply the induction hypothesis to obtain that there exists $\langle \bar {w}^{\prime \prime }_{1}, i^{\prime }_{1}, j^{\prime }_{1} \rangle \in \text {PI}^{\ell }_{n}(\varphi (t|_{\bar {v}_{1}}))$ such that $\bar {w}^{\prime \prime }_{1}i^{\prime }_{1} \preceq z$. Hence $\langle \bar {w}^{\prime \prime }_{1}, i^{\prime }_{1}, j^{\prime }_{1} \rangle \in \text {PI}^{\ell }_{n}(u|_{\bar {w}^{\prime }_{1}})$ and so $\langle \bar {w}^{\prime }_{1}\bar {w}^{\prime \prime }_{1}, i^{\prime }_{1}, j^{\prime }_{1} \rangle \in \text {PI}^{\ell }_{n}(u)$ by Lemma 38(i).Consequently, $\bar {w}_{1} \in \text {P}^{\ell }_{n}(u)$, where $\bar {w}_{1} = \bar {w}^{\prime }_{1}\bar {w}^{\prime \prime }_{1}$. In addition, $wi^{\prime } \preceq \bar {w}_{1} \prec w_{1}$ because

$$wi^{\prime} \preceq z_{1} \preceq \bar{w}^{\prime}_{1} \preceq \bar{w}_{1} \qquad \text{and} \qquad \bar{w}_{1} \prec \bar{w}^{\prime}_{1}\bar{w}^{\prime\prime}_{1}i^{\prime}_{1} \preceq \bar{w}^{\prime}_{1}z = \widehat{w}^{\prime}_{1} \preceq w^{\prime}_{1} \preceq w_{1}~. $$

In other words, we have shown that $\bar {w}_{1} \in \text {br}_{u}(wi^{\prime }, w_{1}) \cap \text {P}^{\ell }_{n}(u)$. Moreover, since $\bar {w}^{\prime \prime }_{1}i^{\prime }_{1} \preceq z$, we have that $\bar {w}^{\prime \prime }_{1} \in \text {pos}_{\Delta } \left (\varphi (t(\bar {v}_{1})) \right )$. Since also $(\bar {v}_{1}, \bar {w}^{\prime }_{1}) \in L$, we can say that $\bar {v}_{1}$ is the position in t that creates the symbol $u(\bar {w}_{1}) = u(\bar {w}^{\prime }_{1} \bar {w}^{\prime \prime }_{1})$ at $\bar {w}_{1}$. Hence, the uniqueness condition in Lemma 36(v) guarantees that for each selection of $\bar {v}_{1}$ we obtain a different position $\bar {w}_{1} \in \text {br}_{u}(wi^{\prime }, w_{1}) \cap \text {P}^{\ell }_{n}(u)$. This verifies (3) because $w_{1} \in \text {br}_{u}(wi^{\prime }, w_{1}) \cap \text {P}^{\ell }_{n}(u)$ and there are at least ℓ ⁿ⁺¹−1 possible selections of $\bar {v}_{1}$ (and each position $\bar {w}_{1}$ differs from w ₁ because $\bar {w}_{1} \prec w_{1}$ as shown above). □

The next lemma shows that an inverse linear tree homomorphism reduces the maximal level of the special positions of a tree by at most 1 (for a sufficiently large distance ℓ).

Lemma 40

Let ψ:T _Γ →T _Σ be a linear tree homomorphism. Moreover, let t∈T _Γ and $\ell , n \in \mathbb {N}$ be such that ℓ>ht(ψ(γ ^′ )) for all symbols γ ^′ ∈Γ. If there exists $w \in \textnormal {P}^{\ell }_{n+1}(\psi (t))$ with w∈pos _Σ (ψ(t(ε))), then $\varepsilon \in \textnormal {P}^{\ell }_{n}(t)$.

Proof

The proof is similar to the one of Lemma 39. Let t = γ(t ₁,…, t _k) with $k \in \mathbb {N}$, γ ∈Γ_k, and t ₁,…, t _k∈ T _Γ, and let u = ψ(t) = φ(γ)[φ(t ₁),…, φ(t _k)]. Moreover, let $\langle w, i^{\prime }, j^{\prime }\rangle \in \text {PI}^{\ell }_{n+1}(u)$ with w ∈pos_Σ(ψ(γ)). By the definition of $\text {PI}^{\ell }_{n+1}(u)$, there exist positions $w_{1} \in \text {P}^{\ell }_{n}(u)$ and $w_{2} \in \text {P}^{\ell }_{n}(u)$ such that w i ^′≼w ₁ and w j ^′≼w ₂ and

$$\left|{\text{br}_{u}(wi^{\prime}, w_{1}) \cap \text{P}^{\ell}_{n}(u)}\right| \geq \ell^{n+1} \qquad \text{and} \qquad \left|{\text{br}_{u}(wj^{\prime}, w_{2}) \cap \text{P}^{\ell}_{n}(u)}\right| \geq \ell^{n+1} ~. $$

The paths in u from w to w ₁ and from w to w ₂ contain strictly more than ℓ ⁿ⁺¹ positions, so they are longer than any path in ψ(γ). Together with w ∈pos_Σ(ψ(γ)) we conclude that there must exist 1 ≤ i, j ≤ k and positions $z_{1} \in \text {pos}_{x_{i}}(\psi (\gamma ))$ and $z_{2} \in \text {pos}_{x_{j}}(\psi (\gamma ))$ such that w i ^′≼z ₁≼w ₁ and w j ^′≼z ₂≼w ₂. Since ψ is linear and i ^′≠j ^′, we have i≠j, which yields 〈ε, i, j〉∈bri_t. It remains to prove that $\langle \varepsilon , i, j\rangle \in \text {PI}^{\ell }_{n}(t)$, which we prove by induction on n. In the induction base we have n = 0 and thus $\langle \varepsilon , i, j\rangle \in \text {bri}_{t} = \text {PI}^{\ell }_{0}(t)$.

We proceed with the induction step. Again, Fig. 6 illustrates the used positions and their relations. Clearly, we have (i, z ₁) ∈ L. Let v ₁ be the position of t _i that creates the symbol u(w ₁) at w ₁. More precisely, by Lemma 36(v), there exist unique positions $v_{1}, w^{\prime }_{1}, w^{\prime \prime }_{1}$ such that i≼v ₁, $z_{1} \preceq w^{\prime }_{1}$, $w_{1} = w^{\prime }_{1} w^{\prime \prime }_{1}$, $(v_{1}, w^{\prime }_{1}) \in L$, and $w^{\prime \prime }_{1} \in \text {pos}_{\Sigma } \left (\psi (t(v_{1})) \right )$. Similarly, let v ₂∈ pos(t) be the position that creates the symbol u(w ₂) at w ₂. We claim that the property required to prove that $\langle \varepsilon , i, j\rangle \in \text {PI}^{\ell }_{n}(t)$, and hence $\varepsilon \in \text {P}^{\ell }_{n}(t)$, holds for br_t(i, v ₁) and br_t(j, v ₂), i.e.,

$$\left|{\text{br}_{t}(i, v_{1}) \cap \text{P}^{\ell}_{n-1}(t)}\right| \geq \ell^{n} \qquad \text{and} \qquad \left|{\text{br}_{t}(j, v_{2}) \cap \text{P}^{\ell}_{n-1}(t)}\right| \geq \ell^{n} ~. $$

We only prove this property for v ₁ because the proof for v ₂ is analogous. Since $w_{1} = w^{\prime }_{1}w^{\prime \prime }_{1} \in \text {P}^{\ell }_{n}(u)$, it follows from Lemma 38(i) that $w^{\prime \prime }_{1} \in \text {P}^{\ell }_{n}(u|_{w^{\prime }_{1}})$. Moreover, $(v_{1}, w^{\prime }_{1}) \in L$ and Lemma 36(i) yield that $u|_{w^{\prime }_{1}} = \psi (t|_{v_{1}})$ and thus $\text {P}^{\ell }_{n}(u|_{w^{\prime }_{1}}) = \text {P}^{\ell }_{n}(\psi (t|_{v_{1}}))$. Together with $w^{\prime \prime }_{1} \in \text {pos}_{\Sigma } \left (\psi (t(v_{1})) \right )$, we can conclude that $\varepsilon \in \text {P}^{\ell }_{n-1}(t|_{v_{1}})$ from the induction hypothesis, and hence $v_{1} \in \text {P}^{\ell }_{n-1}(t)$ by Lemma 38(i).

Next, we consider any position $\bar {w}_{1} \in \text {br}_{u}(z_{1}, w_{1}) \cap \text {P}^{\ell }_{n}(u)$. We follow the same approach as in the beginning of the induction step. Let $\bar {v}_{1}$ be the position of t _i that creates the symbol $u(\bar {w}_{1})$ at $\bar {w}_{1}$. More precisely, we apply Lemma 36(v) to $\bar {w}_{1}$ to obtain that there exist positions $\bar {v}_{1}, \bar {w}^{\prime }_{1}, \bar {w}^{\prime \prime }_{1}$ such that $i \preceq \bar {v}_{1}$, $z_{1} \preceq \bar {w}^{\prime }_{1}$, $\bar {w}_{1} = \bar {w}^{\prime }_{1}\bar {w}^{\prime \prime }_{1}$, $(\bar {v}_{1}, \bar {w}^{\prime }_{1}) \in L$, and $\bar {w}^{\prime \prime }_{1} \in \text {pos}_{\Sigma } \left (\psi (t(\bar {v}_{1})) \right )$. By the same reasoning as in the previous paragraph, we obtain that $\bar {v}_{1} \in \text {P}^{\ell }_{n-1}(t)$. Also, since $\bar {w}_{1}\preceq w_{1}$, we clearly have that $\bar {w}^{\prime }_{1} \preceq w^{\prime }_{1}$ because $w_{1}^{\prime }$ (resp. $\bar {w}^{\prime }_{1}$) is the first position on the path from w ₁ (resp. $\bar {w}_{1}$) to ε that occurs in a link of L. Now note that L is strictly output hierarchical by Proposition 28 because $\langle t, L, u\rangle \in \mathcal {D}(M_{\psi })$, where M _ψ is the l-t defined after Notation 10. Hence $\bar {v}_{1} \preceq v_{1}$ because either $\bar {w}_{1}^{\prime } \prec w_{1}^{\prime }$, which directly yields $\bar {v}_{1} \preceq v_{1}$, or $\bar {w}_{1}^{\prime } = w_{1}^{\prime }$, which yields $\bar {v}_{1} = v_{1}$ because of the uniqueness of $\bar {v}_{1}$. Thus we have shown that

$$\bar{v}_{1} \in \text{br}_{t}(i, v_{1}) \cap \text{P}^{\ell}_{n-1}(t)~. $$

If two different selections of $\bar {w}_{1}$ correspond to the same position $\bar {v}_{1}$, then (since $(\bar {v}_{1}, \bar {w}^{\prime }_{1}), (v_{1}, w^{\prime }_{1}) \in L$ with $\bar {v}_{1} \preceq v_{1}$ and $\bar {w}^{\prime }_{1} \preceq w^{\prime }_{1}$) they also correspond to the same $\bar {w}^{\prime }_{1}$ by the uniqueness condition in Lemma 36(iv), and hence, since $\bar {w}^{\prime \prime }_{1} \in \text {pos}_{\Sigma } \left (\psi (t(\bar {v}_{1})) \right )$, their distance is at most $\text {ht} \left (\psi (t(\bar {v}_{1})) \right ) \leq \ell -1$. In summary, a single position $\bar {v}_{1}$ can create the symbols of at most ℓ positions of br_u(z ₁, w ₁). Since there are at most ℓ−2 positions between w and z ₁ we have

$$\left|{\text{br}_{u}(z_{1}, w_{1}) \cap \text{P}^{\ell}_{n}(u)}\right| \geq \left|{\text{br}_{u}(wi^{\prime}, w_{1}) \cap \text{P}^{\ell}_{n}(u)}\right| - \ell + 2 \geq \ell^{n+1} - \ell + 2~. $$

Consequently, $\left |{\text {br}_{t}(i, v_{1}) \cap \text {P}^{\ell }_{n-1}(t)}\right | \geq \ell ^{n}$ as required since

$$\left|{\text{br}_{t}(i, v_{1}) \cap \text{P}^{\ell}_{n-1}(t)}\right| \leq \ell^{n} - 1 $$

would imply $\left |{\text {br}_{u}(z_{1}, w_{1}) \cap \text {P}^{\ell }_{n}(u)}\right | \leq \ell (\ell ^{n}-1) < \ell ^{n+1} - \ell + 2$. This completes the induction step and the proof. □

Next, we combine the previous two lemmas into the main result of this section that will be used to prove the infinity of several composition hierarchies. We show that a bimorphism in $\mathcal {B}$( l, n) can reduce the maximal level of the special positions by at most 1 (for a sufficiently large distance ℓ).

Theorem 41

Let B=(ψ,T,φ) be a bimorphism such that ψ:T _Γ →T _Σ is linear and φ:T _Γ →T _Δ is nondeleting. Moreover, let (s,u)∈τ(B), and let $\ell \in \mathbb {N}$ be such that ℓ>ht(ψ(γ)) for every γ∈Γ. For every $n \in \mathbb {N}$ , if $\textnormal {P}^{\ell }_{n+1}(s) \neq \emptyset $ , then $\textnormal {P}^{\ell }_{n}(u) \neq \emptyset $.

Proof

Since (s, u)∈ τ(B), there exists t ∈ T such that ψ(t) = s and φ(t) = u. By assumption, we have that $\text {P}^{\ell }_{n+1}(\psi (t)) \neq \emptyset $, so let $w \in \text {P}^{\ell }_{n+1}(\psi (t))$. By Lemma 36(v) there exist v, w ^′, w ^″ such that w = w ^′ w ^″, (v, w ^′) ∈ L _ψ(t), and w ^″∈pos_Σ(ψ(t(v))). Moreover, $\psi (t)|_{w^{\prime }} = \psi (t|_{v})$ by Lemma 36(i). Since $w^{\prime }w^{\prime \prime } \in \text {P}^{\ell }_{n+1}(\psi (t))$, Lemma 38(i) implies that

$$w^{\prime\prime} \in \text{P}^{\ell}_{n+1}(\psi(t)|_{w^{\prime}}) = \text{P}^{\ell}_{n+1}(\psi(t|_{v})) ~. $$

Hence, by Lemma 40, $\varepsilon \in \text {P}^{\ell }_{n}(t|_{v})$. Since φ is nondeleting, $\text {pos}_{x_{i}} \left (\varphi (t(v)) \right )$ is nonempty for every 1 ≤ i≤rk(t(v)). Consequently, Lemma 39 implies that $\text {P}^{\ell }_{n}(\varphi (t|_{v})) \neq \emptyset $. By Lemma 36(iii) there exists $\bar {w}$ such that $(v, \bar {w}) \in L_{\varphi }(t)$, and moreover, $\varphi (t|_{v}) = u|_{\bar {w}}$ by Lemma 36(i). Hence $\text {P}^{\ell }_{n}(u|_{\bar {w}}) = \text {P}^{\ell }_{n}(\varphi (t|_{v})) \neq \emptyset $, which proves that $\text {P}^{\ell }_{n}(u) \neq \emptyset $ by Lemma 38(i), as desired. □

Now we can simply chain Theorem 41 to show that an n-fold composition of tree transformations in $\mathcal {B}$( l, n) can decrease the maximal level by at most n (for a suitable distance ℓ).

Corollary 42 (of Theorem 41)

Let n≥1, and for every 1≤i≤n let B _i =(ψ _i ,T _i ,φ _i ) be a bimorphism such that ψ _i is linear and φ _i is nondeleting. Moreover, let $\varphi _{i} : T_{{\Gamma }_{i}} \to T_{{\Delta }_{i}}$ and $\psi _{i+1} : T_{{\Gamma }_{i+1}} \to T_{{\Delta }_{i}}$ for every 1≤i<n. Finally, let $\ell \in \mathbb {N}$ be such that ℓ>ht(ψ _i (γ)) for every 1≤i≤n and γ∈Γ _i , and let (t,u)∈τ(B ₁ );⋯ ;τ(B _n ). If $\textnormal {P}^{\ell }_{n+1}(t) \neq \emptyset $ , then $\textnormal {P}^{\ell }_{1}(u) \neq \emptyset $.

It remains to demonstrate a tree transformation that can be computed by and that reduces the maximal level of special positions from n+1 to 0. Clearly, this tree transformation cannot be computed by an n-fold composition of tree transformations from $\mathcal {B}$(l, n) because the output tree should contain a special position of level 1 by Corollary 42. We make sure that the assumptions of Corollary 42 are satisfied.

Example 43

Let

M = (Q,Σ,Σ,{⋆}, R) be the with
Q = {⋆, q} and Σ = {σ ⁽²⁾, α ⁽⁰⁾}, and
the set R consisting of the following rules
$$\sigma(\star, \alpha)\overset{\star,q}{\longrightarrow} \star \qquad \quad\sigma(\star, q) \overset{\star,q}{\longrightarrow} \sigma(\star, q) \qquad \quad\alpha \overset{\star}{ \longrightarrow} \alpha ~. $$

It is easy to see that τ(M) is a total function. Intuitively, for an input tree t, it removes all positions v and v2 of t such that t(v) = σ and t(v2) = α. Figure 7 shows the repeated application of τ(M), where one application is indicated by ↦. Assuming that each dashed line contains at least three more positions, it is easy to check that, for distance ℓ = 2, the root of the first tree has level 2 (because positions 1, 11, 111, 1111, 2, 21, 211, and 2111 all have level 1). The penultimate tree, which is obtained from the first tree by the application of τ(M)², only has special positions of level 0.

We use the of Example 43, and show that n transformations from $\mathcal {B}$(l, n) cannot compute the tree transformation τ(M)ⁿ⁺¹.

Lemma 44

for every n≥1.

Proof

Let Σ = {σ ⁽²⁾, α ⁽⁰⁾}. The powers of a tree c ∈ T _Σ({x ₁}) are defined by c ¹ = c and c ^k+1 = c[c ^k] for every k ≥ 1. Let T ₋₁ = {α}. For every $n \in \mathbb {N}$, we define the tree languages C _n ⊆ T _Σ({x ₁}) and T _n ⊆ T _Σ inductively by C _n = {σ(x ₁, t)^k∣t ∈ T _n−1, k ≥ 1} and T _n = {c[α]∣c ∈ C _n}.

Let M be the of Example 43. We have already remarked that τ(M):T _Σ→T _Σ is a total function. It is easy to see that τ(M)(t _n) ∈ T _n−1 for every $n \in \mathbb {N}$ and t _n∈ T _n. Consequently, τ(M)ⁿ⁺¹(t _n+1) ∈ T ₀ for every t _n+1∈ T _n+1 (see Fig. 7 that shows trees in T ₂, T ₁, T ₀, and _T−1). Obviously, $\text {P}^{\ell }_{1}(u) = \emptyset $ for every u ∈ T ₀ and ℓ ≥ 2. Thus, with the help of Corollary 42, we can complete the proof by showing that for every ℓ ≥ 2 there exists t ∈ T _n+1 such that $\text {P}^{\ell }_{n+1}(t) \neq \emptyset $.

Let ℓ ≥ 2 be fixed. We now prove that for every $n \in \mathbb {N}$ there exists t ∈ T _n such that $\text {P}^{\ell }_{n}(t) \neq \emptyset $ by induction on n. In fact, we prove the stronger statement that there exists t ∈ T _n and $v \in \text {P}^{\ell }_{n}(t)$ such that $\left |{\text {br}_{t}(\varepsilon , v) \cap \text {P}^{\ell }_{n}(t)}\right | \geq \ell ^{n+1}$. For n = 0, we select the tree t = c ^ℓ[α]∈ T ₀, where c = σ(x ₁, α), and the position v = 1^ℓ−1. Since $\text {P}^{\ell }_{0}(t) = \text {br}_{t}(\varepsilon , v)$, this selection of t and v fulfills the requirements. In the induction step, there exist a tree t ∈ T _n and $v \in \text {P}^{\ell }_{n}(t)$ such that $\left |{\text {br}_{t}(\varepsilon , v) \cap \text {P}^{\ell }_{n}(t)}\right | \geq \ell ^{n+1}$. We consider the tree $t^{\prime } = c^{(\ell ^{n+2}+1)}[\alpha ]$ with c = σ(x ₁, t) and the position $v^{\prime } = 1^{\ell ^{n+2}-1}$. Obviously, t ^′∈ T _n+1 and $v^{\prime \prime } \in \text {P}^{\ell }_{n+1}(t^{\prime })$ for every v ^″≼v ^′ because $\langle v^{\prime \prime },1, 2\rangle \in \text {PI}^{\ell }_{n+1}(t^{\prime })$ via the positions v ₁ = v ^″12v and v ₂ = v ^″2v using Lemma 38(i). This completes our induction and proof. □

Now we are able to prove that the composition hierarchy of and several other classes is infinite.

Theorem 45

For every n≥1 and

Proof

Since all inclusions are trivial, we only need to prove their strictness. By Proposition 11 we have , hence and . Together with Lemma 44 these two statements imply the strictness of the two inclusions on the left. To prove the strictness of the other two inclusions, we prove that snl-XTⁿ⁺¹⫅̸(l-XT^R)ⁿ. Using simple symmetry, we observe that , which together with the symmetric version of Lemma 44 yields $\text {snl-XT}^{n+1} \not \subseteq \mathcal {B}(\text {n}, \text {l})^{n}$. Furthermore, $\text {l-XT}^{\text {R}} =\mathcal {B}(\text {nl}, \text {l})$ by Proposition 11, which yields $\left (\text {l-XT}^{\text R}\right )^{n} \subseteq \mathcal {B}(\text {n}, \text {l})^{n}$. Together with $\text {snl-XT}^{n+1} \not \subseteq \mathcal {B}(\text {n}, \text {l})^{n}$ we obtain snl-XTⁿ⁺¹⫅̸(l-XT^R)ⁿ as desired. □

For the classes and with we can make more precise statements, which are similar to those in Theorems 26 and 34.

Theorem 46

For every n≥2,

$$\begin{array}{@{}rcl@{}} \textnormal{sl-XT} \subsetneq \textnormal{sl-XT}^{\textnormal R} &\subsetneq& \textnormal{sl-XT}^{n} = (\textnormal{sl-XT}^{\textnormal {R}})^{n} \subsetneq \textnormal{sl-XT}^{n+1} \\ \textnormal{l-XT} \subsetneq \textnormal{l-XT}^{\textnormal R} &\subsetneq& \textnormal{l-XT}^{n} \subseteq (\textnormal{l-XT}^{\textnormal R} )^{n} \subsetneq \textnormal{l-XT}^{n+1} \end{array} $$

Proof

The inclusions from left to right are trivial or follow from Lemma 15. The first strict inclusion on each line follows from Proposition 15. The other strict inclusions follow from snl-XT^m+1⫅̸(l-XT^R)^m, which was shown in the proof of Theorem 45 for every m ≥ 1.

It remains to prove that (sl-XT^R)ⁿ ⊆ sl-XTⁿ. Clearly, it suffices to prove this for n = 2. We first observe that QR ;snl-XT⊆snl-XT. In fact, since (as mentioned in the proof of Theorem 45) and, obviously, QR⁻¹=QR, we obtain that

where the inclusion follows from Lemma 13. Thus,

$$\begin{array}{@{}rcl@{}} (\text{sl-XT}^{\text{R}})^{2} &\subseteq& \text{QR}; \text{sl-XT}^{2} \subseteq \text{QR}; \text{snl-XT}; \text{sdl-H}; \text{sl-XT} \\ &\subseteq& \text{QR}; \text{snl-XT}; \text{sl-XT} \subseteq\text{snl-XT}; \text{sl-XT} \enspace, \end{array} $$

where the first step is by Lemma 15, the second step by Lemma 18, the third step by Lemma 19 and the last step by the above observation. □

The authors do not know whether, but guess that $\text {l-XT}^{n} \subsetneq (\text {l-XT}^{\text {R}})^{n}$ for all n ≥ 2. Table Table 4 summarizes the main results of this section. For the sake of completeness, we mention some additional results from the literature, where T stands for the class of all tree transformations computable by top-down tree transducers [6], and stands for the class of tree transformations computable by ε-free extended top-down tree transducers [17]. The result mentioned in Table 4 can be concluded from [17, Theorem 4.8].

**Table 4 Summary of the results of Section 6, where**

7 Hasse Diagram for the ε-Free Classes

Finally, let us compare the six classes of Theorem 34 with the three classes of Theorem 26 and the two classes of Proposition 17. Additionally, we consider the composition hierarchy for the class for which we established the infiniteness in Theorem 45. Thus, we compare all ε-free classes considered in this paper.

Theorem 47

Figure 8 is the Hasse diagram of the displayed classes of tree transformations for all n≥4.

Proof

The equalities are proved in Theorems 20 and 34, and all the inclusions are trivial or hold by either Lemma 15 or Corollary 25. The strictness of the vertical inclusions is proven in Proposition 17 and Theorems 26, 34, and 45. For the remaining strictness and incomparability results (with respect to ⊆) we have to prove the following six results.

(i)
: This is a consequence of Proposition 14.
(ii)
: This follows from Proposition 16. It is also a consequence of the proof of Theorem 31 as follows. Consider the and the and M ₃ in that proof. If then contradicting the proof of Theorem 31.
(iii)
and M ₃ be as in the proof of Theorem 31. Note that and τ(M ₂), Now suppose that . Then $\tau (M^{\prime }_{1}) ; \tau (M_{2}) ; \tau (M_{3})$ is in
where the first equality is due to Theorem 20. However, this contradicts the proof of Theorem 31.
(iv)
: The translation τ = {(t, α)∣t ∈ T _Σ} with Σ = {σ ⁽²⁾, α ⁽⁰⁾} can obviously be computed by an with the rules $\sigma (q, q^{\prime }) \overset {q_{0}}{\longrightarrow } \alpha $ and $\alpha \overset {q_{0}}{\longrightarrow } \alpha $, but for all k ≥ 1 by Corollary 42 because there exists a tree t ∈ T _Σ such that $\text {P}^{\ell }_{k+1}(t) \neq \emptyset $ as demonstrated in the proof of Lemma 44.
(v)
This follows from the proof of Theorem 31 because the , M ₂, and M ₃ in that proof are nondeleting.
(vi)
: This result follows from the proof of Theorem 33, so let M ₁, M ₂, and M ₃ be the of that proof. We note that τ(M ₂), . It is easy to show that , which can be achieved by the decomposition τ(M ₁) = τ(N ₁);τ(N ₂), where N ₁ is obtained from M ₁ by replacing the two rules involving q ^la by $\sigma (q, q^{\text {la}}) \overset {q}{\longrightarrow } \sigma (q, q^{\text {la}})$ and $\sigma (q^{\text {la}}, q) \overset {q}{\longrightarrow } \sigma (q^{\text {la}}, q)$ and adding the two rules $\gamma _{2}(q^{\text {la}}) \overset {q^{\text {la}}}{\longrightarrow } q^{\text {la}}$ and $\alpha \overset {q^{\text {la}}}{\longrightarrow } \alpha $. Then N ₁ is nondeleting. Similarly, we obtain N ₂ from M ₁ by replacing the two rules involving q ^la by $\sigma (q, \alpha ) \overset {q}{\longrightarrow } q$ and $\sigma (\alpha , q) \overset {q}{\longrightarrow } q$ (and removing the two rules $\gamma _{1}(p)\overset {p}{\longrightarrow } p$ and $\gamma _{2}(q) \overset {q}\longrightarrow q$ because the symbols γ ₁ and γ ₂ have already been removed by N ₁). Note that also N ₂ is nondeleting. The decomposition yields that τ(M ₁);τ(M ₂);τ(M ₃) is in . However, as demonstrated in the proof of Theorem 33, we have that τ(M ₁);τ(M ₂);τ(M ₃) is not in .

□

The authors did not attempt to present a Hasse diagram that contains all the classes (including the non- ε-free classes) discussed in this paper, but consider this a worthwhile effort.

8 Conclusion

Linear extended top-down tree transducers (with or without regular look-ahead) are formal models of syntax-based statistical machine translation. They have several good properties [19]. In particular, most of them can be presented as bimorphisms in the sense of [3], which yields that a result of [3] implies that ε-free, strict, and nondeleting l-xt are not closed under composition and that their composition hierarchy collapses at power 2. We extended their investigation to the composition hierarchy of the classes obtained by dropping some of the restrictions ε-freeness, strictness, and nondeletion. We showed in Theorem 34 that the composition hierarchy of ε-free l-xt^R collapses at power 3 and that of ε-free l-xt collapses at power 4. In fact, the powers 3 and 4 are the least powers with that property. To complete the picture, we showed in Theorem 45 that the composition hierarchies of l-xt, l-xt^R, and ε-free and nondeleting l-xt are infinite. Finally, we presented the Hasse-diagram of the powers of the considered ε-free classes in Theorem 47. In the future, the authors would like to investigate the composition hierarchy of weighted linear extended top-down tree transducers.

References

Arnold, A., Dauchet, M.: Transductions inversibles de forêts. Thèse 3ème cycle M. Dauchet, Université de Lille (1975)
Google Scholar
Arnold, A., Dauchet, M.: Bi-transductions de forêts. In: ICALP, pp 74–86. Edinburgh University Press (1976)
Arnold, A., Dauchet, M.: Morphismes et bimorphismes d’arbres. Theoret. Comput. Sci. 20(1), 33–93 (1982)
Article MathSciNet MATH Google Scholar
Chiang, D.: An introduction to synchronous grammars. In: ACL. ACL. Part of a tutorial given with K. Knight (2006)
Dauchet, M.: Transductions de forêts — bimorphismes de magmoïdes. Première thèse, Université de Lille (1977)
Engelfriet, J.: Bottom-up and top-down tree transformations — a comparison. Math. Systems Theory 9(3), 198–231 (1975)
Article MathSciNet MATH Google Scholar
Engelfriet, J.: Top-down tree transducers with regular look-ahead. Math. Systems Theory 10(1), 289–303 (1977)
Article MathSciNet MATH Google Scholar
Engelfriet, J.: Three hierarchies of transducers. Math. Systems Theory 15(2), 95–125 (1982)
MathSciNet MATH Google Scholar
Engelfriet, J., Maneth, S.: Macro tree translations of linear size increase are MSO definable. SIAM J. Comput. 32(4), 950–1006 (2003)
Article MathSciNet MATH Google Scholar
Engelfriet, J., Schmidt, E.M.: IO and OI I. J. Comput. System Sci. 15(3), 328–353 (1977)
Article MathSciNet MATH Google Scholar
Fülöp, Z., Maletti, A.: Linking theorems for tree transducers. Submitted manuscript; available at: http://www.inf.u-szeged.hu/fulop/publ/linking.pdf (2014)
Fülöp, Z., Maletti, A., Vogler, H.: Preservation of recognizability for synchronous tree substitution grammars. In: ATANLP. ACL, pp 1–9 (2010)
Fülöp, Z., Maletti, A., Vogler, H.: Weighted extended tree transducers. Fundam. Inform. 111(2), 163–202 (2011)
MathSciNet MATH Google Scholar
Fülöp, Z., Vogler, H.: Syntax-Directed Semantics—Formal Models Based on Tree Transducers. EATCS Monographs on Theoret. Comput. Sci. Springer (1998)
Gécseg, F., Steinby, M.: Tree Automata. Akadémiai Kiadó, Budapest (1984) 2nd edition availble at. arXiv:1509.06233
Gécseg, F., Steinby, M.: Tree languages. In: Rozenberg, G., Salomaa, A. (eds.) Handbook of Formal Languages. chap. 1, vol. 3, pp 1–68. Springer (1997)
Graehl, J., Hopkins, M., Knight, K., Maletti, A.: The power of extended top-down tree transducers. SIAM J. Comput. 39(2), 410–430 (2009)
Article MathSciNet MATH Google Scholar
Graehl, J., Knight, K., May, J.: Training tree transducers. Comput. Linguist. 34(3), 391–427 (2008)
Article MathSciNet Google Scholar
Knight, K., Graehl, J.: An overview of probabilistic tree transducers for natural language processing. In: CICLing, LNCS, vol. 3406, pp 1–24, Springer (2005)
Lemay, A., Maneth, S., Niehren, J.: A learning algorithm for top-down XML transformations. In: PODS. ACM, pp 285–296 (2010)
Maletti, A.: Compositions of extended top-down tree transducers. Inf. Comput. 206(9–10), 1187–1196 (2008)
Article MathSciNet MATH Google Scholar
May, J., Knight, K., Vogler, H.: Efficient inference through cascades of weighted tree transducers. In: ACL, pp 1058–1066 (2010)
Rounds, W.C.: Mappings and grammars on trees. Math. Systems Theory 4 (3), 257–287 (1970)
Article MathSciNet MATH Google Scholar
Thatcher, J.W.: Generalized² sequential machine maps. J. Comput. System Sci. 4(4), 339–367 (1970)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Leiden Institute of Advanced Computer Science, Leiden University, P.O. Box 9512, 2300 RA, Leiden, The Netherlands
Joost Engelfriet
Department of Foundations of Computer Science, University of Szeged, Árpád tér 2, H-6720, Szeged, Hungary
Zoltán Fülöp
Institute of Computer Science, Universität Leipzig, Augustusplatz 10–11, 04109, Leipzig, Germany
Andreas Maletti

Authors

Joost Engelfriet
View author publications
You can also search for this author in PubMed Google Scholar
Zoltán Fülöp
View author publications
You can also search for this author in PubMed Google Scholar
Andreas Maletti
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Andreas Maletti.

Additional information

This is a revised and extended version of [Z. Fülöp and A. Maletti: Composition closure of ε -free linear extended top-down tree transducers. In Proc. 17th DLT, volume 7907 of LNCS, pages 239–251. Springer-Verlag, 2013].

This work was partially supported by the exchange project 55 657 of the German Academic Exchange Service (DAAD) and Hungarian Scholarship Board Office (MÖB). Z. Fülöp was partially supported by the NKFI grant K 108 448, and A. Maletti was partially supported by the German Research Foundation (DFG) grant MA / 4959 / 1-1.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Engelfriet, J., Fülöp, Z. & Maletti, A. Composition Closure of Linear Extended Top-down Tree Transducers. Theory Comput Syst 60, 129–171 (2017). https://doi.org/10.1007/s00224-015-9660-2

Download citation

Published: 29 December 2015
Issue Date: February 2017
DOI: https://doi.org/10.1007/s00224-015-9660-2

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Composition Closure of Linear Extended Top-down Tree Transducers

Abstract

Similar content being viewed by others

Composition Closure of Linear Weighted Extended Top-Down Tree Transducers

Composition Closure of ε-Free Linear Extended Top-Down Tree Transducers

Compositions of Tree-to-Tree Statistical Machine Translation Models

1 Introduction

2 Preliminaries

3 Linear Extended Top-down Tree Transducers

Definition 1 ([17, Section 2.2])

Example 2

Definition 3 ([12, Section 3])

Definition 4

Example 5

Definition 6 ([12, Section 3])

Definition 7

Example 8

Lemma 9 (context-freeness)

Notation 10

Proposition 11 ([2] and [21, Theorems 17 and 4])

Proposition 12 ([13, Lemma 4.1 and Corollary 4.1])

Proof

Lemma 13 (composition on the right)

Proof

Proposition 14 ([17, Lemma 4.3])

Proof

Lemma 15 (look-ahead decomposition)

Proof

4 Four Classes that are Closed at a Finite Power

Proposition 16 ([3, Section 3.4])

Proposition 17 ([3, Theorem 6.2])

Lemma 18 (decomposition on the right)

Proof

Lemma 19 (composition on the left)

Proof

Theorem 20

Proof

Lemma 21 (non-strict normal form)

Proof

Example 22

Lemma 23 (decomposition on the left)

Proof

Theorem 24

Proof

Corollary 25

5 Least Power of Closedness

Theorem 26

Proof

Definition 27 ([11, Definitions 4 and 5])

Proposition 28 ([11, Corollary 1 and Theorem 2])

Lemma 29

Proof

Proposition 30 ([11, Theorem 4])

Theorem 31

Proof

Lemma 32

Proof

Theorem 33

Proof

Theorem 34

Proof

6 Infinite Composition Hierarchies

Definition 35

Lemma 36

Proof

Example 37

Lemma 38

Proof

Lemma 39

Proof

Lemma 40

Proof

Theorem 41

Proof

Corollary 42 (of Theorem 41)

Example 43

Lemma 44

Proof

Theorem 45

Proof