Key words

1 Introduction

The first result of pure finite Ramsey theory and a prototype of the many later results of this area (see [5]) is the theorem proved by Ramsey in 1930. We recall it now to remind the reader of the flavor of pure finite Ramsey theory. We will also refer to this statement later on. For a natural number n, let \([n] =\{ 1,\ldots, n\}\); in particular, [0] = . The classical Ramsey theorem says that given natural numbers d, l, and m, there exists a natural number n such that for each d-coloring, that is, a coloring with d colors, of all l element subsets of [n], there exists an m element subset z of [n] such that all l element subsets of z have the same color.

In Sect. 2, we give an exposition of the abstract approach to pure finite Ramsey theory developed in [7]; the main theorem, saying that a general pigeonhole principle implies a general Ramsey property, is stated as Theorem 9 (see also Appendix 1). Most pure finite Ramsey theoretic results can be viewed as instances of the machinery presented here. In the exposition, we make an effort to motivate the main abstract notions and we also illustrate them with examples.

In Sects. 345, using arguments consisting mostly of formulating appropriate definitions, we show that certain Ramsey-theoretic results for finite trees, one of which is new, are particular instances of the general Theorem 9. These applications of Theorem 9 to concrete situations are similar to each other, with the main differences lying in the derivations used (more on it in the next paragraph). Therefore, in the first two applications (the illustrations in Sect. 2 and Illustration 9), we explicitly check all the details and provide pictures; in the third application (a generalization of Deuber’s and Jasiński’s theorems, Sect. 4), we give all the definitions, but carefully check the pigeonhole principle only; in the last application (Milliken’s theorem, Sect. 5), we state all the definitions, but leave checking the details to the reader in exercises. Recall that in [7], it is shown how, for example, the classical Ramsey theorem, the Graham–Rothschild theorem, and a new self-dual Ramsey theorem can be obtained as instances of Theorem 9.

In each of the many concrete Ramsey theorems (considered here and in [7]), the same underlying algebraic structure turns out to be present, the structure of a normed background given by Definition 1 (see also Appendix 1). A crucial element of such structures is a truncation operator, which forms a basis for inductive arguments. In the concrete situations involving trees and considered in the present paper, there is a close connection between truncation operators and derivations on trees. Roughly speaking there are two natural derivations on trees: cutting off the rightmost branch and cutting off the highest leaves. These two derivations give rise to two types of truncation operators, which lead to two types of normed backgrounds, which in turn lead to two Ramsey theorems. Namely, the branch cutting derivation gives a generalization of Deuber’s and Jasiński’s theorems, while the leaf cutting derivation gives Milliken’s theorem.

For convenience, we adopt the following modification to the notation for the operation of subtracting 1 among natural numbers: we set 0 − 1 to be equal to 0; for k > 0, k − 1 retains its usual meaning.

2 Abstract Approach with Illustrations

2.1 Abstract Ramsey Theory

A typical Ramsey-type theorem has the following form. We start with two families \(\mathcal{F}\) and \(\mathcal{P}\). (Elements of \(\mathcal{F}\) and \(\mathcal{P}\) are usually finite sets of functions, most frequently some type of morphisms.) A set P from \(\mathcal{P}\) and a number of colors d are given. The conclusion of the theorem then asserts that there is a set F from \(\mathcal{F}\) with a given mapping (usually a type of composition) defined on F ×P,

$$F \times P \ni (f,x) \rightarrow f\,.\,x,$$

such that for each d-coloring of the image {f . x: f ∈ F, x ∈ P} of F ×P under the mapping there exists f 0 ∈ F with {f 0 . x: x ∈ P} monochromatic.

Below in the paper, we formalize this vague idea and we also give several concrete examples that should convince the reader that Ramsey-type theorems do indeed have this form. Here, as an illustration, we only phrase the classical Ramsey theorem in a way that is compatible with the general framework above. It may be useful for the reader to recall here the Ramsey theorem from the first paragraph of the introduction. In the restatement of the Ramsey theorem to which we now proceed, for natural numbers p and q, we identify p element subsets of [q] with increasing injections from [p] to [q] so that a subset z is identified with the unique increasing injection whose range is equal to z. One can take \(\mathcal{F} = \mathcal{P}\) to be the family of all sets produced as follows: fix natural numbers p and q and form the set of all increasing injections from [p] to [q]. Fix natural numbers d, l, and m, and let \(P \in \mathcal{P}\) be the set of all increasing injections from [l] to [m]. Then the classical Ramsey theorem says that there is an n with the following property. For the set \(F \in \mathcal{F}\) of all increasing injections from [m] to [n], if we d-color the set

$$\{f \circ x: f \in F,\,x \in P\} = \mbox{ all increasing injections from}[l]to[n],$$

then there exists f 0 ∈ F such that {f 0 ∘ x: x ∈ P} is monochromatic.

Now we start the description of the abstract approach. Let A and X be sets. Assume we are given a partial function from A ×X to X:

$$(a,x) \rightarrow a\,.\,x.$$

Such a function . will be called an action (of A on X). No properties of the function . are assumed to hold at this point. For F ⊆ A and P ⊆ X, we say that F . Pis defined if a . x is defined for all a ∈ F and x ∈ P, and we let

$$F\,.\,P =\{ a\,.\,x: a \in F,\,x \in P\}.$$

We also write a . P for {a} . P.

We will give a sequence of illustrations that contain the most rudimentary examples of the general notions being introduced. The illustrations depend on each other and lead to the classical Ramsey theorem.

Illustration 1.

Let A = X be the set of all (strictly) increasing functions from \([k] =\{ 1,\ldots, k\}\) to \(\mathbb{N} \setminus \{ 0\}\), where k ranges over \(\mathbb{N}\). Given a, x ∈ A = X with \(a: [l] \rightarrow \mathbb{N} \setminus \{ 0\}\) and \(x: [k] \rightarrow \mathbb{N} \setminus \{ 0\}\), let a . x be defined precisely when [l] contains the image of x and put

$$a\,.\,x = a \circ x.$$

Going back to the general situation, let \(\mathcal{F}\) and \(\mathcal{P}\) be families of non-empty subsets of A and X, respectively. Assume we have a partial function from \(\mathcal{F}\times \mathcal{P}\) to \(\mathcal{P}\):

$$(F,P) \rightarrow F \bullet P$$

such that if F ∙ P is defined, then it is given by the point-wise action of F on P, that is, F . P is defined and

$$F \bullet P = F\,.\,P.$$

In such a situation, we say that \((\mathcal{F},\mathcal{P},\bullet )\) is a pair of families over (A,X, . ). Introducing a restriction ∙ of the point operation of sets in \(\mathcal{F}\) on sets in \(\mathcal{P}\) makes the Ramsey condition (R) below more flexible, while a careful calibration of the resteriction makes it possible to satisfy condition ( ∗ ) of the next subsection. In concrete situations, definitions of restrictions ∙ are very natural.

Illustration 2.

For \(k,l \in \mathbb{N}\) with 0 < k ≤ l, let \(\left ({ l \atop k} \right )\) stand for the set of all (strictly) increasing functions from [k] to [l]. Let also \(\left ({ 0 \atop 0} \right )\) consist of one element—the empty function. Since an increasing function from [k] to [l] is determined by its range, \(\left ({ l \atop k} \right )\) can be identified with the set of all k element subsets of [l]. Let \(\mathcal{F} = \mathcal{P}\) be the set of all \(\left ({ l \atop k} \right )\) with 0 < k ≤ l or \(k = l = 0\). Declare \(\left ({ n \atop m} \right ) \bullet \left ({ l \atop k} \right )\) to be defined precisely when m = l, and let

$$\left ({ n \atop l} \right ) \bullet \left ({ l \atop k} \right ) = \left ({ n \atop k} \right ).$$

It is clear that \(\left ({ n \atop l} \right ) \bullet \left ({ l \atop k} \right ) = \left ({ n \atop l} \right )\,.\,\left ({ l \atop k} \right )\). Note, however, that \(\left ({ n \atop m} \right )\,.\,\left ({ l \atop k} \right )\) is defined if we assume only m ≥ l.

The following condition is our Ramsey statement, which is just a formalization of the statement from the beginning of this subsection:

(R) :

given d > 0, for each \(P \in \mathcal{P}\), there is an \(F \in \mathcal{F}\) such that F ∙ P is defined, and for every d-coloring of F ∙ P there is an f ∈ F such that f . P is monochromatic.

Illustration 3.

In the special case of Illustrations 1 and 2, condition (R) says in particular that given d > 0 and 0 < k ≤ l there exists m ≥ l such that for each d-coloring of \(\left ({ m \atop l} \right ) \bullet \left ({ l \atop k} \right ) = \left ({ m \atop k} \right )\) there exists \(a \in \left ({ m \atop l} \right )\) such that the set

$$\left \{a \circ x: x \in \left ({ l \atop k} \right )\right \}$$

is monochromatic. This is the classical Ramsey theorem.

2.2 Abstract Pigeonhole Principle

We introduce here our pigeonhole principle. The name is purely conventional as the principle is not a simple abstraction of the well known pigeonhole principle of Dirichlet. Rather it is a condition that is easy to check in concrete situations and that implies, through inductive arguments encoded in Theorem 9, the Ramsey condition (R).

We will need an important additional piece of structure. Let A, X, and an action . be as above. Let : X → X be a function such that for a ∈ A and x ∈ X, if a . x is defined, then a . ∂x is defined and

$$\partial (a\,.\,x) = a\,.\,\partial x.$$
(1)

Such a function is called a truncation. For P ⊆ X, we write

$$\partial P =\{ \partial x: x \in P\}.$$
(2)

Introduction of the operator equips X with an additional structure and Eq. (1) states that the action of A on X is implemented by homomorphism of this structure. In applications to concrete Ramsey theorems, is always a form of derivation leading from an object in X to another, less complex object in X. In this fashion, in proofs, provides a foothold for inductive arguments.

Illustration 4.

We continue the pervious illustrations, in particular, our notation is as in Illustration 1. For x ∈ X with \(x: [k] \rightarrow \mathbb{N} \setminus \{ 0\}\), define

$$\partial x = x \upharpoonright [k - 1].$$

(Recall here the convention for the notation k − 1 adopted in the introduction.) It is easy to check that condition (1) is satisfied. Note also that, by Eq. (2), \(\partial \left ({ l \atop k} \right ) = \left ({ l-1 \atop k-1} \right )\), if k > 1, and \(\partial \left ({ l \atop k} \right ) = \left ({ 0 \atop 0} \right )\), if k ≤ 1.

Let \((\mathcal{F},\mathcal{P},\bullet )\) be a pair of families over (A, X, . ) equipped with a truncation . We are ready to formulate our pigeonhole principle. For P ⊆ X and y ∈ X, put

$$P_{y} =\{ x \in P : \partial x = y\}.$$
(3)

So P y is the set consisting of those elements of P that truncate to the same simpler object y. Given a, b ∈ A, we say that b extends a if for each x with a . x defined, we have that b . x is defined and that it is equal to a . x. For \(F \in \mathcal{F}\) and a ∈ A, let

$$F_{a} =\{ f \in F : f\mbox{ extends }a\}.$$
(4)

The Ramsey statement (R) above requires, upon coloring of F ∙ P, stabilizing the coloring on a copy f . P of P obtained by acting on P by some element f of F. Pigeonhole principle (P) below asks us to perform the following much easier task. We fix an object y ∈ X, which can be assumed to be simpler than objects in P. We consider the elements of P that truncate to this fixed y, that is, we consider P y , and require stabilizing the coloring only on a copy f . P y of P y obtained by acting on P y by an element f from F. The price to pay is that f has to act on y in a way prescribed by an element a ∈ A chosen in advance, that is, f is actually taken from F a for some a for which a . y is defined.

It is surprising that various concrete pigeonhole principles occurring in the finite pure Ramsey theory have this form. We illustrate it below by the classical pigeonhole principle used to prove the classical Ramsey theorem. In the following sections, we will give more complex examples involving trees. Paper [7] contains a number of further examples.

The following criterion on \((\mathcal{F},\mathcal{P},\bullet )\) is our pigeonhole principle:

(P) :

given d > 0, for all \(P \in \mathcal{P}\) and y ∈ ∂P, there are \(F \in \mathcal{F}\) and a ∈ A such that F ∙ P is defined, a . y is defined, and for every d-coloring of F a  . P y there is an f ∈ F a such that f . P y is monochromatic.

Note that in the condition above F a  . P y is defined since F ∙ P is assumed to be defined and F a  ⊆ F and P y  ⊆ P. Also, of course, the condition would not have changed if we required the coloring to be defined on F . P y or even on F . P. It is, however, crucial that f be found in F a .

Illustration 5.

In our special case from the earlier illustrations, a moment of thought and a picture convince one that condition (P) boils down to the classical pigeonhole principle. For the sake of practice, however, let us look at it carefully in detail. We will be helped by Fig. 1.

Fig. 1
figure 1

Condition (P) in Illustration 5

For notational simplicity, in the argument below, we assume that k > 1 and leave checking that the same argument works for k ≤ 1 to the reader. We state condition (P) in our special case:

let d > 0, 1 < k ≤ l and \(y \in \left ({ l-1 \atop k-1} \right )\) be given; let l′ be the maximum of the range of y; there exists m ≥ l and an increasing function \(a: [l^{\prime}] \rightarrow \mathbb{N} \setminus \{ 0\}\) such that for each d-coloring of

$$\left \{f \circ x: f \in \left ({ m \atop l} \right ),\,f \upharpoonright [l^{\prime}] = a,\,x \in \left ({ l \atop k} \right ),\,x \upharpoonright [k - 1] = y\right \}$$

there exists \(f \in \left ({ m \atop l} \right )\) with f ↾ [l′] = a and such that

$$\left \{f \circ x: x \in \left ({ l \atop k} \right ),\,x \upharpoonright [k - 1] = y\right \}$$

is monochromatic.

We claim that the condition above holds with a being the identity function from [l′] to itself. Indeed, with this choice of a, the conclusion of the condition reads:

there exists m ≥ l such that for each d-coloring of

$$\left \{f \circ x: f \in \left ({ m \atop l} \right ),\,f(i) = i\mbox{ for }i \in [l^{\prime}],\,x \in \left ({ l \atop k} \right ),\,x \upharpoonright [k - 1] = y\right \}$$

there exists \(f \in \left ({ m \atop l} \right )\) with f(i) = i, for i ∈ [l′], and with

$$\{f \circ x: x \in \left ({ l \atop k} \right ),\,x \upharpoonright [k - 1] = y\}$$

monochromatic.

The elements of the set

$$\left \{f \circ x: f \in \left ({ m \atop l} \right ),\,f(i) = i\mbox{ for }i \in [l^{\prime}],\,x \in \left ({ l \atop k} \right ),\,x \upharpoonright [k - 1] = y\right \}$$

differ only in the single value f(x(k)) and this value comes from the set [m] ∖ [l′]. Also x(k) is an arbitrary element of [l] ∖ [l′]. So, in essence, we are d-coloring [m] ∖ [l′] and are looking for an increasing function from [l] ∖ [l′] to [m] ∖ [l′] whose values take the same color. This is just the classical pigeonhole condition, and we can take m to be any number strictly bigger than \(l^{\prime} + d \cdot (l - l^{\prime} - 1)\).

Our goal is to state a theorem that condition (P) implies condition (R). Achieving this goal, in Theorem 9, will require introducing more structure on (A, X, . , ) and imposing additional conditions on \((\mathcal{F},\mathcal{P},\bullet )\).

2.3 Additional Structure and Additional Conditions

Let A, X, an action . , and a truncation be as above.

Let again \((\mathcal{F},\mathcal{P},\bullet )\) be a pair of families over (A, X, . ). Recall the notion of extension for elements of A defined in the discussion preceding (4). We first state two conditions on \((\mathcal{F},\mathcal{P},\bullet )\) that do not require introducing any additional structure:

(A) :

if \(P \in \mathcal{P}\), then \(\partial P \in \mathcal{P}\);

(B) :

if \(F \in \mathcal{F}\), \(P \in \mathcal{P}\), and F ∙ ∂P is defined, then there is \(G \in \mathcal{F}\) such that G ∙ P is defined and for each f ∈ F there is g ∈ G extending f.

Strictly speaking conditions (A) and (B) are not needed to prove Theorem 9; one can dispense with them at the expense of strengthening condition (P) slightly. (We elaborate on it in Appendix 1.) However, in some situations, for example, in all the situations in this note, conditions (A) and (B) hold, and whenever they hold they do so in an obvious way (and they make strengthening of (P) unnecessary). Condition (A) simply requires closure of \(\mathcal{P}\) under truncation. As for condition (B), note that if F . P is defined, then F . ∂P is defined. The reverse implication is false in general. Condition (B) gives a substitute for this reverse implication: assuming something stronger, namely that F ∙ ∂P is defined, we can infer that G ∙ P is defined for a G that can simulate the action of every element of F.

Illustration 6.

Recall that we have

$$\mathcal{F} = \mathcal{P} = \left \{\left ({ n \atop m} \right ): 0 < m \leq n\mbox{ or }m = n = 0\right \}.$$

We check conditions (A) and (B). It follows from the remark in Illustration 4 that \(\mathcal{P}\) is closed under , so (A) holds. To check (B), let \(F = \left ({ n \atop m} \right )\) and \(P = \left ({ l \atop k} \right )\). We assume k > 1 and leave the trivial case k ≤ 1 to the reader. We have

$$F \bullet \partial P = \left ({ n \atop m} \right ) \bullet \left ({ l - 1 \atop k - 1} \right )$$

is defined precisely when \(m = l - 1\), and we can take \(G = \left ({ n+1 \atop l} \right )\) to witness the conclusion of (B) since \(\left ({ n+1 \atop l} \right ) \bullet \left ({ l \atop k} \right )\) is defined and each element of \(\left ({ n \atop l-1} \right )\) is extended by an element of \(\left ({ n+1 \atop l} \right )\). We elaborate on this last point. Note that for each \(f \in \left ({ n \atop l-1} \right )\), that is, for each increasing f : [l − 1] → [n], there is increasing g: [l] → [n + 1] with \(g \upharpoonright [l - 1] = f\). In this situation, for each x ∈ X (recall that X is the set of all increasing functions from some [k] to \(\mathbb{N}\)), if f . x is defined, then the image of x is included in [l − 1], and so g . x is defined and obviously

$$g\,.\,x = g \circ x = f \circ x = f\,.\,x.$$

So, in our example, g extending f as an increasing function is equivalent to g extending f as an element of A. A similar coincidence will be present also in the subsequent illustrations.

To make the partial function . from A ×X to X into an honest action, we assume that we also have a partial function from A ×A to A:

$$(a,b) \rightarrow a \cdot b,$$

such that for a, b ∈ A and x ∈ X if a . (b . x) and (ab) . x are both defined, then

$$a\,.\,(b\,.\,x) = (a \cdot b)\,.\,x.$$
(5)

The operation ⋅as above will be called multiplication. Equation (5) is the usual equation defining, say, a group action on a set. As before, for F, G ⊆ A, we say that FGis defined if ab is defined for all a ∈ F and b ∈ G and we let

$$F \cdot G =\{ a \cdot b: a \in F,b \in G\}.$$

Now, again as before, in addition to the partial function ∙ from \(\mathcal{F}\times \mathcal{P}\) to \(\mathcal{P}\), assume that we have a partial function ∙ from \(\mathcal{F}\times \mathcal{F}\) to \(\mathcal{F}\) with the property that if G ∙ F is defined, then it is given point-wise, that is, GF is defined and

$$G \bullet F = G \cdot F.$$

We now call \((\mathcal{F},\mathcal{P},\bullet, \bullet )\) a pair of families over (A, X, . ,  ⋅).

We can now state our final condition on \(\mathcal{F}\), \(\mathcal{P}\), ∙ and ∙ :

(∗) :

if \(F,G \in \mathcal{F}\), \(P \in \mathcal{P}\), and F ∙ (G ∙ P) is defined, then so is (F ∙ G) ∙ P.

This condition is crucial. It says that F ∙ (G ∙ P) is never defined “by chance;” if it is defined, then the product F ∙ G is defined, as is its action on P. In concrete situations, this condition is guaranteed by a natural calibration of the domains of the operations ∙ and ∙ . Note that under the assumptions of ( ∗ ), from Eq. (5), we have

$$F \bullet (G \bullet P) = (F \bullet G) \bullet P.$$

In [7], a pair of families \((\mathcal{F},\mathcal{P},\bullet, \bullet )\) over (A, X, . ,  ⋅) fufilling condition ( ∗ ) is called an actoid of sets.

Illustration 7.

Recall again that

$$\mathcal{F} = \mathcal{P} = \left \{\left ({ n \atop m} \right ): 0 < m \leq n\mbox{ or }m = n = 0\right \}.$$

Declare \(\left ({ n \atop m} \right ) \bullet \left ({ l \atop k} \right )\) on \(\mathcal{F}\) to be defined precisely when m = l and let

$$\left ({ n \atop l} \right ) \bullet \left ({ l \atop k} \right ) = \left ({ n \atop k} \right ).$$

So ∙ is equal to ∙ defined earlier in Illustration 2. It follows that ∙ is given pointwise.

To check ( ∗ ), note that if

$$\left ({ q \atop p} \right ) \bullet \left (\left ({ n \atop m} \right ) \bullet \left ({ l \atop k} \right )\right )$$

is defined, then m = l and p = n, but in this situation

$$\left (\left ({ q \atop p} \right ) \bullet \left ({ n \atop m} \right )\right ) \bullet \left ({ l \atop k} \right )$$

is defined.

We require one more piece of structure that, roughly speaking, measures complexity of objects in X. A function |  ⋅ | : X → D, where (D, ≤ ) is a linear order, is called a norm if for x, y ∈ X, | x | ≤ | y | implies that for all a ∈ A

$$a\,.\,y\mbox{ defined} \Rightarrow (a\,.\,x\mbox{ defined and }\vert a\,.\,x\vert \leq \vert a\,.\,y\vert ).$$
(6)

Illustration 8.

In our special case, X is the set of all increasing injections \(x: [k] \rightarrow \mathbb{N} \setminus \{ 0\}\) for \(k \in \mathbb{N}\). Define \(\vert \cdot \vert : X \rightarrow \mathbb{N}\), where \(\mathbb{N}\) is taken with its natural linear order, to be

$$\vert x\vert = \left \{\begin{array}{@{}l@{\quad }l@{}} \max \mathrm{image}(x) = x(k),\quad &\text{ if}k > 0;\\ 0, \quad &\text{ if} k = 0. \end{array} \right.$$

We check that this definition gives a norm. Let a ∈ A, \(a: [l] \rightarrow \mathbb{N} \setminus \{ 0\}\). Note that, for x ∈ X, a . x is defined precisely when | x | ≤ l and | a . x |  = a( | x | ), if | x |  > 0, and | a . x |  = 0, if | x |  = 0. So given x 1, x 2 ∈ X with | x 1 | ≤ | x 2 | , it is clear that if a . x 2 is defined, then so is a . x 1 and

$$\vert a\,.\,x_{1}\vert = a(\vert x_{1}\vert ) \leq a(\vert x_{2}\vert ) = \vert a\,.\,x_{2}\vert, \;\mbox{ if }\vert x_{1}\vert > 0,$$

or

$$\vert a\,.\,x_{1}\vert = 0 \leq \vert a\,.\,x_{2}\vert, \;\mbox{ if }\vert x_{1}\vert = 0.$$

The additional conditions required to prove our theorem were stated as (A), (B), and ( ∗ ). The additional structure introduced above is consolidated in the following notion.

Definition 1.

A normed background is a pair of sets A, X equipped with a multiplication ⋅and an action . fulfilling (5), with a truncation fulfilling (1), and with a norm |  ⋅ | fulfilling (6).

With some abuse of notation, a normed background as above will be denoted by (A, X).

2.4 The Theorem

Now we can phrase our theorem. To see how it follows from the somewhat more general results of [7], the reader should consult Appendix 1. We write t P, \(t \in \mathbb{N}\), for the result of applying truncation to Pt times.

Theorem 9.

Let \((\mathcal{F},\mathcal{P},\bullet, \bullet )\) be a pair of families over a normed background fulfilling conditions (A), (B), and (∗). Assume that each \(P \in \mathcal{P}\) is finite and for each \(P \in \mathcal{P}\) there is \(t \in \mathbb{N}\) such that ∂ t P consist of one element. If \((\mathcal{F},\mathcal{P})\) fulfills (P), then it fulfills (R).

Note that the theorem above gives the classical Ramsey theorem on the basis of Illustrations 1–8. In them, we checked all the assumptions of Theorem 9 except: for \(P \in \mathcal{P}\), P is finite and t P has one element for some \(t \in \mathbb{N}\). Finiteness of P is clear. Note that \({\partial }^{k}\left ({ l \atop k} \right ) = \left ({ 0 \atop 0} \right )\), so this last assumption is also fulfilled.

3 Trees and Another Illustration

Trees and Embeddings We state here basic definitions concerning trees. By a tree we understand a finite, possibly empty, partial order such that each two elements have a common predecessor and the set of predecessors of each element is linearly ordered. So trees for us are finite trees. If a tree is non-empty, it has a smallest element, which we call the root. Maximal elements of a tree are called leaves. By convention, we regard every node of a tree as one of its own predecessors and as one of its own successors.

Each tree T carries a binary function ∧  T that assigns to each v, w ∈ T the largest element v ∧  T w of T that is a predecessor of both v and w. After Deuber [2], we say that a function f : S → T, for trees S and T, is a morphism if for all v, w ∈ S,

$$f(v \wedge _{S}w) = f(v) \wedge _{T}f(w).$$

So strictly speaking f is a morphism from the functional structure (S, ∧  S ) to the functional structure (T, ∧  T ).

For a tree T and v ∈ T, let im T (v) be the set of all immediate successors of v, and we do not regard v as one of them. Let T(v) be the tree whose elements are all the successors of v (with v among them, of course). Let ht T (v) be the cardinality of the set of all predecessors of v (including v), and let

$$\mathrm{ht}(T) =\max \{ \mathrm{ht}_{T}(v): v \in T\}.$$

For a non-empty tree T, let br(T) be the maximum of cardinalities of im T (v) for v ∈ T.

We will occasionally suppress the subscripts from various pieces of notation introduced above if we deem them clear from the context.

A tree T is called ordered if for each v ∈ T there is a fixed linear order of im(v). Such an assignment allows us to define the lexicographic linear order ≤  T on all the nodes of T by stipulating that v ≤  T w if v is a predecessor of w and, in case v is not a predecessor of w and w is not a predecessor of v, that v ≤  T w if the predecessor of v in im(v ∧ w) is less than or equal to the predecessor of w in im(v ∧ w) in the given order on im(v ∧ w).

The simplest ordered trees are [n] for \(n \in \mathbb{N}\) with their natural successor relation and the unique ordering of the immediate successors of each vertex.

An embedding f from an ordered tree S to an ordered tree T is an injective tree morphism such that

  1. (i)

    It is order preserving between ≤  S and ≤  T ;

  2. (ii)

    For each v ∈ S, the set {w ∈ im T (f(v)): w is a predecessor of f(v′) for some v′ ∈ im S (v)} forms an initial segment with respect to ≤  T of im T (f(v)).

Note that preservation of order by f is equivalent to saying that for every v ∈ S and all w 1, w 2 ∈ im S (v) with w 1 ≤  S w 2 if w 1 , w 2 in im T (f(v)) are predecessors of f(w 1) and f(w 2), respectively, then w 1  ≤  T w 2 . An embedding is leaf preserving if each leaf of the domain is mapped to a leaf of the range. An embedding f : S → T is called strong if for v, w ∈ S with ht(v) = ht(w) we have that ht(f(v)) = ht(f(w)). Note that each embedding from [n], \(n \in \mathbb{N}\), to an ordered tree is a strong embedding.

Derivations on Trees There are two natural ways to trim an ordered tree. Let an ordered tree T be given. Put

$${T}^{{\ast}} =\{ v \in T : \mathrm{ht}(v) < \mathrm{ht}(T)\},$$
(7)

that is, T  ∗  is obtained from T by removing all of its highest leaves. Note that T  ∗  with ≤  T restricted to it is an ordered tree, and that the inclusion from T  ∗  to T is a strong embedding.

Let x be the rightmost with respect to ≤  T leaf of T, that is, x is the ≤  T -largest element of T, and let

$$T^{\prime} =\{ v \in T : T(v)\mbox{ has a leaf different from }x\},$$
(8)

that is, T′ is obtained from T by removing from it a final segment of its rightmost branch. The tree T′ with ≤  T restricted to it forms an ordered tree and the inclusion T′ ⊆ T is a leaf preserving embedding. If T ≠ , then the set T ∖ T′ with the inherited tree structure can be identified with [p] for some \(p \in \mathbb{N}\), p > 0, with its natural tree order. If T′ ≠ , there is a unique node v 0 ∈ T′ that has an immediate successor in T ∖ T′. We call v 0 the splitting node of T.

Examples of Trees and Embeddings We fix some notation concerning trees. After Deuber [2], a non-empty tree T is called regular if for each v ∈ T that is not a leaf, | im(v) |  = br(T) and for each leaf x ∈ T, ht(x) = ht(T). Of course, each such tree is fully determined by the value of two parameters: br(T) and ht(T). For \(k,n \in \mathbb{N}\), k > 0, n > 1, let T k, n be the regular tree of height n and with branching number k. By convention, for \(k \in \mathbb{N}\), let T k, 1 have exactly one node and T k, 0 be equal to the empty tree, and for \(n \in \mathbb{N}\), n > 1, let T 0, n have exactly one node. We consider T k, n to be an ordered tree with some linear order \(\leq _{{T}^{k,n}}\). (All possible orders making T k, n into an ordered tree lead to isomorphic ordered trees.) The tree T 1, n can be identified with [n] as an ordered tree.

We fix two natural ways of embedding T k, n into T k, n + 1. First, there is a unique embedding ι ∗  of T k, n into T k, n + 1 with

$${\iota }^{{\ast}}(\mathrm{im}_{{ T}^{k,n}}(v)) \subseteq \mathrm{im}_{{T}^{k,n+1}}({\iota }^{{\ast}}(v)),$$

for v ∈ T k, n, and with ι ∗  mapping the root of T k, n to the root of T k, n + 1, if T k, n is non-empty. Note that \(\mathrm{ht}_{{T}^{k,n}}(v) = \mathrm{ht}_{{T}^{k,n+1}}({\iota }^{{\ast}}(v))\) for v ∈ T k, n. We write

$${T}^{k,n} {\subseteq }^{{\ast}}{T}^{k,n+1}$$
(9)

to indicate that we consider T k, n identified with its image under ι ∗ . There is also a unique embedding ι of T k, n into T k, n + 1 with

$$\iota ^{\prime}(\mathrm{im}_{{T}^{k,n}}(v)) \subseteq \mathrm{im}_{{T}^{k,n+1}}(\iota ^{\prime}(v)),$$

for v ∈ T k, n, and with ι mapping the \(\leq _{{T}^{k,n}}\)-smallest leaf of T k, n to the \(\leq _{{T}^{k,n+1}}\)-smallest leaf of T k, n + 1, if T k, n is non-empty. This embedding comes from the isomorphism between T k, n and T k, n + 1(v 0), where v 0 is the \(\leq _{{T}^{k,n+1}}\)-smallest immediate successor of the root of T k, n + 1. Note that the image of the set of all leaves of T k, n under ι is an initial segment with respect to \(\leq _{{T}^{k,n+1}}\) of the set of leaves of T k, n + 1. We write

$${T}^{k,n} \subseteq ^{\prime}{T}^{k,n+1}$$
(10)

to indicate that we consider T k, n identified with its image under ι.

We give one more illustration. Its conclusion will be used in the sequel.

Illustration 10.

We prove the following, possibly folklore, generalization of the classical Ramsey theorem.

Given d > 0, \(s \in \mathbb{N}\), and a non-empty ordered tree S, there is a non-empty ordered tree T with br(T) = br(S) such that for each d-coloring of all leaf preserving embeddings of [s] to T there exists a leaf preserving embedding g 0: S → T such that

$$\{g_{0} \circ f : f : [s] \rightarrow S\mbox{ a leaf preserving embedding}\}$$

is monochromatic.

The proof below consists essentially of stating definitions. All the checking that needs to be done is routine and would be probably best left to the reader. However, since this is the first example involving trees, we will perform all the verifications carefully and explicitly.

Let k = br(S). Since there is a leaf preserving embedding from S to T k, m with m = ht(S), we can assume that S = T k, m for some m. For \(n \in \mathbb{N}\), set

$${T}^{n} = {T}^{k,n}.$$

Normed Background Let X be the set of all (not necessarily leaf preserving) embeddings from some [m] to some T n. Let A consist of all strong embeddings from some T m to some T n. (Strong embeddings were defined earlier in this section. We will need this more restrictive notion of embedding for the normed background we are defining to work.) For f ∈ X and g, g 1, g 2 ∈ A declare that g . f is defined if the image of f is included with respect to ⊆  ∗  (as defined in Eq. (9)) in the domain of g, and similarly declare that g 2g 1 is defined if the image of g 1 is included with respect to ⊆  ∗  in the domain of g 2, and let

$$g\,.\,f = g \circ f\;\mbox{ and }\;g_{2} \cdot g_{1} = g_{2} \circ g_{1}.$$

Define  ∗  on X be letting for f : [m] → T n,

$${\partial }^{{\ast}}f = f \upharpoonright [m - 1].$$

Note, after recalling the derivation (7), that  ∗  f = f ↾ [m] ∗ . Let

$$\vert f\vert = \left \{\begin{array}{@{}l@{\quad }l@{}} \mathrm{ht}(f(m)),\quad &\text{ if}m > 0;\\ 0, \quad &\text{ if} m = 0. \end{array} \right.$$
(11)

It is checked without any difficulty that (A, X) with the operations defined above is a normed background. The requirement that embeddings in A be strong is used in checking that |  ⋅ | is a norm.

A Pair of Families Over (A, X) Let S, T be ordered trees. Let

$${ \left ({ T \atop S} \right )}^{s}\;\mbox{ and }\;{\left [{ T \atop S} \right ]}^{s}$$
(12)

stand for the set of all strong embeddings and strong, leaf preserving embeddings, respectively, from S to T. Since all embeddings from [m], \(m \in \mathbb{N}\), to an ordered tree are strong, we simplify our notation by setting

$$\left ({ T \atop m} \right ) ={ \left ({ T \atop [m]} \right )}^{s}\;\mbox{ and }\;\left [{ T \atop m} \right ] ={ \left [{ T \atop [m]} \right ]}^{s},$$
(13)

that is, \(\left ({ T \atop m} \right )\) and \(\left [{ T \atop m} \right ]\) stand for the set of all embeddings from [m] to T and for the set of all leaf preserving embeddings from [m] to T, respectively.

Let \(\mathcal{F}\) consist of sets of the form \({\left ({ {T}^{n} \atop {T}^{m}} \right )}^{s}\) and \({\left [{ {T}^{n} \atop {T}^{m}} \right ]}^{s}\) with 0 < m ≤ n or \(m = n = 0\). Declare ∙ to be defined precisely in the following situations: \({\left ({ {T}^{n} \atop {T}^{m}} \right )}^{s} \bullet {\left ({ {T}^{m} \atop {T}^{l}} \right )}^{s}\) and \({\left [{ {T}^{n} \atop {T}^{m}} \right ]}^{s} \bullet {\left [{ {T}^{m} \atop {T}^{l}} \right ]}^{s}\), and define them to be

$${\left ({ {T}^{n} \atop {T}^{m}} \right )}^{s} \bullet {\left ({ {T}^{m} \atop {T}^{l}} \right )}^{s} ={ \left ({ {T}^{n} \atop {T}^{l}} \right )}^{s}\;\mbox{ and }\;{\left [{ {T}^{n} \atop {T}^{m}} \right ]}^{s} \bullet {\left [{ {T}^{m} \atop {T}^{l}} \right ]}^{s} ={ \left [{ {T}^{n} \atop {T}^{l}} \right ]}^{s}.$$

Let \(\mathcal{P}\) consist of sets of the form \(\left ({ {T}^{n} \atop m} \right )\) and \(\left [{ {T}^{n} \atop m} \right ]\) with 0 < m ≤ n or \(m = n = 0\). Declare ∙ to be defined precisely in the following situations: \({\left ({ {T}^{n} \atop {T}^{m}} \right )}^{s} \bullet \left ({ {T}^{m} \atop l} \right )\) and \({\left [{ {T}^{n} \atop {T}^{m}} \right ]}^{s} \bullet \left [{ {T}^{m} \atop l} \right ]\), and let

$${\left ({ {T}^{n} \atop {T}^{m}} \right )}^{s} \bullet \left ({ {T}^{m} \atop l} \right ) = \left ({ {T}^{n} \atop l} \right )\;\mbox{ and }\;{\left [{ {T}^{n} \atop {T}^{m}} \right ]}^{s} \bullet \left [{ {T}^{m} \atop l} \right ] = \left [{ {T}^{n} \atop l} \right ].$$

It is easy to see that these ∙ and ∙ when defined are given point-wise. This checking boils down to showing that each strong embedding from T l to T n factors through T m, and the same for strong, leaf preserving embeddings. Such factorizations are easy to produce. Arguing by induction, we see that it is suffices to show their existence for \(l < m = l + 1 \leq n\). Since l < n, given a strong (leaf preserving, respectively) embedding g: T l → T n, there is 1 ≤ j ≤ n such that ht(g(v)) ≠ j, for each v ∈ T l. Fix the largest 1 ≤ i ≤ l such that ht(g(v)) < j for all v ∈ T l with ht(v) = i, or let i = 0 if no such i exists. Let g 1: T l → T l + 1 be an arbitrary strong (leaf preserving, respectively) embedding such that for v ∈ T l

$$\mathrm{ht}(g_{1}(v)) = \left \{\begin{array}{@{}l@{\quad }l@{}} \mathrm{ht}(v), \quad &\text{ if}\mathrm{ht}(v) \leq i;\\ \mathrm{ht } (v) + 1,\quad &\text{ if} \mathrm{ht } (v) \geq i + 1. \end{array} \right.$$

So there is no element of T l that gets mapped to a w ∈ T l + 1 with \(\mathrm{ht}(w) = i + 1\). Now it is easy to find a strong (leaf preserving, respectively) embedding g 2: T l + 1 → T n such that g = g 2 ∘ g 1. (We do it so that ht(g 2(w)) = j for all w ∈ T l + 1 with \(\mathrm{ht}(w) = i + 1\).)

Note that

$${ \partial }^{{\ast}}\left ({ {T}^{n} \atop m} \right ) = {\partial }^{{\ast}}\left [{ {T}^{n} \atop m} \right ] = \left \{\begin{array}{@{}l@{\quad }l@{}} \left ({ {T}^{n-1} \atop m-1} \right ),\quad &\text{ if}m > 1; \\ \left ({ {T}^{0} \atop 0} \right ), \quad &\text{ if}m \leq 1. \end{array} \right.$$
(14)

Using Eq. (14), we verify that \(\mathcal{F}\) and \(\mathcal{P}\) is a pair of families over (A, X) fulfilling conditions (A) and (B). Condition (A) is clear from Eq. (14). We verify condition (B) for k > 1 in the calculation below, and leave the trivial case k ≤ 1 to the reader. Note that by Eq. (14), if

$${\left ({ {T}^{n} \atop {T}^{m}} \right )}^{s} \bullet {\partial }^{{\ast}}\left ({ {T}^{l} \atop k} \right ) ={ \left ({ {T}^{n} \atop {T}^{m}} \right )}^{s} \bullet \left ({ {T}^{l-1} \atop k - 1} \right )$$

is defined, then \(l = m + 1\), so \({\left ({ {T}^{n+1} \atop {T}^{m+1}} \right )}^{s} \bullet \left ({ {T}^{l} \atop k} \right )\) is defined and each \(g \in {\left ({ {T}^{n} \atop {T}^{m}} \right )}^{s}\) is extended by some \(h \in {\left ({ {T}^{n+1} \atop {T}^{m+1}} \right )}^{s}\) (that is, for each f ∈ X if g . f is defined, then so is h . f and h . f = g . f); simply view T m as included in T m + 1 via ⊆  ∗  and take \(h: {T}^{m+1} \rightarrow {T}^{n+1}\) to be any strong embedding with h ↾ T m = g, that is, h extends g: T m → T n as an embedding. We handle the situation when

$${\left ({ {T}^{n} \atop {T}^{m}} \right )}^{s} \bullet {\partial }^{{\ast}}\left [{ {T}^{l} \atop k} \right ] ={ \left ({ {T}^{n} \atop {T}^{m}} \right )}^{s} \bullet \left ({ {T}^{l-1} \atop k - 1} \right )$$

is defined in the same way, except that in this case \({\left [{ {T}^{n+1} \atop {T}^{m+1}} \right ]}^{s}\) witnesses that (B) holds.

To see condition ( ∗ ), note that if

$${\left ({ {T}^{q} \atop {T}^{p}} \right )}^{s} \bullet \left ({\left ({ {T}^{n} \atop {T}^{m}} \right )}^{s} \bullet \left ({ {T}^{l} \atop k} \right )\right )$$

is defined, then m = l and p = n, so

$$\left ({\left ({ {T}^{q} \atop {T}^{p}} \right )}^{s} \bullet {\left ({ {T}^{n} \atop {T}^{m}} \right )}^{s}\right ) \bullet \left ({ {T}^{l} \atop k} \right )$$

is defined, as required. We handle the situation when

$${\left [{ {T}^{q} \atop {T}^{p}} \right ]}^{s} \bullet \left ({\left [{ {T}^{n} \atop {T}^{m}} \right ]}^{s} \bullet \left [{ {T}^{l} \atop k} \right ]\right )$$

is defined in the same way.

Condition (R) for the above defined pair of families clearly gives the statement from the beginning of this illustration. Since for \(\left [{ {T}^{n} \atop m} \right ] \in \mathcal{P}\) we have \({({\partial }^{{\ast}})}^{n}\left [{ {T}^{n} \atop m} \right ] = \left ({ {T}^{0} \atop 0} \right )\) and \(\left ({ {T}^{0} \atop 0} \right )\) has exactly one element, by Theorem 9, it suffices to see condition (P).

Condition (P) We carefully check condition (P). Fix an element of \(\mathcal{P}\), which must be of the form \(\left ({ {T}^{q} \atop p} \right )\) or \(\left [{ {T}^{q} \atop p} \right ]\). We consider the first case first. We assume p > 1 and leave p ≤ 1 to the reader. To check (P), recall the pieces of notation set up in Eqs. (3) and (4). Fix \(f_{0} \in {\partial }^{{\ast}}\left ({ {T}^{q} \atop p} \right ) = \left ({ {T}^{q-1} \atop p-1} \right )\). We need to find an element \({\left ({ {T}^{r} \atop {T}^{q}} \right )}^{s}\) of \(\mathcal{F}\) (it suffices, of course, to find r) and g 0 ∈ A so that for each d-coloring of \(\left ({ {T}^{r} \atop {T}^{q}} \right )_{g_{0}}^{s}\,.\,\left ({ {T}^{q} \atop p} \right )_{f_{0}}\) there is \(g \in \left ({ {T}^{r} \atop {T}^{q}} \right )_{g_{0}}^{s}\) such that \(g\,.\,\left ({ {T}^{q} \atop p} \right )_{f_{0}}\) is monochromatic. Note that \({\left ({ {T}^{r} \atop {T}^{q}} \right )}^{s} \bullet \left ({ {T}^{q} \atop p} \right )\) is automatically defined.

We claim that g 0 ∈ A equal the identity function \({T}^{\vert f_{0}\vert +1} \rightarrow {T}^{\vert f_{0}\vert +1}\) does the job, where | f 0 | is defined by Eq. (11). Checking (P) boils down to stating precisely what elements the sets \(\left ({ {T}^{q} \atop p} \right )_{f_{0}}\), \(\left ({ {T}^{r} \atop {T}^{q}} \right )_{g_{0}}^{s}\), and \(g\,.\,\left ({ {T}^{q} \atop p} \right )_{f_{0}}\), for \(g \in \left ({ {T}^{r} \atop {T}^{q}} \right )_{g_{0}}^{s}\), consist of. Let v 0 be the smallest with respect to \(\leq _{{T}^{q}}\) element of the set \(\mathrm{im}_{{T}^{q}}(f_{0}(p - 1))\), and keep in mind that we are looking for r.

The set \(\left ({ {T}^{q} \atop p} \right )_{f_{0}}\) consists of all \(f \in \left ({ {T}^{q} \atop p} \right )\) with  ∗  f = f 0. This last condition is equivalent to saying that \(f \upharpoonright [p - 1] = f_{0}\) and

$$f(p) \in {T}^{q}(v_{ 0}),$$
(15)

where Eq. (15) is a consequence of point (ii) in the definition of embedding between ordered trees. Each such embedding f is completely determined by the value of f(p). Fix r ≥ q, arbitrary for the moment. Let g be a strong embedding in \(\left ({ {T}^{r} \atop {T}^{q}} \right )_{g_{0}}^{s}\). It is equal to the identity on \({T}^{\vert f_{0}\vert +1}\) and it is determined by strong embeddings g v from T q(v) to T r(v), where v varies over the nodes of T q with \(\mathrm{ht}(v) = \vert f_{0}\vert + 1\). Now, elements of \(g\,.\,\left ({ {T}^{q} \atop p} \right )_{f_{0}}\) are embeddings g ∘ f : [p] → T r with f for which Eq. (15) holds. Each such embedding is completely determined by the value

$$(g \circ f)(p) = g_{v_{0}}(f(p)) \in {T}^{r}(v_{ 0}).$$

Therefore, solving the problem of fixing the color on \(g\,.\,\left ({ {T}^{q} \atop p} \right )_{f_{0}}\) amounts to the following: d-color T r(v 0) (this is where the values of (g ∘ f)(p) are coming from), then find a strong embedding (this is \(g_{v_{0}}\)) of T q(v 0) (this is where the values f(p) are located) to T r(v 0) so that the image of T q(v 0) is monochromatic. This can be arranged using a form of the Halpern–Läuchli theorem ((HL2) with t = 1 from Appendix 2) by taking r large enough since T q(v 0) and T r(v 0) are isomorphic to T m and T n, where \(m = q - (\vert f_{0}\vert + 1)\) and \(n = r - (\vert f_{0}\vert + 1)\), respectively. For v ≠ v 0, after identifying T q(v) with T q(v 0) and T r(v) with T r(v 0) via the unique isomorphisms, we let g v be equal to \(g_{v_{0}}\). Note that so defined g is strong (Fig. 2).

Fig. 2
figure 2

Condition (P) in Illustration 9

The case \(P = \left [{ {T}^{q} \atop p} \right ]\) is handled analogously with the exception that for F one takes \({\left [{ {T}^{r} \atop {T}^{q}} \right ]}^{s}\) for large enough r and one uses another form of the Halpern–Läuchli theorem ((HL1) from Appendix 2). We leave it to the reader to re-check the details.

4 A Ramsey Theorem for Finite Trees

We prove the following theorem that extends the results of Deuber [2] and of Jasiński [3]. Our proof differs from the arguments of these two papers.

Proposition 1.

For non-empty ordered trees S,T and d > 0, there exists a non-empty ordered tree V with br (V ) = br (T) such that for each d-coloring of all leaf preserving embeddings from S to V there is a leaf preserving embedding g 0 : T → V such that

$$\{g_{0} \circ f : f : S \rightarrow T\mbox{ a leaf preserving embedding}\}$$

is monochromatic.

As a direct consequence of the above result, one gets its version for embeddings that are not necessarily leaf preserving by the following argument. Given ordered trees S, T, let \(S_{+},T_{+}\) be the trees obtained from S and T by adding one node on top of each leaf of S and T, respectively. Apply now the above statement to \(S_{+},T_{+}\) obtaining V. Let V  −  be gotten from V by deleting from it all of its leaves. It is easy to check that V  −  works by using the obvious observation that embeddings from S to T, from S to V  − , and from T to V  −  are precisely restrictions of leaf preserving embeddings from S  +  to T  + , from S  +  to V, and from T  +  to V, respectively.

Deuber’s theorem [2] is the above result for embeddings that are not necessarily leaf preserving and under the additional assumptions that br(S) = br(T) and that S is regular as defined in Sect. 3. Jasiński’s theorem is originally [3] stated for boron structures as defined in [1], but can easily be rephrased in terms of trees, and then it becomes equivalent to the above result with the additional assumptions that br(S) = br(T) and that for each v ∈ S that is not a leaf | im(v) |  = 2.

We show now how to derive Proposition 1 from Theorem 9.

Let k = br(T) and set T n = T k, n. Note that it is enough to prove the theorem for T equal to some T n since every ordered tree T with br(T) = k embeds leaf-preservingly into some T n.

We define an analogue of the set of natural numbers for the present Ramsey situation. We view T n as an ordered subtree T n + 1 via the inclusion ⊆  defined by Eq. (10). This convention gives an increasing sequence \(({T}^{n})_{n\in \mathbb{N}}\) of ordered trees. Let the direct limit (that is, the union, if T n is identified with its image in T n + 1) of this sequence be denoted by T . Observe that T carries a linear order induced from the linear orders \(\leq _{{T}^{n}}\) on the T n-s. We denote this linear order by ≤ . Each element v of T belongs to some T n. We call v a leaf if v is a leaf in some, or equivalently all, T n to which it belongs. For an ordered tree S, each function f : S → T has its range included in some T n. We call f a leaf preserving embedding if f is a leaf preserving embedding to some, or equivalently all, T n in which the image of f is included. Further, g: D → T for a subset D of T is called a leaf preserving embedding if the restriction of g to each D ∩ T n, \(n \in \mathbb{N}\), is a leaf preserving embedding, where D ∩ T n is taken with the tree order inherited from T n. For a leaf x ∈ T , let

$$T_{x} =\{ v \in {T}^{\infty }: v {\leq }^{\infty }x\}.$$

Note that T x is an infinite set.

Normed Background Let Y consist of all leaf preserving embeddings f : S → T , where S is an ordered tree. Let B consist of the empty function and of all leaf preserving embeddings g: T x  → T , where x is a leaf of T . It is easy to see that for such a g: T x  → T , we have g(T x ) ⊆ T g(x). As always, for f ∈ Y and g ∈ B, let g . f to be defined precisely when the image of f is included in the domain of g and let

$$g\,.\,f = g \circ f.$$

Similarly for g 1, g 2 ∈ B, define g 2g 1 to be defined precisely when the image of g 1 is contained in the domain of g 2 and let

$$g_{2} \cdot g_{1} = g_{2} \circ g_{1}.$$

We define a truncation using the branch cutting derivation on trees given by Eq. (8). For f ∈ Y with f : S → T define

$$\partial ^{\prime}f = f \upharpoonright S^{\prime},$$

where S′ is given by Eq. (8). We define a norm |  ⋅ | : Y → T ∪{ − }, where T is considered as a linear order with ≤  and −  is an element that is less than all the elements of T , by letting for f ∈ Y with f : S → T

$$\vert f\vert = \left \{\begin{array}{@{}l@{\quad }l@{}} \max \mathrm{image}(f),\quad &\text{ if}S\not =\varnothing ;\\ -\infty, \quad &\text{ if} S = \varnothing. \end{array} \right.$$

Observe that if S ≠ , then | f | is the ≤ -minimal leaf x ∈ T such that image(f) ⊆ T x . It is easy to check that with so defined operations, (B, Y ) becomes a normed background.

A Pair of Families Over (B, Y ) For \(n \in \mathbb{N}\), let x n  ∈ T be the rightmost leaf of T n and let v n  ∈ T be the root of T n. Note that

$$T_{x_{n}} = {T}^{n} \cup \{ v_{ n+k}: k \in \mathbb{N},\,k > 0\}.$$

Define for 0 < m ≤ n

$$\begin{array}{cccc} \begin{array}{rlrlrl} {\left [{ {T}^{n} \atop {T}^{m}} \right ]}^{\infty } =\{ g \in B: &g: T_{ x_{m}} \rightarrow T_{x_{n}},\,g({T}^{m}) \subseteq {T}^{n},\,\mbox{ and } \\ &g(v_{m+k}) = v_{n+k}\mbox{ for all }k \in \mathbb{N},k > 0\}. \end{array} & \end{array}$$

Additionally, let \({\left [{ {T}^{0} \atop {T}^{0}} \right ]}^{\infty }\) consist of the empty function. Observe that the function

$${\left [{ {T}^{n} \atop {T}^{m}} \right ]}^{\infty }\ni g \rightarrow g \upharpoonright {T}^{m}$$

is a bijection from \({\left [{ {T}^{n} \atop {T}^{m}} \right ]}^{\infty }\) to all leaf preserving embeddings from T m to T n.

Let \(\mathcal{G}\) consist be the family of all subsets of B of the form \({\left [{ {T}^{n} \atop {T}^{m}} \right ]}^{\infty }\), where \(n,m \in \mathbb{N}\) and 0 < m ≤ n or \(m = n = 0\). Let \(\mathcal{Q}\) be the family of all non-empty finite sets Q ⊆ Y of the following form: there is an ordered tree S such that Q consists of some leaf preserving embeddings from S to T . In such a situation, we say that Q is based on S. As usual, declare \({\left [{ {T}^{n} \atop {T}^{m}} \right ]}^{\infty }\bullet {\left [{ {T}^{l} \atop {T}^{k}} \right ]}^{\infty }\) to be defined precisely when m = l and let

$$\displaystyle\begin{array}{rcl}{ \left [{ {T}^{n} \atop {T}^{l}} \right ]}^{\infty }\bullet {\left [{ {T}^{l} \atop {T}^{k}} \right ]}^{\infty } ={ \left [{ {T}^{n} \atop {T}^{k}} \right ]}^{\infty }.& & \\ \end{array}$$

Declare \({\left [{ {T}^{n} \atop {T}^{m}} \right ]}^{\infty }\bullet Q\) to be defined precisely when m is the smallest natural number with the property that the images of all elements of Q are included in T m, and let

$${\left [{ {T}^{n} \atop {T}^{m}} \right ]}^{\infty }\bullet Q ={ \left [{ {T}^{n} \atop {T}^{m}} \right ]}^{\infty }\,.\,Q.$$

We leave to the reader the easy check that \((\mathcal{F},\mathcal{Q},\bullet, \bullet )\) is a pair of families over (B, Y ), that is, the operations ∙ and ∙ are given pointwise. The pair of families fulfills conditions (A), (B), and ( ∗ ). Condition (A) is clear. To see condition (B), assume that \({\left [{ {T}^{n} \atop {T}^{m}} \right ]}^{\infty }\bullet \partial Q\) is defined, that is, m is smallest such that the image of all elements of ∂Q is included in T m. Let \(l \in \mathbb{N}\) be smallest such that the image of each element of Q is included in T m + l. Then \({\left [{ {T}^{n+l} \atop {T}^{m+l}} \right ]}^{\infty }\) witnesses that (B) holds since \({\left [{ {T}^{n+l} \atop {T}^{m+l}} \right ]}^{\infty }\bullet Q\) is defined and, as is easy to check, each leaf preserving embedding from \({\left [{ {T}^{n} \atop {T}^{m}} \right ]}^{\infty }\) extends (as a function) to a leaf preserving embedding from \({\left [{ \{{T}^{n+l} \atop {T}^{m+l}} \right ]}^{\infty }\). Condition ( ∗ ) follows immediately from an easy observation that if m is the smallest natural number such that the image of each function in Q is included in T m, then n is the smallest natural number with each function in \({\left [{ {T}^{n} \atop {T}^{m}} \right ]}^{\infty }\,.\,Q\) having its image included in T n.

Note that condition (R) in this case is the theorem we are proving. Observe also that if \(Q \in \mathcal{Q}\) is based on S, then ∂Q is based on S′, and S′ has one leaf fewer than S if S ≠ . Thus, t Q has exactly one element (the empty function) for t equal to the number of leaves in S. It follows that to get (R) it remains to check condition (P).

Condition (P) Let \(Q \in \mathcal{Q}\) be based on S and let \(q \in \mathbb{N}\) be smallest such that all elements of Q have ranges included in T q. The set ∂Q is based on S′. We assume S′ is not the empty tree. (The case S′ =  is easier, and we ask the reader to handle it after reading the current argument.) Let u 0 ∈ S′ be the splitting node of S, and identify S ∖ S′ with [p] for some non-zero \(p \in \mathbb{N}\). (Recall here the discussion following (8).) Fix f 0 ∈ ∂′Q. Then f 0: S′ → T q, f 0 ∈ Y. Let

$$v_{0} = f_{0}(u_{0}) \in {T}^{q}.$$
(16)

To check (P), we need to find \(r \in \mathbb{N}\) and g 0 ∈ B such that for each d-coloring of \(\left [{ {T}^{r} \atop {T}^{q}} \right ]_{g_{0}}^{\infty }\,.\,Q_{f_{0}}\) there is \(g \in \left [{ {T}^{r} \atop {T}^{q}} \right ]_{g_{0}}^{\infty }\) with \(g\,.\,Q_{f_{0}}\) monochromatic. We will show that large enough r works. Fix r ≥ q. Now, we define g 0. Find the ≤ -smallest leaf x in T q such that the image of f 0 is included in T x . Note that v 0 is a predecessor of x.

First we define g 0: T x  → T . Note that

$$T_{x} = (T_{x} \cap {T}^{q}) \cup \{ v_{ q+k}: k \in \mathbb{N},\,k > 0\}.$$

For the moment, we view T q as a subset of T r in the sense T q ⊆  ∗  T r, as defined by Eq. (9), and we let g 0 be the identity on the elements of T x  ∩ T q that are not leaves. Let g 0 map leaves of T x  ∩ T q to leaves of T r in such a way that g 0 on T x  ∩ T q is a leaf preserving embedding to T r. Let \(g_{0}(v_{q+k}) = v_{r+k}\) for \(k \in \mathbb{N}\), k > 0. It is clear that g 0 ∈ B.

Consider the set E of all w ∈ T q such that w is an immediate successor of a predecessor of x and x <  w. The set T q ∖ T x q is partitioned into trees T q(w) with w ∈ E. Therefore, each \(g \in \left [{ {T}^{r} \atop {T}^{q}} \right ]_{g_{0}}^{\infty }\) is equal to g 0 in T x and is completely determined by leaf preserving embeddings

$$g_{w} = g \upharpoonright {T}^{q}(w): {T}^{q}(w) \rightarrow {T}^{r}(w),\;w \in E.$$

Note that v 0 given by Eq. (16) has an immediate successor in E. Let w 0 be the ≤ -smallest among them. For each \(f \in Q_{f_{0}}\), f ↾ S′ is equal to f 0 whose image is included in T x , while the image of f ↾ (S ∖ S′) is included in T q(w 0). So each element of \(g\,.\,Q_{f_{0}}\) being of the form g ∘ f : S → T r is completely determined by

$$g_{w_{0}} \circ (f \upharpoonright (S \setminus S^{\prime})): S \setminus S^{\prime} \rightarrow {T}^{r}(w_{ 0}).$$

Note that the identification of S ∖ S′ with [p] makes f ↾ (S ∖ S′) into a leaf preserving embedding from [p] to T q(w 0). Thus, fixing the color on \(g\,.\,Q_{f_{0}}\) amounts to the following (with notation as in Eq. (13)): d-color \(\left [{ {T}^{r}(w_{ 0}) \atop p} \right ]\), find a leaf preserving embedding \(g_{w_{0}}: {T}^{q}(w_{0}) \rightarrow {T}^{r}(w_{0})\) so that \(g_{w_{0}}\,.\,\left [{ {T}^{q}(w_{ 0}) \atop p} \right ]\) is monochromatic. This can be achieved from Illustration 9 by taking r large enough as T r(w 0) and T q(w 0) are isomorphic to T n and T m, where \(n = r -\mathrm{ht}(w_{0})\) and \(m = q -\mathrm{ht}(w_{0})\). We can let g w : T q(w) → T r(w) be arbitrary leaf preserving embeddings for w ∈ E, w ≠ w 0.

5 Milliken’s Theorem in Exercises

We prove in this section the following result due to Milliken [4]. The reader may consult [6] for another purely finitary proof of Milliken’s theorem.

Proposition 2.

Let S and T be ordered trees. Assume that all leaves in T have the same height. For d > 0, there exists an ordered tree V with br (V ) = br (T) such that for each d-coloring of all strong, leaf preserving embeddings from S to V there is a strong, leaf preserving embedding g 0 : T → V such that

$$\{g_{0} \circ f : f : S \rightarrow T\mbox{ a strong, leaf preserving embedding}\}$$

is monochromatic.

The proof of Proposition 2 that we will give yields also the statement obtained from Proposition 2 by replacing strong, leaf preserving embeddings by strong embeddings in all places. This statement can also be obtained from Proposition 2 by a proof that is identical to the argument following Proposition 1. It suffices to notice that, with the notation as in that argument, strong embeddings from S to T, from S to V  − , and from T to V  −  are precisely restrictions of strong, leaf preserving embeddings from S  +  to T  + , from S  +  to V, and from T  +  to V, respectively.

The proof of Proposition 2 is a somewhat more sophisticated version of the argument in Illustration 9. Let k = br(T). As before set T n = T k, n. View T n as a subtree of T n + 1 via the inclusion ⊆  ∗  defined in Eq. (9). This inclusion is a strong embedding. This way we obtain an increasing sequence \(({T}^{n})_{n\in \mathbb{N}}\) of ordered trees. Let T be the union (direct limit) of this sequence. The range of each function f : S → T on an ordered tree S is included in some T n. We call f a strong embedding if f is a strong embedding as a function from S to T n for some, or equivalently all, T n in which the image of f is included. For v ∈ T , let ht(v) be equal to \(\mathrm{ht}_{{T}^{n}}(v)\) for some, or equivalently, all T n with v ∈ T n.

Normed Background Let Z consist of all strong embeddings f : S → T , where S is an ordered tree. Let C consist of all strong embeddings g: T m → T n, for some m ≤ n. For f ∈ Z and g ∈ C, let g . f be defined precisely when the image of f is included in the domain of g and let

$$g\,.\,f = g \circ f.$$

Similarly for g 1, g 2 ∈ C, let g 2g 1 be defined precisely when the image of g 1 is contained in the domain of g 2, and let

$$g_{2} \cdot g_{1} = g_{2} \circ g_{1}.$$

For f ∈ Z with f : S → T n define

$${\partial }^{{\ast}}f = f \upharpoonright {S}^{{\ast}}.$$

Define a norm \(\vert \cdot \vert : Z \rightarrow \mathbb{N}\), by letting for f ∈ Y with f : S → T

$$\vert f\vert =\max _{v\in S}\mathrm{ht}(f(v)).$$

Exercise.

Check that (C, Z) is a normed background.

A Pair of Families Over (C,Z) The pair of families described below extends the one described in Illustration 9. Recall the sets \({\left ({ T \atop S} \right )}^{s}\) and \({\left [{ T \atop S} \right ]}^{s}\) defined in Eq. (12). Let \(\mathcal{H}\) consist of all \({\left ({ {T}^{n} \atop {T}^{m}} \right )}^{s}\) and \({\left [{ {T}^{n} \atop {T}^{m}} \right ]}^{s}\) where \(m,n \in \mathbb{N}\) and 0 < m ≤ n or \(m = n = 0\). Let \(\mathcal{R}\) consist of all non-empty sets of the form \({\left ({ {T}^{n} \atop S} \right )}^{s}\) and \({\left [{ {T}^{n} \atop S} \right ]}^{s}\), where S is an ordered tree. Declare \({\left ({ {T}^{n} \atop {T}^{m}} \right )}^{s} \bullet {\left ({ {T}^{l} \atop {T}^{k}} \right )}^{s}\) and \({\left [{ {T}^{n} \atop {T}^{m}} \right ]}^{s} \bullet {\left [{ {T}^{l} \atop {T}^{k}} \right ]}^{s}\) to be defined precisely when m = l, and let

$${\left ({ {T}^{n} \atop {T}^{l}} \right )}^{s} \bullet {\left ({ {T}^{l} \atop {T}^{k}} \right )}^{s} ={ \left ({ {T}^{n} \atop {T}^{k}} \right )}^{s}\;\mbox{ and }\;{\left [{ {T}^{n} \atop {T}^{l}} \right ]}^{s} \bullet {\left [{ {T}^{l} \atop {T}^{k}} \right ]}^{s} ={ \left [{ {T}^{n} \atop {T}^{k}} \right ]}^{s}.$$

Similarly, declare \({\left ({ {T}^{n} \atop {T}^{m}} \right )}^{s} \bullet {\left ({ {T}^{l} \atop S} \right )}^{s}\) and \({\left [{ {T}^{n} \atop {T}^{m}} \right ]}^{s} \bullet {\left [{ {T}^{l} \atop S} \right ]}^{s}\) to be defined precisely when m = l, and let

$${\left ({ {T}^{n} \atop {T}^{l}} \right )}^{s} \bullet {\left ({ {T}^{l} \atop S} \right )}^{s} ={ \left ({ {T}^{n} \atop S} \right )}^{s}\;\mbox{ and }\;{\left [{ {T}^{n} \atop {T}^{l}} \right ]}^{s} \bullet {\left [{ {T}^{l} \atop S} \right ]}^{s} ={ \left [{ {T}^{n} \atop S} \right ]}^{s}.$$

The operations ∙ and ∙ are undefined in situations not specified above.

Exercise.

Check that \((\mathcal{H},\mathcal{R},\bullet, \bullet )\) is a pair of families over (C, Z) fulfilling conditions (A), (B), and ( ∗ ). (Hint. This is almost identical to the argument in Illustration 9.)

Exercise.

Note that it suffices to prove Proposition 2 for T of the form T n (this is where the assumption that all leaves in T have the same height is used) and check that condition (R) for \((\mathcal{H},\mathcal{R},\bullet, \bullet )\) implies Proposition 2 (as well as the statement obtained from Proposition 2 by replacing strong, leaf preserving embeddings by strong embeddings).

Exercise.

Check condition (P) for \((\mathcal{H},\mathcal{R},\bullet, \bullet )\). (Hint. This follows from the Halpern–Läuchli theorem for strong subtrees (HL1) and (HL2) from Appendix 2 and is similar to the argument for (P) in Illustration 9.)

6 Appendix 1: Conditions (A) and (B) Removed and the Final Word on Normed Backgrounds

1. The following criterion (P + ) is the strengthening of condition (P) allowing us to get rid of conditions (A) and (B). It is obtained from (P) by replacing all occurrences of P, except the one in F ∙ P, by t P for a fixed but arbitrary \(t \in \mathbb{N}\).

(P+) :

given d > 0 and t, for all \(P \in \mathcal{P}\) and x ∈  t + 1 P, there are \(F \in \mathcal{F}\) and a ∈ A such that F ∙ P is defined, a . x is defined, and for every d-coloring of F a  . ( t P) x there is f ∈ F a such that f . ( t P) x is monochromatic.

The following result is [7, Corollary 4.4].

Theorem 11.

Let \((\mathcal{F},\mathcal{S},\bullet, \bullet )\) be a pair of families with (∗) over a normed background. Assume that each \(P \in \mathcal{P}\) is finite and for each \(P \in \mathcal{P}\) there is \(t \in \mathbb{N}\) such that ∂ t P consist of one element. If \((\mathcal{F},\mathcal{P})\) fulfills (P+), then it fulfills (R).

To see that Theorem 9 is a consequence Theorem 11, we note that (P) in the presence of (A) and (B) implies (P + ). To see this implication, we proceed by induction on t. Condition (P + ) for t = 0 is just (P). Assuming that (P + ) holds for t, we prove it for t + 1. Let \(P \in \mathcal{P}\) and x ∈  t + 2 P. By condition (A), \(\partial P \in \mathcal{P}\). So condition (P + ) for t applied to ∂P and x gives \(F \in \mathcal{F}\) and a ∈ A such that F ∙ ∂P is defined, a . x is defined, and for every d-coloring of F a  . ( t + 1 P) x there is f ∈ F a such that f . ( t + 1 P) x is monochromatic. Now condition (B) gives \(G \in \mathcal{F}\) such that G ∙ P is defined and such that each element of F is extended by an element of G. It follows that each element of F a is extended by an element of G a . Now it is clear that G and a witness that (P + ) holds for t + 1.

2. The main algebraic structures in the paper are normed backgrounds. We list below conditions that are more symmetric than those defining normed backgrounds. As indicated by Lemma 1, they give a notion that is in essence equivalent to normed background. All the normed backgrounds in the present paper and in [7] fulfill the conditions below.

Let (A, X,  ⋅, . , , |  ⋅ | ) be such that ⋅is a partial function from A ×A to A, . is a partial function from A ×X to X, is a function from X to X and |  ⋅ | is a function from X to a set with a linear order ≤ . Assume the following axioms hold for all a, b ∈ A and x, y ∈ X:

  1. (i)

    if a . (b . x) and (ab) . x are defined, then a . (b . x) = (ab) . x;

  2. (ii)

    if a . x and a . ∂x are defined, then (a . x) = a . ∂x;

  3. (iii)

     | ∂x | ≤ | x | ;

  4. (iv)

    if | x | ≤ | y | and a . x and a . y are defined, then | a . x | ≤ | a . y | ;

  5. (v)

    if | x | ≤ | y | and a . y is defined, then so is a . x.

The following result is [7, Lemma 4.5].

Lemma 1.

  1. (a)

    Assume (A, X,  ⋅, . , , |  ⋅ | ) fulfills conditions (i)–(v) above, then (A, X) with ⋅, . , ∂and |  ⋅ | is a normed background

  2. (b)

    If (A, X) with ⋅, . , and |  ⋅ | is a normed background, then there is a function |  ⋅ | 1 on Xsuch that (A, X,  ⋅, . , , |  ⋅ | 1) fulfills conditions (i)–(v) above.

7 Appendix 2: The Halpern–Läuchli Theorem for Strong Subtrees as a Restatement of the Hales–Jewett Theorem

We point out here that the Halpern–Läuchli theorem for strong subtrees (there are other, more difficult, versions) and the Hales–Jewett theorem are identical statements phrased in different languages. The importance of this translation for the presentation here comes from the fact that the Hales–Jewett theorem is shown in [7] to be one of the results that follow from the abstract approach to Ramsey theory. So when using the Halpern–Läuchli theorem in the present paper we stay within this approach. Justin Moore remarks that equivalence of these two statements (that is, of the Hales–Jewett and the Halpern–Läuchli theorems) has been known for some time.

We set up a dictionary for translating the Hales–Jewett theorem to the Halpern–Läuchli theorem. Let S and T be ordered trees. Let f : leaves(S) → leaves(T) be strictly increasing with respect the orders ≤  S and ≤  T (restricted to the leaves), and be such that for each v ∈ S there is w ∈ T such that for any two leaves x, y of S with v = x ∧ y we have w = f(x) ∧ f(y). Then there is a unique leaf preserving embedding from S to T whose restriction to leaves(S) is equal to f. We, therefore, refer to such an f itself as a leaf preserving embedding. If in the above definition ht(w) depends only on ht(v), then the induced embedding is strong and again we call f strong. A sequence \(f_{1},\ldots, f_{r}: \mathrm{leaves}(S) \rightarrow \mathrm{leaves}(T)\) of strong embeddings is called a strong sequence if for x, y ∈ leaves(S) and 1 ≤ i, j ≤ r

$$\mathrm{ht}(f_{i}(x) \wedge f_{i}(y)) = \mathrm{ht}(f_{j}(x) \wedge f_{j}(y)).$$

Fix a linearly ordered finite set A that is disjoint from \(\mathbb{N}\). For \(n \in \mathbb{N}\), we consider ordered trees

$${A}^{\leq n} =\{ v: [l] \rightarrow A: l \leq n\},$$

where the tree relation is equal to the extension relation and the order relation is the one coming from the linear order on A. So A  ≤ n is a version of the trees T k, n defined in Sect. 3, where k =  | A | . Note that the set of leaves of this tree is equal to the set A n of all functions from [n] to A.

For any function v: [l] → A, let v′: A ∪[l] → A be equal to the identity function on A and to v on [l]. Assume we have a function w: [n] → A ∪[m] such that

  1. (i)

    [m] is included in the image of w;

  2. (ii)

    w([l]) ∩ [m] is an initial segment of [m], for each l ≤ n.

Such w gives rise to a strong embedding g w : A m → A n (recall that A m and A n are the sets of leaves of A  ≤ m and A  ≤ n, respectively) defined by

$$g_{w}(x) = x^{\prime} \circ w,$$

for x ∈ A m. It is easy to check, using property (ii) of w, that g w preserves the lexicographic order. Property (i) ensures that g w is injective. Further note that for x, y ∈ A m, if x ∧ y = v 0 with ht(v 0) = i 0, then g w (x) ∧ g w (y) = v 1 with ht(v 1) = i 1, where

$$i_{1} =\max \{ i: w([i]) \cap [m] \subseteq [i_{0}]\}\;\mbox{ and }\;v_{1}(i) = v_{0}^{\prime}(w(i)),\mbox{ for }i \in [i_{1}].$$
(17)

Note that i 1 depends only on i 0. Thus, g w is indeed a strong embedding.

Now assume that for \(r \in \mathbb{N}\), we have w: [n] → A r ∪[m] with properties (i) and (ii) above. Such a w gives rise to r functions w i  = π i  ∘ w, where π i : A r ∪[m] → A ∪[m] is the i-th projection on A r and the identity on [m], also fulfilling conditions (i) and (ii). We therefore get a sequence of strong embeddings \(g_{w}^{1},\ldots, g_{w}^{r}: {A}^{m} \rightarrow {A}^{n}\) defined by

$$g_{w}^{i}(x) = x^{\prime} \circ w_{ i},$$

where x ∈ A m. Formulas (17) imply that this is a strong sequence.

The following result is a version of the Halpern–Läuchli theorem (for strong subtrees). Recall the definition of the trees T k, n from Sect. 3. Fix k and let T n = T k, n. Note that we can take T n = A  ≤ n for A with | A |  = k.

(HL1) Given d > 0, t and m there exists n such that for each d-coloring of leaves (T n ) ×⋯ ×leaves (T n ) (t factors) there exists a strong sequence of leaf preserving embeddings g i : T m → T n , for \(i = 1,\ldots, t\) , such that the set

$$g_{1}(\mathrm{leaves}({T}^{m})) \times \cdots \times g_{ t}(\mathrm{leaves}({T}^{m}))$$

is monochromatic.

(HL2) Given d > 0, t and m there exists n such that for each d-coloring of

$$\{(w_{1},\ldots, w_{t}): w_{1},\ldots, w_{r} \in {T}^{n},\,\mathrm{ht}(w_{ i}) = \mathrm{ht}(w_{j}),\mbox{ for }1 \leq i,j \leq t\}$$

there exists a strong sequence of embeddings g i : T m → T n for \(i = 1,\ldots, t\) such that the set

$$\{(g_{1}(v_{1}),\ldots g_{t}(v_{t})): v_{1},\ldots, v_{t} \in {T}^{m},\,\mathrm{ht}(v_{ i}) = \mathrm{ht}(v_{j}),\mbox{ for }1 \leq i,j \leq t\}$$

is monochromatic.

We show that the above statements are re-phrasings of the Hales–Jewett theorem. The Hales–Jewett theorem can be stated as below in points (a) and (b). (It is stated this way in [7, Sect. 7], and it is proved there using the abstract approach to Ramsey theory.)

  1. (a)

    Let B be a finite set not including any natural numbers. Given d > 0 and m there is n such that for each d-coloring of functions from [n] to B there is a function w 0: [n] → B ∪[m] with properties (i) and (ii) such that the set

    $$\{v \circ w_{0}: v: B \cup [m] \rightarrow B,\,v \upharpoonright B = \mathrm{id}_{B}\}$$

    is monochromatic.

  2. (b)

    Let B be a finite set not including any natural numbers. Given d > 0 and m there is n such that for each d-coloring of functions from [q] to B for all q ≤ n there is n 0 ≤ n and a function w 0: [n 0] → B ∪[m] with properties (i) and (ii) such that the set

    $$\{v \circ w_{0}: v: B \cup [p] \rightarrow B,\,p \leq m,\,v \upharpoonright B = \mathrm{id}_{B}\}$$

    is monochromatic.

By the discussion at the beginning of this appendix, it is clear that (HL1) and (HL2) follow from (a) and (b), respectively, by taking B = A t.