Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Originally, Ramsey theory investigates the behavior of structures with respect to colorings of substructures into a fixed number of classes, typically into two classes. Probably the most well-known example is the pigeon hole principle, saying that for every 2-coloring of ω there exists an infinite subset Fω which is monochromatic. Of course, if we allow colorings with an unbounded number of colors then it is clear that the conclusion of the pigeon hole principle does not have to hold. For example, we could take Δ(n) = n for every n < ω. However, in this case we have an infinite set which meets each color in at most one element. Now it is an easy observation that one of these two possibilities must always occur. For every coloring \(\varDelta:\omega \rightarrow \omega\) there exists an infinite set \(F \subseteq \omega\) such that either \(\varDelta \rceil F\) is monochromatic or \(\varDelta \rceil F\) is one-to-one, i.e., any two elements of F have different colors. This is the most elementary example of a canonical partition theorem, first introduced by Erdős and Rado (1950) studying unbounded colorings of finite sets. A coloring \(\varDelta: {[\omega ]}^{k} \rightarrow \omega\) of the k-subsets of the nonnegative integers is canonical if there exists a Jk such that \(\varDelta (X) =\varDelta (\mathrm{Y})\) if and only if X: J = Y: J for every pair X, Y ∈ [ω]k. The Erdős-Rado canonization Theorem 1.4 then asserts that for every coloring \(\varDelta: {[\omega ]}^{k} \rightarrow \omega\) there exists \(F \in {[\omega ]}^{\omega }\) such that \(\varDelta \rceil {[F]}^{k}\) is canonical.

In this chapter unrestricted colorings of parameter words are investigated and their canonical patterns are determined. As applications we derive a canonizing version of van der Waerden’s theorem from the corresponding result for zero-parameter words and the finite form of the Erdős-Rado canonization theorem from a canonizing version of the Graham-Rothschild theorem. In fact, throughout this chapter we will consider only parameter words over the trivial group.

A final remark concerns our notation. To indicate that we consider unbounded colorings we will always choose ω as their range, although it will quite often happen that only finitely many colors can actually be used.

1 Canonizing Hales-Jewett’s Theorem

In studying unbounded colorings of zero-parameter words we meet completely different pattern of ‘canonical colorings’ than for finite sets. Consider, e.g., the alphabet \(3 =\{ 0,1,2\}\) and the equivalence relation ≈ on 3 having 0 and 1 in the same class, 2 in another one. Define an (unbounded) coloring \(\varDelta _{\approx }: [3]\binom{n}{0} \rightarrow \omega\) by \(\varDelta _{\approx }(g) = g/_{\approx }\), where \(g/_{\approx }\in [\{0,2\}]\binom{n}{0}\) is the ≈ -quotient of g, i.e., \(g/_{\approx }(i) = 0\) if \(g(i) \in \{ 0,1\},g/_{\approx }(i) = 2\) otherwise. Observe that \(\varDelta _{\approx }\) obeys a kind of uniform description. Any two m-parameter words inherit the same pattern from Δ . In case m = 2, i.e. of planes, this pattern can be visualized as in Fig. 6.1.

Fig. 6.1
figure 1

The canonical pattern on 32

Of course, every equivalence relation on the alphabet \(\{0,1,2\}\) leaves such a hereditary pattern. More general, let A be any finite alphabet and let ≈ be an equivalence relation on A. Then every coloring \(\varDelta _{\approx }: [A]\binom{n}{0} \rightarrow \omega\) satisfying

$$\displaystyle{ \varDelta _{\approx }(g) =\varDelta _{\approx }(h)\quad \text{ if and only if }\quad g/_{\approx } = h/_{\approx } }$$
(6.1)

is hereditary in the sense that for every m and every \(f \in [A]\binom{n}{m}\) the restriction \(\varDelta _{\approx }\rceil f \cdot {A}^{m}\) again satisfies (6.1). The following theorem shows that these are all ‘canonical colorings’.

Theorem 6.1 (Canonical Hales-Jewett theorem).

Let A be a finite alphabet and m be a positive integer. Then there exists a positive integer \(n = \mathit{CHJ}(\vert A\vert,m)\) such that for every unbounded coloring \(\varDelta: [A]\binom{n}{0} \rightarrow \omega\) there exists \(f \in [A]\binom{n}{m}\) and there exists an equivalence relation \(\approx \) on A such that for all \(g,h \in [A]\binom{m}{0}\) it follows that

$$\displaystyle\begin{array}{rcl} & & \varDelta (f \cdot g) =\varDelta (f \cdot h)\quad \text{if and only if}\quad g/_{\approx }\; =\; h/_{\approx }, {}\\ & & \mathit{i.e}.,\mbox{ $g(i) \approx h(i)$}\mathit{for\ every\ }i < m. {}\\ \end{array}$$

Observe that considering unbounded colorings we are only interested in the pattern of these colorings but not in the actual colors. This is taken into account by considering equivalence relations, thus abstracting from the actual colors.

A set \(\mathcal{E}\) of equivalence relations on \([A]\binom{m}{k}\) is called a canonical set of equivalence relations if \(\mathcal{E}\) is minimal (with respect to cardinality) such that there exists n so that for every unbounded coloring \(\varDelta: [A]\binom{n}{k} \rightarrow \omega\) there exists \(f \in [A]\binom{n}{m}\) and an equivalence relation ≈ in \(\mathcal{E}\) satisfying \(\varDelta (f \cdot g) =\varDelta (f \cdot h)\) if and only if hg, i.e., the equivalence relation induced by Δ coincides on f with ≈.

Theorem 6.1 together with the hereditary property of each of these equivalence relations imply that the set of all equivalence relations on \([A]\binom{m}{0}\) which are induced by equivalence relations on A, form a canonical set of equivalence relations on \([A]\binom{m}{0}\). In fact, this is the unique canonical set of equivalence relations on \([A]\binom{m}{0}\). Hence, it is justified to call a coloring \(\varDelta: [A]\binom{m}{0} \rightarrow \omega\) satisfying \(\varDelta (g) =\varDelta (h)\) if and only if \(g/_{\approx } = h/_{\approx }\) for some equivalence relation ≈ on A, a canonical coloring of zero-parameter words.

Proof of Theorem 6.1.

Assume that \(\varDelta: [A]\binom{n}{0} \rightarrow \omega\) is given. Consider the colorings that a line \(g \in [A]\binom{n}{1}\) induces: \(\langle \varDelta (g \cdot a)\mid a \in A\rangle\). In the following we are not interested in the actual coloring of the line, but only in its pattern, i.e., for which a’s in A we get the same color and for which different ones. We can thus describe the pattern of a line by an equivalence relation on the alphabet A. Let r a denote the number of equivalence relations on A. We just convinced ourselves that every coloring \(\varDelta: [A]\binom{n}{0} \rightarrow \omega\) gives rise to a coloring \({\varDelta }^{{\ast}}: [A]\binom{n}{1} \rightarrow r_{a}\) which assigns to each line the equivalence relation on A that corresponds to the pattern induced by Δ on this line. Observe that the Graham-Rothschild theorem implies that for \(n = GR(\vert A\vert,1,M,r_{a})\), where M is yet to be determined, there exists an \(f \in [A]\binom{n}{M}\) that is monochromatic with respect to Δ .

We now repeat the above argument for m-spaces instead of lines. Every m-space \(g \in [A]\binom{M}{m}\) induces a pattern with respect to the colors \(\langle \varDelta ((f \cdot g) \cdot h)\mid h \in [A]\binom{m}{0}\rangle\) – and thus an equivalence relation on \([A]\binom{m}{0}\). Let \(\hat{r}_{a}\) denote the number of equivalence relations on \([A]\binom{m}{0}\) and let \({\varDelta }^{{\ast}{\ast}}: [A]\binom{M}{m} \rightarrow \hat{ r}_{a}\) denote the coloring that assigns to every m-space the equivalence relation on \([A]\binom{m}{0}\) that corresponds to the pattern induced by Δ on this m-space. Applying the Graham-Rothschild theorem again implies that for \(M = GR(\vert A\vert,m,m + 1,\hat{r}_{a})\) there exists a \(f^{\prime} \in [A]\binom{M}{m + 1}\) that is monochromatic with respect to Δ ∗∗.

Observe that f, f′ induce a coloring \(\hat{\varDelta }: [A]\binom{m + 1}{0} \rightarrow \omega\), defined by

$$\displaystyle{\hat{\varDelta }(\hat{f}) =\varDelta ((f \cdot f^{\prime}) \cdot \hat{ f})\quad \text{for every}\quad \hat{f} \in [A]\binom{m + 1}{0}.}$$

By construction we also have

  1. (1)

    The pattern which \(\hat{\varDelta }\) leaves to lines are all the same, i.e.

    $$\displaystyle{\hat{\varDelta }(\eta \cdot a) =\hat{\varDelta } (\eta \cdot b)\quad \text{if and only if}\quad \hat{\varDelta }(\eta ^{\prime} \cdot a) =\hat{\varDelta } (\eta ^{\prime} \cdot b)}$$

    for all \(\eta,\eta ^{\prime} \in [A]\binom{m + 1}{1}\) and all a, bA,

and, additionally,

  1. (2)

    The pattern which \(\hat{\varDelta }\) leaves to m-spaces are all the same, i.e.

    $$\displaystyle{\hat{\varDelta }(\xi \cdot g) =\hat{\varDelta } (\xi \cdot h)\quad \text{if and only if}\quad \hat{\varDelta }(\xi ^{\prime} \cdot g) =\hat{\varDelta } (\xi ^{\prime} \cdot h)}$$

    for all \(\xi,\xi ^{\prime} \in [A]\binom{m + 1}{m}\) and all \(g,h \in [A]\binom{m}{0}\).

(We remark in passing that by repeating the above argument multiple times we could even ensure that the pattern which \(\hat{\varDelta }\) leaves to i-spaces are all the same – for all 1 ≤ im. However, in the following we do not need this generalization.)

In the following we use the notation \(\triangleq \) to abbreviate facts (1) and (2). More precisely, for a, bA we write \(a \triangleq b\) if \(\hat{\varDelta }(\eta \cdot a) =\hat{\varDelta } (\eta \cdot b)\) for some (and hence for all) \(\eta \in [A]\binom{m + 1}{1}\). Similarly, for \(g,h \in [A]\binom{m}{0}\) we write \(g \triangleq h\) if \(\hat{\varDelta }(\xi \cdot g) =\hat{\varDelta } (\xi \cdot h)\) for some (and hence again for all) \(\xi \in [A]\binom{m + 1}{m}\).

We define the relation ≈ on A as follows: ab if and only if \(a \triangleq b\). The idea now is to show that \(g \triangleq h\) if and only if \(g/_{\approx } = h/_{\approx }\). Observe that in this case an m-parameter word \(f \cdot f^{\prime}\cdot \xi \in [A]\binom{n}{m}\), where \(\xi \in [A]\binom{m + 1}{m}\) is an arbitrarily chosen m-parameter word, together with ≈ satisfy the theorem.

First consider \(g,h \in [A]\binom{m}{0}\) such that \(g/_{\approx } = h/_{\approx }\). We prove by induction that

$$\displaystyle{(g_{0},g_{1},\ldots,g_{m-1}) \triangleq (h_{0},h_{1},\ldots,h_{i-1},g_{i},\ldots,g_{m-1})}$$

for all im. This is trivially satisfied for i = 0. Assume it holds for some i < m, and consider the line \(\eta = (h_{0},h_{1},\ldots,h_{i-1},\lambda _{0},g_{i+1}\ldots,g_{m-1}) \in [A]\binom{m}{1}\) and an arbitrary m-parameter word \(\xi \in [A]\binom{m + 1}{m}\). Observe that g i h i implies \(\hat{\varDelta }((\xi \cdot \eta ) \cdot g_{i}) =\hat{\varDelta } ((\xi \cdot \eta ) \cdot h_{i})\), and thus \(\eta \cdot g_{i} \triangleq \eta \cdot h_{i}\). As

$$\displaystyle\begin{array}{rcl} \eta \cdot g_{i}& =& (h_{0},h_{1},\ldots,h_{i-1},g_{i},\ldots,g_{m-1})\qquad \text{and} {}\\ \eta \cdot h_{i}& =& (h_{0},h_{1},\ldots,h_{i-1},h_{i},g_{i+1}\ldots,g_{m-1}), {}\\ \end{array}$$

we deduce that the induction hypothesis also holds for i + 1. Note that for i = m we get \(g \triangleq h\), as desired.

Let us now assume that \(g,h \in [A]\binom{m}{0}\) are such that \(g/_{\approx }\neq h/_{\approx }\). Choose im with \(g_{i}\not\approx h_{i}\) and consider \(\eta = (g_{0},\ldots,g_{i-1},g_{i},\lambda _{0},g_{i+1},\ldots,g_{m-1}) \in [A]\binom{m + 1}{1}\). Then \(g_{i}\not\approx h_{i}\) implies that

$$\displaystyle\begin{array}{rcl} & & \hat{\varDelta }(g_{0},\ldots,\ldots,g_{i-1},g_{i},g_{i},g_{i+1},\ldots,g_{m-1}) =\hat{\varDelta } (\eta \cdot g_{i}) \\ & & \qquad \qquad \qquad \neq \hat{\varDelta }(\eta \cdot h_{i}) =\hat{\varDelta } (g_{0},\ldots,\ldots,g_{i-1},g_{i},h_{i},g_{i+1},\ldots,g_{m-1}).{}\end{array}$$
(6.2)

In order derive a contradiction assume that \(g \triangleq h\) and consider m-parameter words

$$\displaystyle\begin{array}{rcl} \xi & =& (\lambda _{0},\ldots,\lambda _{i-1},\lambda _{i},\lambda _{i},\lambda _{i+1},\ldots,\lambda _{m-1}), {}\\ \xi ^{\prime}& =& (\lambda _{0},\ldots,\lambda _{i-1},\lambda _{i},h_{i},\lambda _{i+1},\ldots,\lambda _{m-1}) \in [A]\binom{m + 1}{m}. {}\\ \end{array}$$

Then \(g \triangleq h\) implies \(\hat{\varDelta }(\xi \cdot g) =\hat{\varDelta } (\xi \cdot h)\) and \(\hat{\varDelta }(\xi ^{\prime} \cdot g) =\hat{\varDelta } (\xi ^{\prime} \cdot h)\). Closer inspection of the words ξ and ξ′ yields that \(\xi \cdot h =\xi ^{\prime} \cdot h\), thus

$$\displaystyle\begin{array}{rcl} & & \hat{\varDelta }(g_{0},\ldots,\ldots,g_{i-1},g_{i},g_{i},g_{i+1},\ldots,g_{m-1}) =\hat{\varDelta } (\xi \cdot g) {}\\ & & \qquad \qquad \qquad \qquad =\hat{\varDelta } (\xi ^{\prime} \cdot g) =\hat{\varDelta } (g_{0},\ldots,\ldots,g_{i-1},g_{i},h_{i},g_{i+1},\ldots,g_{m-1}). {}\\ \end{array}$$

which contradicts (6.2). Hence \(g{ \bigtriangleup \atop \neq } h\), as desired. This completes the proof of Theorem 6.1.

Schmerl (1993) applies this result to show that for every countable non-standard model \(\mathcal{M}\) of Peano arithmetic and every positive integer k ≥ 2 there exists a cofinal extension \(\mathcal{N}\) of \(\mathcal{M}\) such that the lattice \(\mathcal{L}(\mathcal{N}/\mathcal{M})\) of intermediate models is isomorphic to Π(k), the lattice of equivalence relations of a k-element set (cf. also Schmerl 1985).

The special case | A | = 2 of Theorem 6.1 admits the following formulation.

Corollary 6.2.

Let m be a positive integer. Then there exists a positive integer \(n = \mathit{CHJ}(2,m)\) such that for every coloring \(\varDelta: \mathcal{B}(n) \rightarrow \omega\) of the points of the n-dimensional Boolean lattice \(\mathcal{B}(n)\) there exists a \(\mathcal{B}(m)\) -sublattice \(\mathcal{L}\subseteq \mathcal{B}(n)\) such that either \(\varDelta \rceil \mathcal{L}\) is constant or \(\varDelta \rceil \mathcal{L}\) is one-to-one. □

Here we have the same kind of result as for the unbounded pigeon hole principle: the substructure we are looking for must either be colored monochromatically or one-to-one. Nešetřil and Rödl (1978b, 1979) call this phenomenon selectivity. We will meet this phenomenon several times in the sequel, e.g., in the next section in connection with van der Waerden’s theorem.

Recall that every finite poset can be embedded (order-preserveingly) into some Boolean lattice \(\mathcal{B}(n)\), cf. Sect. 4.2.5. Hence, we get immediately

Corollary 6.3.

Let Q be a finite poset. Then there exists a finite poset P such that for every coloring \(\varDelta: P \rightarrow \omega\) of the points of P there exists a Q-subposet Q′ ⊆ P such that either \(\varDelta \rceil Q^{\prime}\) is monochromatic or \(\varDelta \rceil Q^{\prime}\) is one-to-one. □

2 Canonizing van der Waerden’s Theorem

As indicated in Sect. 4.2, van der Waerden’s theorem on arithmetic progressions is one of the most prominent applications of Hales-Jewett’s theorem. The aim of this section is to show how a canonical version of van der Waerden’s theorem can be obtained using the canonical Hales-Jewett theorem.

Theorem 6.4 (Canonical van der Waerden theorem).

Let t be a positive integer. Then there exists a positive integer n = EG(t) such that for every coloring \(\varDelta: n \rightarrow \omega\) there exists a t-term arithmetic progression X ⊆ n such that either \(\varDelta \rceil X\) is constant or \(\varDelta \rceil X\) is one-to-one.

At the first glance it may look somewhat astonishing that the canonical Hales-Jewett theorem which allows every pattern on the lines can be used in order to obtain a selectivity result for arithmetic progressions. The original proof of Erdős and Graham (1980) used Szemerédi’s density result for arithmetic progressions. Later, an ‘elementary’ proof was obtained by Prömel and Rödl (1986). The proof given here is based on ideas from (Prömel and Rothschild 1987) which can also be used to prove a slightly stronger result, viz. a restricted version of the canonical van der Waerden theorem.

Proof of Theorem 6.4.

Let \(\ell= {(t - 1)}^{2} + 1\). It is easy to see that the first nonnegative integers have the following property:

  1. (1)

    Let μ < ν < t be arbitrary and let ≈ be an equivalence relation on such that every arithmetic progression of length t in has its μth and its νth term in the same equivalence class. Then there is a t-term arithmetic progression in which is completely contained in one equivalence class, e.g., the progression \(\mu +(\nu -\mu ) \cdot j,j < t\).

Let \((X_{i})_{i<z}\) be an enumeration of all arithmetic progressions of length t in and assume \(X_{i} =\{ x_{i,0},\ldots,x_{i,t-1}\}\) for every i < z is in ascending order.

Choose \(n = \mathit{CHJ}(\ell,z)\) according to the canonical Hales-Jewett theorem and let \(\varDelta: (\ell-1)n + 1 \rightarrow \omega\) be a coloring. Consider the coloring \({\varDelta }^{{\ast}}: [\ell]\binom{n}{0} \rightarrow \omega\) which is defined by

$${\displaystyle{\varDelta }^{{\ast}}(g_{ 0},\ldots,g_{n-1}) =\varDelta (\sum _{i<n}g_{i}).}$$

By choice of n there exists \(f \in [\ell]\binom{n}{z}\) and an equivalence relation ≈ on such that \({\varDelta }^{{\ast}}\rceil f\) is canonical, meaning that for all \(g,h \in [\ell]\binom{z}{0}\) we have:

$${\displaystyle{\varDelta }^{{\ast}}(f \cdot g) {=\varDelta }^{{\ast}}(f \cdot h)\quad \text{if and only if}\quad g_{ i} \approx h_{i}\mbox{ for every $i < z$}.}$$

Let \(F =\sum \{ f_{i}\mid f_{i} \in \ell\}\) and put \({\zeta }^{j} = (x_{0,j},x_{1,j},\ldots,x_{z-1,j})\) for j < t and consider \(\{f {\cdot \zeta }^{j}\mid j < t\}\). Observe that \(\{F +\sum _{i<z}x_{\mathit{ij}}\mid j < t\}\) forms a t-term arithmetic progression.

First assume that \({\varDelta }^{{\ast}}\rceil \{f {\cdot \zeta }^{j}\mid j < t\}\) is one-to-one. Then, clearly, \(\varDelta \rceil \{F +\sum _{i<z}x_{\mathit{ij}}\mid j < t\}\) is also one-to-one and we are done.

So assume that there exists μ, ν < t such that

$${\displaystyle{\varDelta }^{{\ast}}(f {\cdot \zeta }^{\mu }) {=\varDelta }^{{\ast}}(f {\cdot \zeta }^{\nu }).}$$

But then \(x_{j,\mu } \approx x_{j,\nu }\) for every j < z. So by (1) there exists an arithmetic progression X i such that \(x_{i,0} \approx x_{i,1} \approx \ldots \approx x_{i,t-1}\). Let

$${\displaystyle{\xi }^{j} = (\underbrace{\mathop{0,\ldots,0}}\limits _{ z-1},x_{\mathit{ij}}),}$$

for every j < t. Then \({\varDelta }^{{\ast}}\rceil \{f {\cdot \xi }^{j}\mid j < t\}\) is constant and hence, by definition, also \(\varDelta \rceil \{F + x_{\mathit{ij}}\mid j < t\}\). Observing that \(\{F + x_{\mathit{ij}}\mid j < t\}\) forms a t-term arithmetic progression completes the proof of Theorem 6.4. □

Concerning more than one dimension a canonical version of Gallai-Witt’s theorem was proved by Deuber et al. (1983) for finite subsets of the integer lattice grid and by Spencer (1983) for arbitrary finite subsets of the Euclidean space, both based on Fürstenberg-Katznelson’s density version of the Gallai-Witt result. Simplified proofs are given in Prömel and Rödl (1986) and Prömel and Rothschild (1987). Although the method used to prove the canonical van der Waerden theorem can easily be adopted to derive a canonical version of Gallai-Witt’s theorem there exist additional canonical patterns in this higher dimensional case. We omit the result.

3 Canonizing Graham-Rothschild’s Theorem

Next we consider an extension of the canonizing version of Hales-Jewett’s theorem to higher dimensions. Here, the canonical colorings occurring in the Erdős-Rado canonization theorem and those from the canonical Hales-Jewett theorem come together, finding a kind of common generalization.

Consider the surjective mapping ϕ: \([A]\binom{m}{k} \rightarrow {[m]}^{k}\) given by \(\phi (f) =\{\min {f}^{-1}(\lambda _{i})\mid i < k\}\), cf. Sect. 3.1.2. This mapping shows that every canonical coloring \(\varDelta _{J}: {[m]}^{k} \rightarrow \omega\), where Jk and \(\varDelta _{J}(X) = X\): J, gives rise to a canonical coloring

$$\displaystyle{\varDelta: [A]\binom{m}{k} \rightarrow \omega \quad \text{via}\quad \varDelta (f) = (\phi (f)): J.}$$

On the other hand, every equivalence relation ≈ on \(A \cup \{\lambda _{0},\ldots,\lambda _{k-1}\}\) allows to color according to the ≈ -quotient of the k-parameter words in \([A]\binom{m}{k}\).

It turns out that all colorings which are relevant for canonizing the Graham-Rothschild theorem can be produced by combining these two types of colorings appropriately.

Let \(J \subseteq k\) be any subset of k and put \({J}^{+} = J \cup \{ k\}\). For ik let \(\mathit{pre}(i)\,:=\,\max \{j \in {J}^{+}\mid j < i\}\) (and \(\mathit{pre}(i) = -1\) if there doesn’t exist such element in J), and \(\mathit{suc}(i)\,:=\,\min \{j \in {J}^{+}\mid j > i\}\). Consider a family of equivalence relations \(\{\approx _{i}\}_{i\in J+}\), where ≈ i is defined on \(A \cup \{\lambda _{0},\ldots,\lambda _{i-1}\}\). We associate to the pair \(\varPi = (J,(\approx _{i})_{i\in {J}^{+}})\) an equivalence relation \(\approx _{\varPi }\) on \([A]\binom{n}{k}\) by putting

\(g \approx _{\varPi }h\quad\) if and only if for every iJ +

  • (1) \(\min {g}^{-1}(\lambda _{i}) =\min {h}^{-1}(\lambda _{i})\),

  • (2) \(g(\nu ) \approx _{i}h(\nu )\) \(\forall \) \(\min {g}^{-1}(\lambda _{\mathit{pre}(i)}) <\nu <\min {g}^{-1}(\lambda _{i})\),

where we tacitly agree that \(\ \min {g}^{-1}(\lambda _{-1}) = -1\) and \(\ \min {g}^{-1}(\lambda _{k}) = m\).

Note that the definition of \(\approx _{\varPi }\) does not depend on the dimension of the parameter words on which it is imposed. The pair \(\varPi = (J,(\approx _{i})_{i\in J+})\) is called an (A, k)-canonical pair (A, k)-Canonical pair, if and only if

  1. (3)

    For every jJ we have \(\alpha \approx _{j}\beta\) implies \(\alpha \approx _{\mathit{suc}(j)}\beta\) for all \(\alpha,\beta \in A \cup \{\lambda _{0},\ldots,\lambda _{j-1}\}\), i.e., the family of equivalence relations is getting coarser, and

  2. (4)

    For every \(j \in \{ 0,\ldots,k - 1\}\setminus J\) there exists \(\alpha \in A \cup \{\lambda _{0}\ldots \,\lambda _{j-1}\}\) such that \(\alpha \approx _{\mathit{suc}(j)}\lambda _{j}\).

Observe that condition (3) assures that the associated equivalence relation \(\approx _{\varPi }\) is hereditary, meaning that for every \(f \in [A]\binom{n}{m}\) the restriction of \(\approx _{\varPi }\) to f yields the same equivalence relation, i.e., \(f \cdot g \approx _{\varPi }f \cdot h\) if and only if g Π h. We prove now that any two equivalence relations which are associated to distinct canonical pairs are essentially different and then we show that the set of equivalence relations which come from (A, k)-canonical pairs indeed forms a canonical set of equivalence relations on \([A]\binom{n}{k}\).

Proposition 6.5.

Let \(\mathrm{II}_{0} = (J_{0},(\approx _{i}^{0})_{i\in J_{0}^{+}})\) and \(\varPi _{1} = (J_{1},(\approx _{i}^{1})_{i\in J_{1}^{+}})\) be distinct (A,k)-canonical pairs. Then for every \(f \in [A]\binom{n}{m}\) the restrictions of \(\approx _{\varPi _{0}}\) and \(\approx _{\varPi _{1}}\) to f are distinct.

Proof.

Fix some \(f \in [A]\binom{n}{m}\). First assume that J 0J 1. Without loss of generality we can assume that there exists jJ 0 such that \(j\not\in J_{1}\). By (4) we know that there exists \(\alpha \in A \cup \{\lambda _{0},\ldots,\lambda _{j-1}\}\) so that \(\alpha \approx _{i}^{1}\lambda _{j}\) (where i > j is minimal so that \(i \in J_{1}^{+}\)). Consider

$$\displaystyle{g = (\lambda _{0},\ldots,\lambda _{j-1},\lambda _{j},\lambda _{j},\lambda _{j+1},\ldots,\lambda _{k-1},\lambda _{0},\ldots,\lambda _{0}) \in [A]\binom{m}{k}}$$

and

$$\displaystyle{h = (\lambda _{0},\ldots,\lambda _{j-1},\alpha,\lambda _{j},\lambda _{j+1},\ldots,\lambda _{k-1},\lambda _{0},\ldots,\lambda _{0}) \in [A]\binom{m}{k}.}$$

Then,

$$\displaystyle\begin{array}{rcl} f \cdot g\not\approx _{\varPi _{0}}f \cdot h,& & \mbox{ as $\min {(f \cdot g)}^{-1}(\lambda _{j})\not =\min {(f \cdot h)}^{-1}(\lambda _{j})$, but} {}\\ f \cdot g \approx _{\varPi _{1}}f \cdot h,& & \mbox{ as $\alpha \approx _{i}^{1}\lambda _{j}$ implies by (3) that $\alpha \approx _{\ell}^{1}\lambda _{j}$ for every $i \leq \ell\leq k$.} {}\\ \end{array}$$

Now assume that J 0 = J 1, but there exist \(i \in J_{0}^{+}\) and \(\alpha,\beta \in A \cup \{\lambda _{0},\ldots\), \(\lambda _{i-1}\}\) so that \(\alpha \not\approx _{i}^{0}\beta\), but \(\alpha \approx _{i}^{1}\beta\). Put

$$\displaystyle{g = (\lambda _{0},\ldots,\lambda _{i-1},\alpha,\lambda _{i},\lambda _{i+1},\ldots,\lambda _{k-1},\lambda _{0},\ldots,\lambda _{0}) \in [A]\binom{m}{k}}$$

and

$$\displaystyle{h = (\lambda _{0},\ldots,\lambda _{i-1},\beta,\lambda _{i},\lambda _{i+1},\ldots,\lambda _{k-1},\lambda _{0},\ldots,\lambda _{0}) \in [A]\binom{m}{k}.}$$

Then, obviously, \(f \cdot g\not\approx _{\varPi _{0}}f \cdot h\), but \(f \cdot g \approx _{\varPi _{1}}f \cdot h\), as above, completing the proof of Proposition 6.5. □

Theorem 6.6 (Canonical Graham-Rothschild theorem).

Let A be a finite alphabet and k, m be positive integers. Then there exists n = PV (|A|,k,m) such that for every coloring \(\varDelta: [A]\binom{n}{k} \rightarrow \omega\) there exists \(f \in [A]\binom{n}{m}\) and there exists an (A,k)-canonical pair \(\varPi = (J,(\approx _{i})_{i\in J+})\) such that for all \(g,h \in [A]\binom{m}{k}\) we have

$$\displaystyle{\varDelta (f \cdot g) =\varDelta (f \cdot h)\quad \text{if and only if}\quad g \approx _{\varPi }h.}$$

Proof.

Proceeding as in the proof of the canonical Hales-Jewett theorem we observe that by using the (classical) Graham-Rothschild theorem twice we may assume that there exists \(\hat{f} \in [A]\binom{n}{m + 1}\) such that \(\hat{\varDelta }: [A]\binom{m + 1}{k} \rightarrow \omega\),

$$\displaystyle{\hat{\varDelta }(g)\,:=\,\varDelta (\hat{f} \cdot g)\quad \text{ for }\quad g \in [A]\binom{m + 1}{k},}$$

satisfies:

  1. (1a)

    The pattern which \(\hat{\varDelta }\) leaves to the (k + 1)-parameter subwords are all the same, i.e.,

    $$\displaystyle{\hat{\varDelta }(\eta \cdot a) =\hat{\varDelta } (\eta \cdot b)\quad \text{if and only if}\quad \hat{\varDelta }(\eta ^{\prime} \cdot a) =\hat{\varDelta } (\eta ^{\prime} \cdot b)}$$

    for all \(\eta,\eta ^{\prime} \in [A]\binom{m + 1}{k + 1}\) and all \(a,b \in [A]\binom{k + 1}{k}\), and additionally,

  2. (1b)

    The pattern which \(\hat{\varDelta }\) leaves to the m-parameter subwords are all the same, i.e.,

    $$\displaystyle{\hat{\varDelta }(\xi \cdot a) =\hat{\varDelta } (\xi \cdot b)\quad \text{if and only if}\quad \hat{\varDelta }(\xi ^{\prime} \cdot a) =\hat{\varDelta } (\xi ^{\prime} \cdot b)}$$

    for all \(\xi,\xi ^{\prime} \in [A]\binom{m + 1}{m}\) and all \(a,b \in [A]\binom{m}{k}\)

We define the relation \(\triangleq \) similarly as in the proof of the canonical Hales-Jewett theorem: for \(t \in \{ k + 1,m\}\) and \(a,b \in [A]\binom{t}{k}\), we write \(a \triangleq b\) if \(\hat{\varDelta }(f^{\prime} \cdot a) =\hat{\varDelta } (f^{\prime} \cdot b)\) for some (and hence for all) \(f^{\prime} \in [A]\binom{m + 1}{t}\). We also extend this notation to other values \(t \in \{ k + 1,\ldots,m + 1\}\) as follows: for \(a,b \in [A]\binom{t}{k}\), we write \(a \triangleq b\) if \(\hat{\varDelta }(f^{\prime} \cdot a) =\hat{\varDelta } (f^{\prime} \cdot b)\) for all \(f^{\prime} \in [A]\binom{m + 1}{t}\). We will repeatedly make use of the following simple fact that shows that the relation ≜ can be extended upwards:

  1. (1c)

    If \(a \triangleq b\) for some \(a,b \in [A]\binom{t}{k}\), then \(f^{\prime\prime} \cdot a \triangleq f^{\prime\prime} \cdot b\) for every \(f^{\prime\prime} \in [A]\binom{t^{\prime}}{t}\), \(t^{\prime} \in \{ t,\ldots,m + 1\}\).

To see this fix some \(f^{\prime\prime} \in [A]\binom{t^{\prime}}{t}\) and consider an arbitrary \(f^{\prime\prime\prime} \in [A]\binom{m + 1}{t^{\prime}}\); then \(f^{\prime\prime\prime} \cdot f^{\prime\prime} \in [A]\binom{m + 1}{t}\) and \(a \triangleq b\) thus implies that \(\hat{\varDelta }(f^{\prime\prime\prime} \cdot f^{\prime\prime} \cdot a) =\hat{\varDelta } (f^{\prime\prime\prime} \cdot f^{\prime\prime} \cdot b)\).

It remains to find an (A, k)-canonical pair Π such that

$$\displaystyle{g \triangleq h\quad \text{if and only if}\quad g \approx _{\varPi }h,}$$

for every pair \(g,h \in [A]\binom{m}{k}\). Note that then Π together with an m-parameter word \(\hat{f} \cdot f\), where \(f \in [A]\binom{m + 1}{m}\) is chosen arbitrarily, satisfies the theorem.

First we define equivalence relations ≈ i for all ik. These equivalence relations will later be used to obtain a set Jk and a family of equivalence relations ≈ i , \(i \in {J}^{+}\), which form an (A, k)-canonical pair. Let \(\approx _{i}^{{\ast}}\) be defined on \(A \cup \{\lambda _{0},\ldots,\lambda _{i}\}\), by \(\alpha \approx _{i}^{{\ast}}\beta\) if and only if \({\lambda }^{i}(\alpha ){ \triangleq \lambda }^{i}(\beta )\), where

$${\displaystyle{\lambda }^{i}(x) = (\lambda _{ 0},\ldots,\lambda _{i-1},x,\lambda _{i},\ldots,\lambda _{k-1}).}$$

In order to later define the desired set J, we first exhibit three properties of the relations \(\approx _{i}^{{\ast}}\):

  1. (2a)

    \(\alpha \approx _{i}^{{\ast}}\beta\) implies \(\alpha \approx _{i+1}^{{\ast}}\beta\), thus \(\approx _{i+1}^{{\ast}}\) is coarser than \(\approx _{i}^{{\ast}}\), for every i < k.

  2. (2b)

    Let \(\alpha \approx _{i}^{{\ast}}\lambda _{i}\) for some \(\alpha \in A \cup \{\lambda _{0},...,\lambda _{i-1}\}\). Then \(\approx _{i+1}^{{\ast}}\rceil A \cup \{\lambda _{0},\ldots,\lambda _{i}\} =\;\approx _{i}^{{\ast}}\).

Every parameter word \(g \in [A]\binom{m}{k}\) is naturally divided into k + 1 (possibly empty) pieces between the minimal occurrences of its k parameters. We denote by \(p(g,i) \subseteq m\) the positions of the ith of these k + 1 pieces. More formally,

$$\displaystyle{p(g,i) =\{ j < m\mid \min {g}^{-1}(\lambda _{ i-1}) < j <\min {g}^{-1}(\lambda _{ i})\}}$$

for i < k, where we assume that \(\min {g}^{-1}(\lambda _{-1}) = -1\) and \(\min {g}^{-1}(\lambda _{k}) = m\).

  1. (2c)

    Let \(g \in [A]\binom{m}{k}\) and m such that p(g, i) for some ik + 1. Then for any \(\alpha \in A \cup \{\lambda _{0},\ldots,\lambda _{i}\}\) such that \(g_{\ell} \approx _{i}^{{\ast}}\alpha\) and

    $$\displaystyle{g^{\prime} = (g_{0},\ldots,g_{\ell-1},\alpha,g_{\ell+1},\ldots,g_{m-1}) \in [A]\binom{m}{k}}$$

    we have \(g \triangleq g^{\prime}\).

Proof of (2a): Assume that \(\alpha \approx _{i}^{{\ast}}\beta\). Applying (1c) with

$$\displaystyle\begin{array}{rcl} \eta & =& (\lambda _{0},\ldots,\lambda _{i},\lambda _{i+1},\lambda _{i},\lambda _{i+2},\ldots,\lambda _{k}), {}\\ \eta ^{\prime}& =& (\lambda _{0},\ldots,\lambda _{i},\lambda _{i+1},\alpha,\lambda _{i+2},\ldots,\lambda _{k}) \in [A]\binom{k + 2}{k + 1} {}\\ \end{array}$$

on \({\lambda }^{i}(\alpha ){ \triangleq \lambda }^{i}(\beta )\), we get

$$\displaystyle\begin{array}{rcl} \eta {\cdot \lambda }^{i}(\alpha )& =& (\lambda _{ 0},\ldots,\lambda _{i-1},\alpha,\lambda _{i},\alpha,\lambda _{i+1},\ldots,\lambda _{k-1}) {}\\ & \triangleq & (\lambda _{0},\ldots,\lambda _{i-1},\beta,\lambda _{i},\beta,\lambda _{i+1},\ldots,\lambda _{k-1})\; =\;\eta {\cdot \lambda }^{i}(\beta ), {}\\ \end{array}$$

and

$$\displaystyle\begin{array}{rcl} \eta ^{\prime} {\cdot \lambda }^{i}(\alpha )& =& (\lambda _{ 0},\ldots,\lambda _{i-1},\alpha,\lambda _{i},\alpha,\lambda _{i+1},\ldots,\lambda _{k-1}) {}\\ & \triangleq & (\lambda _{0},\ldots,\lambda _{i-1},\beta,\lambda _{i},\alpha,\lambda _{i+1},\ldots,\lambda _{k-1})\; =\;\eta ^{\prime} {\cdot \lambda }^{i}(\beta ). {}\\ \end{array}$$

Thus, by transitivity,

$$\displaystyle{(\lambda _{0},\ldots,\lambda _{i-1},\beta,\lambda _{i},\beta,\lambda _{i+1},\ldots,\lambda _{k-1}) \triangleq (\lambda _{0},\ldots,\lambda _{i-1},\beta,\lambda _{i},\alpha,\lambda _{i+1},\ldots,\lambda _{k-1}).}$$

Now consider \(\eta ^{\prime\prime} = (\lambda _{0},\ldots,\lambda _{i-1},\beta,\lambda _{i},\ldots,\lambda _{k}) \in [A]\binom{k + 2}{k + 1}\) and observe that the equality above implies

$$\displaystyle\begin{array}{rcl} \eta ^{\prime\prime} {\cdot \lambda }^{i+1}(\alpha ) & =& (\lambda _{ 0},\ldots,\lambda _{i-1},\beta,\lambda _{i},\alpha,\lambda _{i+1},\ldots,\lambda _{k-1}) {}\\ & \triangleq & (\lambda _{0},\ldots,\lambda _{i-1},\beta,\lambda _{i},\beta,\lambda _{i+1},\ldots,\lambda _{k-1}) {}\\ & =& \eta ^{\prime\prime} {\cdot \lambda }^{i+1}(\beta ). {}\\ \end{array}$$

Therefore, for any \(f \in [A]\binom{m + 1}{k + 2}\) we have \(\hat{\varDelta }((f \cdot \eta ^{\prime\prime}) {\cdot \lambda }^{i+1}(\alpha )) =\hat{\varDelta } ((f \cdot \eta ^{\prime\prime}) {\cdot \lambda }^{i+1}(\beta ))\), hence from (1a) we deduce \({\lambda }^{i+1}(\alpha ){ \triangleq \lambda }^{i+1}(\beta )\) which by definition implies \(\alpha \approx _{i+1}^{{\ast}}\beta\), proving (2a).

Proof of (2b): Let us assume \(\alpha \approx _{i}^{{\ast}}\lambda _{i}\) for some \(\alpha \in A \cup \{\lambda _{0},\ldots,\lambda _{i-1}\}\). From (2a) we already know that \(\approx _{i+1}^{{\ast}}\) is coarser than \(\approx _{i}^{{\ast}}\). We need to show that \(\approx _{i+1}^{{\ast}}\) restricted to \(A \cup \{\lambda _{0},\ldots,\lambda _{i}\}\) is not strictly coarser than \(\approx _{i}^{{\ast}}\). In other words, we need to show that for \(\beta,\gamma \in A \cup \{\lambda _{0},\ldots,\lambda _{i}\}\) with \(\beta \approx _{i+1}^{{\ast}}\gamma\) we also have \(\beta \approx _{i}^{{\ast}}\gamma\). Observe that the assumption \(\alpha \approx _{i}^{{\ast}}\lambda _{i}\) implies that \(\alpha \approx _{i+1}^{{\ast}}\lambda _{i}\) (as \(\approx _{i+1}^{{\ast}}\) is coarser). That is, without loss of generality we may assume that neither β nor γ is equal to λ i .

We proceed similarly as in the proof of (2a). Applying (1c) with

$$\displaystyle\begin{array}{rcl} \eta & =& (\lambda _{0},\ldots,\lambda _{i-1},\lambda _{i},\beta,\lambda _{i+1},\ldots,\lambda _{k}), {}\\ \eta ^{\prime}& =& (\lambda _{0},\ldots,\lambda _{i-1},\lambda _{i},\gamma,\lambda _{i+1},\ldots,\lambda _{k}) \in [A]\binom{k + 2}{k + 1} {}\\ \end{array}$$

on \({\lambda }^{i}(\alpha ){ \triangleq \lambda }^{i}(\lambda _{i})\), we get

$$\displaystyle\begin{array}{rcl} (\lambda _{0},\ldots,\lambda _{i-1},\alpha,\beta,\lambda _{i},\ldots,\lambda _{k-1})& \triangleq & (\lambda _{0},\ldots,\lambda _{i-1},\lambda _{i},\beta,\lambda _{i},\ldots,\lambda _{k-1}), {}\\ (\lambda _{0},\ldots,\lambda _{i-1},\alpha,\gamma,\lambda _{i},\ldots,\lambda _{k-1})& \triangleq & (\lambda _{0},\ldots,\lambda _{i-1},\lambda _{i},\gamma,\lambda _{i},\ldots,\lambda _{k-1}). {}\\ \end{array}$$

Similarly, applying (1c) with

$$\displaystyle{\eta ^{\prime\prime} = (\lambda _{0},\ldots,\lambda _{i-1},\lambda _{i},\lambda _{i+1},\lambda _{i},\lambda _{i+2},\ldots,\lambda _{k}) \in [A]\binom{k + 2}{k + 1}}$$

on \({\lambda }^{i+1}(\beta ){ \triangleq \lambda }^{i+1}(\gamma )\), which follows from \(\beta \approx _{i+1}^{{\ast}}\gamma\), implies

$$\displaystyle\begin{array}{rcl} (\lambda _{0},\ldots,\lambda _{i},\beta,\lambda _{i},\lambda _{i+1},\ldots,\lambda _{k-1}) \triangleq (\lambda _{0},\ldots,\lambda _{i},\gamma,\lambda _{i},\lambda _{i+1},\ldots,\lambda _{k-1}).& & {}\\ \end{array}$$

Therefore, by transitivity we have

$$\displaystyle{(\lambda _{0},\ldots,\lambda _{i-1},\alpha,\beta,\lambda _{i},\ldots,\lambda _{k-1}) \triangleq \varDelta (\lambda _{0},\ldots,\lambda _{i-1},\alpha,\gamma,\lambda _{i},\ldots,\lambda _{k-1}).}$$

Thus, applying \(\eta ^{\prime\prime\prime} = (\lambda _{0},\ldots,\lambda _{i-1},\alpha,\lambda _{i},\ldots,\lambda _{k}) \in [A]\binom{k + 2}{k + 1}\) on the previous equality, we have

$$\displaystyle\begin{array}{rcl} \eta ^{\prime\prime\prime} {\cdot \lambda }^{i}(\beta )& =& (\lambda _{ 0},\ldots,\lambda _{i-1},\alpha,\beta,\lambda _{i},\ldots,\lambda _{k-1}) {}\\ & \triangleq & (\lambda _{0},\ldots,\lambda _{i-1},\alpha,\gamma,\lambda _{i},\ldots,\lambda _{k-1}) {}\\ & =& \eta ^{\prime\prime\prime} {\cdot \lambda }^{i}(\gamma ). {}\\ \end{array}$$

By the same argument as in the proof of (2a) we deduce that \(\beta \approx _{i}^{{\ast}}\gamma\), thus proving (2b).

Proof of (2c): Let \(g \in [A]\binom{m}{k}\) and p(g, i) for some ik + 1, and consider any \(\alpha \in A \cup \{\lambda _{0},\ldots,\lambda _{i}\}\) such that \(\alpha \approx _{i}^{{\ast}}g_{\ell}\). Then by applying (1c) with

$$\displaystyle{\eta = (g_{0},\ldots,g_{\ell-1},\lambda _{i},g_{\ell+1}^{{\ast}},\ldots,g_{ m-1}^{{\ast}}) \in [A]\binom{m}{k + 1},}$$

where

$$\displaystyle{g_{\nu }^{{\ast}} = \left \{\begin{array}{@{}l@{\quad }l@{}} g_{\nu } \quad &\mbox{ if $g_{\nu } \in A \cup \{\lambda _{0},\ldots,\lambda _{i-1}\}$} \\ \lambda _{\mu +1}\quad &\mbox{ if $g_{\nu } =\lambda _{\mu }$ for $\mu \geq i$,} \end{array} \right.}$$

on \({\lambda }^{i}(\alpha ){ \triangleq \lambda }^{i}(g_{\ell})\), we get

$$\displaystyle{g =\eta {\cdot \lambda }^{i}(g_{\ell}) \triangleq \eta {\cdot \lambda }^{i}(\alpha ) = g^{\prime},}$$

which completes the proof of (2c).

Completing the proof: With properties (2a), (2b) and (2c) at hand we complete the proof of the theorem as follows. Let

$$\displaystyle\begin{array}{rcl} J& \ \,:=\,\ & \{i < k\mid \alpha \not\approx _{i}^{{\ast}}\lambda _{ i}\mbox{ for every $\alpha \in A \cup \{\lambda _{0},\ldots,\lambda _{i-1}\}\}$}\qquad \text{ and } {}\\ \approx _{i}& \ \,:=\,\ & \approx _{i}^{{\ast}}\rceil A \cup \{\lambda _{ 0},\ldots,\lambda _{i-1}\}\mbox{ for every $i \in {J}^{+}$.} {}\\ \end{array}$$

We now show that these Jk and ≈ i , iJ +, are such that \(\varPi = (J,(\approx _{i})_{i\in {J}^{+}})\) is as required in the theorem.

By (2a) and the definition of J it is obvious that \(\varPi = (J,(\approx _{i})_{i\in {J}^{+}})\) is an (A, k)-canonical pair. In the remainder of the proof we verify that \(g \triangleq h\) if and only if \(g \approx _{\varPi }h\), for all \(g,h \in [A]\binom{m}{k}\). In doing so we will repeatedly use the following observation which immediately follows from the definition of J and (2b):

  • If i < k, jJ are such that pre(j) < ij, then \(\pi _{i}^{{\ast}} =\pi _{j}\rceil A \cup \{\lambda _{0},\ldots,\lambda _{i}\}\)

First assume that \(g \approx _{\varPi }h\). We show, by induction, that there exist k-parameter words \({g}^{0},\ldots,{g}^{k},{h}^{0},\ldots,{h}^{k} \in [A]\binom{m}{k}\), such that for each tk + 1 the following holds:

  1. (3a)

    \(\min {({g}^{t})}^{-1}(\lambda _{i}) =\min {({h}^{t})}^{-1}(\lambda _{i})\) for it, i.e., the first occurrences of each of the first t parameters are identical in g t and h t.

  2. (3b)

    \({g}^{t} \approx _{\varPi }{h}^{t}\), and

  3. (3c)

    \({g}^{t-1} \triangleq {g}^{t}\) and \({h}^{t-1} \triangleq {h}^{t}\),

where \({g}^{-1} = g\) and \({h}^{-1} = h\). For t = 0, all three properties are trivially satisfied for g 0 = g and h 0 = h. Assume now that the claim holds for some tk. If \(\min {({g}^{t})}^{-1}(\lambda _{t}) =\min {({h}^{t})}^{-1}(\lambda _{t})\), then \({g}^{t+1} = {g}^{t}\) and \({h}^{t+1} = {h}^{t}\) satisfies the claim for t + 1. Otherwise, without loss of generality we assume that \(\min {({g}^{t})}^{-1}(\lambda _{t}) >\min {({h}^{t})}^{-1}(\lambda _{t})\). Note that this implies tJ (as \({g}^{t} \approx _{\varPi }{h}^{t}\)) and \(\ell=\min {({h}^{t})}^{-1}(\lambda _{t}) \in p({g}^{t},t)\). From (∗) and \({g}^{t} \approx _{\varPi }{h}^{t}\) it thus follows that \(g_{\ell}^{t} \approx _{t}^{{\ast}}h_{\ell}^{t}\). Thus by applying (2c) with \(\alpha = h_{\ell}^{t} =\lambda _{t}\) on g t we get \({g}^{t+1} \in [A]\binom{m}{k}\),

$$\displaystyle{{g}^{t+1} = (g_{ 0}^{t},\ldots,g_{\ell -1}^{t},\lambda _{ t},g_{\ell+1}^{t},\ldots,g_{ m-1}^{t}),}$$

such that \({g}^{t} \triangleq {g}^{t+1}\). It is easy now to see that g t+1, together with \({h}^{t+1} = {h}^{t}\), satisfies all three properties of the claim. For t = k, (3a) implies that g k and h k agree on all first occurrences of parameters. Thus for each m such that \(g_{\ell}^{k}\neq h_{\ell}^{k}\) we have p(g k, i), for some ik + 1. Since \({g}^{k} \approx _{\varPi }{h}^{k}\), we can apply (2c) together with (∗) for \(\alpha = h_{\ell}^{k}\) on g k, hence completely matching g k and h k. Therefore \({g}^{k} \triangleq {h}^{k}\), and from (3c) we conclude \(g \triangleq h\).

Let us now assume that \(g\not\approx _{\varPi }h\). First we show that we may assume without loss of generality that g and h are such that there exists a position and an index i < k such that the following three properties are satisfied:

  1. (4a)

    \(g_{\ell}\not\approx _{i}^{{\ast}}h_{\ell}\),

  2. (4b)

    For all i′ < i we have \(\min {g}^{-1}(\lambda _{i^{\prime}}) =\min {h}^{-1}(\lambda _{i^{\prime}}) <\ell\),

  3. (4c)

    \(\ell\leq \min {g}^{-1}(\lambda _{i}) \leq \min {h}^{-1}(\lambda _{i})\).

If the first occurrences of the parameters λ j for j < k are all identical, then \(g\not\approx _{\varPi }h\) together with (∗) easily implies that there exist indices i and that satisfy (4a)–(4c). Otherwise choose i < k minimal such that \(\min {g}^{-1}(\lambda _{i})\neq \min {h}^{-1}(\lambda _{i})\). We may assume without loss of generality (rename g and h if necessary) that \(\ell\,:=\,\min {g}^{-1}(\lambda _{i}) <\min {h}^{-1}(\lambda _{i})\). If \(g_{\ell}\not\approx _{i}^{{\ast}}h_{\ell}\) then we have found and i that satisfy (4a)–(4c). So assume that \(g_{\ell} \approx _{i}^{{\ast}}h_{\ell}\). Apply (2c) to deduce that \(h^{\prime} = (h_{1},h_{\ell-1},g_{\ell},h_{\ell+1},\ldots,h_{m-1})\) satisfies \(h \triangleq h^{\prime}\). Note that (∗) implies that we also have that h′ ≈ Π h. We may thus assume without loss of generality that h = h′. Repeating this process we see that we either find the desired and i or we end up with g and h such that for all i < k we have \(\min {g}^{-1}(\lambda _{i}) =\min {h}^{-1}(\lambda _{i})\), which is the case that we already handled.

So assume now that and i are such that (4a)–(4c) hold. Consider the (k + 1)-parameter word \(\eta \in [A]\binom{m + 1}{k + 1}\),

$$\displaystyle{\eta = (g_{0},\ldots,g_{\ell-1},\lambda _{i},g_{\ell}^{{\ast}},\ldots,g_{ m-1}^{{\ast}}),}$$

where

$$\displaystyle{g_{\nu }^{{\ast}} = \left \{\begin{array}{@{}l@{\quad }l@{}} g_{\nu } \quad &\mbox{ if $g_{\nu } \in A \cup \{\lambda _{0},\ldots,\lambda _{i-1}\}$} \\ \lambda _{j+1}\quad &\mbox{ if $g_{\nu } =\lambda _{j}$ for $j \geq i$.} \end{array} \right.}$$

Note that \(g_{\ell}\not\approx _{i}^{{\ast}}h_{\ell}\) implies, by definition, \({\lambda }^{i}(g_{\ell}){{ \bigtriangleup \atop \neq } \lambda }^{i}(h_{\ell})\). Then from (1a) we also have \(\eta {\cdot \lambda }^{i}(g_{\ell}){ \bigtriangleup \atop \neq } \eta {\cdot \lambda }^{i}(h_{\ell})\), thus

$$\displaystyle\begin{array}{rcl} \eta {\cdot \lambda }^{i}(g_{\ell})& =& (g_{ 0},\ldots,g_{\ell-1},g_{\ell},g_{\ell},\ldots,g_{m-1}) \\ & { \bigtriangleup \atop \neq } & (g_{0},\ldots,g_{\ell-1},h_{\ell},g_{\ell},\ldots,g_{m-1}) =\eta {\cdot \lambda }^{i}(h_{\ell}).{}\end{array}$$
(6.3)

For a contradiction, let us assume \(g \triangleq h\). Then applying (1c) with

$$\displaystyle\begin{array}{rcl} \xi & =& (\lambda _{0},\ldots,\lambda _{\ell-1},\lambda _{\ell},\lambda _{\ell},\lambda _{\ell+1},\ldots,\lambda _{m-1}), {}\\ \xi ^{\prime}& =& (\lambda _{0},\ldots,\lambda _{\ell-1},{h}^{{\ast}},\lambda _{\ell},\lambda _{\ell +1},\ldots,\lambda _{m-1}) \in [A]\binom{m + 1}{m}, {}\\ \end{array}$$

where \({h}^{{\ast}} = h_{\ell}\) if \(h_{\ell} \in A\) and \({h}^{{\ast}} =\lambda _{\min {h}^{-1}(\lambda _{j})}\) if \(h_{\ell} =\lambda _{j}\), we get

$$\displaystyle\begin{array}{rcl} \xi \cdot g& =& (g_{0},\ldots,g_{\ell-1},g_{\ell},g_{\ell},\ldots,g_{m-1}) {}\\ & \triangleq & (h_{0},\ldots,h_{\ell-1},h_{\ell},h_{\ell},\ldots,h_{m-1}) =\xi \cdot h, {}\\ \end{array}$$

and

$$\displaystyle\begin{array}{rcl} \xi ^{\prime} \cdot g& =& (g_{0},\ldots,g_{\ell-1},h_{\ell},g_{\ell},\ldots,g_{m-1}) {}\\ & \triangleq & (h_{0},\ldots,h_{\ell-1},h_{\ell},h_{\ell},\ldots,h_{m-1}) =\xi ^{\prime} \cdot h. {}\\ \end{array}$$

Note that \(\xi ^{\prime} \cdot g = (g_{0},\ldots,g_{\ell-1},h_{\ell},g_{\ell},\ldots,g_{m-1})\) comes from the fact that \(\min {g}^{-1}(\lambda _{j}) =\min {h}^{-1}(\lambda _{j})\), in case \(h_{\ell} =\lambda _{j}\). Therefore, by transitivity we have

$$\displaystyle{(g_{0},\ldots,g_{\ell-1},g_{\ell},g_{\ell},\ldots,g_{m-1}) \triangleq (g_{0},\ldots,g_{\ell-1},h_{\ell},g_{\ell},\ldots,g_{m-1}),}$$

which contradicts (6.3). Hence \(g{ \bigtriangleup \atop \neq } h\), which completes the proof of Theorem 6.6. □

This result was proved in Prömel and Voigt (1983), cf. also Prömel and Voigt (1986).

4 Applications

Every result which can be proved using the Graham-Rothschild theorem admits some kind of canonization using the canonizing version of Graham-Rothschild’s theorem instead. Here we will only discuss three examples where applying the canonizing Graham-Rothschild theorem easily gives a canonical set of equivalence relations.

4.1 Finite Unions and Finite Sums

The first application of the canonical Graham-Rothschild theorem is a canonizing version of the finite union theorem (cf. Sect. 5.2.4). Recall that every nonempty subset of n can be interpreted as an element of \([1]\binom{n}{1}\). Observing that there are precisely three \((\{0\},1)\)-canonical pairs, viz. \((\varnothing,(\{0,\lambda \})_{\approx _{1}}),(\{0\},(\{0\},\{\lambda \})_{\approx _{1}})\) and \((\{0\},(\{0,\lambda \})_{\approx _{1}})\), we obtain

Theorem 6.7.

Let m be a positive integer. Then there exists n = n(m) such that for every coloring \(\varDelta: \mathcal{B}(n) \rightarrow \omega\) there exist m mutually disjoint and non empty subsets \(X_{0},\ldots,X_{m-1} \in \mathcal{B}(n)\) such that one of the following three cases is valid for all nonempty I,J ⊆ m:

  1. (1)

    \(\varDelta (\bigcup _{i\in I}X_{i}) =\varDelta (\bigcup _{j\in J}X_{j})\)

  2. (2)

    \(\varDelta (\bigcup _{i\in I}X_{i}) =\varDelta (\bigcup _{j\in J}X_{j})\quad\) if and only if I = J

  3. (3)

    \(\varDelta (\bigcup _{i\in I}X_{i}) =\varDelta (\bigcup _{j\in J}X_{j})\quad\) if and only if min I = min J.□

Using again the bijection κ: \(\mathcal{B}(n) \rightarrow {2}^{n}\) given by \(\kappa (B) =\sum _{i\in B}{2}^{i}\) for every Bn we obtain a canonical Rado-Folkman-Sanders theorem, viz.

Theorem 6.8.

Let m be a positive integer. Then there exists n = n(m) such that for every coloring \(\varDelta: n \rightarrow \omega\) there exist mutually distinct positive integers \(a_{0},\ldots,a_{m-1}\) such that one of the following three cases is valid for all nonempty I,J ⊆ m:

  1. (1)

    \(\varDelta (\sum _{i\in I}a_{i}) =\varDelta (\sum _{j\in J}a_{j})\)

  2. (2)

    \(\varDelta (\sum _{i\in I}a_{i}) =\varDelta (\sum _{j\in J}a_{j})\quad\) if and only if I = J

  3. (3)

    \(\varDelta (\sum _{i\in I}a_{i}) =\varDelta (\sum _{j\in J}a_{j})\quad\) if and only if \(\quad \min I =\min J\)

It is interesting to note that if finite subsets of ω are partitioned, instead of subsets of some finite n, respectively ω instead of n, and we ask for the canonical patterns on finite unions, respectively finite sums, then it turns out that three patterns are no longer sufficient (Taylor 1976).

4.2 Boolean Lattices

From the canonical Hales-Jewett theorem we obtained that coloring the points (i.e., \(\mathcal{B}(0)\)-sublattices) of a sufficiently large Boolean lattice always yields a \(\mathcal{B}(m)\)-sublattice which is either colored monochromatically or one-to-one (Corollary 6.2). Clearly, these two patterns do not longer suffice if we color \(\mathcal{B}(1)\)-sublattices, i.e., 2-element chains.

Every 2-element chain in a Boolean lattice is given by a pair \((X_{0},X_{0} \cup X_{1})\), where \(X_{1}\neq \varnothing \) and \(X_{0} \cap X_{1} = \varnothing \). On the other hand every such chain can be interpreted as a one parameter word over the alphabet \(\{0,1\}\). Using this interpretation, the canonizing Graham-Rothschild theorem gives a canonical set of equivalence relations as follows: on the left hand side as (2, 1)-canonical pairs, on the right hand side in terms of 2-element chains saying that \((X_{0},X_{0} \cup X_{1})\) is equivalent to \((Y _{0},\mathrm{Y}_{0} \cup Y _{1})\) if and only if the equation(s) in the second column is (are) fulfilled:

J = and

$$\displaystyle{\begin{array}{ll@{\qquad }l} & (\{0,\lambda \},\{ 1\})_{\approx _{1}}\qquad &X_{0} = Y _{0} \\ & (\{0\},\{ 1,\lambda \} )_{\approx _{1}}\qquad &X_{0} \cup X_{1} = Y _{0} \cup \mathrm{ Y}_{1} \\ & (\{0, 1,\lambda \} )_{\approx _{1}}\qquad &\mathrm{always}\\ \end{array} }$$

\(J =\{ 0\}\) and

$$\displaystyle{\begin{array}{lll@{\qquad }l} & (\{0\},\{ 1\})_{\approx _{0}},&(\{0\},\{ 1\},\{\lambda \} )_{\approx _{1}}\qquad &X_{0} =\mathrm{ Y}_{0}\quad \mathrm{and}\quad X_{1} = Y _{1} \\ & (\{0\},\{ 1\})_{\approx _{0}},&(\{0, 1\},\{\lambda \} )_{\approx _{1}}\qquad &\{x \in X_{0}\mid x <\min X_{1}\} =\{ y \in Y _{0}\mid y <\min Y _{1}\} \\ & & \qquad &\mathrm{and}\quad X_{1} = Y _{1} \\ & (\{0\},\{ 1\})_{\approx _{0}},&(\{0,\lambda \},\{ 1\})_{\approx _{1}}\qquad &X_{0} = Y _{0}\quad \mathrm{and}\quad \min X_{1} =\min Y _{1} \\ \ & (\{0\},\{ 1\})_{\approx _{0}},&(\{0\},\{ 1,\lambda \} )_{\approx _{1}}\qquad &X_{0} \cup X_{1} = Y _{0} \cup Y _{1}\quad \mathrm{and}\quad \min X_{1} =\min Y _{1} \\ & (\{0\},\{ 1\})_{\approx _{0}},&(\{0, 1,\lambda \} )_{\approx _{1}}\qquad &\{x \in X_{0}\mid x <\min X_{1}\} =\{ y \in Y _{0}\mid y <\min Y _{1}\} \\ & & \qquad &\mathrm{and}\quad \min X_{1} =\min Y _{1} \\ & (\{0, 1\})_{\approx 0}, &(\{0, 1\},\{\lambda \} )_{\approx _{1}}\qquad &X_{1} = Y _{1} \\ & (\{0, 1\})_{\approx _{0}},&(\{0, 1,\lambda \} )_{\approx _{1}}\qquad &\min X_{1} =\min Y _{1}\\ \end{array} }$$

In general, coloring \(\mathcal{B}(k)\)-lattices one obtains a canonizing version in the same way interpreting the (2, k)-canonical pairs in terms of sets. For sublattices of Boolean lattices, i.e., for arbitrary distributive lattices, the situation gets slightly more complicated. The interested reader will find a discussion of this in Prömel and Voigt (1982).

4.3 Finite Sets

The last application of the canonical Graham-Rothschild theorem we mention in this section is another proof of a finite version of the Erdős-Rado canonization theorem.

Theorem 6.9.

Let k and m be positive integers. Then there exists a positive integer n = ER(k,m) such that for every coloring \(\varDelta: {[n]}^{k} \rightarrow \omega\) there exists an m-subset M ∈ [n] m and there exists a (possible empty) set J ⊆ k such that

$$\displaystyle{\varDelta (X) =\varDelta (Y )\quad \text{if and only if}\quad X: J\; =\; Y: J}$$

for every pair X,Y ∈ [M] k .

Proof.

Let n be according to Theorem 6.6 with respect to \(A =\{ 0\}\), k and m. Let \(\varDelta: {[n]}^{k} \rightarrow \omega\) be a coloring. Define \(\varDelta ^{\prime}: [\{0\}]\binom{n}{k} \rightarrow \omega\) by \(\varDelta ^{\prime}(g) =\varDelta (\phi \cdot g)\). Then there exist a \((\{0\},k)\)-canonical pair \(\varPi = (J,(\approx _{i})_{i\in {J}^{+}})\) and an \(f \in [\{0\}]\binom{n}{m}\) satisfying Theorem 6.6. Observe that by definition of Δ′, every ≈ i can only have one equivalence class. But this implies immediately that \(M =\{ {f}^{-1}(\lambda _{i})\mid i < m\} \in {[n]}^{m}\) and Jk satisfy Theorem 6.9. □