1 Introduction

Considered as “one of the most elegant pieces of mathematics ever produced” (de Bruijn 1978), together with Hilbert’s theorem, Schur’s theorem, and Ramsey’s theorem, van der Waerden’s theorem is one of the cornerstones of Ramsey theory, a branch of combinatorics.

Bartel Leendert van der Waerden was a Dutch mathematician and historian of mathematics and science. He was born in 1903, the same year as Frank Ramsey, the man after whom Ramsey theory was named, and died in 1996, the same year as Paul Erdős, who is considered to be the father of Ramsey theory (Graham et al. 1980).

Van der Waerden’s theorem was proven in 1926 and published in 1927 (van der Waerden 1927). Many years later, van der Waerden told a story about how the proof was found. Here are a few quotes from the beginning of the English version of van der Waerden’s essay (1971) that provide an insight into how the proof was created and reflect on the process of mathematical discovery.

Once in 1926, while lunching with Emil Artin and Otto Schreier, I told them about the conjecture of the Dutch mathematician Baudet:

If a sequence of integers of \(1,2,3,\ldots \) is divided into two classes, at least one of the classes contains an arithmetic progression of l terms: \(a,a+b,a+2b, \ldots , a+(l-1)b\), no matter how large the length l is.

After lunch we went into Artin’s office in the Mathematics Department of the University of Hamburg, and tried to find a proof.

(\(\ldots \)) One of the main difficulties in the psychology of invention is that most mathematicians publish their results with condensed proofs, but do not tell us how they found them. In many cases they do not even remember their original ideas. Moreover, it is difficult to explain our vague ideas and tentative attempts in such a way that others can understand them.

(\(\ldots \)) In the case of our discussion of Baudet’s conjecture the situation was much more favourable for a psychological analysis. All ideas we formed in our minds were at once put into words and explained by little drawings on the blackboard. We represented the integers \(1,2,3,\ldots \) in two classes by means of vertical strokes on two parallel lines. Whatever one makes explicit and draws is much easier to remember and to reproduce than mere thoughtsFootnote 1.

Regardless of the fact that combinatorics was “a field that he never seriously worked in” van Lint (1982), van der Waerden’s contribution to combinatorics is indispensable. Various generalizations of van der Waerden’s theorem have marked the development of Ramsey theory over the last several decades. As an example we mention the polynomial van der Waerden theorem (Bergelson and Leibman 1996; Walters 2000). Another example is the long-standing 2-Large Conjecture (Brown et al. 1999, Robertson (to appear)).

2 van der Waerden’s Theorem

Theorem 1

(van der Waerden’s Theorem) Let l and k be positive integers. Any k-colouring of positive integers contains a monochromatic l-term arithmetic progression. Moreover, there is a positive integer \(N=N(l,k)\) such that any k-colouring of the segment of positive integers [1, N] contains a monochromatic l-term arithmetic progression.

Here a k-colouring of a set A means that the set A is split into k mutually disjunct subsets. We think about the k subsets as “k colours.” Equivalently, a k-colouring of a set A is any function \(c:A\rightarrow B\), where B is a set with exactly k elements (“colours.”) A subset of the set A is monochromatic (with respect to the given colouring c) if all of its elements are of the same colour.

An l-term arithmetic progression is a set of the form \(\{a, a+d,\ldots ,a+(l-1)d\}\). In this note a and d will always be positive integers. For example, \(\{ 2, 5, 8\}\) is a 3-term arithmetic progression where \(a=2\) and \(d=3\).

As an exercise, we suggest to the reader to find a 2-colouring of the segment of positive integers \([1,8]=\{1,2,\ldots ,8\}\) with no monochromatic 3-term arithmetic progression. In other words, the reader should find two mutually disjunct sets A and B such that \(A\cup B=[1,8]\) and that neither A nor B contains a 3-term arithmetic progression. This should be followed by showing that such a colouring for the set [1, 9] does not exist.

The smallest N guaranteed by the theorem is often denoted by W(lk) and called a van der Waerden number. Those readers who completed the above exercise have established that \(W(3,2)=9\).

3 Proof

This “proof in nine figures” follows van der Waerden’s original idea to establish the existence of W(lk) by using double induction. It also contains ideas and terminology from Leader (2000) and Tao (2007). As N.G. de Bruijn put it de Bruijn (1978):

van der Waerden’s argument is so nice that one might secretly hope that a simpler proof does not exist!

3.1 Main Tools: Colour-Focused Arithmetic Progressions and Spokes

Let c be a finite colouring of an interval of positive integers \([\alpha ,\beta ]\) and let l and r be positive integers. We say that l-term arithmetic progressions \(A_1,A_2,\ldots ,A_r\), where

$$\begin{aligned} A_i=\{ a_i+jd_i:j\in [0,l-1]\}, i\in [1,r], \end{aligned}$$

are colour-focused at a positive integer f if:

  1. 1.

    \(A_i\) is a subset of \([\alpha ,\beta ]\) for each \(i\in [1,r]\).

  2. 2.

    Each \(A_i\) is monochromatic.

  3. 3.

    If \(i\not = j\) then \(A_i\) and \(A_j\) are not of the same colour.

  4. 4.

    \(a_1+ld_1=a_2+ld_2=\cdots =a_r+ld_r=f\).

The \((l+1)\)-term arithmetic progression \(A_i\cup \{f\}\), \(i\in [1,r]\), is called a spoke. See Fig. 1.

Fig. 1
figure 1

2-term arithmetic progressions and are colour-focused at 7. Each of 3-term arithmetic progressions and is a spoke (color figure online)

3.2 The Base Case

Note that, for any positive integer k, \(W(1,k)=1\) and \(W(2,k)=k+1\).

3.3 The Inductive Step

Suppose that \(l\ge 3\) is such that \(W(l-1,k)\) exists for any (finite) number of colours k. We fix \(k\ge 2\).

We start the proof of the inductive step by using mathematical induction to prove Claim below. Actually, most of the proof of van der Waerden’s theorem is the proof of Claim.

Claim

For all \(r\le k\) there is an M such that any k-colouring of [1, M] contains a monochromatic l-term arithmetic progression or r colour-focused \((l-1)\)-term arithmetic progressions together with their focus.

Proof of Claim

For the base case when \(r=1\) set \(M=2W(l-1,k)\). See Fig. 2. \(\square \)

Fig. 2
figure 2

Any k-colouring of the set [1, M] contains a monochromatic l-term arithmetic progression or one coloured-focused \((l-1)\)-term arithmetic progression

Here is the summary of the proof of van der Waerden’s theorem so far:

figure c

For the inductive step see Fig. 3. This image is inspired by the term “a polychromatic fan” used by T. Tao in (2007)

Fig. 3
figure 3

Suppose that \(r\in [2,k]\) is such that there is an M such that any k-colouring of [1, M] contains or a monochromatic l-term arithmetic progression or at least \(r-1\) coloured-focused \((l-1)\)-term arithmetic progressions focused at some \(f\in [1,M]\). Notice that this implies that any set that contains M consecutive positive integers has this property

Next we consider the interval of positive integers \([1, M\cdot W(l-1, k^{M})]\).See Fig. 4.

Fig. 4
figure 4

The interval \([1, M\cdot W(l-1, k^{M})]\) is divided into \(W(l-1, k^{M})\) consecutive blocks \(B_i\), \(1\le i\le W(l-1, k^{M})\), of length M

Suppose that c is a k-colouring of \([1, M\cdot W(l-1, k^{M})]\) that does not contain a monochromatic l-term arithmetic progression. See Fig. 5.

Fig. 5
figure 5

The colouring c colours each block \(B_i\) with k-colours in one of the possible \(k^{M}\) ways and hence induces a \(k^{M}\)-colouring of \([1,W(l-1,k^{M})]\)

Any \(k^{M}\)-colouring of \([1,W(l-1,k^{M})]\) contains a monochromatic \((l-1)\)-term arithmetic progression. See Fig. 6.

Fig. 6
figure 6

The \(k^{M}\)-colouring of \([1,W(l-1,k^{M})]\) induced by the colouring c contains a monochromatic \((l-1)\)-term arithmetic progression. This implies that there are \(l-1\) blocks \(B_{i_j}\), \(1\le j\le l-1\), that are identically coloured by c and that are equally spaced between each other

Fig. 7
figure 7

All foci form a spoke. There is a new spoke in each of the previously used colours. Hence there are r spokes!

Fig. 8
figure 8

Closer look: \(l-1\) spokes in each of \(r-1\) colours produce another spoke in the same colour with a new focus that coincides with the \(l{\mathrm{th}}\) term of the arithmetic progression that contains all of the \(l-1\) original foci

The set of r colour-focused \((l-1)\)-term arithmetic progressions appears! See Figs. 7 and 8.

Where are we?

figure d

If \(r=k\) see Fig. 9:

Fig. 9
figure 9

Done!