There are two labyrinths of the human mind: one concerns the composition of the continuum, and the other the nature of freedom, and both spring from the same source – the infinite.

Baron von Leibniz

During World War II, when von Neumann was working on the design of nuclear weapons, he came to the conclusion that analytical methods were inadequate to the task, and that the only way to deal with equations of continuum mechanics is to discretize them. …It is to this task that von Neumann devoted his energies after the war.

Peter Lax

1 Overture

Baron Bourgain, the IBM von Neumann Professor in the School of Mathematics at the Institute for Advanced Study (IAS), is one of the most original, penetrating, and versatile analytical minds of our troubled times, justly celebratedFootnote 1 and revered without reservations.

While he rejected outright the suggestion of a sixtieth birthday conference, a proposal to have a gathering occasioned by the publication of his 500th paper was not immediately dismissed—the conference Analysis and Beyond: Celebrating Jean Bourgain’s Work and Impact took place at the IAS in Princeton on May 21–24, 2016. The conference talks (all of which were videotaped) are a tribute to the depth and breadth of Bourgain’s work and its singular and transcendent impact on the whole of our discipline. The beauty and power of the first result highlighted by Jean’s hand (Fig. 1) on the conference poster ∥eit Δ φp ≪ Nεφq is apparent from reading the splendid paper by Andrea Nahmod in the Bulletin of the American Mathematical Society (BAMS), [71]. The brief of this paper is to explicate the origins, nature, and development of the second result, the discretized sum-product inequality

$$\displaystyle \begin{aligned} \mathcal{N}(A+A, \delta) +\mathcal{N}(A \cdot A, \delta) > \mathcal{N}(A, \delta)^{1+ \tau}, \end{aligned} $$
(1)

in analysis and beyond.

Fig. 1
figure 1

Two of Jean Bourgain’s signature results

***

The three great branches of mathematics are, in historical order, Geometry, Algebra and Analysis. Geometry we owe essentially to Greek civilization, Algebra is of Indo-Arab origin and Analysis (or Calculus) was the creation of Newton and Leibniz, ushering in the modern era.

Sir Michael Atiyah [1]

Von Zahlen und Figuren—“On Numbers and Shapes”Footnote 2 is the title of one of the most successful expositions of mathematics aimed at a broad audience, reflecting a common perception of our discipline as a marriage between Algebra and Geometry. This happy marriage, notwithstanding Count Tolstoy’s contention (“All happy marriages are alike; each unhappy marriage is unhappy in its own way.”), is not without tensions (as, perhaps, each happy marriage—including, possibly, bicameral mind—is in its own way). “In these days the angel of topology and the devil of abstract algebra fight for the soul of each individual mathematical domain” is the way Hermann WeylFootnote 3 put it; three score and seven years later, in conversation at Google with the company’s CEO, a somewhat divergent sentiment was expressed: “When you form your ideas on the basis of words, you build from concepts, which to be meaningful depend on relation to other concepts. When you form your ideas on the basis of pictures, you form your views on the basis of impressions and of moods, that cannot even be recreated very easily, so you cannot look back and check what it was that impressed you so much.”Footnote 4

This tension is embodied in the system of real numbers, the soil in which the functions of Analysis grow, resembling Janus’s head facing in two directions: on the one hand, it is the field closed under the operations of addition and multiplication; on the other hand, it is a continuous manifold the parts of which are so connected as to defy exact isolation from each other. The one is algebraic; the other is the geometric face of real numbers. Continued fractions are much more intrinsic and geometric forms of discretizing the continuum; the lack of a practical algorithm for their addition and multiplication leads to the regnancy of the discretization based on the ordinary (digital or decimal, i.e., base 10) fractions.

Whereas Newton, in his development of Calculus, was primarily motivated by “dynamics” (force, acceleration), as exemplified by the falling of the apple on his head, Leibniz, it appears, was more intrigued by what would now be described by the appellation “fractal geometry of nature.” “Imagine a circle; inscribe within it three other circles congruent to each other and of maximum radius; proceed similarly within each of these circles and within each interval between them, and imagine that the process continues ad infinitum,” wrote Leibniz referencing configuration akin to the four mutually tangent circles appearing on Baron Bourgain’s coat of arms. Leibniz’s definition of the straight line as a ‘curve, any part of which is similar to the whole, and it alone has this property, not only among curves but among sets’ is a reflection of the fractal nature of the continuum: the Cantor set would satisfy Leibniz’s definition.Footnote 5

Dynamics, broadly conceived, is perceived as a study of change, which in its primordial (physical) context takes place within time. The Cantor set (and \({\mathbb R}\)) is, so to speak, timeless, i.e., static in time, but there is “a condition of possibility” of (almost) “equiprimordial” change “in the eye of the beholder,” taking form in changing the degree of magnification scale and “zooming in.” This is reflected in the “multi-scale” nature of Bourgain’s proof(s) of (1).

To bring this opening section to a close, let us in passing note that both results chosen by Jean are not equalities (inequalities, rather), commenting thus:

If Algebra is generally perceived as the study of equations, what perhaps lies at the heart of Analysis are inequalities, or estimates, which compare the size of two quantities or expressions. Einstein’s discovery that nothing travels faster than light is an example of an inequality. The inequality 2X is considerably larger than X arguably neatly encapsulates both the P vs NP problem (properly stated for finite X) and Cantor’s continuum problem (when X is the first infinite ordinal). An elementary inequality, taught in the middle school, asserts that the arithmetic mean of two positive numbers is never less than their geometric mean. In between these two extremes there is a vast range of estimates of great variety and importance. Such estimates, reflecting and quantifying some subtle aspect of the underlying problem, are often exceedingly difficult to prove. It will be seen that for the inequality (1), with which we are about to get intimate, the underlying issue lies at the heart of the tension between the algebraic and (fractal)-geometric nature of the continuum. Fractal derives from Latin fractus, meaning broken apart; algebra derives from the Arabic al-jabr, meaning the reunion of broken parts.

2 Origins: Kakeya-Besicovitch Problem+ 

It is difficult and often impossible to judge the value of a problem correctly in advance; for the final award depends upon the gain which science obtains from the problem. Nevertheless we can ask whether there are general criteria which mark a good mathematical problem. An old French mathematician said: ‘A mathematical theory is not to be considered complete until you have made it so clear that you can explain it to the first man whom you meet on the street.’ This clearness and ease of comprehension, here insisted on for a mathematical theory, I should still more demand for a mathematical problem if it is to be perfect; for what is clear and easily comprehended attracts, the complicated repels us.

David Hilbert, Problems of Mathematics, 1900

In Hilbert’sFootnote 6 democratic dictum, if followed by Sōichi Kakeya (writing the paper on an island nation in 1917, at the height of the Great War), the explanation of the problem now bearing his name to almost every person at just about any street in Eastern Eurasia might have run as follows: Entrusted with defending an island, possessing a huge hill, cragged and steep, your task is to purchase at the least cost to the nation’s treasury, a plot of land on the flat hilltop with the following property—a cannon of length one must be capable of pointing in any direction.

Kakeya improved by a factor of one-half the obvious solution (a circle of diameter one, having area \(\frac {\pi }{4}\)); his proposed shape (three-cusped hypocycloid inscribed in the circle of radius 1) is alluded to in the rendering of A in the conference poster (Fig. 2). In the same year, working in Perm,Footnote 7 while the October/November Russian/Soviet Revolution was unfolding, A. S. Besicovitch reduced the minimal necessary sum to virtuallyFootnote 8 nothing.

Fig. 2
figure 2

Analysis and beyond

In fact, Besicovitch was working on the following question: if f is a Riemann integrable function defined on the plane, is it always possible to find a pair of orthogonal coordinate axis with respect to which \(\int f(x,y) d x\) exists as a Riemann integral for all y, and with resulting function of y also Riemann integrable? Besicovitch noticed that if he could construct a compact set F of plane Lebesgue measure zero containing a line segment in every direction, this would lead to a counterexample as follows. Assume (by translating F if necessary) that F contains no segment parallel to and of rational distance from either of a fixed pair of axes. Let f be the characteristic function of the set F r consisting of those points of F with at least one rational coordinate. As F contains a segment in every direction on which both F r and its complement are dense, there is a segment in each direction in which f is not Riemann integrable. On the other hand, the set of points of discontinuity of F is of plane measure zero, so f is Riemann integrable over the plane by the well-known criterion of Lebesgue.

The basic idea underlying the original construction of Besicovitch [5] is to form a figure obtained by splitting an equilateral triangle of unit height into many smaller triangles of the same height by dividing up the base and then sliding these elementary triangles varying distances along the base line. In 1964 Besicovitch developed a completely different approach [6], using the projection theorem due to Marstrand.

2.1 Some Fundamental Properties of Plane Sets of Fractional Dimension

In this 1954 paper [67], which was essentially the work for his doctoral thesis at Oxford and was heavily influenced by Besicovitch, John Marstrand proved the following fundamental result.

Theorem 1 (Marstrand’s Projection Theorem)

Denote the projection in the direction θ by π θ . If \(X \in {\mathbb R}^2\) is a Borel subset of Hausdorff dimension s, then \(\dim _H(\pi _{\theta } X) = \min (s, 1)\) for almost every θ.

Concerning the finer information about the set of exceptional θ in Theorem 1, Kaufman proved [51] that if \(\dim X \geq t\), B ⊂ S1 with \(\dim B > t\), then there exists θ ∈ B such that \(\dim (\pi _{\theta } (X) \geq t\). Using crucially (1), in The Discretized Sum-product and Projection Theorems [14], Bourgain established the following, sharper result:

Theorem 2

Given 0 < α < 2 and κ > 0, there is \(\eta > \frac {\alpha }{2}\) such that if \(X \subset {\mathbb R}^2\) is of Hausdorff dimension greater than α, then dimH(π θ(X)) ≥ η for all θ  S1 except in an exceptional set E satisfying dimH(E) ≤ κ.

2.2 Besicovitch Type Maximal Operators and Applications to Fourier Analysis

We must admit with humility that, while number is purely a product of our mind, space has a reality outside of our mind, so that we cannot prescribe its laws a priori.

Gauss, Letter to Bessel, 1830

The Kakeya problem in \({\mathbb R}^n\) is to estimate the fractal dimension of the Besicovitch set \(E \subset {\mathbb R}^n\), i.e., a set containing line segments of length one in all directions.

Conjecture 1

Let E be a Besicovitch set in \({\mathbb R}^n\) . Then \(\beta (n) =\dim (E) =n\).

There are several relevant notions of “fractal dimension,” the simplest being the Minkowski dimension, defined as follows. Let A be a closed subset of a metric space X. Fix some radius δ. Let \(\mathcal {N}(A, \delta )\) be the least number of balls of radius δ needed to cover A. If A is a rectifiable curve in \({\mathbb R}^n\), it is easy to see that \(\mathcal {N}(A, \delta )\) is of order δ−1. If A is a surface, \(\mathcal {N}(A, \delta )\) is approximately δ−2. This suggests the idea of defining the dimension of an arbitrary set as the number d for which \(\mathcal {N}(A, \delta ) \thicksim \delta ^{-d}\). The limit

$$\displaystyle \begin{aligned} \begin{array}{rcl}\lim_{\delta \to 0}\frac {\log \mathcal{N}(A, \delta)}{\log (\delta^{-1})},\end{array} \end{aligned} $$

if it exists, is called Minkowski dimension, dimM(A).

The basic result proved by Davies [27] in 1971 is that β(2) = 2. The same year C. Fefferman [37] discovered the intimate connection between the Kakeya problem and the multiplier problem for the ball, proving that for d ≥ 2 the map \(f \to \int _{|\xi |\leq 1} \hat {f}(\xi ) e^{ix \xi } d \xi \) defines only for p = 2 a bounded operator on \(L^p({\mathbb R}^d)\). This seminal result made apparent the fundamental connection between Kakeya-type questions and the higher-dimensional Fourier analysis, in particular in the theory of oscillatory integral operators.Footnote 9

In the 1980s, Drury [31] showed that

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} \beta(n) \geq \frac{n+1}{2} \end{array} \end{aligned} $$
(2)

(see also Christ et al. [25]). The argument consists of intersecting the line segment L ξ ⊂ E, L ξ parallel to ξ in Sd−1 by a pair of parallel hyperplanes H 1, H 2 in \({\mathbb R}^d\) and observing that for all δ > 0

$$\displaystyle \begin{aligned} \begin{array}{rcl} \left(\frac{1}{\delta}\right)^{d-1} \lesssim \mathcal{N}(H_1\cap E, \delta) \mathcal{N}(H_2\cap E, \delta) .\end{array} \end{aligned} $$
(3)

The estimate (2) was first improved by Bourgain in 1991, in the paper eponymous with the title of this subsection [8], to \(\frac {n+1}{2} +\varepsilon _n\) with ε n given by a recursive argument (for n = 3, this yields bound \(\frac {7}{3}\)) by using a “bush” argument. A more efficient geometric argument, using “hairbrushes,” was given several years later by T. Wolff, leading to

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} \dim_{H}(E) \geq \frac{n}{2} +1. \end{array} \end{aligned} $$
(4)

The space constraints prevent me from going into the details of these arguments; referring the reader to beautiful surveys by Izabella Łaba [55], Terence Tao [89], and Thomas Wolff [95], I will restrict myself to two remarks.

The first remark is that these developments made apparent the connection between Kakeya-type problems and results in combinatorial geometry, such as the Szemerédi-Trotter Theorem [88], which will be briefly discussed in Sect. 3.2.

The second remark is that Bourgain’s interest in Kakeya problem was stimulated by his discovery [9] of it being implied by the following version of Montgomery’s conjectureFootnote 10 for Dirichlet polynomials:

Conjecture 2

Let \(S(s)= \sum _{n=1}^N a_n n^s\) with |a n|≤ 1 and \(\mathcal {F}\) be a set of 1-separated reals in the interval [0, T], T > N. Then

$$\displaystyle \begin{aligned} \begin{array}{rcl} \sum_{t \in \mathcal{F}} |S(it)|{}^2 \ll T^{\varepsilon}(N+ |\mathcal{F}|) N ( \max_{1\leq n\leq N} |a_n|{}^2). \end{array} \end{aligned} $$
(6)

Regrettably skipping thus over many important and pertinent developments that took place in the last decade of the past century, let us note, looking forward, that in its closing year (1999), Bourgain unveiled the connection between Kakeya problem and one of the most consequential and far-reaching results in arithmetic combinatorics, obtained by Gowers in his groundbreaking A New Proof of Szemerédi’s Theorem for Arithmetic Progressions of Length Four [40]. This result, Balog-Szemerédi Gowers Lemma, will play a crucial role in many a subsequent development, of which some are discussed in this essay.

2.3 Balog-Szemerédi-Gowers Lemma

Either this universe is a mere confused mass, and an intricate context of things, which shall in time be scattered and dispersed again; or it is a union consisting of order and administered by Providence.

Marcus Aurelius “Meditations” 6, VIII

Complete disorder is impossible.

T.S. Motzkin

The Balog-Szemerédi-Gowers lemma is ostensibly a statement about group structure, but the main tool in its proof is a remarkable (and remarkably useful) graph-theoretic result best viewed in the context of Ramsey theory. Ramsey theory is a systematic study of the following general phenomenon. Surprisingly often, a large structure of a certain kind has to contain a fairly large highly organized substructure, even if the structure itself is completely arbitrary and apparently chaotic. It can be viewed as a vast generalization of the pigeonhole principle, which states that if a set X of n objects is colored with S colors, then there must be a subset of X of size at least \(\frac {n}{s}\) that uses just one color. Such a subset is called monochromatic. The situation becomes more interesting if the set X has some additional structure. It then becomes natural to ask for a monochromatic subset that keeps some of the structure X. However it also becomes much less obvious if such a subset exists. Frank Plumpton Ramsey in 1930 [76] took as his set X the set of all the edges in a complete graph and the monochromatic subset he obtained consisted of all the edges of some complete graph. One version of his theorem is as follows. For every positive integer k, there is a positive integer N such that if the edges of the complete graph are all colored either red or blue, then there must be k vertices such that all edges joining them have the same color. That is, a sufficiently large complete graph colored with two colors contains a complete subgraph of size k which is monochromatic. The least integer N that works is known as R(k) and is known that

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} 2^{\frac{k}{2}} \leq R(k) \leq 2^{2k}. \end{array} \end{aligned} $$
(7)

There were several results in Ramsey theory predating Ramsey’s theorem; in particular, van der Waerden [93] proved that if you color the integers with some finite number r of colors, there must be some color that contains arithmetic progressions of every length. In 1935 Erdös and Turán conjectured that this holds for “the most popular” color class. More precisely, they conjectured that for any positive integer k and any real number ε > 0, there is a positive integer n 0 such that if n > n 0, any set of at least εn positive integers between 1 and n contains k-term arithmetic progression. This conjecture was proved by Szemerédi in 1975 using, among other things, his celebrated regularity lemma [87], which can be very roughly described as a statement that even the most “chaotic” systems can be decomposed into a “relatively” small number of “approximately regular” subsystems.

Using the Szemerédi regularity lemma, the following result was established by Balog and Szemerédi in 1994 [2], resulting in tower-like exponential-type dependence (cf. (7)). Gowers achievement of the polynomial bounds KO(1) in the statement below is crucial in the ensuing applications.

Theorem 3 (Balog-Szemerédi-Gowers Lemma)

Let \(\mathcal {G}(A, B, E)\) be a finite bipartite graph, that is, a graph whose vertices can be partitioned into two disjoint sets, with \(|E| \geq \frac {|A| |B|}{K}\) . Then there exist subsets A′ A and B′ B with |A′|≫ KO(1)|A| and |B′|≫ KO(1)|B| such that for every a  A and b  B, a and b are joined by KO(1)|A|B| paths of length three.

The fact that the following corollary is valid for non-commutative groups was established by Tao [90].

Corollary 4

Let A, B be finite nonempty subsets of a group G and suppose

$$\displaystyle \begin{aligned} \begin{array}{rcl} \| 1_{A} \star 1_B\|{}_{l^2(G)} \ge \frac{|A|{}^{\frac{3}{4}} |B|{}^{\frac{3}{4}}}{K} \end{array} \end{aligned} $$
(8)

for some Footnote 11 K ≥ 1 . Then there exist subsets A′ A and B′ B with |A′|≫ KO(1)|A| and |B′|≫ KO(1)|B| with |A′ B′|≪ KO(1)|A||B| and |A′⋅ (A′)1|≪ KO(1)|A|.

The quantity \(\| 1_{A} \star 1_B\|{ }_{l^2(G)}\) counts the number of solutions to the equation a 1 ⋅ b 1 = a 2 ⋅ b 2 with a 1, a 2 ∈ A, and b 1, b 2 ∈ B (multiplicative or additive quadruples) and is also known as the multiplicative energy of A and B.

2.4 On the Dimension of Kakeya Sets and Related Maximal Inequalities

The main result in this 1999 paper of Bourgain [10] is the following improvement of (4) for large n

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} \dim_{H}(E) \geq \frac{1}{25}(13 n + 12).\end{array} \end{aligned} $$
(9)

The heart of the argument consists in applying Balog-Szemerédi-Gowers lemma to show that Kakeya set E satisfies \(\mathcal {N}_{\delta } \geq \delta ^{-\alpha (n-1)}\) with \(\alpha > \frac {1}{2}\) as follows. Let L be the lattice \(\delta {\mathbb Z}^n \subset {\mathbb R}^n\), and for each of the segments \(\{x+ t e \, : \, |t| \leq \frac {1}{2} \}\) with e ∈ Sn−1 in the definition of Kakeya set. Let x+ and x be the elements of L closest to \(x+\frac {1}{2} e\) and \(x-\frac {1}{2} e\), respectively. Let A be the set whose elements are the various x+ and x and define \(\mathcal {G} \subset A \times A\) to be the set of pairs (x+, x); then let S be the set of sums x+ + x. Clearly \(|A| \lesssim \mathcal {N}_{\delta }(E)\), and in addition \(|S| \lesssim \mathcal {N}_{\delta }(E)\), since the midpoint \(\frac {1}{2}(x^{+} +x^{-})\) is within of x ∈ E. But it is equally clear that point of \(\mathbb {P}^{n-1}\) is within of some difference x+ − x. Thus \(\delta ^{-(n-1)} \lesssim \mathcal {N}_{\delta }(E)^{2-\varepsilon }\), as claimed.

This paper marked the first application in Harmonic Analysis of Additive Combinatorics.Footnote 12

Fig. 3
figure 3

Jean Bourgain and Ben Green

3 Sum-Product Phenomena and the Labyrinth of the Continuum

Additive combinatorics grew out of the classical additive number theory. Though few isolated results existed before, the turning point was Schnirelmann’s approach [80] to Goldbach’s conjecture asserting that any integer greater than three can be expressed as a sum of two or three primes, depending on parity. Schnirelmann proved the weaker result that there is a bound k so that every integer is a sum of at most k primes, or, in other words, the primes form an additive basis. Schnirelmann’s approach, notwithstanding it being soon superseded for the Goldbach’s problem by Vinogradov’s method of exponential sums, kindled the interest in addition of general sets; a result of fundamental and lasting importance in this subject is due to Gregory Abelevich Freiman [38], a student of Gelfond, who was a close friend and collaborator of Schnirelmann.Footnote 13

3.1 Freiman’s Theorem and Ruzsa’s Calculus

Freiman’s Theorem gives characterization of sets with small doubling in terms of generalized arithmetic progression. A d-dimensional generalized arithmetic progression (GAP) is a set P of the form

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} \{a+x_1 q_q + \dots +x_d q_d\, :\, 0 \leq x_i \leq l_i\}, \end{array} \end{aligned} $$
(10)

where l 1, …l d are positive integers. We call d the dimension of P; by the size of P, we mean \(\| P\| = \prod _{i=1}^{d} (l_i +1)\), which is the same as the number of elements if all sums in (10) are distinct (in which case we say that P is proper). Note that

$$\displaystyle \begin{aligned} \begin{array}{rcl} |P+P| < 2^d |P| \leq 2^d \|P\|. \end{array} \end{aligned} $$
(11)

Theorem 5 (Freiman’s Theorem)

If \(A \subset {\mathbb Z}\), |A| = n, |A + A|≤ αn, then A is contained in a generalized arithmetic progression of dimension at most d(α) and size at most s(α)n.

The quantitative bound in Freiman’s theorem, used by Bourgain in his first proof of (1), is due to Mei-Chu Chang (Fig. 4) [24]: d < α (the best possibleFootnote 14) and \(s \leq e^{\alpha ^c}\).

Fig. 4
figure 4

Jean Bourgain and Mei-Chu Chang

Freiman’s proof was considerably simplified by Ruzsa [77] (building on the earlier work of Plünnecke [74]). One of the fundamental notions introduced by Ruzsa is that of Ruzsa distance between two sets X and Y  in a group, \(\rho (X, Y) = \log \frac {|X-Y|}{\sqrt {|X||Y|}}\), allowing us to rewrite an elementary inequality for A, Y, Z finite sets in a group (which, as observed by Tao, is not necessarily commutative) |A||Y − Z|≤|A − Y ||A − Z| as

$$\displaystyle \begin{aligned} \begin{array}{rcl} \rho(Y, Z) \leq \rho(Y, A) +\rho(A, Z),\end{array} \end{aligned} $$
(12)

a triangle inequality-like property; ρ is also symmetric (but ρ(X, X) is typically positive). The following result of Plünnecke and Ruzsa was used in Bourgain’s 2 +  proof in place of Freiman’s theorem.

Theorem 6

Let A, B be finite sets in a group and write |A| = m, |A + B| = αm. For arbitrary nonnegative integers k, l we have

$$\displaystyle \begin{aligned} \begin{array}{rcl}|k B -l B| \leq \alpha^{k+l} m.\end{array} \end{aligned} $$

3.2 Sum-Product Phenomena and Incidence Geometry

Freiman’s theorem is an example of an “inverse” result: knowing that the set has small doubling, we can characterize its structure in terms of GAPs. One of the basic “direct” results, applicable to arbitrary sets, is the “sum-product phenomenon,” whose elementary and elemental nature might be described as follows. When studying addition and multiplication tables for numbers from one to nine, one might notice that there are many more numbers in the multiplication table. This basically has to do with the fact that the numbers from one to nine form an arithmetic progression. If you take a set forming an arithmetic progression (or a subset of it) and add it to itself, it will not grow much; if you take a set forming a geometric progression (or a subset of it) and multiply it by itself, it will also not grow much. However a subset of integers cannot be both an arithmetic and a geometric progression, and so it will grow either when multiplied or added with itself.

In 1983 Erdös and Szemerédi proved [35] that for any finite set of integers A

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} |A+A|+|A\cdot A| \geq C |A|{}^{1+\varepsilon}\end{array} \end{aligned} $$
(13)

for absolute constants C, ε and conjectured that in fact for any ε > 0 there is C ε such that

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} |A+A|+|A\cdot A| \geq C_{\varepsilon} |A|{}^{2-\varepsilon}. \end{array} \end{aligned} $$
(14)

We will give a beautiful proof (due to Elekes [33] and Székeley [86]) of (14) with \(\varepsilon =\frac {3}{4}\) using Szemerédi-Trotter theorem in incidence geometry, mentioned in Sect. 2.2, which in turn will follow from crossing number inequality obtained, ultimately, from a purely topological result: Euler’s formula.

3.2.1 Crossing Number Inequality

During World War II, Turán worked as forced labor, moving wagons filled with bricks from kilns to storage places. According to his recollections, it was not a very tough job, except that they had to push much harder at the crossings. This led him to consider the following problem: for a non-planar graph \(\mathcal {G}\), find a drawing for which the number of crossings is minimal. The minimal number of crossings in a drawing is called crossing number of a graph \(\mathrm {Cr}(\mathcal {G})\). Another practical application of this problem appeared in the early 1980s, when it turned out that the chip area required for the realization of an electrical circuit (VLSI layout) is closely related to crossing number of underlying graph. The basic result, due to Leighton [57], is as follows:

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} \mathrm{Cr}(\mathcal{G}) \geq \frac{1}{64} \frac{|E|{}^3}{|V|{}^2} - |V|. \end{array} \end{aligned} $$
(15)

Here |V | and E denote, respectively, the number of vertices and edges in the graph. The proof starts by observing that Euler’s formula implies that if \(\mathrm {Cr}(\mathcal {G}) =0\), then |V |−|E| + |F| = 2. This readily implies that crossing number of any graph satisfies

$$\displaystyle \begin{aligned}\begin{array}{rcl} \mathrm{Cr}(\mathcal{G}) \geq |E| -3 |V| +6.\end{array} \end{aligned} $$

The proof is concluded by considering a planar embedding of \(\mathcal {G}\) with least crossing number and choosing each vertex of \(\mathcal {G}\) at random with probability p. Taking the expectations of the relevant quantities gives

$$\displaystyle \begin{aligned} \begin{array}{rcl} p^4 \mathrm{Cr}(\mathcal{G}) \geq p^2 |E| -3 p |V| +6; \end{array} \end{aligned} $$

letting \(p=\frac {4 |E|}{|V|}\) yields the desired inequality (15).

3.2.2 Szemerédi-Trotter Theorem

This is an assertion that given n points and m lines in the plane the number of incidences

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} I(m, n) \ll m^{\frac{2}{3}} n^{\frac{2}{3}} + m+n, \end{array} \end{aligned} $$
(16)

(and this is sharp). Consider a set P of m points and a set L of n lines in the plane, realizing the maximal number of incidences I(m, n). Define a drawing of a graph \(\mathcal {G}(V, E)\) in the plane: each point p ∈ P becomes a vertex of \(\mathcal {G}\), and two points p, q ∈ P are connected by an edge if they lie on a common line l ∈ L next to one another. If a line l ∈ L contains k ≥ 1 points of P, then it contributes k − 1 edges to P and hence I(m, n) = |E| + n. Since the edges are parts of the lines, at most \(\binom {n}{2}\) pairs may cross: \(\mathrm {Cr}(\mathcal {G}) \leq \binom {n}{2}\). By the crossing number theorem, \(\mathrm {Cr}(\mathcal {G}) \geq \frac {1}{64} \frac {|E|{ }^3|}{m^2} - n,\) so \(\frac {1}{64} \frac {|E|{ }^3|}{m^2} - n \leq \mathrm {Cr}(\mathcal {G}) \leq \binom {n}{2}\), and a calculation gives \(|E| = O(m^{\frac {2}{3}} n^{\frac {2}{3}} + m)\), proving (16).

3.2.3 Proof of Sum-Product Inequality

We are ready to prove (13) with \(\varepsilon =\frac {1}{4}\). Let P = {(a, b)|a ∈ A + A, b ∈ A ⋅ A}; P is a subset of the plane and has cardinality |A + A||A ⋅ A|. Consider the set of lines of the form {(x, y)  :  y = a(x − b)} where a, b are elements of A. Clearly L has |A|2 elements. Moreover, each such line contains at least |A| points in P, namely, the points (b + c, ac) with c ∈ P. Thus I(P, L) ≥|A|3. Applying the Szemerédi-Trotter theorem and elementary linear algebra, we conclude

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} |A+A| +|A \cdot A| = \Omega (|A|{}^{\frac{5}{4}}). \end{array} \end{aligned} $$
(17)

Before turning to the discussion of Erdös-Volkmann and Katz-Tao discretized ring conjectures, let us note that if the set A is δ-separated, by carefully adapting the preceding proofs, we obtain an inequality of the form

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} \mathcal{N}(A+A, \delta^2) +\mathcal{N}(A \cdot A, \delta^2) > \mathcal{N}(A, \delta)^{1+ \tau}, \end{array} \end{aligned} $$
(18)

to be contrasted with Bourgain’s result (1).

3.3 On the Erdös-Volkmann and Katz-Tao Discretized Ring Conjectures

3.3.1 Erdös-Volkmann Problem

With Volkmann we proved that for every 0 ≤ α ≤ 1 there is a group of real numbers of dimH = α. All our efforts so far failed in proving the existence of ring or field of Hausdorff dimension α.

P. Erdös,Footnote 15 1979

Fig. 5
figure 5

From a letter from P. Erdös to K. Falconer dated 18 June 1983

In 1966 Erdös and Volkmann proved [34] that for each α in (0, 1), there is an additive Borel subgroup of the reals with Hausdorff dimension α. Several proofs of this fact have now been given, all involving some sets of numbers which are well approximated by rationals. It is a well-known result that there exist infinitely many rational approximations \(\frac {m}{n}\) to any real number r with an error less than n−2. If α > 2, let E be the set of real numbers r that can be “well approximated” by rational numbers in the sense that there are infinitely many rational numbers \(\frac {m}{n}\) with \(|r- \frac {m}{n}| < \frac {1}{n^{\alpha }}\). Jarník provedFootnote 16 in 1931 that \(\dim _{H}(E) =\frac {2}{\alpha }\). Falconer’s constructionFootnote 17 of an additive Borel subset with Hausdorff dimension α builds on Jarník’s Theorem: take n k a sequence of positive integers which increases sufficiently rapidly, for example, \(n_{k+1}> n_k^k\). Define the set G α to consist of those real numbers for which there exists M such for any k there is an integer p such that \( |x-\frac {p}{n_k}| < M n_k^{-\frac {1}{\alpha }}\). Clearly G α is an additive subgroup, and it is not difficult to show, using Jarník’s theorem, that its Hausdorff dimension is equal to α.

3.3.2 Katz-Tao Discretized Ring Conjecture

It was shown by Falconer [36] that a Borel subring R of \({\mathbb R}\) cannot have Hausdorff dimension exceeding \(\frac {1}{2}\) (by considerations of the distance set \(\{|a-b|; a, b, \in R\times R\} \subset \sqrt {R}\)).

In the 2001 paper “Some connections between Falconer’s distance set conjecture and sets of Furstenberg type” [50], motivated, in part, by connections with the Kakeya problem, Nets Katz and Terence Tao formulated a quantitative version of Erdös-Volkmann problem (discretized ring conjecture). A bounded subset A of \({\mathbb R}\) is called a (δ, σ)1 set provided A is a union of δ-intervals and satisfies

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} |A \cap I| < (\frac{r}{\delta})^{1-\sigma} \delta^{1-\varepsilon} \end{array} \end{aligned} $$
(19)

whenever \(I \subset {\mathbb R}\) is an arbitrary interval of size δ ≤ r ≤ 1 (0 < ε ≪ 1 in (19) is a small parameter).

Katz and Tao conjectured that if A is a \((\delta , \frac {1}{2})_1\) set satisfying \(|A|>\delta ^{\frac {1}{2} +\varepsilon }\), then necessarily \(|A+A| +|A \cdot A| > \delta ^{\frac {1}{2}-c}\), with c > 0 an absolute constant. This was proved by Bourgain in the paper eponymous with the title of this section. More generally, he proved the following result (which is the precise formulation of (1)).

Theorem 7

If A is a (δ, σ)1 set, 0 < σ < 1, satisfying |A| > δσ+ε , then necessarily |A + A| + |A  A| > δσc , with an absolute constant c = c(σ) > 0.

3.3.3 Labyrinth of the Continuum

The title of this subsection is described by Bourgain in the introduction to his paper [11] in the sentence underlined below.

The statement in Theorem 7 is thus a purely combinatorial fact. We proceed by contradiction, assuming

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} |A+A| +|A \cdot A| < \delta^{\sigma-c}.\end{array} \end{aligned} $$
(20)

The initial stages of the argument use only the additive information, thus |A + A| < δσc. It is processed through multi-scale construction, based on Ruzsa’s sumset estimates, and, most importantly, quantitative versions of Freiman’s famous theorem on finite sets of reals with small doubling set. …The final product is a subset C of A with a tree structure which exhibits a “multi-scale porosity property.” At this point, we start using multiplicative structure and prove the existence of elements x 1, x 2 ∈ A − A such that |x 1 C + x 2 C| > δσκ.

The key difficulty comes from the fact that Freiman’s theorem describes the structure of sets of small doubling |A + A| < C|A| with a fixed constant C, whereas the assumption (20) deals with the situation where the constant C grows with A, as A itself increases in size: the heart of Bourgain’s argument is the structure theorem characterizing sets satisfying (20). The additive subgroups G α described in Sect. 3.3 satisfy this assumption; let us look at their structure more closely, concentrating for concreteness on the case \(\alpha =\frac {1}{2}\) and giving an alternative description of it as a subset of the binary tree representing the continuum (Fig. 6).

Fig. 6
figure 6

Labyrinth of the continuum

Let P n = {0, …, n − 1}, and let

$$\displaystyle \begin{aligned} \begin{array}{rcl}A_n=\sum_{i=1}^n \frac{1}{2^{i^2}} P_{2^i} = \left\{\sum_{i=1}^{n} a_i 2^{-i^2} \, : \, 1 \leq a_i \leq 2^i \right\}.\end{array} \end{aligned} $$

It is easy to see that the distance between distinct points x, x′⊂ A n is at least \(\frac {1}{4^{n^2}}\), such that x has a unique representation as a sum \(\sum _{i=1}^{n} a_i 4^{-i^2}\) with 1 ≤ a i ≤ 2i. Each term of the sum \(\sum _{i=1}^{n} a_i 2^{-i^2}\) determines a distinct block of binary digits; it is seen to be GAP (defined in Sect. 3.1) as the image of \(P_2 \times P_4 \dots \times P_{2^n} \to A_n\) given by \((x_1, \dots x_n) \to \sum _{i=1}^{n} x_i 2^{-i^2}\). The rank of this GAP is n so |A n + A n|≤ 2n|A n| and \(|A_n|= \prod _{i=1}^n |P_{2^i}| = 2^{\frac {n(n+1)}{2}}\). So we have |A n + A n| = |A n|1+o(1).

Now we pass to the limit, akin to the way used in constructing the Cantor set: at stage n, we have a collection of \(2^{\frac {n(n+1)}{2}}\) intervals of length \(2^{-n^2}\); from each of these intervals, we keep 2n+1 subintervals of length \(2^{-(n+1)^2}\) separated by gaps of length \(2^{-n^2-(n+1)}\). It is easy to see that the resulting fractal set coincides with \(G_{\frac {1}{2}}\).

A full binary tree of height h can be identified with a set of 0, 1 valued sequences of length ≤ h. Let us say that the tree T has full branching for m generations at the vertex σ if σ has all 2m possible descendants m generations below it, that is, ση ∈ T for all η ∈{0, 1}m. The tree is fully concentrated for m generations at σ if σ has a single descendant m generations down, that is, there is a unique η ∈{0, 1}m with ση ∈ T. The sets A n are represented by trees T n of height n2. For every i < n, every node at level i2 has full branching for i generations and every node at level i2 + i is fully concentrated for i + 1 generations. Consequently, for every j ∈ [i2, i2 + 1), every node at level j has full branching for one generation; for j ∈ [i2 + i, (i + 1)2), every node at level j is fully concentrated for one generation. Moreover, it is not difficult to see that for every m, we can partition the levels 0, 1, …, n2 into three sets U, V, W such that:

  1. a.

    For every i ∈ U, every level i node has full branching for m generations.

  2. b.

    For every j ∈ V , every level j node is fully concentrated for m generations.

  3. c.

    The set W constitutes a negligible fraction of the levels: \(\frac {|W|}{n^2}=o(1)\) as n → (with m fixed).

In the above description, U =⋃i>m[i2, i2 + i − m), V =⋃i>m[i2 + 1, (i + 1)2 − m), and W is the set of remaining levels.

Bourgain’s structure theorem for sets satisfying (20) can now be informally stated as follows. Suppose |A + A|∼|A|1+τ. If b ≥ 2 is a base (say b = 2), we can identify A with a subset of the full b-ary tree of height m: the vertices at distance j from the root are the intervals [kbmj, (k + 1)bmj) which intersect A. Given ε there are τ > 0 and b ≥ 2 (which can be taken arbitrarily large) such that the following holds if m is large enough. Suppose A ⊂{0, 1, …, bm−1} and |A + A|≤ bτm|A| (which is the case if |A + A|∼|A|1+τ). Then there is a subset A′ of A satisfying the following properties:

  1. 1.

    |A′|≥ bεm|A|, that is to say A′ is a fairly dense subset of A.

  2. 2.

    The b-ary tree associated with A′ is regularized in the sense that any vertex at level j has the same number N j of children

  3. 3.

    Either N j = 1 or N j ≥ b1−ε, so at each level the tree has either no branching or close to full branching uniformly over all the vertices at that level.

From Theorem 7 Bourgain deduced that the answer to Erdös-Volkmann problem was negative, which was proved independently at about the same time by Edgar and Miller [32] who gave a simple and elegant proof using crucially Marstrand’s projection theorem 1. The essential idea of their argument served as the starting point and inspiration for the celebrated paper by Bourgain, Katz, and Tao establishing the sum-product theorem in \({\mathbb F_p}\).

3.4 A Sum-Product Estimate in Finite Fields and Applications

The main result of this paper [19] is the following:

Theorem 8

Let A be a subset of \({\mathbb F_p}\) such that for some δ > 0

$$\displaystyle \begin{aligned} \begin{array}{rcl} p^{\delta}< |A|< p^{1-\delta}. \end{array} \end{aligned} $$
(21)

Then

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} |A+A| +|A \cdot A| \geq c(\delta) |A|{}^{1+\varepsilon} \end{array} \end{aligned} $$
(22)

for some ε = ε(δ) > 0.

Here is Terence Tao’s (Fig. 7) recollection:

Regarding the prehistory of my paper with Jean Bourgain and Nets Katz, it all started with a question of Tom Wolff back in 2000, shortly before his unfortunate death. Tom had formulated the finite field version of the Kakeya conjecture (now solved by Dvir), and had observed that there appeared to be a connection between that conjecture (at least in the 3D case) and what is now the sum-product theorem. (Roughly speaking, if the sum-product phenomenon failed, then one could construct ‘Heisenberg group-like’ examples that almost behaved like Kakeya sets.) So he posed the question to me (as a private communication) as to whether the sum-product phenomenon was true. Nets and I chewed on this problem for a while, and found connections to some other problems (the Falconer distance problem, and the Szemeredi-Trotter theorem, over finite fields), but couldn’t settle things one way or another. We then turned to Euclidean analogues, and formulated the discretized ring conjecture and showed that this was equivalent to a non-trivial improvement on the Falconer distance conjecture and on a conjecture of Wolff relating to some sets studied by Furstenberg.

After chasing some dead ends on both the finite field sum-product problem and the discretized ring problem, we gave both problems to Jean, noting that the sum-product problem would likely have applications to various finite field incidence geometry questions, including Kakeya in \({\mathbb F_p}^3\). Jean managed to solve the discretized ring problem using some multi-scale methods, as well as some advanced Freiman theorem type technology based on earlier work of Jean and Mei-Chu Chang. About the same time, Edgar and Miller solved the qualitative version of the discretized ring problem (i.e. the Erdos ring conjecture).

This left the finite field sum-product problem. All the methods in our collective toolboxes were insensitive to the presence of subfields (except perhaps for Freiman’s theorem, but the bounds were (and still are) too weak to get the polynomial expansion; the multi-scale amplification trick that worked in the discretized ring conjecture was unavailable here) and so were insufficient to solve the problem. We knew that it would suffice to show that some polynomial combination of A with itself exhibited expansion, but we were all stuck on how to do this for about a year, until Jean realized that the Edgar-Miller argument (based on the linear algebra dichotomy between having a maximally large span, and having a collision between generators) could be adapted for this purpose. (I still remember vividly the two-page fax from Jean conveying this point. After this breakthrough the paper got finished up quite rapidly. Of course nowadays there are many simple proofs and strengthenings of this theorem, but it was certainly a very psychologically imposing problem for us before we found the solution.

Fig. 7
figure 7

Jean Bourgain and Terence Tao

In 2006 Bourgain, Glibichuk, and Konyagin [18] proved (22) under the weaker assumption that |A| < p1−δ and, combining this result with Balog-Szemerédi-Gowers lemma, made remarkable progress towards the Montgomery-Vaughan-Wooley conjecture. This asserts that multiplicative subgroups of \({\mathbb F_p}^{*}\) have “negligible additive structure” as soon as \(\frac {|H|}{\log p} \to \infty \). This was established for H satisfying \(|H| \geq p^{\frac {1}{4}+\delta }\) by Konyagin in 2002; Bourgain, Glibichuk, and Konyagin proved that the result holds as soon as |H| > pε for any ε. Subsequently, Bourgain refined and extended this approach [12] to obtain hitherto untouchable estimates for exponential sums pertaining to Diffie-Hellman key exchange [13], a result of fundamental significance in cryptographic applications.

4 Discrete and Continuous Variations on the Expanding Theme

4.1 Bemerkung über den Inhalt von Punktmengen

The types of creatures on the earth are countless, and on an individual level their self-preservation instinct as well as longing for procreation is always unlimited; however the space on which this entire life process plays itself out is limited. It is the surface area of a precisely measured sphere.

Hitlers Zweites Buch, 1928

It is a pity the demented housepainter was not briefed about the Hausdorff-Banach-Tarski constructive solution of Lebensraum problem.Footnote 18 Building on Hausdorff’s 1914 construction [44], detailed below, Banach and Tarski, in 1924, proved [4] that there is a way of decomposing a three-dimensional ball (“precisely measured sphere”) into a finite number of disjoint pieces and then reassembling the pieces to form two balls of the same radius, where “reassembling” means that the pieces are translated and rotated and that they end up still disjoint.

The construction, perhaps one of the most strikingly paradoxical in Mathematics (Fig. 8), has its origins in the question posed by Lebesgue in 1904, in the first textbook on integration bearing his name [56]. One of the properties of his integral is the monotone convergence theorem (MCT); is this property really fundamental or follows from more familiar integral axioms? Now MCT is essentially equivalent to countable additivity so the question is concerned with the existence of a positive, finitely (but not countably) additive measure on the reals assigning measure one to the unit interval.

Fig. 8
figure 8

Banach-Tarski hedgefund

In more detail, the problem is to assign a non-negative real number f(A) to each bounded subset \(A \in {\mathbb R}^n\) in such a way that:

  1. (1)

    f(E) = 1 if E is the closed unit cube in \({\mathbb R}^n\)

  2. (2)

    f(A) = f(B) if A and B are congruent

  3. (3)

    f(A ∪ B) = f(A) + f(B) if A and B are disjoint

  4. (4)

    f(A 1 ∪ A 2 ∪… ) = f(A 1) + f(A 2) + … if A 1, A 2, … is any denumerable sequence of mutually disjoint sets whose union is bounded

The congruence condition in 4.1 is as follows: A and B are congruent if there exists an element g in the Euclidean group of distance preserving transformations in \({\mathbb R}^n\) such that g(A) = g(B). The problem of existence of such an f is the σ-additive measure problem; the problem of existence of f verifying only the first three properties is the finitely additive measure problem.

Lebesgue had left the countably additive measure problem in \({\mathbb R}^n\) unresolved; his construction had proved the existence of f(A) for Lebesgue-measurable bounded subsets and had left the existence of non-measurable subsets as an open question. This was settled by Vitali on 1905 [92], whose construction is a forerunner of the Hausdorff-Banach-Tarski. Let l θ be a line segment in \({\mathbb R}^2\) given by l θ = {(r, θ) : 0 ≤ r ≤ 1} in polar coordinates. Consider ⋃θ l θ = D′ a unit disc with the origin removed. The line segments l θ and l ϕ belong to the same equivalence class if θ − ϕ is a rational multiple of π. Consider a set E that is a union of a set of l θ containing exactly one representative from each equivalence class. Rationals are countable: Q ∩ [0, 1] = x 1, x 2, …. Write \(E_n=\{l_{\theta + 2 \pi x_n} \, :\, l_{\theta } \in E\}\). Then each E n is obtained from E by rotation around the origin (by angle 2πx n); the sets E n are disjoint (since E contains representative from each equivalence class), ⋃n E n = D′. Now take D′ and split it into the set F consisting of the union of the sets E 2n and the set G consisting of the sets E 2n+1. Each E 2n can be rotated to E n, and the union of the E n gives us D′. Similarly, each E 2n+1 can be rotated to E n, and the union of the E n gives us D′ again. Thus the punctured unit disc can be split into a countable set of disjoint pieces (all obtained by rotation of one particular set) and translated to form disjoint sets whose union is two copies of D′.Footnote 19

Hausdorff begins his 1914 paper Bemerkung über den Inhalt von Punktmengen [44]by using the subgroup \(G_{\delta }=\{ n \delta \, , \, n \in {\mathbb Z}\}\) (where δ is a fixed irrational number) to show that the σ-additive problem in \({\mathbb R}^n\) has no solution for any n ≥ 1. Both Vitali and Hausdorff use a denumerably dense subgroup of the additive group (in Hausdorff’s case the dense group is \(G=G_{\delta } + {\mathbb Z}\)).

He then proceeds to show that the finitely additive measure problem in \({\mathbb R}^n\) has no solution if n ≥ 3 by reducing the problem to the unit sphere K = S2 in \({\mathbb R} ^3\) and then producing the so-called Hausdorff paradoxical decomposition

$$\displaystyle \begin{aligned} \begin{array}{rcl}{} K=A\cup B\cup C \cup Q \end{array} \end{aligned} $$
(23)

where A, B, C, Q are four disjoint subsets of K, Q being denumerable and A ∼ B ∼ C ∼ B ∪ C, the congruence here being under the group of rotations SO(3).

A decomposition (23) excludes the possibility of having an SO(3) invariant finitely additive positive measure set function defined for all subsets of K with f(K) > 0: indeed for such an f, f(Q) must be zero and f(A) = f(B) = f(C) = f(B ∪ C) = f(B) + f(C), whence all of these numbers are zero, which is impossible since 0 < f(K) = f(A) + f(B) + f(C).

The decomposition (23) is obtained by the consideration of a denumerable subgroup G = G(θ, ϕ) of SO(3) generated by two rotations θ, ϕ such that θ2 = 1 , ϕ3 = 1, 1 being identity map, and such that θ, ϕ satisfy no other nontrivial relations. As observed by von Neumann,Footnote 20 the group G(θ, ϕ) is isomorphic to the free product of \({\mathbb Z}_2\) and \({\mathbb Z}_3\) and must necessarily contain F 2, the free group on two generators

This left open the finitely additive problem in \({\mathbb R}^1\) and \({\mathbb R}^2\); Banach begins his 1923 paperFootnote 21 (giving the title to the next subsection) by showing that in these spaces the finitely additive measure problem does have infinitely many solutions.

4.2 Sur le problème de la mesure

Banach was not a mathematician of finesse, he was a mathematician of power. Inside he combined a spark of genius with that amazing inner imperative, which incessantly whispered to him, as in Verlaine’s verse, ‘Il n’y a que la glorie ardente du mètier’ [There is only one thing: that intense glory of the craft] – and mathematicians know well that their craft depends on the same mystery as the craft of poets.

Hugo SteinhausFootnote 22

In this seminal paper [3], Banach considers three questions pertaining to the invariance of finitely additive measures. First, he constructs a finitely additive, positive, translation-invariant measure μ on the family of bounded subsets of \({\mathbb R}\) such that:

  1. (1)

    μ(A) <  for every bounded subset of \({\mathbb R}\) (so that μ gives rise in an obvious way to an element μ A of l(A)).

  2. (2)

    \(\mu _{[a, b]} (f) = \int _a^b f(x) d x \) for every Riemann integrable function f on an interval [a, b].

  3. (3)

    There exists a Lebesgue integrable function g on an interval [c, d] s.t. \(\mu _{[c, d]} (g) \neq \int _c^d g(x) d x \).

The second result, which Banach calls “le probleme large de la mesure,” is to show that unlike the case of n ≥ 3, studied by Hausdorff, the finitely additive measure problem in \(\mathbb {R}^n\) for n = 1, 2 does have infinitely many solutions.

The third question, posed by Ruziewicz in 1921, is whether Lebesgue measure on the n-sphere is the unique finitely additive rotation invariant measure defined on Lebesgue subsets. Using Hahn-Banach theorem, Banach showed that that for n = 1, the answer is negative, using essentially the commutativity of SO(2). He left the case of n > 2 open.

For n > 3, the affirmative answer was obtained in 1980/1981 by Margulis [65] and Sullivan [85] who used Kazhdan’s property T [53].

In 1984 Drinfeld established [30] the affirmative answer in the most difficult case of n = 2 by proving existence of an element in the group ring of SU(2) which has a spectral gap. As proved by Sarnak (Fig. 9) [78], the affirmative answer for n = 2 implies, via inductive construction, an affirmative answer for n ≥ 2.

Fig. 9
figure 9

Jean Bourgain and Peter Sarnak

Drinfeld method used some sophisticated machinery from the theory of automorphic representations, in particular Deligne’s solution of Ramanujan conjecture [29]. In 1986 the explicit and optimal construction, appealing to the abovementioned tools, was obtained by Lubotzky, Phillips, and Sarnak [59, 60], in tandem with their celebrated construction (independently given by Margulis [66]) of Ramanujan graphs [61].

4.3 Ramanujan-Selberg Conjecture

In 1916 Ramanujan [75] made two deep conjectures about the coefficients of

$$\displaystyle \begin{aligned} \begin{array}{rcl} q \prod_{n=1}^{\infty}(1-q^n)^{24} = \sum_{n=1}^{\infty} \tau(n) q^n.\end{array} \end{aligned} $$
(24)

The first was the multiplicativity of the coefficients: if (m, n) = 1

$$\displaystyle \begin{aligned} \begin{array}{rcl} \tau(mn)=\tau(m)\tau(n); \end{array} \end{aligned} $$
(25)

the second was an estimate

$$\displaystyle \begin{aligned} \begin{array}{rcl} |\tau(n)| \leq d(n) n^{\frac{11}{2}}\end{array} \end{aligned} $$
(26)

where d(n) is the number of divisors of n. In particular,

$$\displaystyle \begin{aligned} \begin{array}{rcl} |\tau(p)| \leq p^{\frac{11}{2}}\end{array} \end{aligned} $$
(27)

for primes p.

The first was proved by Mordell in 1917 [70] and marked the beginning of Hecke’s theory of Hecke operators. The second was proved by Deligne in 1974 [29] and is one of the crowning achievements of twentieth-century mathematics.Footnote 23

In his seminal 1965 paper On the estimation of Fourier coefficients of modular forms, Selberg [81] formulated an analogue of Ramanujan conjecture for non-holomorphic or Maaß forms and showed that it is equivalent to the following statement about the first positive eigenvalue of the Laplacian (Selberg’s eigenvalue conjecture Footnote 24)

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} \lambda_1(X(p)) \geq \frac{1}{4} ,\end{array} \end{aligned} $$
(28)

where \(X(p) = {\mathbb H} \backslash \Gamma (p)\), the quotient of the hyperbolic plane by the congruence subgroup

$$\displaystyle \begin{aligned} \begin{array}{rcl}\Gamma(p) = \{ \gamma \in \mathrm{SL}_2({\mathbb Z}) \, : \, \gamma \equiv \begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix} \mod \, p\}.\end{array} \end{aligned} $$

By the variational characterization of the first eigenvalue, we have

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} \lambda_1(X(p)) = \inf _{\int _{X(p)} f d \mu =0} \frac{\int _{X(p)} |\nabla f|{}^2 d \mu}{\int _{X(p)} f^2 d \mu}.\end{array} \end{aligned} $$
(29)

Using Weil’s bound for Kloosterman sums (obtained as a consequence of his proof of the Riemann hypothesis for curves), Selberg proved the following celebrated result:

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} \lambda_1(X(p)) \geq \frac{3}{16}.\end{array} \end{aligned} $$
(30)

This result can be viewed as (implicitly) giving rise to the first family of expander graphs.

4.4 Expanders

Expanders are highly connected sparse graphs widely used in computer science. Clearly high connectivity is desirable in any communication network. The necessity of sparsity is perhaps best seen in the case of the network of neurons in the brain: since the axons have finite thickness, their total length cannot exceed the quotient of the average volume of one’s head and the area of axon’s cross section. In fact, this is the context in which expander graphs first implicitly appeared in the work of Barzdin and Kolmogorov in 1967 [54].

There are several ways of making the intuitive notions of connectivity and sparsity precise; the simplest and most widely used is the following.

Given a subset of vertices, its boundary is the set of edges connecting the set to its complement. The expansion of a subset is a ratio of the size of a boundary to the size of a set. The expansion of a graph is a minimum over all expansion coefficients of its subsets. Note that the expansion coefficient is strictly positive if and only if the graph is connected.

The expansion coefficient captures the notion of being highly connected; the bigger the expansion coefficient, the more highly connected is the graph. Of course one can simply connect all the vertices, but in this case, the number of edges grows as a square of the number of vertices. The problem of constructing expanders is nontrivial because we put the second constraint: the graphs are to be sparse, i.e., the number of edges should grow linearly with the number of vertices. The simplest way to accomplish this is to demand that the graphs be regular, that is, each vertex has the same number of neighbors (say 3).

A family of k-regular graphs \(\mathcal {G}_{n,k}\) forms a family of expanders if there is a fixed positive constant c, such that

$$\displaystyle \begin{aligned} \begin{array}{rcl} {} \liminf_{n \rightarrow \infty} c(\mathcal{G}_{n,d}) \geq c>0. \end{array} \end{aligned} $$
(31)

The expansion coefficient is a notion which is very easy to grasp, but it is difficult to compute numerically or to estimate analytically, as the number of subsets grows exponentially with the number of vertices. The starting point of most current work on expanders is that the expansion coefficient has a spectral interpretation:Footnote 25 to put it sonorously, if you hit a graph with a hammer, you can determine how highly connected it is by listening to the bass note. In more technical terms, high connectivity is equivalent to establishing a spectral gap for an averaging (or Laplace) operator on the graph so that condition (31) has the following alternative expression:

$$\displaystyle \begin{aligned}\begin{array}{rcl} {} \liminf_{n \rightarrow \infty} \lambda_{1}(\Delta(\mathcal{G}_{n,k})) \geq \mu > 0, \end{array} \end{aligned} $$
(32)

making apparent the connection with Selberg’s celebrated \(\frac {3}{16}\) Theorem (30).

In 1973 Pinsker [73] observed that random regular graphs are expanders. In the same year, Margulis [64] gave the first explicit construction of expanders as Cayley graphsFootnote 26 of \(\mathrm {SL}_3({\mathbb F_p})\) using Kazhdan’s property T [53].

4.5 Superstrong Approximation

The strong approximation for \(\mathrm {SL}_n({\mathbb Z})\), asserting that the reduction π q modulo q is onto, is a consequence of the Chinese remainder theorem; its extension to arithmetic groups is far less elementary but well understood. If S is a finite symmetric generating set of \(\mathrm {SL}_n({\mathbb Z})\), strong approximation is equivalent to the assertion that the Cayley graphs \(\mathcal {G}(\mathrm {SL}_n({\mathbb Z}/q {\mathbb Z}), \pi _q(S))\) are connected. The quantification of this statement, asserting that they are in fact highly connected, that is to say form a family of expanders, is what we mean by superstrong approximation. The proof of the expansion property for \(\mathrm {SL}_2({\mathbb Z})\) has its roots in Selberg’s celebrated lower bound (30). The generalization of the expansion property to \(G({\mathbb Z})\) where G is a semi-simple matrix group defined over \({\mathbb Q}\) is also known thanks to developments towards the general Ramanujan conjectures that have been established; this expansion property is also referred to as property τ for congruence subgroups.

Let Γ be a finitely generated subgroup of \(\mathrm {GL}_n({\mathbb Z})\) and let G = Zcl( Γ). The discussion of the previous paragraph applies if Γ is of finite index in G. However, if Γ is thin, that is to say, of infinite index in \(G({\mathbb Z})\), then \({\mbox{vol}}(G({\mathbb R}) \backslash \Gamma ) = \infty \), and the techniques used to prove both of these properties do not apply. It is remarkable that under suitable natural hypothesis, strong approximation continues to hold in this thin context, as proved by Matthews, Vasserstein, and Weisfeller in 1984 [68, 94]. That the expansion property might continue to hold for thin groups was first suggested by Lubotzky and Weiss in 1993 [62]; for \(\mathrm {SL}_2({\mathbb Z})\), the issue is neatly encapsulated in the following 1-2-3 question of Lubotzky [58]. For a prime p ≥ 5 and i = 1, 2, 3, let us define \(S_{p}^{i} =\left \{ \begin {smallmatrix} 1 & i\\ 0 & 1 \end {smallmatrix} \right ) \, , \left ( \begin {smallmatrix} 1 & 0\\ i & 1 \end {smallmatrix} \right \} \). Let \(\mathcal {G}_{p}^{i} = \mathcal {G} \left ( \mathrm {SL}_2({\mathbb Z}/ p {\mathbb Z}) \, ,S_p^i \right )\), a Cayley graph of \(\mathrm {SL}_2({\mathbb Z}/ p {\mathbb Z})\) with respect to \(S_p^i\). By Selberg’s theorem, \(\mathcal {G}_{p}^1\) and \(\mathcal {G}_{p}^{2}\) are families of expander graphs. However, the group \(\langle \left ( \begin {smallmatrix} 1 & 3\\ 0 & 1 \end {smallmatrix} \right ) \, , \left ( \begin {smallmatrix} 1 & 0\\ 3 & 1 \end {smallmatrix} \right ) \rangle \)has infinite index and thus does not come under the purview of Selberg’s theorem.

Following the groundbreaking work of Helfgott [45] (which builds crucially on sum-product estimate in \({\mathbb F_p}\) discussed in Sect. 3.4), Bourgain and Gamburd [16] gave a complete answer to Lubotzky’s question. The method introduced in uniform expansion bounds for Cayley graphs of \(\mathrm {SL}_2({\mathbb Z}/ p {\mathbb Z})\) and developed in a series of papers became known as “Bourgain-Gamburd expansion machine”; thanks to a number of major developments by many people, the general superstrong approximation for thin groups is now known. The state of the art is summarized in thin groups and superstrong approximation [21] which contains an expanded version of most of the invited lectures from the eponymous MSRI ‘Hot Topics’ workshop, in the surveys by Breuillard [20] and Helfgott [46], and in the book by Tao Expansion in Finite Simple Groups of Lie Type [91].

4.6 On the Spectral Gap for Finitely Generated Subgroups of SU(d)

There is an Archimedean analogue of the expansion property, intimately related to the Banach-Ruziewicz problem discussed in Sect. 4.2, defined as follows.

For k ≥ 2, let g 1, …, g k be a finite set of elements in G = SU(d) (d ≥ 2). We associate with them an averaging (or Hecke) operator \(z_{g_1, \dots , g_k}\), taking L2(SU(d)) into L2(SU(d)):

$$\displaystyle \begin{aligned} \begin{array}{rcl} z_{g_1, \dots, g_k} f(x) = \sum_{j=1}^{k}(f(g_j x) +f(g_j^{-1}(x)).\end{array} \end{aligned} $$

We denote by supp(z) the set \(\{g_1, \dots , g_k, g_1^{-1}, \dots , g_k^{-1}\}\) and by Γz the group generated by supp(z). It is clear that \(z_{g_1, \dots , g_k}\) is self-adjoint and that the constant function is an eigenfunction of z with eigenvalue λ 0(z) = 2k. Let \(\lambda _{1}(z_{g_1, \dots , g_k})\) denote the supremum of the eigenvalues of z on the orthogonal complement of the constant functions in L2(SU(d)). We say that z has a spectral gap if \(\lambda _{1}(z_{g_1, \dots , g_k})<2k.\) It is common to, alternatively, refer to the situation described above, by asserting that the spectral gap property holds for Γz.

It is easy to see that affirmative solution of Banach-Ruziewicz follows from existence of z in SU(2) having a spectral gap. In their 1986 paper, referenced at the end of Sect. 4.2, Lubotzky, Philips and Sarnak posed a question of whether generic in measure z in SU(2) has a spectral gap.

In 2008 Bourgain and Gamburd [17] proved (Theorem 9 below) the spectral gap property for z in SU(2) satisfying the non-commutative diophantine property (NDP)—in particular for free subgroups generated by elements with algebraic entries.

The definition of non-commutative diophantine propertyFootnote 27 introduced in the paper “Spectra of elements in the group ring of SU(2)” by Gamburd, Jakobson, and Sarnak [41] is as follows. We say that \(z_{g_1, \dots , g_k}\) satisfies NDP if there is D = D(g 1, …, g k) > 0 (the diophantine constant of z) such that for any m ≥ 1 and a word W m in g 1, …, g k of length m with W m ≠ ± e (where e denotes the identity in SU(2)) ∥W m ± e∥≥ Dm.

Theorem 9

Let g 1, …, g k be a set of elements in SU(2) generating a free group and satisfying NDP (in particular, elements with algebraic entries Footnote 28 ). Then \(z_{g_1, \dots , g_k}\) has a spectral gap.

Regarding the proof, let me just note that in the adaption of the “expansion machine” to this Archimedean setting, the crucial role is played by the following strengthening of Theorem 7.

Theorem 10

Given 0 < δ < 1 and κ > 0, there exists ε 0 > 0 and ε 1 > 0 such that if δ > 0 is sufficiently small and A ⊂ [1, 2] is a discrete set consisting of δ-separated points, satisfying |A| = δσ and

$$\displaystyle \begin{aligned} \begin{array}{rcl} |A\cap I| < \rho^{\kappa} |A| \end{array} \end{aligned} $$
(33)

whenever I is a size ρ interval with \(\delta < \rho < \delta ^{\varepsilon _0}\) , then

$$\displaystyle \begin{aligned} \begin{array}{rcl} N(A+A, \delta) +N(A\cdot A, \delta) > \delta^{-\varepsilon_1} |A|. \end{array} \end{aligned} $$
(34)

Theorem 9 is of importance in quantum computing [28, 43]. In the context of quantum computation, elements of a three-dimensional rotation group are viewed as “quantum gates,” and a set of elements generating a dense subgroup is called “computationally universal” (since any element of rotation group can be approximated by some word in the generating set to an arbitrary precision). A set of elements is called “efficiently universal” if any element can be approximated by a word of length which is logarithmic with respect to the inverse of the chosen precision (this is the best possible). A consequence of Theorem 9 is that computationally universal sets with algebraic entries are efficiently universal.

Another application is related to the theory of quasicrystals. Generalizing Penrose’s two-dimensional aperiodic tiling, John Conway and Charles Radin [26] constructed a self-similar (hierarchical) tiling of a three-dimensional space with a single prototile, such that the tiles occur in an infinite number of different orientations in the tiling. The tile is a prism, which when scaled up by two is subdivided into eight copies of itself (“daughter tiles”). If one iterates this same subdivision procedure over and over, one creates in the limit the desired tiling of three-dimensional space by prisms. Conway and Radin showed that the orientations of tiles in the tiling are uniformly distributed and posed the question of how fast this convergence to uniform distribution takes place. This question reduces to the study of the spectral gap for the averaging operator associated with eight rotations giving orientations of daughter tiles. A consequence of Theorem 9 is that this convergence takes place exponentially fast.

5 Coda

The essence of mathematics lies precisely in its freedom.

Georg Cantor

Already history has in a sense ceased to exist, i.e. there is no such thing as a history of our own times which could be universally accepted, and the exact sciences are endangered as soon as military necessity ceases to keep people up to the mark. Hitler can say that the Jews started the war, and if he survives, that will become official history. He can’t say that two and two are five, because for the purposes of, say, ballistics they have to make four.

George Orwell, letter to N. Wilmett, 18 May 1944

Freedom is the freedom to say that two plus two make four. If that is granted, all else follows.

George Orwell, Nineteen Eighty-Four, 1949

The difficulties of explaining Bourgain’s work to a broad mathematical audience turned out to be quite substantial;Footnote 29 omitting “mathematical” from the appellation renders them nearly insurmountable.

Ian Stewart begins his admirable book The Problems of Mathematics (Oxford University Press, 1987) with an interview with a mathematician conducted by Seamus Android on behalf of the proverbial man in the streetFootnote 30 invoked in Hilbert’s celebrated 1900 address Problems of Mathematics, referenced at the beginning of Sect. 2.

Mathematician: It’s one of the most important discoveries of the last decade!

Android: Can you explain it in words ordinary mortals can understand?

Mathematician: Look, buster, if ordinary mortals could understand it, you would not need mathematicians to do the job for you, right? You can’t get a feeling for what’s going on without understanding the technical details. How can I talk about manifolds without mentioning that the theorem only works if the manifolds are finite dimensional paracompact Hausdorff with empty boundary?

Android: Lie a bit.

Mathematician: Oh, but I could not do that!

Android: Why not? Everybody else does.

Perhaps the most troubling omen of our times is an assault on the very basic notions of logic and truth, in their most elemental Aristotelian sense, including, in particular, the law of the excluded middle. Our discipline stands as a mighty fortress against this assault, and I, for one, believe we should not be overly defensive about our reluctance to lie a bit just because everybody else does.

***

Of all escapes from reality, mathematics is the most successful ever. It is a fantasy that becomes all the more addictive because it works back to improve the same reality we are trying to evade. All other escapes – sex, drugs, hobbies, whatever –are ephemeral by comparison. The mathematician’s feeling of triumph, as he forces the world to obey the laws his imagination has freely created, feeds on its own success. The world is permanently changed by the workings of his mind, and the certainty that his creations will endure renews his confidence as no other pursuit.

Gian-Carlo Rota, ‘The Lost Cafe’, 1987

The one who writes a poem writes it above all because verse writing is an extraordinary accelerator of conscience, of thinking, of comprehending the universe. Having experienced this acceleration once, one is no longer capable of abandoning the chance to repeat this experience; one falls into dependency on this process, the way others fall into dependency on drugs or on alcohol. One who finds himself in this sort of dependency on language is, I guess, what they call a poet.

Joseph Brodsky, ‘Nobel Lecture’, 1987

To paraphrase W. H. Auden (writing In Memory of W. B. Yeats), [Mathematics]Footnote 31 makes nothing happen: it survives

In the valley of its making, where executives

Would never want to tamper.

In attempting to explain the significance of Bourgain’s remarkable and remarkably useful results to a proverbial human-on-line, one may invoke their applications in mathematical physics, computer science, and cryptography, which are of immense practical importance in contemporary life, making, in particular, the online communication possible. Their subtlety, beauty, and depth appear to be much harder to convey in “plain English.” Here and now, perhaps, we must remind ourselves that the human-on-line, while attached to a digital device (built by von Neumann), is still human and sound bite/tweet thus: while dealing with entities seemingly fake/unreal (e.g., the real line), Bourgain’s singular adventures in the labyrinth of the continuum represent a magnificent and transcendent achievement of the human spirit.

***

I met Jean in September 2005, 6 months after my daughter (who drew the pictures for this essay) was born, while visiting IAS for the program “Lie Groups, Representations and Discrete Mathematics” led by Alex Lubotzky. I do not remember the precise date but do remember the hour: it was between 2 and 3 am. After changing my daughter’s diapers, I could not sleep, went to Simonyi Hall, and ran into Jean walking to the Library. It was in this discombobulated state that I was free of fear to speak to him. By dawn, the problem which had been resisting my protracted attack for a decade was vanquished in Jean’s office.Footnote 32

During this happiest year of my life, in 2005–2006, I stayed on the Lane named after Hermann Weyl who was of the view that “Mathematics is not the rigid and uninspiring schematism which the laymen is so apt to see in it; on the contrary, we stand in mathematics precisely at that point of limitation and freedom which is the essence of man himself.”

During my second visit to IAS, in 2007–2008, as von Neumann Fellow participating in the “Arithmetic Combinatorics” Program led by Jean Bourgain and Van Vu, I stayed on the Lane named after Erwin Panofsky. His magnificent essay The History of Art as a Humanistic Discipline, based on The Spencer Trask Princeton University Lectures for 1937–38, commences thus:

Nine days before his death Immanuel Kant was visited by his physician. Old, ill, and nearly blind, he rose from his chair and stood trembling with weakness and muttering unintelligible words. Finally his faithful companion realized that he would not sit down until the visitor has taken a seat. This he did, and Kant then permitted himself to be helped to his chair, and, after regaining some of his strength, said, ‘Das Gefühl für Humanität hat mich noch nicht verlassen’ – ‘The sense of humanity has not yet left me’. The two men were moved almost to tears. For, though the word Humanität had come, in the eighteenth century, to mean little more than politeness or civility, it had, for Kant, a much deeper significance, which the circumstances of the moment served to emphasize: man’s proud and tragic consciousness of self-approved and self-imposed principles, contrasting with his utter subjection to illness, decay and all that is implied in the word ‘mortality’.

Towards the end of the essay, Panofsky thus (pre-)echoes Orwell: “If the anthropocratic civilization of the Renaissance is headed, as it seems to be, for a Middle Ages in reverse –a satanocracy as opposed to the mediaeval theocracy – not only the humanities but also natural sciences, as we know them, will disappear, and nothing will be left but what serves the dictates of the sub-human.”

During my third, short visit (Fig. 10), I stayed on von Neumann Drive (the only other “Drive” at IAS is named after Einstein). The similarities between von Neumann and Baron Bourgain are subtle and striking.Footnote 33 In his article The Legend of John von Neumann [42], Paul Halmos has the following to say: “The heroes of humanity are of two kinds: the ones who are just like all of us, but very much more so, and the ones who, apparently, have and extra-human spark. We can all run, and some of us can run the mile in less than 4 minutes; but there is nothing that most of us can do that compares with the creation of the Great G-minor Fugue. Von Neumann’s greatness was the human kind. We can all think clearly, more or less, some of the time, but von Neumann’s clarity of thought was orders of magnitude greater than that of most of us, all the time. Both Norbert Wiener and John von Neumann were great men, and their names will live after them, but for different reasons. Wiener saw things deeply but intuitively; von Neumann saw things clearly and logically.” One may agree or disagree with Halmos’s assessment; it is my belief that Bourgain’s greatness combined these two kinds.

Fig. 10
figure 10

Jean Bourgain, Peter Sarnak, Alex Gamburd

***

The IAS (where Jean did most of the work described in this essay) official seal (Fig. 11) is imprinted on the Analysis and Beyond conference poster. In a circular format, the quiet elegant and classical Art Deco composition depicts two graceful young ladies, one clothed and one otherwise, standing on opposite sides of a leafy tree that appears to bear abundant fruit. Their poses are complementary, one looking out towards the spectator and the other looking down, avoiding eye contact. The figures are named in large sans serif letters, TRUTH to the left and BEAUTY on the right. Truth holds a mirror that overlaps the circular frame to reflect reality.

Fig. 11
figure 11

The IAS Seal

Underlying the design of the seal is the evident allusion to the famous final couplet of “Ode on a Grecian Urn”: “Beauty is truth, truth beauty,” – that is all Ye know on earth, and all ye need to know by John Keats, who was of the view that “the excellence of every art is its intensity, capable of making all disagreebles evaporate from their being in close relationship with Beauty and Truth.”

Having attempted in this essay a snapshot of the excellence of Bourgain’s art, let me conclude by giving a glimpse of his intensity by quoting from the interview upon receiving the 2017 Breakthrough Prize in Mathematical Sciences (Fig. 12):

If you have a question which is generally perceived as unapproachable, it is often that you do not even quite know where you have to look to get a solution. From that point of view, we are rather like Fourier,Footnote 34 stranded in the desert, hopelessly lost. At the moment you get this insight, all of a sudden you escape the desert and things open up for you. Then we feel very excited. These are the best moments. They make up for all the suffering with absolutely no progress worth it.

Fig. 12
figure 12

Richard Taylor, Jean Bourgain, Terence Tao