Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

10.1 Integral Calculus

In a concise description of mathematical methods, Henri Lebesgue underlined the importance of definitions and axioms (see [47]).

When a mathematician foresees, more or less clearly, a proposition, instead of having recourse to experiment like the physicist, he seeks a logical proof. For him, logical verification replaces experimental verification. In short, he does not seek to discover new materials but tries to become aware of the richness that he already unconsciously possesses, which is built in the definitions and axioms. Herein lies the supreme importance of these definitions and axioms, which are indeed subjected logically only to the condition that they be compatible, but which could lead only to a purely formal science, void of meaning, if they had no relationship to reality.

Leibniz conceived integration as the reciprocal of differentiation:

$$\displaystyle{\int dx = d\int x = x.}$$

The computation of the integral of f is reduced to the search for its primitive, solution of the differential equation

$$\displaystyle{F^{\prime} = f.}$$

The textbooks by Cauchy, in particular the Analyse algébrique (1821) (see [7]) and the Résumé des leçons données à l’Ecole Royale Polytechnique sur le calcul infinitésimal (1823), opened a new area in analysis. Cauchy was the first to consider the problem of existence of primitives:

In integral calculus, it seemed necessary to me to demonstrate in general the existence of integrals or primitive functions before giving their various properties. In order to reach this, it was necessary to establish the notion of integral between two given limits or definite integral.

Cauchy defines and proves the existence of the integral of continuous functions:

According to the preceding lecture, if one divides X − x 0 into infinitesimal elements x 1 − x 0, x 2 − x 1X − x n − 1, the sum

$$\displaystyle{S = (x_{1} - x_{0})f(x_{0}) + (x_{1} - x_{2})f(x_{1}) + \cdots + (X - x_{n-1})f(x_{n-1})}$$

will converge to a limit given by the definite integral

$$\displaystyle{\int _{x_{0}}^{X}f(x)dx.}$$

So Cauchy proved the existence of primitives of continuous functions using integral calculus.

Though every continuous function has a primitive, Weierstrass proved in 1872 the existence of continuous nowhere differentiable functions. In a short note [44], Lebesgue proved the existence of primitives of continuous functions without using integral calculus. His proof is clearly functional-analytic.

In 1881 ([37]), Camille Jordan defined the functional space of functions of bounded variation, which he called functions of limited oscillation. His goal was to linearize Dirichlet’s condition for the convergence of Fourier series:

Let x 1, , x n be a series of values of x between 0 and ɛ, and y 1, , y n the corresponding values of f(x). The points x 1, y 1; ; x n , y n will form a polygon.

Consider the differences

$$\displaystyle{y_{2} - y_{1},y_{3} - y_{2},\ldots,y_{n} - y_{n-1}.}$$

We will call the sum of the positive terms of this sequence the positive oscillation of the polygon; negative oscillation is the sum of the negative terms; total oscillation is the sum of those two partial oscillations in absolute value.

Let us vary the polygon; two cases may occur:

  1. 1 ∘ 

    The polygon may be chosen so that its oscillations exceed every limit.

  2. 2 ∘ 

    For every chosen polygon, its positive and negative oscillations will be less than some fixed limits P ɛ and N ɛ . We will say in that case that F(x) is a function of limited oscillation in the interval from 0 to ɛ; P ɛ will be its positive oscillation; N ɛ its negative oscillation; P ɛ  + N ɛ its total oscillation.

This case will necessarily occur if F(x) is the difference of two finite functions f(x) − φ(x), because it is clear that the positive oscillation of the polygon will be \(\stackrel{\,}{\mathop{ \begin{array}{c} = \\ < \end{array} }}f(\varepsilon ) -f(0)\), and its negative oscillation \(\stackrel{\,}{\mathop{ \begin{array}{c} = \\ < \end{array} }}\varphi (\varepsilon ) -\varphi (0).\).

The converse is easy to prove. Indeed, it is easy to verify that

  1. 1 ∘ 

    The oscillation of a function from 0 to ɛ is equal to the sum of its oscillations from 0 to x and from x to ɛ, x being any quantity between 0 and ɛ.

  2. 2 ∘ 

    We have that F(x) = F(0) + P x  − N x , P x and N x denoting the positive and the negative oscillations from 0 to x. But F(0) + P x and N x are finite functions nondecreasing from 0 to ɛ.

Hence Dirichlet’s proof is applicable, without modification, to every function of bounded oscillation from x = 0 to x = ɛ, ɛ being any finite quantity.

The functions of limited oscillations constitute a well-defined class, whose study could be of some interest.

Functions of bounded variation will play a fundamental role in the following domains:

  1. (a)

    Convergence of Fourier series;

  2. (b)

    Rectification of curves;

  3. (c)

    Integration;

  4. (d)

    Duality.

Let u : [0, 1] →  be a continuous function. The length of the graph of u is defined by

$$ \begin{array}{lr} L(u) =\sup \left\{ \sum _{j=0}^{k}{\Bigl [{(a_{ j+1} - a_{j})}^{2} +{\bigl ( u(a_{ j+1}) - u{(a_{j})\bigr )}{}^{2}\Bigr ]}}^{1/2} :\right. \\ \\ \qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad k \in \mathbb{N},0 = a_{0} < a_{1} <\ldots < a_{k+1} = 1 \Biggr\} \end{array}$$

In 1887, in Volume III of the first edition of his Cours d’Analyse at the École Polytechnique, Jordan proved that L(u) is finite if and only if u is of bounded variation. The case of surfaces is much more delicate (see Sect. 10.3).

In 1894 ([80]), Stieltjes defined a deep generalization of the integral associated to an increasing function φ:

More generally, let us consider the sum

$$\displaystyle{f(\xi _{1}){\bigl [\varphi (x_{1}) -\varphi (x_{0})\bigr ]} + f(\xi _{2}){\bigl [\varphi (x_{2}) -\varphi (x_{1})\bigr ]} +\ldots +f(\xi _{n}){\bigl [\varphi (x_{n}) -\varphi (x_{n-1})\bigr ]}.}$$

It will still have a limit, which we shall denote by

$$\displaystyle{\int _{a}^{b}f(u)d\varphi (u).}$$

We will have only to consider some very simple cases like f(u) = u k, \(f(u) = \frac{1} {z+u}\), and there is no interest in giving to the function f(u) its full generality. Thus it will suffice, as an example, to suppose the function f(u) continuous, and then the proof presents no difficulty, and we have no need to develop it, since it is done as in the ordinary case of a definite integral.

It is easy to extend Stieltjes’s definition to every function φ of bounded variation. Stieltjes breaks the reciprocity between integral and derivative.

In 1903 ([32]), J. Hadamard characterized the continuous linear functionals on \(\mathcal{C}([a,b])\):

It is easy to reach this, following Weierstrass and Kirchhoff, and introducing a function F(x), with a finite number of maxima and minima and such that

$$\displaystyle{\int _{-\infty }^{+\infty }F(x)dx = 1;}$$

e.g., \(F(x) = \frac{1} {\sqrt{\pi }}{e}^{-{x}^{2} }\).

Starting then from the well-known identity

$$\displaystyle{\lim _{\mu =\pm \infty }\mu \int _{b}^{a}f(x)F[\mu (x - x_{ 0})]dx = f(x_{0}),\quad a < x_{0} < b,}$$

and assuming (as the authors quoted before) the operation U to be continuous (in the sense of Bourlet), it will suffice to define

$$\displaystyle{U[\mu F\mu (x - x_{0})] = \Phi (x_{0},\mu )}$$

to show that our operation could be represented as

$$\displaystyle{U[f(x)] =\lim _{\mu =\pm \infty }\int _{a}^{b}f(x)\Phi (x,\mu )dx.}$$

In 1909 ([61]), F. Riesz discovered a representation depending on only one function:

In the present note, we shall develop a new analytic expression of the linear operation, containing only one generating function.

Given the linear operation A[f(x)], we can determine a function of bounded variation α(x) such that for every continuous function f(x), we have

$$\displaystyle{A[f(x)] =\int _{ 0}^{1}f(x)d\alpha (x).}$$

Riesz’s theorem asserts that every continuous linear functional on \(\mathcal{C}([0,1])\) is representable by a Stieltjes integral.

10.2 Measure and Integral

Les notions introduites sont exigées par la solution d’un

problème, et, en vertu de la seule présence parmi les notions

antérieures, elles posent à leur tour de nouveaux problèmes.

Jean Cavaillès

In 1898, Emile Borel defined the measure of sets in his Leçons sur la théorie des fonctions:

The procedure that we have employed actually amounts to this: we have recognized that a definition of measure could be useful only if it had certain fundamental properties: we have stated these properties a priori, and we have used them to define the class of sets that we consider measurable.

Those essential properties that we summarize here, since we shall use them, are the following: The measure of a sum of a denumerable infinity of sets is equal to the sum of their measures; the measure of the difference of two sets is equal to the difference of their measures; the measure is never negative; every set with a nonzero measure is not denumerable. It is mainly this last property that we shall use. Besides, it is explicitly understood that we speak of measures only for those sets that we called measurable.

Of course, when we speak of the sum of several sets, we assume that every pair them have no common points, and when we speak of their difference, we assume that one set contains all the points of the other.

Following Lebesgue,

The descriptive definition of measure stated by M. Borel is without doubt the first clear example of the use of actual infinity in mathematics.

However, Borel does not prove the existence of the measure!

The Lebesgue integral first appeared on the 29 April 1901. In the note [42], Lebesgue proved the existence of the Borel measure as a restriction of the Lebesgue measure.

In the introduction of his thesis [43], Lebesgue stated his program:

In this work, I try to give definitions as general and precise as possible of some of the numbers considered in Analysis: definite integral, length of a curve, area of a surface.

He formulated the problem of the measure of sets:

We intend to assign to every bounded set a positive or zero number called its measure and satisfying the following conditions:

  1. 1.

    There exist sets with nonzero measure.

  2. 2.

    Two equal sets have equal measures.

  3. 3.

    The measure of the sum of a finite number or of a countable infinity of sets, without common points, is the sum of the measures of those sets.

We will solve this problem of measure only for the sets that we will call measurable.

In his Leçons sur l’intégration et la recherche des fonctions primitives of 1904, see [45], Lebesgue formulated the problem of integration.

We intend to assign to every bounded function f(x) defined on a finite interval (a, b), positive, negative, or zero, a finite number a b f(x)dx, which we call the integral of f(x) in (a, b) and which satisfies the following conditions:

  1. 1.

    For every a, b, h, we have

    $$\displaystyle{\int _{a}^{b}f(x)dx =\int _{ a+h}^{b+h}f(x - h)dx.}$$
  2. 2.

    For every a, b, c, we have

    $$\displaystyle{\int _{a}^{b}f(x)dx +\int _{ b}^{c}f(x)dx +\int _{ c}^{a}f(x)dx = 0.}$$
  3. 3.
    $$\displaystyle{\int _{a}^{b}[f(x) +\varphi (x)]dx =\int _{ a}^{b}f(x)dx +\int _{ a}^{b}\varphi (x)dx.}$$
  4. 4.

    If we have f ≧ 0 and b > a, we also have

    $$\displaystyle{\int _{a}^{b}f(x)dx \geqq 0.}$$
  5. 5.

    We have

    $$\displaystyle{\int _{0}^{1}1 \times dx = 1.}$$
  6. 6.

    If f n (x) increases and converges to f(x), then the integral of f n (x) converges to the integral of f(x).

Formulating the six conditions of the integration problem, we define the integral. This definition belongs to the class of those that could be called descriptive; in those definitions, we state the characteristic properties of the object we want to define. In the constructive definitions, we state which operations are to be done in order to obtain the object we want to define. Constructive definitions are more often used in Analysis; however, we use sometimes descriptive definitions; the definition of the integral, following Riemann, is constructive; the definition of primitive functions is descriptive.

In 1906, in his thesis [23], Maurice Fréchet tried to extend the fundamental notions of analysis to abstract sets.

In this Mémoire we will use an absolutely general point of view that encompass these different cases.

To this end, we shall say that a functional operationU is defined on a set E of elements of every kind (numbers, curves, points, etc.) when to every element A of E there corresponds a determined numerical value of U : U(A). The search for properties of those operations constitutes the object of the Functional Calculus.

Fréchet defined distance which he called, in French, écart:

We can associate to every pair of elements A, B a number (A, B) ≥ 0, which we will call the distance of the two elements and which satisfies the following properties: (a) The distance (A, B) is zero only if A and B are identical. (b) If A, B, C are three arbitrary elements, we always have (A, B) ≤ (A, C) + (C, B).

In [24], Fréchet defined additive families of sets and additive functions of sets:

An additive family of sets is a collection of sets such that:

  1. 1.

    If E 1, E 2 are two sets of this family, the set E 1 − E 2 of elements of E 1, if they exist and that are not in E 2, belongs also to the family.

  2. 2.

    If E 1, E 2,  is a denumerable sequence of sets of this family, their sum, i.e., the set E 1 + E 2 + ⋯ of elements belonging at least to one set of the sequence, belongs also to the family.

A set function f(E) defined on an additive family of sets is additive on if E 1, E 2,  being a denumerable sequence of sets of and disjoint, i.e., without pairwise common elements, we have

$$\displaystyle{f(E_{1} + E_{2}+\ldots ) = f(E_{1}) + f(E_{2}) + \cdots \,.}$$

When the sequence is infinite, the second member has obviously to converge regardless of the order of the terms. Hence the series in the second member has to converge absolutely.

Fréchet defined the integral without using topology. Additive functions of sets will be called measures.

In [12], Daniell chose a different method. He introduced a space of elementary functions and an elementary integral

$$\displaystyle{\mathcal{L}\rightarrow \mathbb{R} : u\mapsto \int u\ d\mu }$$

satisfying the axioms of linearity, positivity, and monotone convergence.

The two axiomatics are equivalent if to Daniell’s axioms we add Stone’s axiom (1948):

$$\displaystyle{\mbox{ for every }u \in \mathcal{L},\min (u,1) \in \mathcal{L},}$$

or the axiom

$$\displaystyle{\mbox{ for every }u,v \in \mathcal{L},uv \in \mathcal{L}.}$$

The choice of primitive notions and axioms is rather arbitrary. There are no absolutely undefinable notions or unprovable propositions.

The axiomatization of integration by Fréchet opened the way to the axiomatization of probability by Kolmogorov in 1933. The unification of measure, integral, and probability was one the greatest scientific achievements of the twentieth century.

In his thesis [5], Banach defined the complete normed spaces:

There exists an operation, called norm (we shall denote it by the symbol \(\vert \vert X\vert \vert\)), defined in the field E, having as an image the set of real numbers and satisfying the following conditions:

$$ \begin{array}{lll} \vert \vert X\vert \vert \geq 0 \\ \vert \vert X\vert \vert = 0 \:\mathrm{if and only if}\: X = \theta, \\ \vert \vert a \cdot X\vert \vert =\vert a\vert \cdot \vert \vert X\vert \vert \\ \vert \vert X + Y \vert \vert \leq \vert \vert X\vert \vert +\vert \vert Y \vert \vert \end{array} $$

If 1. {X n } is a sequence of elements of E, 2. \( \displaystyle{\lim _{\begin{array}{c}r\rightarrow \infty \\ p\rightarrow \infty \end{array}}\vert \vert X_{r} - X_{p}\vert \vert = 0 }\), there exists an element X such that

$$\displaystyle{\lim _{n\rightarrow \infty }\vert \vert X - X_{n}\vert \vert = 0.}$$

Banach emphasized the efficiency of the axiomatic method:

The present work intends to prove theorems valid for different functional fields, which I will specify in the sequel. However, in order not to be forced to prove them individually for every particular field, a tedious task, I chose a different way: I consider in some general way sets of elements with some axiomatic properties, I deduce theorems, and I prove afterward that the axioms are valid for every specific functional field.

The fundamental book of Banach ([6]), Théorie des opérations linéaires, was published in 1932. Banach deduces Riesz’s representation theorem from the Hahn–Banach theorem.

The original proof of the Hahn–Banach theorem holds in every real vector space. Let F : X →  be a positively homogeneous convex function and let f : Z →  be a linear function such that f ≤ F on the subspace Z of X. By the well-ordering theorem, the set X ∖ Z can be so ordered that each nonempty subset has a least element. It follows then, from Lemma 4.1.3, by transfinite induction, that there exists g : X →  such that g ≤ F on X and \(g\big\vert _{Z} = f\).

Let us recall the principle of transfinite induction (see [72]). Let be a subset of a well-ordered set \(\mathcal{A}\) such that

$$\displaystyle{\{y \in \mathcal{A} : y < x\} \subset \mathcal{B}\Rightarrow x \in \mathcal{B}.}$$

Then \(\mathcal{B} = \mathcal{A}\).

In set theory, the well-ordering theorem is equivalent to the axiom of choice and to Zorn’s lemma. In 1905, Vitali proved the existence of a subset of the real line that is not Lebesgue measurable. His proof depends on the axiom of choice.

10.3 Differential Calculus

L’activité des mathématiciens est une activité expérimentale.

Jean Cavaillès

Whereas the integral calculus transforms itself into an axiomatic theory, the differential calculus fits into the general theory of distributions.

The fundamental notions are

  • Weak solutions;

  • Weak derivatives;

  • Functions of bounded variation;

  • Distributions.

In [60], Poincaré defined the notion of weak solution of a boundary value problem:

Let u be a function satisfying the following conditions:

$$\displaystyle{\frac{du} {dn} + h\ u =\varphi,}$$
$$\displaystyle{\Delta u + f = 0.}$$

Now let v be an arbitrary function, which I assume only continuous, together with a first-order derivative. We shall have

$$\displaystyle{\int \left (v\frac{du} {dn} - u\frac{dv} {dn}\right )d\omega =\int (v\Delta \ u - u\Delta \ v)d\tau,}$$

so that

$$\displaystyle{\int v\ f\ d\tau +\int u\Delta \ v\ d\tau +\int v\varphi \ d\omega =\int u\left (h\ v + \frac{dv} {dn}\right )d\omega.}$$

Condition (5) is thus a consequence of condition (3).

Conversely, if condition (5) is satisfied for every function v, condition (3) will be also satisfied, provided thatuand \(\frac{du} {dn}\) are finite, well-defined, and continuous functions.

But it can happen that in some cases, we are unaware that \(\frac{du} {dn}\) is a well-defined and continuous function; we cannot assert then that condition (5) entails condition (3), and it is even possible that condition (3) is meaningless.

Poincaré named condition (5) a modified condition and asserted (p. 121),

It is obviously equivalent to condition (3) from the physical point of view.

This Mémoire of Poincaré contains (p. 70) the first example of an integral inequality between a function and its derivatives:

Let V be an arbitrary function of x, y, z; define:

$$\displaystyle{A =\int {V }^{2}d\tau,\quad B =\int \left [{\left (\frac{dV } {dx} \right )}^{2} +{ \left (\frac{dV } {dy} \right )}^{2} +{ \left (\frac{dV } {dz} \right )}^{2}\right ]d\tau.}$$

I will write to shorten:

$$\displaystyle{B =\int \sum { \left (\frac{dV } {dx} \right )}^{2}d\tau.}$$

I assume first that V satisfies the condition:

$$\displaystyle{\int V \ d\tau = 0}$$

and I intend to estimate the lower limit of the quotient \(\frac{B} {A}\).

The maximum principle is stated on p. 92. Poincaré’s principle appears in [59] for the formal construction of the eigenvalues and eigenfunctions of the Laplacian. In [60], Poincaré proved the existence of eigenvalues (for Dirichlet boundary conditions) using the theory of meromorphic functions (see [50]).

Let us recall that we denote by L(u) the length of the graph of the continuous function u : [0, 1] → . Following Jordan, L(u) <  if and only if u is of bounded variation. It follows then from a theorem due to Lebesgue that u is almost everywhere differentiable on [0, 1]. In [82], Tonelli proved a theorem equivalent to

$$\displaystyle{L(u) =\int _{ 0}^{1}\sqrt{1 + {(u^{\prime}(x))}^{2}}\ dx\Longleftrightarrow u \in {W}^{1,1}(]0,1[).}$$

A counterexample due to Schwarz, published in 1882 in the Cours d’Analyse of Hermite, shows that it is not possible to extend the definition of length due to Jordan to surfaces. Let z = u(x, y) be a nonparametric surface, with u continuous on [0, 1] ×[0, 1]. Let Ω = ]0, 1[ ×]0, 1[ and define, on \(X = \mathcal{C}(\overline{\Omega })\), the distance

$$\displaystyle{d(u,v) =\max \{ \vert u(x,y) - v(x,y)\vert : (x,y) \in \overline{\Omega }\}.}$$

The space of quasilinear functions on \(\overline{\Omega }\) is defined by

$$\displaystyle{\begin{array}{ll} Y & =\{ u \in X : \mbox{ there exists a triangulation $\tau $ of $\Omega $}\\ &\quad \quad \quad \mbox{ such that, for every $T \in \tau $, $u\big\vert _{ T}$ is affine}\}. \end{array} }$$

The graph of u ∈ Y consists of triangles. The sum of the areas of those triangles is called the elementary area of the graph of u and is denoted by B(u).

The Lebesgue area of the graph of u is defined by

$$\displaystyle{A(u) =\inf \left \{\begin{array}{c} \underline{\lim }\\ n\rightarrow \infty \end{array} B(u_{n}) : (u_{n}) \subset Y \mbox{ and }d(u_{n},u) \rightarrow 0,\quad n \rightarrow \infty \right \}.}$$

In [83] (see also [53]), Tonelli stated two theorems equivalent to

$$\displaystyle{\begin{array}{ll} &A(u) < \infty \Longleftrightarrow \vert \vert Du\vert \vert _{\Omega } < \infty,\\ \\ &A(u) = \int _{\Omega }\sqrt{1 +{ \left (\frac{\partial u} {\partial x}\right )}^{2} +{ \left (\frac{\partial u} {\partial y}\right )}^{2}}\ dx\ dy\Longleftrightarrow u \in {W}^{1,1}(\Omega ). \end{array} }$$

Lebesgue area is a lower semicontinuous function on X. It extends the elementary area: for every u ∈ Y, A(u) = B(u).

In [25], Fréchet observed that Lebesgue’s definition allows one to extend lower semicontinuous functions. Let Y be a dense subset of a metric space X and let B : Y → [0, + ] be an l.s.c. function. The function A defined by ( ∗ ) is an l.s.c. extension of B on X such that for every l.s.c. extension C of B on X and for every u ∈ X, C(u) ≤ A(u).

In [48], Leray defined the weak derivatives of L 2 functions, and called them quasi-dérivées.

In [75], announced in [74] and translated in [78], Sobolev defined the distributions of finite order on N, which he called fonctionnelles. (A distribution f on N is of orderk if for every sequence \((u_{n}) \subset \mathcal{D}({\mathbb{R}}^{N})\) such that the supports of u n are contained in some compact set and such that \( \displaystyle{\sup_{\vert \alpha \vert \leq \vert \leq k}\vert \vert {\partial }^{\alpha }u_{n}\vert \vert _{\infty }\rightarrow 0}\), n → , we have ⟨f, u n ⟩ → 0, n → .) Sobolev defined the derivative of a fonctionnelle by duality and associated a fonctionnelle to every locally integrable function on N.

Without reference to his theory of fonctionnelles, Sobolev defined in [77] the weak derivatives of integrable functions. Regularization by convolution is due to Leray for L 2 functions (see [48]) and to Sobolev for L p functions (see [77]).

In [69], Laurent Schwartz defined general distributions. In [70], he defined the tempered distributions and their Fourier transform. The treatise [71] is a masterful exposition of distribution theory.

Let g :  →  be a function of bounded variation on every bounded interval. The formula of integration by parts shows that for every \(u \in \mathcal{D}(\mathbb{R})\),

$$\displaystyle{\int _{\mathbb{R}}u\ d\ g = -\int _{\mathbb{R}}u^{\prime}g\ dx.}$$

The Stieltjes integral with respect to g is nothing but the derivative of g in the sense of distributions! Riesz’s representation theorem asserts that every continuous linear functional on \(\mathcal{C}([0,1])\) is the derivative in the sense of distributions of a function of bounded variation.

10.4 Comments

Some general historical references are [15, 19, 29]. We recommend also [46] on Jordan, [52] on Hadamard, [81] on Fréchet, and [38] on Banach.