A Modern Theory?

Gottfried Wilhelm Leibniz invented a theory and the language of this theory is still used in modern mathematics. Today, just like him, we write differentials as “ dx ”, “ dy ” and integrals as “\(\int y\,{\mathit {\,dx}\,}\)”.

However, today these signifiers do not have the same meaning as they had for their daring inventor. (Let’s also remember that the sign “\(\int \)” was not at all invented by Leibniz but by his younger colleague Johann Bernoulli.) Whereas Leibniz thought of areas and lines as purely geometrical quantities, these are seen today as much more general concepts. Nowadays, the differential “ dx ” is often only a symbol of calculation without any material significance—whereas Leibniz thought it, as we have seen, to be a “variable (geometrical) quantity”, “decreasing indefinitely”, “below any given quantity”, and “eventually vanishing”.

The transition from Descartes to Leibniz showed us a complete semantic change of the “x” in the formulae! Descartes thought the “x” to be an unknown “number” which was to be calculated—but essentially for him it signified a definite length of a line segment as he held arithmetic at bottom to be the same as geometry (although in some new version, e.g. a “unit” had to be defined). To sum up, Descartes thought of the “x” as a “number” or as a “line segment”—whereas Leibniz made “x” to be a “variable quantity”, or in short: a “variable”.

In the following chapter we shall learn in which way Leibniz’ concept of differential was developed and changed by Johann Bernoulli. We shall come to this later and note for now:

So one has to be careful not to superimpose our modern understanding on an old text. At least, we should try not to, as it is all too easy to fall into ingrained habits. Let’s see!

Leibniz Knew His Theory Was Descended from an Old Tradition

Leibniz formulated the foundations of his new theory during the years 1674–76 in Paris. Naturally, he continued working on them. Because of his first publication on the topic in 1684, which was hard to understand (we mentioned this on p. 32), two very capable mathematicians had become alert: the brothers Jacob and Johann Bernoulli. Subsequently, this triumvirate developed Leibniz’ fundamentals to a great extent.

As yet in the year 1692, Leibniz called his system “our analysis of indivisibles”. Thus, nearly twenty years after his invention he stuck to his original methodology: the “method of indivisibles”. What this is about will be indicated in this chapter. Only indicated, for the details are too intricate to be explained here. But the basic ideas are of great importance: we need to realize that Leibniz did not create his theory from scratch. The studies of former scholars opened up the path toward Leibniz’ creation. Nevertheless it was he who developed these studies further and into another direction which turned out to be so very successful. In a similar way, the same holds for Newton.

Before we turn toward the concept of “indivisible”, we have to focus on an issue which is even older, and which has the name “continuum”. Today we still work with the notion of “continuum” but do not use “indivisible”.

The Continuum and Why It Does Not Consist of Points

What Is the Continuum?

The continuum is cohesive, unbroken, connected. Prototypes are the line, the area, the volume as well as the course of time—Thus far, very well and easy.

But notice, although the continuum is cohesive, it can be divided.

The course of time is divided by the present into past and future. The line is divided by the point in “left of” and “right of” it. The area is divided by a whole line. This, too, is self-evident and obvious.

For the future we register:

And it is also evident, what the continuum divides is inside the continuum. The present is a moment in time. The point, the area are limits within the continuum. In short, what divides the continuum belongs to it.

How Do Continuum and Point Interact?

Now the question: What is the proper relation between point and continuum?

Clearly, the point belongs to the continuum. The moment belongs to the course of time, etc. But the other way round? Do only points make the continuum?

Maybe you will answer this question with “Obviously!”. What else should exist within the continuum?

But it is not that simple! Caution: point and continuum differ from each other in an essential aspect: the continuum can be divided, as we have just pondered upon—but the point cannot be divided.

This is the definition of “point”. It is already in Euclid. For him it is the first definition, the very first sentence at all.

But what has no parts, clearly cannot be divided.

Therefore, continuum and point are essentially different: the continuum can be divided; the point cannot.

What follows from this?

The Continuum Does Not Consist of Points

The pair of concepts “part/whole” was already the subject of the earliest and most elementary aspects of philosophical thought in occidental culture. Early philosophers used these notions to speculate about the metaphysical nature of the world. One of the bedrocks of these ideas is the following truism:

This principle remained unchallenged in Western philosophy up to the beginning of the twentieth century, or until the development of set-theory.

As long as we accept this foundational axiom of philosophical thought (at least for Western culture), we arrive at the result above and have to conclude the following theorem:

Let us recapitulate the proof! It is made up of four steps:

Fact 1.:

The continuum can be divided.

Fact 2.:

The point cannot be divided.

Fact 3.:

The continuum has a quality which the point does not have: divisibility. This quality is an essential one because for the continuum it is essential that it can be divided. Everything which is extended in space or persisting in time can be divided.

Fact 4.:

Consequently, the point cannot be “part” of the continuum—End of proof!

We have shown conclusively that: “The continuum does not consist of points (or nows)”. It was not terribly difficult to prove our initial statement! Or was it?

For thousands of years this way of thought has been accepted within occidental culture until, some 150 years ago, it became outdated. It was held no longer suitable for the new times and thrown onto the rubbish heap. We shall return to this later, starting from Chap. 12. However, one fact can already be mentioned here: the later rise of set-theory quickly did away with this age-old way of thinking.

But we are not done yet! There are still some further developments of analysis and, most importantly, a short retrospective regarding the late Middle Ages, to which we will turn now.

The Indivisible

We remember: Leibniz, as late as 1692, spoke of his theory as “our analysis of indivisibles”. This Latin-based name was taken over into the English language; in German it sounds a little arcane today. As long as Latin remained the language of scholars (and theologians), such notions were widely accepted.

Thomas Aquinas

Thomas Aquinas (c1225–74) was a philosopher and theologian and one of the most famous and influential scholastics of the late Middle Ages. Aquinas used the notion “indivisible” when he spoke of a point or an instant of the spatial or temporal continuum: the indivisible is the point on the line or the present moment in time.

Nicholas of Cusa

There were other philosophers for whom the concept of the indivisible played an important role. One of the most illustrious is Nicholas of Cusa (1401–64) whose Latin name is Nicolaus Cusanus.

Although Leibniz lived about 250 years later than Cusanus, there are many similarities between the two men. Both were scholars of jurisprudence as well as diplomats and both were keen travellers. Nicholas of Cusa was a modern scholar of his times. Just as Leibniz much later, Cusanus was a pioneer of new thought. One of his revolutionary mottoes was: “Man is the measure of all things!” In reawakening this doctrine (usually attributed to Protagoras) he opposed the pious tradition of his contemporaries.

He formulated a principle of thought which is called “Doctrine of Coincidence”. Somewhat shortened, it says: reason is the wholeness of those opposites (including contradictions!), which are incompatible to our understanding. At once we are reminded of Leibniz’ Law of Continuity (p. 34).

Those mathematicians who think that a wholeness of contradictions is an evil trick may be referred to the quotation of a contemporary philosopher (who is a great authority on Cusanus), Kurt Flasch:

Who once got acquainted with the contradiction that our thinking is, will grasp that the Law of Noncontradiction cannot be a philosophical criterion of truthfulness.

Because, says Flasch:

Thinking is rest as well as motion; both are its qualities; who tries to make distinctions between them, in order to get rid of the contradiction, damages the elementariness of thinking.

(Caution: This “elementariness ” does not mean “simplicity” but is to be understood as the Leibnizian designation: “elementariness ” means “without parts”—p. 21.)

Let me present at least one sentence from Nicholas of Cusa on the indivisibles, to be found in his Conjectures written ca. 1440/44:

However, reason is of such a lucid nature that it grasps, so to speak, the whole sphere in its indivisible center.

The “indivisible centre” is the “puncto centrali indivisibili”.

Buonaventura Cavalieri

The (younger) contemporary of Galilei, Buonaventura Cavalieri (1598?–1647) , made the “indivisible” a principal notion of his mathematical theory of the calculation of areas. This theory is explained in two books. Cavalieri came too early and could not make use of the language of formulae which Descartes was to publish in 1637, and consequently, his writings are not easily understood by us.

However, the historian of mathematics took on the challenge of thoroughly deciphering Cavalieri’s writings. Her representation will be my source in what follows.

  1. 1.

    In a letter to Galilei from 2nd October 1634, Cavalieri wrote clearly: “I absolutely do not declare to compose the continuum from indivisibles”.

  2. 2.

    It is Cavalieri’s principal idea to compare areas and to draw conclusions from these comparisons regarding the magnitudes of those areas.

He traverses a “ruler” through two areas and compares what happens. While doing so he is interested in a concept invented by him and called “all the lines”. (In case of an area it is the line which is the decisive “indivisible”.)

The line IK above is the ruler. The above plane moves downwards. Then the two hatched rectangles signify “all the lines”, the rectangle KM in the case of the “straight traverse” and the rectangle KO in case of the “oblique traverse”. And these two collections “all the lines” were considered by Cavalieri to be equal (Fig. 4.1):

Fig. 4.1
figure 1

Cavalieri: straight and oblique traverse (Exercitationes 1647, p. 15)

$$\displaystyle \begin{aligned}\mathcal O_{KM}(l)_{\text{ straight traverse}}= \mathcal O_{KO}(l)_{\text{ oblique traverse}} \end{aligned} $$

Of course, this does not mean that the areas of KM and KO are equal. The reason is that the “traverse” of the “ruler” differs in both cases: in case of KM it is “straight”, but in case of KO it is “oblique”. It is eminently important to compare both “traverses” with each other, that is to say, their ratio.

Nowadays we describe this ratio with the help of the sine of the angle of inclination of the rectangle KO. Cavalieri does not do this.

It is crucial that Cavalieri does not say, the collection “all the lines” makes up the area. This would be nonsense. (We will prove this below!) Instead, Cavalieri takes the ratio of two of those collections:

$$\displaystyle \begin{aligned} \mathcal O_{KM}(l)_{\text{ }}: \mathcal O_{KO}(l)_{\text{ }} \qquad \text{or more general}\qquad \mathcal O_{F_1}(l)_{\text{ }}: \mathcal O_{F_2}(l)_{\text{ }}\,. \end{aligned} $$

He compares only this ratio to the ratio of the considered areas and that only in case both areas belong to the same plane. Consequently, the case of the “oblique traverse” is excluded. Then he has:

$$\displaystyle \begin{aligned}\ \mathcal O_{F_1}(l)_{\text{ }} : \mathcal O_{F_2}(l)_{\text{ }}\ =\ F_1 : F_2\ \,.\end{aligned} $$

It follows that the ratio of the collections “all the lines” of the two surfaces F1 and F2 is the same as the ratio of their areas.

Cavalieri proves this last equality in all detail, i.e. according to Euclid’s standards.

We will not go into these details here and just accept the result of ’s research: by this means Cavalieri succeeded, with all mathematical rigour, in calculating some intricately formed areas. (We quietly pass over the fact that he had to make up his mind anew when faced with differently shaped areas. Leibniz enabled us to deal with this much better.)

Our only aim was to present the principle invented by Cavalieri. The picture shows the uninteresting case where the areas of two rectangles are to be determined. In this case we obviously do not need a new method, for we know: this area is length times width. Cavalieri’s method is only of interest if more complicated areas are to be found.

Evangelista Torricelli

Another contemporary of Galilei and Cavalieri is . At first, refused to accept Cavalieri’s method; but later on he was thrilled by it and thus subscribed to it.

However, misunderstood Cavalieri’s method. For, contrary to him, he asserted that Cavalieri’s strange collections “all the lines” were thought to be identical to the area. In other words, used just that equation which Cavalieri painstakingly shunned, namely

$$\displaystyle \begin{aligned} F=\mathcal O_{F}(l)_{\text{ }} \end{aligned}$$
(banned equation!)

And as intensely promulgated this “adopted” method as Cavalieri’s—be it out of ignorance or on purpose— he brought Cavalieri into discredit.

Why Are “All the Lines” Not the Area?

That the last equation is nonsense (and, therefore, was rightfully avoided by Cavalieri) can easily be proved.

We take a rectangle ABCD with its diagonal and conclude:

  1. 1.

    Each line EF corresponds to a line FG.

  2. 2.

    Each line EF and each corresponding line FG have the ratio EF  :  FG, i.e. the ratio AB  :  BC.

  3. 3.

    But if we have \(\mathcal O_{F}(l)_{\text{ }}=F,\) it follows that the areas of the two large triangles ADC and ABC must have the ratio AB  :  BC.

  4. 4.

    But they are obviously equal and thus we reach a contradiction!

As the first two statements are established facts, the error must arise in the third step. Consequently, the equation \(\mathcal O_{F}(l)_{\text{ }}=F\) must be wrong.

’s method is of no use! But the reason is not his handling of “indivisibles”. (For Cavalieri relied on indivisibles and got a working method.) Instead, made the wrong usage of indivisibles. Wrong means: he identified “all indivisibles” with the “area”. simply ignored the fact that the continuum does not consist of indivisibles!

’s mistake disappears, if one does not compose the area from lines but from (very small) pieces of an area instead. The piece EFGG′F′E′ consists of two parts with equal areas, divided by FF′. (It is the difference of the larger right angle AG′F′E′ and the smaller one AGFE, both halved by a diagonal.) Now, if the sides of these parts, EE′ and GG′, are infinitely small, i.e. if they are taken as indivisibles, then the equality of the two large triangles does follow!

By the way: knew about this problem!

Newton’s Method of Fluxions

It would be completely unjustified to give an overview of Leibniz’ calculus and not to say a word about Isaac Newton (1643–1727), especially as Newton invented his method about ten years prior to Leibniz.

However, Newton’s formulation of his method was far less clear than Leibniz’. Besides, it is much more specific than the calculus. Therefore, we will only deal very briefly with Newton’s method.

Newton’s Method

Newton, too, worked in his manuscripts with “indivisibles”. Usually he called them “infinitely small lines” and used them to get new results, as a method of invention.

But in case of proving his results in his publications he carefully tried to avoid these notions.

An Example

Since 1981, all the working papers of Newton have been published, including those which he did not wish to be published. Therefore, everybody has the chance today to witness how he was working: The Mathematical Papers of Isaac Newton, volumes 1 to 8.

In the following I present an example from his originally unpublished papers that illustrates how Newton truly worked. To make things easier, I simplify Newton’s example; but his method is preserved. Newton starts with an equation like

$$\displaystyle \begin{aligned} x^2-ax+a^2=0\,.\end{aligned} $$

Then he continues: let x be a “fluent” quantity with velocity m. During the infinitely small interval of time o, x will become x + mo. (Because length is velocity times duration.) In the equation, x + mo may be substituted in place of x:

When the terms of the first equation are erased, we get:

$$\displaystyle \begin{aligned} 2\cdot x\cdot mo+(mo)^2-a\cdot mo =0\,.\end{aligned} $$

Newton divides all by o and gets

$$\displaystyle \begin{aligned} 2\cdot x\cdot m+m^2o-a\cdot m=0\,.\end{aligned} $$

And then he writes boldly:

Since o is supposed to be infinitely small, terms which have it as a factor will be equivalent to nothing in respect to the others. I, therefore, cast them out and there remains

$$\displaystyle \begin{aligned} 2\cdot x\cdot m -a\cdot m=0\,\qquad \text{or more simply:}\qquad 2x-a=0\,.\end{aligned} $$

Every physicist knows: Newton’s result is correct. But in regard to mathematical or philosophical standards, Newton’s method leaves much to be desired. At first he divides by the quantity o—and, therefore, implicitly assumes it to be ≠ 0— because a division by 0 is forbidden, and thereafter he acts as if o = 0!

According to mathematics or logic, such reasoning cannot be justified; it is acceptable only by its success.

Nearly one hundred years were to pass, till mathematics succeeded in rendering this crooked reasoning logically sound. Today it is known that Leibniz faced the same problem—and that he solved it unobjectionably with the help of an intricate idea (pp. 32f).

Fluxions and Fluents

Newton calls his variable quantities “fluxions” and “fluents”. The velocity of the variable x he describes as “fluxion” and sometimes he wrote this fluxion as “\(\dot x\,\)”.

Generally speaking, Newton’s first problem is to determine the velocity of a known variable quantity. Newton’s second problem is the reversal of this: to determine the distance covered when the velocity of the moving object is known.

It is easy to express this in Leibniz’ own language:

  1. 1.

    If y is given, determine \(\frac {{\mathit {\,dy}\,}}{{\mathit {\,dt}\,}}\).

  2. 2.

    If \(\frac {{\mathit {\,dy}\,}}{{\mathit {\,dt}\,}}\) is given, determine y.

This is due to the fact that Leibniz writes down all the involved quantities. (Just as Descartes then demanded!) Clearly, “velocity” comes along with “time”, and, therefore, Leibniz explicitly wrote it down as t. But Newton did not!

At best, time is hidden deep in the notation “\(\dot x\,\)”. Each physicist today can immediately translate this in Leibniz’ language:

$$\displaystyle \begin{aligned} \dot x = \frac{{\mathit{\,dx}\,}}{{\mathit{\,dt}\,}}\,.\end{aligned} $$

However, Newton has no knowledge of this language and instead writes “m” (as we have just seen) or “\(\dot x\,\)”.

This reveals the second shortcoming of Newton’s method: it is conceptually too restricted.

Newton’s method demands that every variable depends on time. But not each problem in the world is of this kind. Sometimes, a variable does not depend on time alone but in addition to some other quantity: temperature, pressure, height, etc.

It happens that one variable depends on two quantities. For Leibniz there is no difficulty: he simply writes all the variables of the problem down (as was demanded by Descartes in his particular problems).

Without this approach, Newton gets into severe difficulties: in the case of two variables he has to trace both of them back to time, then to solve the problem and finally, he has to eliminate time again.

This is possible. But it is not simple. It is of little surprise then that British mathematics fell behind in the further development of calculus: Leibniz’ notations were more general and thus more applicable.