Keywords

We appear to see a three-dimensional world, but this vision is based on two-dimensional images. In this paper, I consider the historically most important theories of how visual perception is made spatial in the cognitive processing of the sensory input to the eye . In all of them, active engagement of the mind is necessary in order to make visual perception genuinely spatial. The three main models are the following:

  1. 1.

    The brain receives separated three-dimensional visual images, which the mind takes to represent a consistently organized and unified three-dimensional world (Alhacen).

  2. 2.

    The brain receives a unified two-dimensional visual image, which the mind innately takes to represent a three-dimensional world (Descartes).

  3. 3.

    The brain receives two-dimensional visual images, which the mind associates to the experience of a three-dimensional world given by the proprioceptive senses (Berkeley).

The first model can be found from medieval authors. My example is the Persian author Alhacen (965–1039), whose major work in optics, De aspectibus had a very strong influence on later developments.Footnote 1 The second model was put forward by René Descartes (1596–1650). He worked soon after Johannes Kepler’s (1571–1630) association of neuronal registration of the visual information to the retina. Descartes undertakes in his Optics (1637),Footnote 2 among other things, the task of explaining how the two-dimensional image on the retina can give rise to the visual perception of a three- dimensional world. Descartes’s solution became widely accepted but it was also rejected by some, most famously by George Berkeley, who published An Essay Towards a New Theory of Vision in 1709.Footnote 3 Ville Paukkonen discusses Berkeley’s theory in detail below, and thus I consider it only shortly and in relation to Descartes's theory.

Before Kepler, it was generally thought that visual images are three-dimensional perhaps already in the eye. The soul was taken to receive them in the crystalline lens, which was a space rather than a surface. It was, nevertheless, widely recognized already before Kepler that the world given by sight is not a straightforward Euclidian three-dimensional space but a space construed on the basis of a projection on the eye. Also, of objects we see only the surfaces facing us. Furthermore, problems related to the apparent sizes of objects varying with their distance were a regular part of all serious theories of vision at least since Galen (c. 130–c. 210 AD). That stereoscopic vision was needed for perceiving the real size of an object was a received view already in the middle ages. Another issue still is that of seeing how far an object is. Kepler does not really tackle the problem of why we in the first place connect visual perception to a three-dimensional rather than a flat or, for instance, cone-shaped world.

Perspectival painting was one cultural development that obviously influenced Descartes strongly in his theory of visual perception. It sprung from considerations of how a flat two-dimensional surface can directly represent spatial, three dimensional reality and had already in the seventeenth-century art a history several centuries old. Representations were of course important for medieval paintings too, but while medieval paintings can be said to represent objects, or even Aristotelian substances like people, and their characteristics, perspectival painting represents space and things located in space in a much more obvious sense.

The crucial problem encountered by Descartes and other scholars writing after Kepler is to explain why two-dimensional visual imagery at the bottom of the eye produces a perception of a three-dimensional world. Even when looking at something obviously flat like a perspectival painting, our mind goes to thinking about a three-dimensional landscape in a manner very different from what happens when we look at text, for example. Why don’t we just see a surface? In a sense, this is a case where we are active in our perception, but we are not quite free to choose between seeing a surface or seeing a space.

Descartes thought that we have an innate idea of spatiality, and our intellectual preconception with all visual perception is that everything we see must be located in a three-dimensional space. George Berkeley rejected this model and argued that we learn spatiality with what he calls the sense of touch, which includes proprioceptive senses and even experiences like walking through distances. As we learn to connect visual perception to the objects that we touch, or to objects we reach by walking to them, we learn to connect visual perception to the space we experience through touch. In this theory, spatiality is not innately known nor is it literally seen, but through experience we learn how to connect visual experiences with experiences of our own bodily movements.

10.1 Nature of Light

One of the classical problems in explaining vision is the fact that there appears to be no contact between the eye and the seen object. Senses of touch and taste are obviously contact senses. But even in the cases of smell and hearing, it is relatively easy to show how something physically real reaches the sense organ from the external thing to which the sense quality is attributed. Rose emits its scent to the whole room, and the scent has a quite clear physical presence in the room. Similarly, the vibrations heard as sound can easily be seen to be a real physical phenomenon. In sight, however, the seen object is remote, and no physical change appears to take place between the eye and the object. Indeed, visual rays carrying opposite colours can intersect without any causal interaction. Smells get mixed, but having walls of different colours does not produce any such effect in the room.

In the first discourse of his Optics, Descartes rejects out of hand the Aristotelian theory of how visible things transmit their visible species to the eye through an illuminated transparent medium . Descartes’s formulation of the theory is that there are “little images flying through the air, called ‘intentional species’” (AT VI, 85.) In more serious terms, Aristotelian theories gave light a role in vision that differs fundamentally from how it is understood nowadays. Instead of speaking of rays of light influencing the eye, Aristotle thought that light is a quality of the medium that enables the visible species (eg. colour) of the object to influence the eye . To find ancient authors really speaking about little images flying in the air, one would have to turn to the Epicureans. It is however, not wrong to say that the Aristotelian theories of the medieval scholastics claimed that the visible species is in vision transmitted from the object to the eye.Footnote 4 A notable exception to this kind of theory is William Ockham, who denied the species at the cost of admitting that vision takes place through remote causation without any physical or material contact between the seen object and the eye.Footnote 5 This view never gained popularity.

It was well noted by the late medieval scholastics that the “intentional species”, which Descartes mentions, is ontologically a very special kind of entity both in the medium (air) and in the eye. As Aquinas carefully formulates, when the intentional species appears in the eye , there need not be any change at the level of the four basic elements (fire, air, earth, and water) nor any local motion of any matter.Footnote 6 As I have argued elsewhere,Footnote 7 it seems appropriate to put this in modern terms as claiming that there is no physical change in the eye. Visual images do not supervene on the physical, but come and go without any change at the physical level. The intentional species has however, a very exact location both in the eye and in the air as it is transmitted from the seen object to the eye. Medieval scholastics even disagreed whether its movement takes time. In such sense, it had very intimate connection to materiality of the world. Intentional species clearly was not an abstract entity, nor was it a universal belonging to the level of the intellect. Its material place was in the medium and in the organ.

Descartes chose to call the intentional species “little image” (petite image) (AT VI, 85). This choice implies that Aristotelian scholastics would have thought that the actual shape, perhaps even three-dimensional shape of the seen object is transmitted to the eye through a similarity being formed and travelling through the air. This would, however, be a misunderstanding of the more serious medieval optical tradition. If we look at Alhacen’s De aspectibus, for example, we find a clear theory of how the image is formed in the visual organs. What travels through the air, is light and colour. The image is built in an active manner by the sense of sight by careful use of the organ. Indeed, perception of shape requires active contribution by the soul and it does not belong to the brute sensation. The core property of light that vision relies on is the directness of a ray of light. As Alhacen explains vision, rays of light carry the relevant colours from the seen object to the surface of the eye in an orderly manner. Because of the exactness of the order, the various colours of the seen object reorganize themselves into a similarity or an image of the seen object in the lens of the eye . In Alhacen’s theory, there is no “image” in the air between the object and the eye. In this respect, his theory does not differ from Descartes’s account or from the most recent optical theories of our time.Footnote 8

The more problematic issue is of course what light is, or how we should ontologically think of the direct visual ray . What is it that flows in a ray of light , and in which direction? From the ancient discussions, medieval scholars inherited the question whether the visual ray comes from the object to the eye or from the eye to the object. Does the soul reach out for the object, or does the light from object reach into the eye? Ptolemy and Galen had presented theories of sight that were active even in the sense of reaching out, while Aristotle clearly posited an inflow of the species and in that sense a theory of passivity in vision. Alhacen both agrees and disagrees with all the three mentioned ancient authors with his theory based on intromission of perceivable light.Footnote 9 Although Alhacen’s De aspectibus was very influential, the topic was far from settled and the scholastics continued to think that at least the eyes of some animals have light in themselves so that they can see in the dark with their own visual rays. Indeed, even Descartes hesitates on the issue. To see how, we must however consider his theory of light with a little more detail.

Descartes puts forward dual theories of light. On the one hand, he treats light as an influx of particles, especially when explaining reflection of light on a mirror or other such surface. The leading metaphor in this respect is the tennis ball bouncing on a hard surface (introduced at AT VI, 93), but he also mentions the unhappy example of artillery pieces bouncing off the surface of a river to hit people on the other side, when the idea was to aim at the bottom of the river (AT VI, 99). Light cannot however in Descartes’s view be treated only as particles. It must also be approached as a kind of field of pressure. The leading metaphors in this respect are the wine-vat full of half-pressed wine and the stick of the blind man. Thus, if the surface of the half-pressed wine in the vat is pressed, there is pressure at the holes at the bottom of the vat. (AT VI, 86–87) Similarly, the blind man moves and presses his stick slightly against different surfaces and gets thereby a feel of how the surfaces are like. Even the slightest pressure at one end of the stick is immediately moved to the other end. Interestingly enough, Descartes shows with obvious satisfaction that the stick metaphor shows how light can travel instantaneously—but still he continues to use the particle metaphor where instantaneous travel is not that easily conceivable. (AT VI, 85–86.)

As Descartes explains with reference to the stick, “in the bodies called luminous, light is just a certain very rapid and lively movement, which is passed to our eyes through the mediation of the air and other transparent bodies in the same way as the movement or resistance of the bodies met by this blind man is passed to his hand by the mediation of his stick”. (AT VI, 84.) That is, air is transparent because it allows the particular kind of movement that we call light to be transferred with direct rays to all directions, including towards our eyes, where this movement is registered to produce a visual perception.

Blind man’s stick can be moved from both ends. Curiously enough, Descartes notes that although human vision works only so that the eye passively registers light from the seen object, cats have extramissive vision. They have light in their eyes enabling them to see in the dark. Apparently Descartes’s idea is that even if the object has no movement to be called light, the activity in cat’s eyes works in an even more literal way like the blind man’s stick, allowing the cat to swipe through external surfaces with its eyes so that it sees them by physical activity that is to be called light. (AT VI, 86.)

Descartes seems to be very satisfied by his ability to explain light without reference to anything but mechanical movement in the transparent medium. Light reduces to mere matter in motion in this explanation. Even more, no intentional species is needed. Even the species of colour is explained away: Descartes suggests that colours are “just different ways in which bodies receive and send light to our eyes”. (AT VI, 85.) He even speculates with the idea that colours are registered on the ray of light as a twist in different directions and different strengths (AT VI, 95). Thinking of the tennis ball or even the blind man’s stick metaphor of the nature of light, the idea of a twist seems to make sense, but in terms of the half-pressed wine, it seems rather hard to understand what could such a twist be. It seems understandable if some of his readers thought that the scholastic theory of the ray of light being coloured in an unperceivable way is more sensible.

10.2 Passivity of the Eye

There is a very clear domain of activity of the eye in Descartes’s theory. The person or the brain can make the eye move in order to look towards the object and to accommodate so that the image becomes sharp. But in the actual image formation, the eye is a passive object in which the light coming from outside draws an image on the retina. The formation of the image in an eye that is looking at the same immobile object is completely passive. Indeed, Descartes spends considerable part of his Optics explaining the comparison of the eye to a camera obscura in which an image of the outside is passively produced on the inside.

More exactly, Descartes advices his reader to build a special camera obscura. One must take an eye of a recently deceased person, or if such cannot be found, a cow. The rear side of the carefully loosened eye is to be removed so that none of the liquid escapes, and then the back of the eye is covered with an eggshell. Now, if this prepared eye is assembled in the middle of an otherwise covered only window of a room, and the eggshell is observed from the darkness behind the eye, an image of the outside in full colour can be seen to form on the eggshell. As Descartes argues, such an image forms on a healthy eye of a living person, and the image is very relevant for visual perception. Figure 10.1. reproduces an image from the first edition of the Latin translation of Descartes’ Optics (1644). It shows how a person is observing an image being geometrically formed at the eggshell replacing the bottom of the eye . (AT VI, 115–116.)

Fig. 10.1
figure 1

A person observing an image at the bottom of the eye. (© The National Library of Finland and Gaudeamus)

Descartes is in effect explaining an experimental setting that shows Kepler to be right. The eye has such a material structure that it can passively produce a unified two-dimensional image of the outside world on the retina. The scholastic theories, like that by Alhacen, discussed the visual image formation on the basis of projection of the visible scene at the front of the eye.Footnote 10 This is of course problematic since there is no way to make such an image visible in itself, except in the very faint sense in which it is possible to see reflections on the eyes of another person. According to Alhacen’s theory, the image is in the crystalline lens (glacialis) and not really on the surface of the eye .Footnote 11 As Alhacen explains, this image is then transmitted with optical regularity to the rear of the eye and to the optical nerve, which was taken to be hollow, and then to the brain where the final sensation takes place.Footnote 12 While this explanation leaves considerable gaps in relation to the formation of the image, it seems reasonable in the sense that the image gets formed in the living substance so that it has a route through which it can be transferred to the brain, where actual processing of visual imagery was taken to happen. It is also quite clear that Alhacen’s explanation concerns the visual image of a single object. In this respect, Alhacen’s approach differs from Descartes’s story, where the whole landscape is formed as a single image.

Alhacen’s theory thus posits visual imagery being produced in the lens—which is a space and not a surface. This makes the images three-dimensional entities. It is hard to find from Alhacen’s De aspectibus any account of how clearly he thought the visual images convey spatiality of the seen objects. Seeing a cube, we do not form a square image, but is the visual image a cube, or more like a cube projected on paper? Perhaps he did not put the question to himself in an exact manner, but as his reader I get the impression that perhaps the image in the eye is more like a lamina while the brain actively produces something like a three-dimensional model of the visually observed object to be processed in the brain chambers.

On the one hand, Alhacen makes it very clear that the three-dimensionality of an object is perceivable by vision only in a mediated way. When we see an opaque surface, we do not see through it, and thus we cannot directly see whether it is the surface of a three dimensional thing or just a convex or concave surface.Footnote 13 This would make one think that the visual image is just a two-dimensional lamina. On the other hand, the visual image travels from the eye to the brain through the hollow optical nerve. The nerve is not straight, and the image is not assumed to travel like light travels but more like things travel.Footnote 14 This gives a feel to the theory that the image is to be understood as a separable three-dimensional object—even if it is more like a lamina than a scale model.

The experiment of the prepared eye described by Descartes makes it clear that he did not think of the visual images in the eye as separable objects that could move by themselves. Instead, the visual image on the retina is ontologically more like a shadow.

10.3 Imagery in the Brain

Descartes follows the scholastic theorizing in believing that cognitive processing of the brain is located in the brain chambers that house animal spirits. His main argument for giving the pineal gland a central role is that it is suitably located in the middle of the chambers and that it is single while most other parts of the brain come in pairs. Descartes even uses the Aristotelian term “common sense” (sens commun) for the functions performed at the pineal gland. (AT VI, 129; letter to Meyssonnier 29.1.1640; AT III, 18–21.) The theory presented in Optics is that the nerves transmit visual images to the inner walls of the brain chambers. In that location, they have an influence on the movements of the animal spirits in the chambers and are thereby moved to the surface of the pineal gland .

Contrary to his practice in the Meditations or other later works, in Optics Descartes uses the word “idea” (idea) so that ideas are located in the imagination or common sense—in the brain and not in the mind. Occasionally he even uses the expression “corporeal idea” (e.g. AT VI, 55; cf. also AT X, 443). It seems that when composing the Discourse on Method and Optics he thought that there is something very corporeal to serve as the basis of the contents treated in the imagination and that the word “idea” suits well to refer to such imagery. At the beginning of the sixth Meditation, he seems to be still committed to the view that imagination works by attention being guided by corporeal imagery, but in that work he has reserved the word “idea” for a different usage that he describes in the “Preface to the reader”. (AT VII, 8.)

It may be difficult to see how Descartes would have accounted for the visual imagery moving from the wall of the brain chamber to the pineal gland. It is clear, however, that he accepted the idea that these images can somehow become independent entities that float around in the system. He claims in Optics that visual imagery can be the origin of birthmarks (AT VI, 129). In a letter, he even claims that this kind of event is the explanation why there are little images of dogs in the urine of people who have caught rabies (Letter to Meyssonier 29.1.1640; AT III, 21). From our view, the most interesting images are nevertheless on the inner surface of the brain . They are marked as “789” in Fig. 10.2, which is reproduced from the first Latin edition of Optics (1644).

Fig. 10.2
figure 2

The configuration YXV transmitted to the brain as the double configuration 789. (© The National Library of Finland and Gaudeamus)

According to the Cartesian explanation of how nerves function, there is a wire-like filament in the middle of the nerve surrounded by animal spirits that allow the wire to move easily. In the eye, light moves the end of the wire, and the structure of the nerve allows this movement to be transmitted to the other end of the nerve. Since there is a plurality of nerve endings at the bottom of the eye, and all these nerves have their other end in a systematic order at the inner surface of the brain , the image at the bottom of the eye gets transmitted to the brain in a systematic manner, though not necessarily retaining all shapes as such. (AT VI, 106–108.)

In a very interesting passage of the fourth discourse in the Optics, Descartes distances his theory from that of his predecessors. He rejects almost out of hand the idea that sensory perception could take place at the skin or other surfaces of the body. As he sees it, the little filaments in the nerves transmit mechanical movements to the brain, where perception really happens. Then he rejects in a more argumentative manner the view that in perception the soul observes imagery located in the brain . “At least the nature of those images must be conceived in a manner very different from the usual one”, as he says. (AT VI, 112.)

Descartes continues, “in order to keep as close to received views as possible”, it can be admitted that there are images in the brain, but it must be noticed that these images need not be similar to the things they represent. Having just revoked the fact that not only similarities but also signs (signes) and words (paroles) make us think about certain things, he continues to describe how a printed line-engraving (taille-douce), for example, is very different from the landscape represented. Some ink at certain parts of the paper manages to represent us forests, towns, people, and even battles and storms. There is some similarity in the form, but if you compare the form of the ink on the paper to the mountain or the battle, you may notice that perspectival regularities are more important than actual similarity. (AT VI, 112–113.)

Engraving is an interesting metaphor given that Descartes thought the image in the brain to be formed by the neural filaments transferring movements caused by light at the bottom of the eye in a systematic manner to the brain . The word that appears to have caught readers attention at the time seems however to be “painting” (peinture). In a letter, Marin Mersenne asked what Descartes actually means by the word. Descartes’s answer is rather disappointing: “just the configuration 789 as the nerves produce it on the inside of the brain”. (AT II, 591.) Here “789” seems to refer to the formation on the inside of the brain as depicted and marked 789 in Fig. 10.2 above.

A careful reader of Descartes’s Optics cannot escape the importance of perspectival paintings for the work. Descartes clearly admits that there are images in the brain, but tries to understand their role with reference to perspectival paintings. He is impressed by the fact that we automatically decode two-dimensional perspectival paintings into thoughts concerning three-dimensional world. He takes something like this to happen also when we actively produce spatial vision from two-dimensional imagery in the eyes and brain. We do not have actual three-dimensional miniature models of the seen objects in the brain, nor any exact similarity, but such a formation on the inner surface of the brain that it has semiotic markers for all the relevant three-dimensional features that we actually see in the landscape we look at.

10.4 Visual Perception in the Mind

The sixth discourse in Descartes’s Optics begins with the warning that we should not think that we would be looking at the imagery inside our brains with a second pair of eyes. Instead, there is just a natural connection between the imagery and the sensations (sentiments) that the soul has. He goes on to explain that different sense qualities are all caused by movements of the nerve endings. The idea seems to be that while the neural movements are rather similar in the case of, say, sound and taste, the perceptual qualities associated with hearing and taste are very different. Yet the difference seems to be due only to the difference between which nerves move, not on how they move .

Descartes gives at the beginning of the sixth discourse on Optics a list of what are the primary things perceived visually: light, colour, position, distance, size and figure. In a very traditional way adopted also by Alhacen, light and colour are the primary sensibles on which perception of the others is based. Descartes’s idea seems to be that if there is movement at a nerve ending at the inside of the brain connected to the light receptors at the bottom of the eye, the soul perceives light. Depending of how the ending moves, the mind perceives colours. Also Alhazen gives a theory in which everything we see is based on seeing light and colour and then performing mental operations on what is seen with them.Footnote 15 Alhacen distinguishes altogether 22 visible properties to which everything that can be seen can be reduced. The list starts with light, colour, distance, spatial disposition, corporeity, shape and size, which are rather close to what Descartes gives as primary visible things.Footnote 16

In the following I will take a careful look at the explanation of how we see distance, or at the visual systems allowing us to see how far a seen object is. Alhacen spends considerable energy showing how we know that there is a distance between the eye and the object before turning to discussion of we see the magnitude of the distance. Basically, Alhacen’s argument is to show that what we see is spatially external to us because we recognize the difference between what we appear to see eyes closed and what we actually see eyes open.Footnote 17

Both Alhacen and Descartes take, in their theories of vision, for granted that we live in a three-dimensional world. Alhacen makes this view explicit in his discussion of how the corporeity of an object is seen. As he writes “Still, according to human judgment, it is an absolute given that only a body can be perceived by sense, and so, when someone perceives a visible object, he will immediately realize that it is a body, even though he may not perceive its extension according to three dimensions”.Footnote 18 The challenge the active mind faces in his theory of vision is thus that one knows that the seen object is three-dimensional, but one only sees its surface. As Alhacen notes, the surface can either be convex or concave, and understandably he points out that seeing the corporeity, or the depth behind the surface is easier in the case of convex surfaces. In his discussion, the problem of seeing corporeity comes after the discussion of seeing distance, and thus he takes as already achieved that we can see the convexity or concavity of the surface.

As is well known, Descartes did not take it for granted in his metaphysics that there is a corporeal world. Nevertheless, he seems to share Alhacen’s assumption that in vision, we take it for granted that whatever we see is a three-dimensional body. To understand how he comes to share this view, it is necessary to recognize certain issues in how he proves that there indeed are bodies. In all three main general treatments of his metaphysics he makes the point that the idea of three-dimensional extension as the object studied in geometry is a very clearly given innate idea. We need not construct three-dimensionality on the basis of two-dimensionality, but we have an innate understanding of three-dimensional space as a possibly existing object of pure mathematics. This idea is then applied to the material world.

In part four of the Discourse on Method he conceives the object studied in geometry “as a continuous body, or a space indefinitely extended in length, breadth, and height or depth”. (AT VI, 36.) As he notes, at this stage there is doubt whether any such body exists. A little bit later he however reaches the rule that clear and distinct ideas are true, which in effect also means that there is something about which geometry is true, i.e., the material world, and that the material world thus exists.

This implication is developed in a careful way in Meditation 6, where we find a careful proof of the existence of the material world. It begins with the statement that spatial, three-dimensional world can at least exist as the object of pure mathematics. Meditation 5 has indeed discussed the exemplary certainty of geometry, and the thing remaining to be proved is that there really exist objects of the type studied in geometry. Meditation 6 starts with the consideration that “imagination seems to be just a kind of application of the cognitive faculty to a body that is internally present to it” (AT VII, 71). As is well known, the meditation continues to a full-fledged proof of the existence of the material world.

The Principles of Philosophy tells essentially the same story. Descartes takes quite directly the view that geometry as he knows it, i.e. Euclidian geometry of the three-dimensional space , is true about the material things. (AT VIII, 41.) This result concerns directly the systematic order of science: in all study of the material world we can assume that geometry is true of it. It seems that in his theory of vision Descartes accepts even a stronger application of the result. In looking at things, even in ordinary life, we innately know that the real objects we see accord with the Euclidian geometry of three-dimensional space. Although out visual imagery is two-dimensional, we know that the world it represents is three-dimensional. To gain three-dimensional visual experience , we have to combine our visual images with the innate knowledge of three-dimensionality.

Descartes’s claim that spatiality is a dimension added to the visual experiences by the mind has lead many scholars to think that its addition is a matter of mathematical exercise. In my view, Descartes did not mean quite this. Rather, his point is that there is nothing in the brain that would build three-dimensional visual images. Visual images are in the brain on surfaces just like in the eye. The extent to which such an image is three-dimensional in the mind, is a product of active engagement by the mind based on the innate knowledge that the world is three-dimensional (and not flat).

The challenge the active mind faces in Descartes’s theory of vision is thus just the same as in Alhacen’s theory. We know that the world we are looking at is a Euclidian three-dimensional space, and our task is to understand what the object that we see is really like and how far it is. Although its image in the eye and in the brain is, for Descartes, on a surface and thus lacks third dimension completely, we must judge depth and distance. However, it is important to note that it is not solely or even typically by trigonometric calculations, but through much more straightforward and so to say automatic systems that we judge the shapes and distances of the objects we see.

Descartes distinguishes altogether four ways in which we perceive distance. In the following, I discuss them one by one with comparison to earlier theories.

  1. 1.

    Accomodation. The first and principal means for perceiving distance in Descartes’s theory is the feel of the shape of the eye. He thought the eye accommodates to distance through actively changing the distance between the lens and the retina . He did not think that the lens itself would accommodate. We do not normally pay attention to the accommodation even when we do pay attention to how far our eyes are gazing. Descartes compares this to taking an object to one’s hand to estimate its weight. One does not pay attention to the movements of the hand, but only to the weight of the object. As Descartes saw it, the neural impulses that produce the accommodation of the eye “are instituted by Nature to make our soul to perceive the distance”. (AT VI, 137.) It is noteworthy that Descartes thought that this way of perceiving distance is available only for objects that are rather close. There is no considerable accommodation needed for objects lying at a distance of more than four or five feet.

    In visual theories before Kepler there was nothing to correspond to the accommodation of the eye. In those theories, however, where vision was thought to be based on actively stretching a visual ray from the eye to the object, there was occasionally recognition of some kind of basic feel of how far the visual ray is being stretched. Such theories bear an interesting similarity to Descartes’s view.

  1. 2.

    Natural geometry of the two visual axes. Second, there is something “like a natural geometry” (geometrie naturalle)(AT VI, 137) that allows us to estimate the distance of the object. Descartes’s Optics has several images of the blind man with two sticks. Even when the blind man does not know the lengths of his sticks, he still can feel the distance between his hands and the angle in which he holds the sticks in his hands. By those means, he can estimate the distance of an object that is simultaneously felt by both sticks. Correspondingly, there is “a natural geometry” of the eyes so that if an object is very near, the eyes must turn more towards each other than when the object is far. Descartes does not clearly take a stance on what exactly we are aware of in such “natural geometry”, but everything he says about the natural geometry is consistent with the interpretation that we do not have awareness of the angles but only of the distance of the seen object.

    Already Galen discussed estimation of distance by the angle of the two visual axes. The theme has stayed in the tradition since then. For this means of seeing distance, Descartes does not give quite as clear a limit, but he does point out that the changes in the angle are very small if you look at anything more remote. It seems that we are again limited to perceiving distances less than four or five feet.

  1. 3.

    Relocating the one eye . Descartes provides an elaborated explanation how you do not need two eyes to estimate a distance by a method comparable to the natural geometry if you can move the one eye you use. In this case, Descartes notes that we need “an act of thought which is nothing but a simple imagination” . (AT VI, 138.) As Descartes makes clear, this mental move differs from the genuinely geometrical calculations used by surveyors using vantage points for measurements although it uses the same sort of reasoning. It is quite noteworthy that for the two first methods Descartes does not attribute an act of thought, but in the rather active and purposeful use of one eye he does think that what is at issue is active albeit simple thought. In this case, it seems clear that he thought we have awareness of the angle in which we are directing the eye.

  2. 4.

    Comparative sharpness of the image together with strength of the light. Descartes notes that given a certain accommodation of the eye, objects farther or closer than the one at which the accommodation aims are fuzzier. As he explains it, more light arrives from the closer object and less from the more remote object. This gives some comparative ordering of objects in terms of their distance. According to Descartes, in this we do not “really see the distance, but imagine it”. Again he seems to be speaking of an act of thought rather than passive perception. (AT VI, 138–139.)

    In this method, we are very vulnerable to an error that must be corrected by ratiocination. In Descartes’s example, if a mountain has better light than a forest at its feet, we are tempted to judge that the mountain seems to be closer, but we can correct our judgment by taking into account the fact that the we see the forest exactly at the feet of the mountain. It seems that for very long distances we have to rely on reason. Indeed, in astronomical objects the imagination and the sensory systems of the brain lead us wrong. Descartes estimates that our imagination or common sense (sens commun) cannot receive an idea of a distance more than hundred or two hundred feet. Rationally, we can of course conceive the sun or the moon being much farther. But this does not change the illusion of the moon appearing much larger close to the horizon even if the angle in which it is seen (its size on the retina) is just the same, as Descartes recognizes. We have no sensory type of image of its distance and therefore estimate its size with a wrong kind of sensory image. (AT VI, 144–145.)

10.5 Berkeley’s Criticism of Descartes

George Berkeley quite famously attacked the “geometrical theories of vision” in his An Essay towards a New Theory of Vision. In the second edition appendix to the book George Berkeley explains what is the exact target of his criticism. He quotes a paragraph from Descartes explaining the “natural geometry” he thinks we use for estimating distance by the method 2. above. As Berkeley claims, he could “amass together citations from several authors to the same purpose”. According to the appendix, the point in the discussion he wanted to make by his essay was that “we neither see distance immediately, nor yet perceive it by the mediation of anything that hath … necessary connexion with it”. That is, we do not see distance, nor is there a necessary logical connection between ideas of sight and the proprioceptive senses by which we have the idea of distance.Footnote 19

Berkeley very famously came in his later works to the strong view that there is no material substance. Also, he did not accept innate ideas. From this viewpoint, it is quite understandable that his approach to theory of vision differed from Alhacen and Descartes, who thought that our estimation of the distance of the seen object is based on a basic acceptance that everything we see is located in a Euclidian three-dimensional world. Berkeley does not look at visual experiences in such a background.

It seems, however, that he overstates his disagreement with Descartes within theory of vision itself. In fact, he does not deny that we rely on the four above discussed methods of estimating the distance of the seen object. What Berkeley is really attacking is exactly what he says in the appendix: theories which claim that we see distance either immediately or by perceiving something that has “necessary connexion” to it. With “necessary connexion” he means something that could translate as logical or conceptual connection, or in Humean terms as an agreement of ideas. In this way, Berkeley is opening an interesting new line in the discussion of theory of vision as regards the reason why, for instance, we connect the feel of the eyes turning towards each other to the idea that we move our gaze closer. Descartes thought that the connection is innate, but Berkeley claims that this is not a “necessary connexion” but we learn it through experience .

In paragraph 28§ Berkeley summarizes that he has gone through the three “sensations or ideas that seem to be the constant and general occasions for introducing into the mind the different ideas of near distance” (Berkeley 1948, p. 177). The three on his list are 1., 2. and 4. of the Cartesian list given above. Concerning them, Berkeley makes the clear statement that there is no “necessary connexion” involved. Instead we rely on them because “by experience they have been found to be connected with them” (Berkeley 1948, p. 177; § 28). Berkeley’s third item is straightforwardly similar to the first item on the Cartesian list, accommodation of the eye , and needs no discussion, but the first two are more interesting.

The first connection Berkeley discusses equals to 2. on the Cartesian list. As Berkeley describes “Not that there is any natural or necessary connexion between the sensation we perceive by the turn of the eyes and greater and lesser distance”, but rather that through constant experience “there has grown an habitual or customary connexion”.Footnote 20 The difference from Descartes’s theory is that while Descartes claimed the connection between the idea of turning the eyes and the idea of distance is innate, Berkeley thinks it is learnt.

According to Berkeley, it is a “received opinion” that we perceive the changes of the angles of the visual axis and that the perception of distance is based on this perception. He himself claims not to be conscious of any such perception.Footnote 21 Descartes’s text is not very clear about such a question, but it is clear that “received opinion” refers to Berkeley’s contemporaries and not Descartes. My reading of Descartes would suggest that awareness of the angle could be included in the cue 3. on the Cartesian list, when distance is estimated through purposeful movement of one eye. In this context, Descartes refers to “an act of thought”. He apparently implies that in 2., or the simple case of two eyes looking at an object, there is no such “act of thought” needed. Rather, direct and immediate idea of near distance is caused by the sensory recognition of the movement of the eyes. He seems to mean that without intervention of the mind the brain causes an idea that we are looking at something close to us when the eyes are turned towards each other without consciousness of the turning of the eyes. The case would be comparable to how less pressure on the nerves at the bottom of the stomach makes the mind feel hunger, although there is no logical connection between less pressure on certain nerves and hunger. It is just a God-instituted natural connection. (Cf. Principles of Philosophy IV; 190; AT VIII, 316–317.) We are perhaps not really conscious of the stomach being empty but of the hunger. Similarly, the “natural geometry” would not require one to be conscious of any angles, as Descartes saw it. The work would be done by innate structures of the brain so that the mind is directly aware of distance.

The second ground for estimating distance Berkeley discusses is confused appearance in vision , or lack of sharpness when the object is very near the eye, or at a distance with sensible relation to the size of the pupil. Again, Berkeley makes it clear that the connection between distance and confusion should not be thought to be necessary, but customary. He does admit that as the object is brought close to the eye , we have the experience of the sharpness of the image being lost. Again, Berkeley admits that there is some connection between loss of sharpness and nearness of the object, but he rejects the geometrical explanation.

According to Berkeley, the “most approved writers of optics” explain that the rays from a single radiating point to the eye diverge more when the point is close than when it is farther off. However, as Berkeley shows in a very detailed manner, this divergence cannot be perceived—the retina or the pupil does not perceive from which direction the rays come. Berkeley discusses at length a counterexample provided by Isaac Barrow. The Barrowian counterexample is an optical organization in which the perceived distance in fact grows as the rays falling on the pupil diverge more according to standard geometrical optics of the time. The wrongness of this putative manner of seeing distance is clearly a main target of criticism in Berkeley’s New theory of Vision.

Descartes does not discuss this kind of divergence of the rays forming the image at all. Therefore I think that the core of Berkeley’s attack on geometrical theories of seeing distance does not really hit Descartes’s theory. This does not of course mean that Berkeley would have accepted Descartes’s theory of vision. He rejected Descartes’s innatism and worked hard to build a theory of vision that is not based on assuming a Euclidian three-dimensional world to be seen. He built a theory of vision based on associating vision with human experiences deriving from the proprioceptive senses and the sense of touch, as described below by Ville Paukkonen. But these issues provide a different theoretical foundation for the account of how we see distance rather than changing the details of the real life account.

As we have seen, Berkeley rejects innate geometry because he rejects the innate intellectual knowledge that the visual world is three-dimensional. It is not because of rejecting that we see distance by the cues discussed by Descartes, since Berkeley does not reject them. Rather, these cues are an important part of Berkeley’s theory of how we see distances. He agrees that we perceive near distances by recognition of how we turn the eyes towards each other and through recognition of the accommodation phenomenon by which we sharpen the visual image. But for Berkeley, these cues are learnt, not innate.

Furthermore, Berkeley rejects one single geometrical explanation of how information of the distance could be carried to mind, because he correctly notes that the eyes are incapable of receiving that information. Descartes did not put forward this particular explanation and thus did not make this error. In this respect Descartes is not among “the most approved writers of optics” criticized by Berkeley. Already in the appendix to the second edition to his book, he emphasizes that he should not be taken reject all the use of “lines and angles”, or geometry, in theory of vision.Footnote 22

Berkeley made it a very strong point that vision is not literally spatial at all. As it is, this fact was accepted to a large extent also in earlier theories, but neither Alhacen nor Descartes thought that it implies much because we put our visual experiences on the background of genuine knowledge that the world is three-dimensional. In recent discussions the received view has been that this implies that Descartes thought that we calculate the visual distances and that there is no immediateness in the spatiality of vision. This view is almost exclusively based on Berkeley’s criticism of geometrical optics, and on one word in the sixth set of replies in the Meditations.Footnote 23 As we have seen, Descartes was not really vulnerable to Berkeley’s criticism in certain issues and in a very central issue he was not even the target. But the passage of the Meditations still needs consideration.

In the sixth objections, Descartes is challenged with the traditional sceptical example of a stick half-immersed in water and thus visually appearing bent. The objector claims that the intellect is not able to correct the visual mistake but that it is the sense of touch that can do it—one must put one’s hand to the water to check how the stick is (AT VII, 418). In his reply to the objection (AT VII, 436–439) Descartes goes back to his Optics. He starts with rejecting the scholastic theory of vision, and continues with distinguishing three grades (gradus) of sensation. First, there are the movements in the brain. Second, there are the sensory perceptions of colour and light coming from the stick that are immediately effected in the mind by the brain . And third, there are intellectual judgments made on the basis of the senses.

What Descartes is clearly aiming at is that we should distinguish between the visual perception of the apparently bent stick and the intellectual judgment that the stick is bent. In this context, he asks us to assume him to “judge (ratiocinor) on the basis the extension, boundaries and location of the colour in relation to the parts of the brain something about the size, shape and distance of the stick”. And couple lines later that he has “demonstrated in the Optics that size, distance and shape can be perceived only by reasoning (ratiocinatio) in relation to each other”. (AT VII, 437–438.) Robert Stoothoff has translated the crucial verb ratiocinor as “make a rational calculation”. (Descartes 1984, p. 295.) However, any careful reader of Optics notices that this is not quite what is at issue in Descartes’s theory of perceiving distance. Indeed, the French translation in the AT-edition uses as translations for ratiocino in this context the verb juger in the first occasion, and raisonner on the second occasion where it is used as parallel to Latin judico translated as juger. I have followed this translation. (AT IX, 237.) The crucial content of the verb ratiocino is not to refer to geometrical calculations but to an intellectual operation. The point is that the mind actively takes a stance on the real size, distance and shape of the stick, and it is this judgment in which one is on the error if one takes the stick to be bent. At the level of a visual image, there appears a collage of colours having an angular form, but this image needs evaluative act of the mind before it can be taken to represent a stick either bent or straight.

Furthermore, Descartes says that we learn very early in age to make these judgments without noticing any ratiocinative steps. This surely does not support the idea that there would be steps of calculation involved. Rather, he must mean how a child looking at a person on a hillside might first think that the person seems very small and then—after a ratiocinative step—perhaps understand that the person must be far away. In such a situation, we would as adults still say that the person seems small just like the stick in the objector’s example seems bent. Descartes’s point is that if we say that the person seems to be far away, we really mean the mental judgment that a person seeming so small must be far away. In the case of the stick the ratiocinative step is even more obvious for an adult. We would not say that the stick seems straight despite the apparent bent, we just say that the apparent angle at the surface of the water does not genuinely make it seem bent to an experienced person because the mind corrects the misleading appearance. (AT VII, 438.)

Comparing to the above discussed list of the visual cues for perceiving distance it seems clear that the case of the bent stick does not belong to the two first classes where Descartes allows some limited immediate perception of distance through accommodation of the eye or through turning the eyes towards each other (cues 1. and 2.). The case is more complex, and clearly belongs to a sphere where we have to make an intellectual judgment of the stick on the assumption that it is a bodily objet in Euclidian three-dimensional space divided by the surface of the water. The visual appearance is such that it would make a child to judge that the stick is broken, and that only the intellect is capable of encouraging one to feel the stick to check whether it really is bent. At some age, we learn that sticks do not get bent merely by being put half way into water. There is no reference to geometrical calculations, neither conscious nor unconscious. Furthermore, geometrical calculations would be so complex in the case of an apparent bent on a stick half-immersed in water that making our everyday judgment dependent on them would not be credible at all.

Descartes’s view is quite similar to what Alhacen would say. The visual appearance of the stick calls for apprehending a three-dimensional object. Descartes’s specific point is that only the intellect can provide the route to a correct judgment. And right or wrong, the judgment about the shape of the three-dimensional object is thus made on the level of the mind, not of the brain. But there is no hint that he would be in this reply to his objector thinking of geometrical calculations of the sort rejected by Berkeley.

10.6 Conclusion

My discussion carves out three different approaches to explaining the fact that we appear to see a three-dimensional world. In the pre-Keplerian theories, low-level cognitive processes developed a sphere of three-dimensional imagery in the brain. According to the Cartesian theory, three-dimensionality was added intellectually on low-level cognitive processes of the brain operating with something like two-dimensional imagery. To a very limited extent three-dimensionality was so to say hard-wired on the brain, because we are aware of some differences in small distances because of the accommodation of the eye. But the brain did not host any three-dimensional images in Descartes’s theory. Visual objects were in his theory intellectually and actively located to the three-dimensional space we innately know to be around us. In his new theory of vision, Berkeley rejects this innateness and points out that nothing in the visual imagery itself would give the idea of a three-dimensional space : he seems to think of the visual field as two-dimensional. The experiential three-dimensionality in vision is due to associating visual ideas to ideas of other senses, among which proprioceptive senses were the most important.

We can thus distinguish three basic approaches to the spatiality of vision. Spatiality is, strictly speaking, not due to visual experience itself according to any of these three models. It can be, first, a precondition built into the very basic mental imagery and thus given in an unsystematic way with the visual experience itself (Alhacen). Or secondly, we innately and strongly know that the real world represented in vision is three-dimensional, and thus we always judge things we see to be in space (Descartes). Or thirdly, we learn to customarily associate visual images with the proprioceptive perceptual modalities, which are spatial, that we feel the spatiality even in connection to vision (Berkeley).