The freedom and usefulness of imagination

Visualization is a key construct to a variety of research programs within cognitive science and psychology. But, what is visualization, cognitively speaking? At a first pass, we can say that it is any cognition that involves visual mental imagery; to visualize is to visually imagine. Visualizations are visual instances of what philosophers sometimes call “perceptual” or “sensory” imagination. More substantive explanations normally proceed through comparisons to visual perception: the phenomenology of visual imagining is similar to that of visual perception; the two share many underlying neural mechanisms,Footnote 1 and psychophysical experiments suggest important similarities between the two.Footnote 2

Then there are the obvious differences between visualization and visual perception. Visualization is under control of the will in a way perception is not; it is often triggered by memory searches and action-planning routines and not by the state of the world before one’s eyes. So, the two have very different kinds of causes. As for their effects, a visual perceptual experience of an apple typically causes a belief in the presence of an apple; a visual imagining of an apple does not. Visually perceiving a tornado 50 yards away normally causes an intense emotional response; visualizing one normally doesn’t (though this isn’t to say it results in no affective response). Supposing we were to type mental states by their typical causes and effects, visualizations would come out relatively unlike visual perceptions.

The functional dissimilarities between visualization and visual perception—what we can think of as diachronic dissimilarities, revealed in the causes and effects of the states across time—are often obscured by the intense focus researches have had on the potential format similarities between the two.Footnote 3 Here, I want to focus on the diachronic dissimilarities between visualization and perception, without prejudice to whatever format similarities they might or might not have. For I think this focus reveals an easily overlooked puzzle about visualization—one which lies at the heart of understanding visualization’s role in the broader cognitive economy.

The puzzle derives from the simultaneous freedom and usefulness of visual imagination (I will use “visual imagination” equivalently with “visualization”). As remarked, what we imagine is more or less up to us; intuitively, it is “free” in a way that both perception and belief are not. On the other hand, we rely upon visualization to guide us in a wide variety of practical reasoning tasks, whether it is in spatial reasoning (Barsalou 1999; Cornoldi et al. 1996; De Vega et al. 1996; Kosslyn et al. 2006), action-planning,Footnote 4 the prediction of others’ behavior (Gregory Currie 1995; Goldman 2006), the training of motor routines (Feltz and Landers 2007), or the development of novel technologies (Arp 2008). But, how can the deliverances of visual imagination be relied upon to guide our actions if what we imagine is subject to our whims?

Consider: other mental states we rely upon to guide our actions—beliefs and perceptions—are not subject to our whims, and one might think this is precisely why they can be relied upon. Perceptual representations are reliably caused, and thereby constrained, by the world around us. Perceptual beliefs, we can assume, are directly and reliably caused by these perceptual states, inheriting their reliability from the perceptual states that cause them. Other beliefs are formed through processes of reasoning and inference, and are constrained by the principles and norms governing those processes. However difficult it is to characterize precisely these principles (be they probabilistic, deductive, inductive, or something else), it is clear that rational inference is not arbitrary or “free”. However fallible we may be as reasoners, we cannot infer whatever we wish from p.

Why, then, is visualizing a reliable way to make plans, anticipate responses, solve problems, and so on, if it is not constrained either by what is before one’s eyes, or by rules of rational inference? How can we conceptualize the kinds of constraints (if any) that apply to visualization in a way that is compatible with its freedom and creativity? These queries can be sharpened into the following two questions:

  1. 1.

    “Compatible with the usefulness and freedom of imagination, what determines (or what principles characterize) the pattern by which representations unfold during visualization; that is, what governs the relation among successive states of visualization?” and,

  2. 2.

    “Compatible with the usefulness and freedom of imagination, what determines (or what principles characterize) how those representations interact with other kinds of mental states—most importantly, with belief ?”

I submit that, without satisfying answers to (1) and (2), we remain more or less in the dark about the role of visualization in cognition generally. In what follows, I will consider a variety of strategies for answering (1) and (2), building on some proposals in the literature. The section on “Grush’s emulation theory of visual imagery” outlines and expands on Grush’s (2004) theory of visual imagery, using it as a scaffold for building a general approach to answering (1) and (2). The section immediately following develops three specific proposals consistent with Grush’s approach. There, I cast doubt on the view that visualization should be thought of as inherently misrepresentational, as a kind of “off-line” perception. The freedom of imagination is the freedom to reason about topics of one’s own choosing, not a freedom to represent whatever one wishes as present before one. The final section “Illusion and encapsulation” assesses the two remaining options. I tentatively conclude that the most promising account assimilates visualizations (and visual imaginings) to occurrent beliefs in sensorimotor generalizations; they are beliefs in the way certain kinds of sensorimotor scenarios unfold. Making this argument requires looking closely at the phenomenon of informational encapsulation as it pertains to visualization, and at our putative ability to “imagine illusions.”

But, before coming to the three main approaches to be considered, I will briefly describe and reject two additional ways of responding to (1) and (2). This should help clarify the questions at hand and set the stage for the more nuanced accounts to come.

Memory and imagination

The inner DVR approach

We can call one overly simple approach to answering (1) and (2) the “inner DVR” view (“DVR” is for “digital video recorder”). The inner DVR view posits an especially close relation between visualization and past perceptual experience. It proposes that, when we visualize, we are simply re-playing (with some informational degradation, perhaps) past perceptual experiences that were “recorded” at the time; specific diachronic neural patterns triggered during a past act of perception are “reactivated.” If this were a generally viable approach, it would explain (1); we could say that the representations in visualization unfold in more or less the sequence their perceptual ancestors did, and inherit their reliability (and usefulness) straightforwardly from the original perceptual acts.

The obvious problem with the proposal is that it has nothing to say about the creative side of visual imagination, its ability to represent novel scenarios and thereby allow for planning and innovation through the contemplation of new possibilities. If visualization is completely tied to the past, it isn’t very free at all. Visualizing would amount to choosing where to start the orderly replay of previously recorded footage. Moreover, such a view is deeply at odds with current research on episodic memory, which sees both it and imaginative visualization as primarily “constructive” processes, not “literal reproduction[s] of the past” (D.L. Schacter and Addis 2007, p. 773).Footnote 5

As wise it may be to sever both memory and imagination from the overly simplistic inner DVR view, we should not overlook the explanatory gulf that opens. For now, it is far from clear how to approach answering (1) and (2), both with respect to visualization in its paradigmatically “imaginative” instances (what memory researchers call “imagining the future”) and with respect to episodic memory. Despite the fact that both share underlying mechanisms with visual perception, the crucial question of what governs the transition from one state to the next cannot have the same answer for imaginative and memory states as it does for perceptual ones. We can appeal neither to a current environmental stimulus nor to a straightforward “re-playing” of states in the order they were caused during a past perceptual episode. A very different sort of story must be told to explain the usefulness and reliability of visualization.

A lesson from theories of propositional imagination

Given the problems of the inner DVR approach in answering (1), it is worth having a closer look at what comparisons can be drawn between visualization and the following of rational rules of inference of the kind that govern belief. Here, it will be useful to look at some leading accounts of the cognitive architecture underlying “propositional” imagination. Propositional imagining occurs when someone imagines that thus and such, e.g., imagines that the roof is on fire. This is typically contrasted to non-propositional (or “objectual”) imagining, where a that-clause is not used in the description of the act (e.g., imagining the Eiffel Tower). While I do not think anything obvious concerning the role of mental imagery in each kind of mental act falls out of the way the acts are described (i.e., with or without a that-clause), I will assume here with othersFootnote 6 that propositional imagining does not involve mental imagery and so is a distinct cognitive phenomenon from visualization and visual imagination (philosophers typically contrast the latter as a form of sensory imagination).

In an influential paper and book, Nichols and Stich (2000, 2003) argue that propositional imagination involves a distinct cognitive attitude, similar to belief its inferential patterns (Currie and Ravenscroft (2002, Ch.1-2) espouse a similar view). For instance, when told to imagine that Bob was in New York yesterday and London today, one will (typically) imaginatively infer that he traveled to London by plane, just as one would infer that he traveled by plane if one simply believed he was in New York yesterday and London today and hadn’t been told how he made the trip. More generally, we tend to “fill in” the details of imagined scenarios with what we would infer to be true if the imagined scenario were believed. The inferential “mirroring” is thought by Nichols and Stich to result from the same “inference mechanism” working on representations in the “Belief Box” and in the “Possible Worlds Box,” the latter being the cognitive “box” that houses the representations involved in propositional imagination (talk of cognitive “boxes” here is offered as a way of grouping representations that have important functional similarities). The core idea is simply that whatever mechanism or principles guide processes of inference among beliefs also guide (and thereby constrain) inferences among one’s imaginings—where one’s imaginings are thought of as collections of propositions quarantined in their own cognitive “box”. Imagination is still “free” on these accounts, to the extent that we can insert into the Possible Worlds Box whatever we wish as the initial “premise” from which other inferences are drawn “in imagination”. Once this premise is inserted, however, inferences will be drawn in imagination in more or less the same manner they would if the premise were believed.

While there are important details of their account that this brief sketch leaves out (including, e.g., their explanation of the many instances where imaginative inference does not mirror the inferential patterns of belief), we can see that Nichols and Stich have at least the beginnings of a proposal for explaining (1) with respect to propositional imagining.Footnote 7 Given an initial imagined proposition, the sequence of propositions subsequently imagined is constrained by the same inference “mechanism” (or, more neutrally, rules of inference) as govern belief. If we know how to explain transitions among beliefs in broadly mechanistic terms, we know how to explain the analogous transitions among states of propositional imagination.

Can this approach to answering (1) be extended to explain visualization? Defenders of such views think not, and I am inclined to agree.Footnote 8 If we accept the governing assumption of these views that beliefs are all “propositional” or sentence-like in nature, together with the mainstream view that mental images are not—at least, not entirely (De Vega, et al. 1996; Fodor 1975; Kosslyn, et al. 2006; Tye 1991)—then this strongly suggests that whatever “inference mechanism” (or principles of inference) govern inferences among beliefs cannot be the same one that guides inferences among states of visual imagination. At very least, pursuing such an approach would require a more novel account both of belief itself and of the principles of inference governing belief than these theorists have proposed.

Nevertheless, Nichols and Stich’s proposal should alert us to another possibility worth considering, which is that visualizations are governed by rules of inference of a kind—just not those thought to govern belief. If there were such rules, then Nichols and Stich’s approach to freedom could be adapted as well: it may consist in one’s ability to choose which premises to “feed into” these rules. In thinking about what such rules might be like, it will help to have in hand an account of visual imagery that is broadly compatible with the idea that visualizing is constrained by principles of inference of a kind. Here, we may look to Rick Grush’s (2004) “emulation” theory.

Grush’s emulation theory of visual imagery

Grush’s (2004) emulation theory of visual imagery can serve as a foundation upon which to build towards answering our questions (1) and (2). Grush’s account serves this purpose well for three main reasons: first, it is compatible (as far as it goes) with a variety of views of visualization, including most of those that see it as a kind of “simulation” of visual perception; second, it is consistent with relevant neurological and psychophysical data concerning visualization (Grush 2004, p. 367); and, third, it provides details needed for answering (1) and (2) that are left out of many other accounts—specifically, it says something about the “rules of inference” governing visualizations. That said, Grush’s account provides only a foundation; to adequately address our puzzle, his basic picture will need to be significantly expanded in ways described below.

Grush’s view draws on well-known posits from control theory (e.g., “forward models”, “efference copies”) and signal processing theory (e.g., “Kalman filters”). To provide just a little of that background, it is commonly held that fast error correction for limb movements requires a process of prediction and comparison carried out subconsciously by the motor system (Blakemore et al. 2002; Miall et al. 1993; Wolpert et al. 1995). According to this view, when a motor command is issued (e.g., “open fist”), an “efference copy” of the command is sent to a neurally realized “forward model,” capable of generating a prediction of the sensory consequences of the act. If the prediction issued by the forward model “matches” the actual “reafferent” sensory input (as judged by a “comparator” mechanism), the reafferent signal is significantly “canceled” out or dampened. If, on the other hand, there is a mismatch between the prediction and input state, the sensory cancelation does not occur, and the organism is thereby alerted to change its approach. Such predictive mechanisms have been thought necessary to explain corrective arm adjustments made in grasping tasks that occur too quickly (200–300 ms) to result from the visual or proprioceptive monitoring of sensory feedback (Miall et al. 1993, p. 205; Wolpert et al. 1995).

Grush seeks to extend these ideas to visual imagery, conceiving of visual images as “prediction” states that can be generated in the absence of incoming sensory information with which they might be compared—in his terms, humans generate imagery “by operating emulators of the motor-visual loop” (p. 386).Footnote 9 Classical work by Sperry (1950) and von Holst (1950/1973) posited similar mechanisms in the visual systems of simple organisms (e.g., fish, flies), but did not couch the idea in terms of “forward models” or “comparator” mechanisms. Grush’s idea is that visual imagining is the capacity to exploit the prediction mechanisms normally at play in visual perception, when perception of the relevant kind is not occurring. While the comparison between prediction state and reafferent input occurs below the level of consciousness during perception, visual imagery becomes conscious (i.e., is made globally available) during visualization precisely because it does not “match” the current input.

In filling in the details of his account, Grush compares human visualization to the visuospatial reasoning of robots created by Mel (1986, 1988). These robots are trained to navigate past obstructions to reach a goal object, using visual input gained through attached video cameras. In a first stage, the robot learns (through trial and error investigation of the setup) facts about how the environment relates to its possible movements, such as “if the visual input at t1 is x1, and motor command m1 is issued, the next visual input, at t2, will be x2” (Grush 2004, p. 386). When enough of these mappings have been learned, the robot “is able to solve the problems off-line using visual imagery…it moves the image of its arm around by means of the same motor commands that would usually move its arm around, seeing what sequences of movement impact upon objects” (ibid.). By learning (and storing) correlations between possible movements and visual inputs, these robots are “able to engage in visual imagery in which [they] can mentally rotate, zoom, and pan images” (p. 386).

It is not hard to see how these ideas might be extended to human beings and visual imagery. Through interaction with the world, our visual systems are trained to “expect” certain kinds of inputs provided particular combinations of a present input and motor command, and are able (by exploiting a “forward model”) to generate mental states constitutive of these predictions. Visualization may consist in generating such predictions, where the initial input is gained not through perception but through a kind of visual stipulation. After summoning an initial visual image, the subject chooses subsequent motor commands to send to the forward model, which continually generates predictions of subsequent visual states. So, supposing v1 stands for an initial visual image, and m1 a particular motor command (e.g., “turn head to the right”), the forward model computes a function taking v1 and m1 as input and giving v2 (a second visual image) as output, where v2 is the “prediction” of the sensory consequences of v1 and m1. v2 is then looped back as input to the forward model along with a second motor command m2, continuing the predictive processing so long as motor commands continue to be sent to the forward model.

However, as an account of human visualization, Grush’s proposal remains incomplete. Granted, Mel’s robots take an important step beyond the inner DVR view, as their visualization does not consist merely in the orderly replay of pre-recorded footage; they can visualize navigating the maze along a path they haven’t taken before. Nevertheless, they remain entirely tied to visualizing one specific maze; they are not able to “break out” of their learned correlations to visualize a new environment, one characterized by different correlations between visual inputs, motor commands, and subsequent visual states. So, something more must be said about freedom.

We can potentially add in this freedom by allowing the robot to intervene “at will” in the unfolding of expectations, not just with a novel motor command but with a novel visual state—one which does not follow from the rules implicit in the forward model that has developed around navigating its “home” environment. It can then go on drawing out consequences in accordance with the forward model’s algorithms from that (stipulation) state. Of course, this new stipulation state must itself be a state from which the system is prepared (given its history) to make predictions. Provided this last condition is met, the robot could be said to visualize an endless variety of novel scenarios, simply using its rules for navigating its “home” environment plus the ability to intervene in its own processing with a novel visual stipulation when desired—essentially “moving the walls” of the maze when it does so.

Another limitation of Grush’s appeal to Mel’s robots worth mentioning is that the robots (as described) only visualize navigating static environments. Yet, in addition to static environments, humans also visualize situations where they remain still, and objects move in relation to them. Visualization must then involve algorithms characterizing not only how our awareness of the world changes relative to our motor commands, but also characterizing the way in which moving objects typically continue to move relative to us, and the relationship of such movements on our visual awareness of them as we potentially move as well. Here, we perhaps beyond what we can expect a simple forward model to accomplish (recall, such models were initially posited only with respect to limb movements (Miall et al. 1993; Wolpert et al. 1995)). Nevertheless, the important point for our purposes is that we have in hand a collection of algorithms and stored representations that could plausibly serve to constrain sequences of visual imaginings in a way that is compatible with the freedom and creativity of visual imagination.

Notably, the algorithms in question only involve what we can think of as visuomotor regularities and concern simple properties like shape, location, and relative motion. Yet, visualization is thought to influence reasoning across a wide variety of contexts—from theorizing about other minds to the development of complex scientific hypotheses. It is reasonable to ask how our thinking about “higher level” properties connects with the more basic visuomotor relations captured by the sort of algorithms so far described. I will not have space here for much speculation on the matter. My aim is less ambitious, in that I hope to develop just the beginnings of an account of visualization responsive to (1) and (2). I take it that the right starting-point in thinking about visualization is with relatively simple “low-level” properties; given a plausible account of how we usefully visualize these, we can go on to ask about more complex kinds of properties.

For now, I hope it can be agreed that we have at least the outlines of an answer to (1). We can thus turn our attention back to question (2). How are we to conceive of the functional relationships (the typical causes and effects) that hold between such “imaginative” states and the subject’s beliefs, perceptions, and other mental states? It seems clear that, on Grush’s sort of approach, visualization constitutes a kind of visuospatial reasoning, even if it is reasoning about environments never before encountered. But, do the imaginative states cause judgments about the way various perceptual scenarios would unfold? Are they the judgments themselves? What are the options here? In the next section, I will describe three main approaches to filling in these details.

Of course, there might well be approaches to (1) that reject the core tenets of the Grush-inspired view so far described; I do not mean to deny their possibility. Rather, I am proposing to move forward with a relatively clear proposal for answering (1) to see where we end up when we try account for (2) as well.

Three commitment views

To offer an appropriately broad range of options for answering (2), I will need to introduce the notion of a commitment. A commitment is a truth-evaluable representation concerning the way the world is, as held by a thinking subject (I do not assume commitments must be occurrent or consciously held). Beliefs are paradigm cases of commitments, though they arguably form only a subset of one’s commitments. Perceptual representations are also plausibly commitments; they represent the world to be a certain way, though (I will assume) are not to be identified with belief (here I follow philosophical convention (Siegel 2008); see Sec. 5 for more on the relation between perception and belief). Notably, beliefs in generalizations (or ceteris paribus beliefs) are commitments, whatever their peculiarities. To believe that birds fly and trees have leaves is to believe truths about the way our world is, even if many birds don’t fly and some tress lack leaves.

Using the notion of a commitment, we can describe three options for approaching (2), each of which is consistent with the view outlined in the section on “Grush’s emulation theory of visual imagery”. On the first two views, visualizations constitute occurrent commitments about the way certain kinds of scenarios unfold; we can think of them as commitments concerning sensorimotor generalizations (details to come). The difference between these two options consists in whether the commitments are considered beliefs, or some other kind of commitment (“quasi-perceptual” commitments, perhaps). I will call the former approach the Belief View (“BV”) of visualization; to my knowledge, no one has previously defended such a view. The latter I will call the “Impinging Generalization” (“IG”) view of visualization. It is an “impinging” view to the extent that it sees visualizations, like perceptual representations, as belief-distinct commitments that impinge on one’s web of belief from without, typically causing beliefs with related contents.

A second kind of impinging view (and our third option) holds that visualizing strictly speaking involves misrepresenting one’s actual environment, for it involves representing as present a variety of things that are typically not present. I will call this the Misrepresentational Impinging View (“MRI”). This is a commitment view because it holds that visualizations represent various objects as present before one; however, because these commitments are not appropriately tied to action-guiding systems (they are triggered “off-line”), they do not lead to wildly mistaken behavior. Instead, such willful misrepresentations tend to cause beliefs about how certain scenarios generally unfold (or would unfold); hence its status with IG as an “impinging” view, influencing one’s web of belief from without.

MRI is motivated by the thought that visualization involves much the same kind of mental state as visual perception, and that perceptual states represent their objects as present. The difference between MRI and IG lies in the correctness conditions each attributes to visualization: IG holds that visualizations are veridical to the extent that certain kinds of scenarios unfold in a certain way; MRI holds that visualizations are veridical to the extent that the world right before one is a certain way (and therefore are almost always non-veridical).

A fourth option would be to deny that visualizations are commitments at all. One might insist that visualizations are simply not truth-evaluable. I set this option aside for the time being, as it is not compatible with the Grush-inspired approach to answering (1) developed in the previous section. For if it is true that sequences of visualization are governed by a set of predictive algorithms, then such sequences are plainly calculations of a kind and, as such, aim to calculate something correctly. Holding that visualization is, in general, a sequence of non-truth-evaluable representations simply opens up again the question of how we are to answer (1). I will, however, have more to say about the non-truth-evaluable proposal below (the section on “A last word about freedom”). For now, I turn to further explicating the three approaches just described, beginning with BV.Footnote 10

The Belief View of visualization (“BV”)

According to BV, visualizing consists in making occurrent judgments concerning visuomotor generalizations, where these judgments are beliefs. It sees visualization as on a par with mathematical reasoning. In line with the approach to (1) described in the section on “Grush’s emulation theory of visual imagery”, an initial visual image is taken together with an efference-copy motor command as input to a complex function, governed by sensorimotor algorithms. The output of the function is a subsequent visual image, itself combined with a further motor command and fed back into the function, and so on continually. It is the diachronic sequence of this reasoning that is truth-evaluable, and that constitutes the relevant belief—not the individual states. By analogy to mathematical reasoning, one does not form truth-evaluable representations as a means to accomplishing the reasoning in question. When I add 245 to 342, I do not need to represent that 245 is added to 342 as a kind of truth-evaluable hypothetical supposition entertained along the way to arriving at the answer. I simply carry out the operation “in my head” to the best of my abilities, using whatever algorithms or heuristics I have at my disposal, and conclude that the answer is 587. The reasoning process as a whole is assessable for accuracy and, plausibly, constitutes my occurrent judgment that 245 plus 342 equals 587. Of course, if I add these two numbers often enough, I might also acquire a stored belief that 245 plus 342 equals 587, one that could be recalled without my needing to do any addition in my head. But that is not the normal case. For most such problems, I have to go through with the calculation in order to render an occurrent judgment; I simply don’t have any commitments about the answers in question independent of my ability to go through with the calculations. For me to have “implicit” beliefs about such mathematical propositions just is for me to be disposed (given my algorithms and heuristics, etc.) to arrive at such occurrent judgments.

Similarly, visualizing scoring a goal in a soccer match can be seen as “working out” the problem of how things would go in such a circumstance, using visual as opposed to arithmetic representations and algorithms.Footnote 11 Suppose, as above, that v1 stands for the initial visual image, m1 the initial motor command, and v2 the subsequent visual image that is generated in accord with the relevant algorithms. According to BV, neither v1 nor v2 (nor their sequence) represents anything as present in the visualizer’s immediate environment; v1 and v2 considered singly are not truth-evaluable at all. Rather, the dynamic sequence of v1 and m1 generating v2 is the truth-evaluable entity, for it constitutes the organism’s commitment that a certain kind of visuomotor scenario would unfold in a particular way—its commitment that v1 plus m1 equals m2.

For clarity, it may help to translate (very roughly) the proposed representational contents into natural language, as follows: v1 = “a foot pulled back at angle t to strike a ball”; m1 = “swing foot forward at angle r”; v2 = “a foot striking a ball at angle s. v1 and v2 do not purport to represent a foot and ball as present before one; they just represent a foot and a ball in different relationships to each other. Considering the processing sequence as a whole, however, we can think of it as akin to a belief in a generalization of the form: “A foot pulled back at angle t and a motor command to swing the foot forward at angle r results in a foot striking a ball at angle s.” To be clear, the suggestion is not that the belief itself has propositional or sentential structure, only that the processing sequence has comparable truth conditions to such a belief. Intuitively, one is just reasoning that a foot moving toward a ball at a certain angle results in its hitting the ball at a certain angle; this is a commitment about the way a certain kind of visuomotor situation would unfold. That one is disposed to reason in this way (given one’s algorithms) amounts to saying that one implicitly believes the generalization in question. Going through with the reasoning amounts to making that implicit belief explicit or “occurrent.”

Obviously, the belief can be either true or false. But, either way, BV grants that in going through with this reasoning one has “successfully” imagined scoring a goal, since imagining scoring a goal amounts (on the present view) to reasoning about scoring a goal, whether the reasoning is good or bad.Footnote 12

BV requires that one can have beliefs with mental images as constituents. I will discuss some reasons one might resist that idea below. In the meantime, we can see roughly what the answer to (2) would be on this approach: these states of visualization just are beliefs, so they interact with beliefs, desires, and action-guiding systems as would any other beliefs in generalizations (modulo the involvement of visual imagery).

The freedom of imagination, on BV, is explained as indicated in the section on “Grush’s emulation theory of visual imagery”. At any point in the reasoning processes, the subject can intervene to insert a visual image that does not follow from the algorithms that generally govern visualization. When this happens, the visualization’s truth conditions should be assessed beginning from the point at which the intervention occurred, and ending before any further intervention. For at the point of the intervention, one has stopped trying to reason about how the initial kind of visuomotor scenario would go; one is “starting over” with a new (if connected) reasoning project. Of course, there may be an important cognitive point to the larger imaginative project, where the larger project contains the various interventions within it—interventions which allow one to represent an overall situation not previously perceived. The point is simply that in assessing the correctness conditions of the imaginative project, we must sometimes look at individual “pieces” of it—those that occur in-between stipulative interventions—for only these aspects are subject to the kinds of constraints that are suitable to give rise to correctness conditions in the first place.

The Impinging generalization view (“IG”)

IG holds that visualizations have the same representational contents and correctness conditions as according to BV. There is no presumption that such states are misrepresentational, since, as with BV, they are veridical to the extent that they accurately reflect the way a certain kind of situation generally unfolds (and not to the extent that they represent how the world actually is before one). Again they constitute commitments because they are generated out of predictive, probabilistic algorithms whose nature it is to try to get these generalizations right. Also, as with BV, we can say that a person whose visualization misrepresents the way a particular kind of visuomotor scenario would unfold nevertheless imagines that scenario, since imagining is simply visual reasoning (good or bad). And both IG and BV account for the freedom of imagination in the same way.

The key difference with IG is that it denies visualizations are beliefs (I will say more about why one might deny they are beliefs in the section on “Illusion and encapsulation”). This difference calls for a difference in our answer to (2). Visualizations obviously need to interact with our beliefs in order to play the role they do in practical reasoning; if visualizations are not themselves occurrent beliefs, then they need to somehow transfer their contents to belief. We might conceptualize the relation by analogy to the relation between perception and belief: just as perceiving that there is a tree to the right typically causes a belief that there is a tree to the right, visualizing my foot striking the soccer ball (that is, the diachronic sequence of such) perhaps causes a belief in a conditional of the form “if I swung my foot forward just so, thus and such would happen,” or in a comparable generalization. I will come back in the section on “Illusion and encapsulation” to discuss this aspect of IG in relation to BV, considering how the “imagistic” portion of visualization might transfer its contents to belief.

The remainder of this section is devoted to assessing MRI and its relation to IG.

The Misrepresentational Impinging view (“MRI”)

The second impinging view, MRI, sees visualization as always involving mental states that are strictly speaking non-veridical. They are non-veridical because they are straightforward indicative representations about how the world is before one, akin to one’s perceptual representations; since the world before one’s eyes is rarely if ever the way one’s visualizations represent it to be, one’s visualizations are typically non-veridical. Nevertheless, it is in virtue of more or less the same set of rules or algorithms as posited for BV and IG that an initial visual image is followed by an ordered sequence of others, making it consistent with the architecture described in the section on “Grush’s emulation theory of visual imagery”. Freedom is also accounted for in the same way as with BV and IG—the subject can intervene at will to insert a state that does not “follow” from visualization’s predictive algorithms. An important addendum to MRI is that the states it posits are cut off from action-guiding systems—they are in this sense “off-line”. If they were not, they would presumably lead to hallucinatory behavior; one would act though if everything one was imagining was happening before one.Footnote 13 Instead, (and in partial answer to (2)) their main output is (typically) to cause a belief that a certain scenario would unfold in a particular way. So, it would seem they have the same typical effects on belief as those proposed for IG.

So much for a quick summary of the three approaches. My sense is that most would initially favor both MRI and IG over BV, and then MRI over IG. I have not attributed the views to specific theorists, however, since each view goes into more details concerning the precise representational characteristics of visualization than are specified in other accounts. However, Currie (1995) seems clearly to defend an MRI-style view when, having argued that visualizing amounts to simulating visual perception, he concludes that visual imagery “should also have a content that is potentially the content of a visual [perceptual] experience…the simulationist will say that the content of visual imagery is always of the form, ‘That I am seeing such-and-such’” (p. 36–37). And, to the extent that Gordon (1986, 1992) and others see visualization as occurring “off-line”, they are implicitly committed to MRI and its view that visualization is inherently misrepresentational (why else would it need to occur off-line—that is, quarantined from action-guiding systems?). Rollins’s (1989, Ch. 5) account of “pictorial attitudes” may however represent a relatively rare case of an IG-style view. And while some philosophers have explicitly denied MRI’s claim that visual images represent their targets as present (McGinn 2004; Sarte 1966) and to this extent sympathize with IG and BV, they have not gone on to develop a positive account of what visualization’s representational properties are in a way that would allow for an answer to questions (1) and (2). So, it is difficult to assess what they would make of our three options.

In any event, I turn now to assessing IG against MRI.

IG vs. MRI

The most important difference between IG and MRI lies in how they view the correctness conditions of the collective states of visualization; I will argue here that IG’s approach is preferable.

However tempting it may be at first glance, MRI’s core commitment that visualizing involves (dimly?) misrepresenting one’s actual environment is problematic. The question of whether a mental state represents its target as present is plausibly a matter of its functional role, and visualizations do not under any normal circumstances lead to behavior appropriate to the visualized objects being present before one. Sure, one could insist that the “real” functional role of such states is disguised by the fact that they occur “off-line”; but the question is why we should think visualization is misrepresentational in the first place. Why posit these epicycles?

Some might think Perky’s (1910) famous experiment, where subjects reported that objects dimly (and unexpectedly) projected on a screen were their own imaginings (the so-called “Perky effect”), constitutes evidence that visual imaginings, like visual perception, represent their objects as present. But, if this is evidence that imaginings and perceivings both represent their objects as present, then we have far more evidence that imaginings do nothing of the sort, since the confusion hardly ever occurs; every non-confused imagining is evidence against the hypothesis. In any case, the instances of confusion can be attributed to shared representational characteristics, without going so far as to conceive of imagining as inherently misrepresentational. For instance, both visualizing and visual perception may synchronically represent colors and shapes in fixed three-dimensional coordinates, say, and do so at least partially in a ‘depictive’ or iconic format; and it may be a feature of both that, for any two parts of an object represented, the parts are represented as being in some determinate spatial relationship to each other. Such representational similarities are sufficient to account for the “phenomenal” similarity between the two. Further, it bears emphasis that visual imagery may re-use the visual system (Anderson 2011), without completely duplicating the representational characteristics of visual perception.

Note also that, notwithstanding claims of philosophers to the contrary, Perky’s results have proven difficult to replicate. The closest cases are described by Segal (1971), who notes that she was forced to give subjects a placebo, which they were told was a “relaxant”, before some would claim that projected images were their own mental images. Segal adds that some subjects apparently viewed the placebo as a kind of hallucinogen, as they would continue to claim the projected images were mental images even once the intensity of the stimulus was raised well above threshold (1971, p. 77). Moreover, the question of whether the Perky effect is a genuine phenomenon is further confounded by the fact that some psychologists use ‘the Perky effect’ to refer to the interference of visual imagery tasks on concurrent visual perception tasks (C. Craver-Lemley and Reeves 1987; Catherine Craver-Lemley and Reeves 1992), for which there is ample evidence, while philosophers typically use ‘the Perky effect’ to name the supposed tendency of subjects to mistake an imagining for a perceiving.

A more subtle argument, however, might be offered in favor of MRI, to the effect that the forward model-based conception of imagery developed in the section on “Grush’s emulation theory of visual imagery” requires visual images (conceived of as “predictions” of sensory input) to have the same representational properties as the reafferent perceptual states they predict, if they are to appropriately match those states when assessed by the comparator. If perceptual representations represent their objects as present, so too (one might argue) should visual images.

However, according to theories invoking forward models and comparator mechanisms, the relevant predictions and comparisons happen below the level of consciousness; indeed, they shape and partly determine the nature of downstream conscious perceptual experience. At the pre-conscious level at which these representations are “compared,” it is reasonable to think that neither form of representation (proto-perceptual or imagistic) yet represents its object as present; rather, both may have generalized, non-truth-evaluable contents of the kind considered above, e.g., a ball moving left; or, a table receding to the right. If conscious perceptual representations represent their objects as present, this representational feature may be acquired downstream from such comparisons.

Another reason one might favor MRI over IG derives from the idea that, in order to find out what would happen if p, we must represent that p is the case “off-line” and see what we come to infer. This view of hypothetical reasoning is typically advanced as a part of theories of propositional imagination (Currie and Ravenscroft (2002, Ch. 1-2); Nichols and Stich (2000); Gordon (1992, p. 92)). On these views, hypothetical reasoning involves representing (“off-line”) the hypothetical situation as actually occurring, and then seeing “what emerges as reasonable” (Currie and Ravenscroft 2002). So, for instance, in order to determine what will happen if it rains on the parade, say, one represents “it is raining on the parade” off-line from action-guiding systems and, similarly off-line, draws a number of inferences, such as: people are getting wet, the floats are flooding, the drum-major is wearing a poncho, etc. Based on these misrepresentationalFootnote 14 off-line indicative inferences, one then forms the conditional belief: if it rains on the parade, then people will get wet, the floats will flood, the drum-major will wear a poncho, etc.

Suppose we grant that propositional hypothetical reasoning—and even visualization—could conceivably proceed in this manner. Is the view inevitable? I think not. I argue elsewhere (Langland-Hassan 2011) that a simpler account, involving only ordinary belief and desire, is possible. Nor, I should add, do its proponents in the case of propositional imagination argue it is the only way that hypothetical reasoning could take place (see, e.g., Nichols et al. (1996)). Rather, they advance their views of imagination on other grounds (e.g., to explain childhood pretense (Nichols and Stich 2000)), noting that the posited architecture could also be deployed in (propositional) hypothetical reasoning.

Consider again an analogy to mathematical reasoning: to determine the answer to 245 plus 346, do I need to represent 245 plus 346 “off-line” and see what I come to infer? Surely not—I can just carry out the calculation “on-line”, using whatever heuristics and algorithms I have for accomplishing addition. If visualization is comparable in the ways already suggested, then there is no reason to treat it differently (however one wishes to treat propositional hypothetical reasoning). Imagining someone hitting a baseball with bat involves an initial visual image v1 (of a bat hitting the ball at a certain angle) that is fed into algorithms governing how a ball would likely move as a result, together with algorithms governing how subsequent motor commands would effect what is seen, and so on, resulting in an output that then becomes the input to further “reasoning”. This hypothetical reasoning process about the way certain kinds of visuomotor scenarios unfold is on-line and assessable for accuracy; we should not think of it as misrepresentational and “off-line” any more than mathematical reasoning is misrepresentational and “off-line”.

It is worth emphasizing as a further virtue of both IG and BV over MRI that on both IG and BV we can hold that an instance of visualizing is veridical or not to the extent that it constitutes good (visual) reasoning. This ties the veridicality conditions of visualization straightforwardly to the impact of those states in successfully guiding behavior. Those attracted to teleological accounts of content—holding that the content of a state is best determined by what it helps us (or helped our ancestors) do—should find this a considerable advantage. MRI is unnecessarily awkward in this regard, holding that all visualizations are themselves non-veridical, yet often enough result in true beliefs about how certain kinds of scenarios would go. It draws too strong a parallel between perceptual representation and imagination than is warranted by any obvious consideration.

A last word about freedom

A last reason one might favor MRI over IG and BV (or at any rate mistrust IG and BV) is tied to the freedom of imagination. According to IG and BV, all visualization consists in making and/or rendering occurrent commitments—be they beliefs or commitments of another kind. This follows from the view of visualization as a kind of visual reasoning. Philosophers especially may chafe at this part of the account. Is it really at all plausible to think that all visual imagining amounts to making judgments about how a particular kind of situation would go? Can’t we imagine things going all sorts of ways we think they wouldn’t go? Isn’t that precisely what distinguishes visual imagination from visual hypothetical reasoning (supposing we allow for the latter)?

Of course, IG and BV have things to say about freedom—and these are essentially the same things MRI says. The freedom of imagination consists in our ability to interrupt the pattern of visual inferences that follow from the above-described algorithms to insert a new visual image (or short sequence of images) as a premise for further reasoning. For instance, supposing I visualize letting go of a baseball and, instead of imagining it falling to the ground, I imagine it shooting up into the sky. This is not what I judge would happen if I let go of a baseball. But, IG and BV respond, it is what I judge would happen if I let go of a ball and it began to go upwards at a great rate. In imagining the ball flying upwards, I have intervened in the default mode of imagining dropping a ball to carry out inferences about how a different sort of scenario would unfold—one where the ball flew upwards (on MRI one interrupts to carry out inferences about how a different scenario is unfolding before one). To the extent that the subsequent development of images is constrained by algorithms governing motor commands and their relation to visual input, the dynamics of moving objects, and so on, the sequence remains assessable for accuracy. As remarked above, we “start over” in assessing the veridicality of the visualization at each point where there is an intervention, as these in effect mark the boundaries of appropriately constrained reasoning processes.

Now, one may not care in certain cases of visualization whether one is “getting it right”—that is, whether one’s imaginings veridically represent the way the imagined scenario would unfold. And the more one intervenes with visual “stipulations” that replace the default algorithms’ output, the more likely one is not to care. In some cases, the larger point of the visualization may merely be self-entertainment, not problem solving. But, the visualizations may have correctness conditions and constitute commitments all the same; you are visually reasoning even when you don’t especially care whether you get the right result.

In addition, we must bear in mind that commitments come in many degrees of strength. Visualizations that extend well beyond what we have previously perceived may constitute only tentative commitments, just as guesses about the future can in general be tentative. But again, we can still consider them commitments; by doing so, we retain an answer to (1) and (2).

According to IG, BV, and MRI, the only way to completely avoid making (or activating) commitments during visualization is for the visualization to, at each step, consist in a visual stipulation—no image “following” from another via relevant algorithms. It seems clear that this is not how visualization normally proceeds, though the possibility of such episodes should not be dismissed. BV and IG would hold that such visualizations are not truth-evaluable, for here we would simply have a succession of images with non-truth-evaluable contents such as “a white ball…a green rake…a blue moon.” For it is only in combination with algorithms whose nature it is to predict subsequent images from prior ones that diachronic patterns of visualizations come to constitute calculations of a kind—calculations that by their very nature aim to get something right. Where a purely stipulative imagining is not truth-evaluable, it will not constitute a commitment; we will then expect it to have negligible cognitive effects (i.e., it will not be useful). This does not threaten our response to (1) and (2), so long as most visualization is not of this nature. MRI, by contrast, would hold that such purely “stipulative” imaginings are, like all imaginings, misreprentational; that MRI does not capture in its account of correctness conditions the difference between such an imagining and a useful one is a weakness of the approach.

But, suppose that one maintained that most (or all) visualizations lack truth values completely. This might seem an attractive way of preserving the intuition that visualizing does not involve forming commitments of any kind. I don’t suppose many psychologists would espouse such a view, but it may well capture the sympathies of some philosophers (see, e.g., McGinn 2004, p. 21; Searle 1983). The obvious problem here is that in adopting a non-commitment view of visualization as our general approach, we lose our accounts of both (1) and (2). If each step of visualization is determined by a kind of willful stipulation, then there are no general principles governing how any particular image will develop across time in visualization. Nor have we any idea how the procession of such images could profitably influence belief in the many instances of practical reasoning where visualization is implicated, for it would be constrained by nothing other than one’s stipulative wishes. Holding that visualization is entirely (or even predominantly) stipulative is, from the perspective of cognitive science, a non-starter.

Let me try to put this point in another way, in response to a comment from a referee. A tempting view of visualization sees sequences of visual images as not themselves constituting commitments, but rather as offering candidate beliefs, or fodder for belief fixation. The idea is that in visualization we can “try out” a variety of scenarios without these rehearsals reflecting or constituting any kind of commitment. When one of these “candidates” seems especially plausible or compelling, we may form a corresponding belief, and only then arrive at any sort of commitment. Now, such an approach has likely given up on answering (1), which should be enough to stop us in our tracks. Setting that aside, we have the difficult question of what determines which candidates inspire belief, and which will not (this is question (2)). The only answer would seem to be that a visualization will inspire belief when it coheres with or relevantly “matches” a preexisting belief. But, this renders the visualization itself otiose, for we already knew what the visualization is telling us. The usefulness of visualization is unexplained. However, lacking any such belief to match with candidate visualizations, it remains entirely unclear how and why some visualizations inspire belief while others do not.

Nevertheless, an important concession can be made in response to those convinced that visualization is too “free” to constitute commitments of any kind. Often visualization is driven not so much by a conscious intention to solve a particular “problem”, but rather by what we might call “associative” principles that vary depending on individual psychologies. For instance, the way that daydream fantasies play out typically has less to do with solving a visuospatial problem, more to do with exploring possibilities that one finds pleasant or otherwise of interest. Let us call the two different kinds of causes of visualization “intentions to determine” in the first case and “desires to explore” in the latter. The different kinds of cause will impact the kinds of contents that are visualized, desires to explore often resulting in contents that depart more radically from the everyday. The more fanciful the subject matter of the visualizing, the more we may be tempted to think it does not constitute a commitment of any kind. But, we should resist that temptation, for the different kinds of causes and different characteristic contents need have no bearing on whether the processing is a kind of reasoning, constitutive of one’s commitments. Reasoning about fantastical scenarios and distant possibilities is reasoning all the same.Footnote 15 By resisting the temptation to deny lighthearted reasoning about fantastical scenarios as a kind of reasoning, we preserve our answers to (1) and (2) and so have the beginnings of an account of visualization’s functional role within a broader cognitive economy.

MRI dismissed

As should be clear by now, MRI is no better suited than IG and BV to account for the kind of radical freedom—freedom from constituting any kind of commitment—that some might wish to attribute imaginings. After all, MRI accounts for freedom in the same way as IG and BV, by allowing stipulations to occasionally intervene in the processing governed by various algorithms. MRI maintains that visualization is otherwise constrained by the same algorithms as IG and BV, and therefore constitutes a kind of visual reasoning. The difference is that its outputs are all strictly speaking misrepresentational and are generated “off-line”. One does not gain any extra measure of freedom by conceiving of the sequence of representations in this way—they are commitments all the same, albeit ones (like known illusions) that are believed to be false. On such a view, one is still reasoning about how a certain scenario would unfold—rather, is unfolding—it’s just that one is reasoning that the scenario is occurring in front of one, while “resisting” the mild illusion.

Once it is clear that MRI is no better suited than IG or BV to satisfy the intuition that visual imagination flies free of our actual commitments and that the similarity of visualization to visual perception need not entail that both represent their objects as present, much of MRI’s appeal relative to IG falls away. MRI unnecessarily divorces the correctness conditions of visualization from its role in guiding behavior, while explaining no features of visualization that IG cannot also accommodate. While more could undoubtedly be said in the debating the merits of each, I propose to move forward to consider IG in relation to BV.

Illusion and encapsulation

Many will balk at assimilating visualization to a kind of occurrent belief, in accord with BV. While I won’t be able to touch on all the possible reasons for this resistance, I want to show that BV is not as implausible as it might seem. I want to leave BV a viable contender among other options.

Most of the reasons one would prefer IG to BV are, I think, essentially the same reasons one would resist assimilating perception to belief. I do not aim to deny the traditional distinction between perception and belief. Rather, in making a limited case for BV, I will argue that the most obvious reasons for distinguishing between perception and belief do not extend to the case of visualization and belief.

First, the matter of representational format might seem to pose an acute problem for BV. It is often held the representations underlying human thought must be compositional and systematic. While there are many ways of understanding these notions, the rough idea is that human thoughts are composed of constituent concepts, and that for each thought there is a “canonical decomposition”—one way of divvying it up that reveals its constituent structure (i.e., its basic parts). This compositionality is held to explain the apparent systematicity of human thought (i.e., the (purported) fact that anyone who can think that a is F and b is G can also think that a is G and b is F) and its productivity (i.e., the ability of humans to entertain an unlimited variety of thoughts, while having a necessarily limited representational store). By contrast, it is often held that perceptual representations (and perhaps mental images) are ‘iconic’, where this means they lack a canonical decomposition; there is no single way of “cutting up” an iconic representation into parts and, by extension, no clear explanation of systematicity (see, e.g., Fodor (2000, Ch. 2)). If the representations underlying beliefs all have canonical decompositions, and visual images do not (let us suppose they are iconic), then we have an obvious problem assimilating visualizations to belief.

It is beyond the scope of this paper to weigh in on the (empirical) question of whether mental images fail tests of compositionality and systematicity. However, without making any assumptions one way or another about their underlying format, it is not obvious that they do. Plausibly enough, anyone who can visualize a red hen and a green rake can also visualize a green hen and a red rake; and anyone who can visualize John hugging Mary can visualize Mary hugging John, and so on. One reason some might think visualizations fail compositionality is that, like perceptual experience, they have nonconceptual content. In a way, this just restates the claim that visualizations lack compositional structure, since (in line with Evans’ (1982, Ch. 4) generality constraintFootnote 16), the mark of conceptual content is often held to be that someone who conceptually thinks that a is F and that b is G must be able also to think that b is F and a is G. However, the standard arguments for nonconceptual content seem not to extend to visualizations. Debates about nonconceptual content typically center on the observation that we can perceive and discriminate far more colors than we have concepts for (and, plausibly, beliefs about), and that therefore the content of such perceptions must be at least partly nonconceptual. On the face of it, visualization doesn’t represent properties as finely grained as perception; it doesn’t seem that we can form distinct visual images for every shade of color we can perceptually discriminate. Moreover, supposing evidence arose that we are able to visualize fine-grained differences in color—reliably visualizing red38 and red40 with representationally distinct visual images, say, as a means of solving reasoning tasks concerning the two colors—this would simply be evidence that we have the concepts of those colors after all. So, typical considerations linking fineness of grain and nonconceptual content do not transfer over to visualization.

Furthermore, in recent work on animal cognition, Rescorla (2009) shows how iconic “cognitive maps” capable of receiving probabilistic weightings can be used in deductive reasoning tasks that have traditionally been thought to require a systematic, combinatorial mentalese, and Carruthers (2009) argues that humans share “quick and dirty” “system-1” reasoning processes with animals and that these processes both satisfy a plausible construal of the generality constraint and constitute our genuine beliefs and desires (he contrasts distinctively human “system-2” thought as faux-thought). If mental images are relevantly like cognitive maps and can play a role both in deductive inference and satisfy defensible interpretations of the generality constraint, then whatever differences they may have with natural language representations may have little bearing on whether they are suitable constituents of belief.

But, let us, for the sake of argument, assume that mental images lack canonical decompositions, fail tests of systematicity, and have nonconceptual content. The key question then becomes whether all beliefs must have canonical decompositions and purely conceptual content. The fact that we have beliefs that require a compositional, systematic “mentalese” does not require that the representations underlying all of our beliefs are compositional or systematic. Thus, arguments for the compositional and systematic nature of human thought are not arguments that a subset of one’s beliefs cannot involve iconic representations as constituents, or have nonconceptual contents.

Granted, admitting iconic representations into the realm of belief raises large questions concerning the principles by which these representations combine and interact with representations that do have canonical decompositions. Traditionally, one of the motivations for holding that beliefs have compositional structure has been that it allows for a tidy picture where rational inference consists in operations over uninterpreted symbols mimicking the inference rules of formal logic (where the expressions of formal logic have analogous constituent structure). Allowing into belief representations without canonical decompositions throws a wrench into these works.

In response, most theorists accept that the wrench is already there, to the extent that much human reasoning is clearly not guided by rules of inference analogous to those of formal logic (more on this below). And most leading theories of visual imagery already hold that visualization involves iconic and descriptive elements working in tandem, which helps to explain how visualization has a more determinate content than would be possible if it featured iconic representations exclusively.Footnote 17 Given the very close relation envisioned here between the iconic and the discursive, the idea that belief can contain both kinds of representation under its umbrella seems relatively conservative.

Finally, everyone concerned to explain the usefulness of visualization has to account for the inferential interaction between visualizations and the beliefs that do satisfy tests of compositionality and systematicity (and indeed between iconic perceptual representations and belief). This is not a burden peculiar to BV. Of course, one may insist that this particular problem should be enshrined in our terminology, by using ‘belief’ for the commitments that are relevantly compositional and systematic, and some other term for the commitments that are not. At this point, the dispute is merely terminological. I note, however, that the terminological decision to use ‘belief’ in this way does not fall straightforwardly out of common sense or folk psychology. (And again, this is all assuming, for the sake of argument, that mental images do fail relevant tests of compositionality).

The best case I can see against mixing visualization and belief traces to the phenomenon of informational encapsulation; and I think, for many, this gets to the core of the intuitive resistance to BV. In Fodor’s (1983, 2000) term, perceptual representations are “encapsulated” with respect to one’s beliefs, in the sense that informational exchanges between the two are asymmetrical (2000, p. 62-63).Footnote 18 One’s beliefs are subject to influence by what one perceptually represents, but not the other way around. The asymmetry is easily illustrated by appeal to visual illusions. Take the well-known Müller-Lyer illusion, where two equal-length lines appear to be different lengths, due to the different orientation of arrows at the ends of each. Someone can know that the two lines presented are the same length and can be a model of rationality, even while perceptually representing the lines as being different lengths—for perceptual representations are immune to change or “revision” in response to belief contents. Perceptual representations, then, cannot be beliefs, for a fundamental feature of beliefs is that they are sensitive to influence and revision in the light of contrasting beliefs.Footnote 19 If visual imagination is quasi-perceptual in the specific sense that its contents are encapsulated from belief, then, arguably, it too cannot be a kind of belief.

Now, if MRI were the correct view of visualization, there would be a strong parallel between visualizing and knowingly viewing an illusion. For all visualizations would plausibly constitute cases of visually representing the environment in front of one as being a way it is believed not to be; the very possibility of unconfused visualization would seem to require its encapsulation from belief. But, as we saw in the section on “Three commitment views”, there is no reason to think that visualization is inherently misrepresentational; there is no reason to favor MRI over IG. The temptation to call visualization misrepresentational is grounded in the mistaken view that the freedom of imagination is dependent upon an ability to represent as present whatever we wish. But, as we saw, MRI affords no greater freedom for imagination than IG or BV. And if there is no reason to think that visualizations are inherently misrepresentational, then there is no reason to assume they will conflict with one’s beliefs in the way that perceptual representations of visual illusions often do. And that means there is no obvious reason to think visualizations must be informationally encapsulated from belief.

With this in mind, let us look more closely at the phenomenon of visual illusions as they pertain to imagination. For there are many reports in the psychological literature of visual illusions being “mirrored” or “replicated” in visual imagination (Finke 1989; Pressey and Wilson 1974; Wallace 1984).Footnote 20 But what does it mean to say that an illusion is mirrored or replicated in imagination? What is it to imagine an illusion? Illusions occur when a person misperceives an object or property as being some way that it is not—when there is a conflict between what the perceptual representation “says” about the stimulus, and the way the stimulus really is. If imagining does not involve an external stimulus, in what sense can we be said to imagine illusions? Where is the needed conflict between representation and reality? Note that this is a fair question even to the defender of MRI. Suppose someone imagines “as present” a set of lines that look just like the Müller-Lyer lines. Who’s to say he isn’t imagining two lines that really are of different lengths?

The short answer is that these studies all partly involve perception of a physical stimulus and that this provides the needed contrast. In all such studies of which I am aware, subjects are first shown a physical stimulus and are then asked to imagine something in addition to that stimulus—essentially “adding on” to it in some way. For instance, Bernbaum and Chung (1981) showed subjects a straight line and asked them to alternately imagine outward facing or inward facing arrows at its ends (this was called “imagining the Müller-Lyer illusion”).Footnote 21 Imagining the different ways of adding arrows resulted in different judgments about the length of the line, just as seeing arrows at the ends of each line typically leads to different judgments about their relative lengths. But here, the “illusory” contrast is between the various ways the line is perceived to be and the line itself. And, of course, we already accepted that perceptual representations can conflict with one’s beliefs—the question is whether imaginative ones can. Such examples involve no contrast between the contribution of imagination itself (e.g., the imagined arrow-ends) and belief. And so, they offer no reason to conclude that visual imagination as such is encapsulated from belief.

What the experiments do provide evidence for is that there are distinctively visual ways of reasoning and that under specific circumstances, these ways of reasoning can lead us astray about how things actually are (in some cases, this is all the studies’ authors are trying to show). If we are using a ruler to determine the length of a line, adding arrows at its ends obviously will not affect our judgment of its length, but if we are trying to judge the length just by looking, characteristics of the visual system make it such that adding arrows of different kinds at the ends can “throw off” the judgment. Because certain representational characteristics of visual perception are mirrored by visualization, our vision-based judgments can be thrown off by visualizing additions to figures in the same way they would be thrown off if we saw the figures with those additions in place. But, this does not show that visual imagination is encapsulated from belief—only that there are distinctively visual methods of problem solving.

It is well-known that humans use a wide variety of inference rules and heuristics in practical reasoning (Girgerenzer et al. 1999). It is not as though all (or even very much of) human reasoning proceeds via the inference rules of formal logic (Johnson-Laird and Byrne 2002). Much everyday inference is inductive or probabilistic in nature. Consider, for example, the processes through which one chooses between competing explanations that are logically compatible with one’s evidence. Moreover, it is well known that many of the heuristics actually deployed by humans often lead to reasoning errors, and that different heuristics can return contradictory results (Nisbett & Wilson, 1977). For instance, in work on preference reversals, whether a subject deems one option better than another often changes depending on how the options are described, even when the different descriptions don’t entail real differences in the options described (Tversky and Thaler 1990). In such cases, one arguably uses different heuristics to reason about the cases, depending on how they are described. Such heuristics return contradictory results, even though the outcomes being reasoned about remain the same.

Similar results have been found concerning the calculation of comparative probabilities. A famous example comes from Tversky and Kahneman (1983), where subjects were given a description of a woman “Linda” who is “outspoken” and “concerned with issues of discrimination and social justice.” Asked whether it was more likely that Linda was a feminist bank teller, versus merely a bank teller, subjects typically replied that the former was more likely, even though a straightforward principle of probabilistic reasoning holds that A&B is never more likely than A by itself. Subsequent research has shown that how the problem is approached—and whether one succumbs to the faulty pattern of reasoning—depends much upon how the question is posed (Girgerenzer 1991, 1996). The important point here is that humans attack problems in a variety of ways. The fact that these algorithms and heuristics sometimes return differing answers does not entail that they are operating on different kinds of mental states (Samuels 1998). So, the fact that visual reasoning sometimes generates different judgments than non-visual reasoning does not by itself show that the mental states involved in visualization are not themselves beliefs (or components of beliefs). Of course, it does not show that visualizations are beliefs—only that we don’t yet have a clear reason for thinking they are not.

But, do these considerations threaten the distinction between perception and belief as well? Is perception not just one more reasoning heuristic? The key difference perception retains with both belief and visualization is its stimulus-dependence. All things equal, the only way to change a visual perceptual representation of two lines, so that one is represented as longer than the other, is to change the lines themselves. By contrast, we can through an act of will reason about whatever length lines we wish; commitments featured in chains of reasoning are (crucially) not stimulus dependent. Visualizations have the key characteristic of stimulus independence in spades, notwithstanding the phenomenon of “imagining illusions.” This renders it more suitable to be seen as a kind of reasoning process than as a quasi-perceptual one. Of course, while stimulus independence guarantees a measure of flexibility and “freedom” uncharacteristic of perceptual representations, it does not guarantee independence from broadly rational constraints. But, that just restates one of our main theses: the freedom of imagination is a freedom to reason about topics of one’s own choice, not a freedom to reason (or perceive) however one wishes. By contrast, perceptual representations are not suitable to be governed by rational constraints of any kind, precisely because their stimulus-dependence places them outside of one’s cognitive control.

If BV is not obviously false, should we prefer it to IG? I think so, if only tentatively. BV has simplicity on its side; there is no interaction between belief and a different kind of mental representation that needs to be explained; visualizing just is a way of updating (or rendering occurrent) one’s beliefs about how certain scenarios would unfold. IG is oddly redundant in this regard; it holds that in visualizing we generate a representation of how a certain kind of scenario unfolds, just so that it may then cause a belief with the same (or nearly the same) content. As remarked above, one might advocate IG on the grounds that visualizations (or parts of visualizations) fail tests of compositionality and systematicity, supposing we had good grounds for thinking that they do. But, this would simply amount to marking with terminology the problem of how reasoning processes without the relevant compositional structure interact with those that have it. While this is an important and difficult question, there seems little reason to mark it by using ‘belief’ for one kind of commitment and something else for the other.

In sum, once we have concluded that visualization is a useful, stimulus-independent, and not inherently misrepresentational means of generating (or reactivating) commitments, it is it is hard to see why IG should be preferred to BV. A more definitive conclusion, however, would require defending a more precise set of principles for individuating mental states, and a deeper investigation into the metaphysics of belief than I have had space for here. What I hope to have established is that the matter of whether visualization is a kind of belief hangs on very different kinds of questions—and far more subtle ones—than most have usually thought. If BV is false, it is not obviously false. Should it turn out to be false nonetheless, IG is a worthy alternative.

Conclusion

I have proposed a new place for visualization within a broader cognitive architecture. The freedom of visual imagination is not a freedom to misrepresent the world; it is not a freedom to entertain representations we believe to be false. Rather, it is a freedom to engage in visuospatial reasoning about (sometimes fantastical) topics of our own choosing. Understanding imagination’s freedom in this way allows us to simultaneously explain its usefulness to practical reasoning.

Once we understand diachronic patterns of visualization as constituting visuospatial commitments, the question arises as to the relation between these commitments and one’s beliefs. I have argued that we should broaden somewhat the traditional conception of belief, to allow in visualizations in as a particular species. Despite their sensory character, visual imaginings are not informationally encapsulated from belief in the way of visual perceptual representations. And while assimilating visualizations to belief leaves open the question of how “imagistic” representations inferentially interact with purely “propositional” ones, this question needs answering even if we do not assimilate visualization to belief. Allowing the assimilation assures that we see past the dogma that visualizing involves misrepresenting the world before us, while encouraging an appreciation of its proper place among our rational faculties.