Deep Learning: How the Mind Overrides Experience is not only breathtaking in scope and intellectual range, but also beautifully written and completely engaging. Written from the perspective of a mainstream cognitive scientist who wants to discharge all “homunculi” in his theory building, Ohlsson tackles a topic that is by no means mainstream in the current field of learning—how the mind can override past experience. Standard learning theories focus on how we use past experience to extract regularities that guide future action—which he calls “monotonic” forms of learning because they directly build on or add to what one already knows without calling for revision. Ohlsson’s fundamental premise is that because we live in an ever-changing rather than clockwork world, it is an important to have learning mechanisms that allow us to transcend our previous experience. Further, he argues that three quite different forms of cognitive change—the production of new creative insights, the adaptation of skills to changing circumstances, and the revision of beliefs in conversion—all share some common characteristics as three instances of “non-monotonic learning”—that is forms of learning that involve the fundamental modification and revision of prior knowledge.

1 Overview of Ohlsson’s Arguments and Contributions

His book, divided into five main sections, can be read in a variety of ways and makes a number of distinctive contributions. In the first section (Introduction) he argues why non-monotonic learning is important and provides an explicit articulation of a meta-theory for adequate theory building about these issues in cognitive psychology. The next three sections (Creativity, Adaptation, Conversion) each focus on a specific form of non-monotonic learning. For each form, he begins by outlining the critical questions an adequate theory of each domain must answer and by critically reviewing prior research in each domain. These critical reviews both call attention to important insights from prior research as well as key limitations in answering these key questions. He then provides a clear articulation of his prevailing assumptions about routine processing and monotonic learning in each domain that he argues is a starting point for articulating a clear theory of non-monotonic learning in each domain. Although he presents these as syntheses of prevailing views rather than as novel contributions, he provides an unusually clear and succinct formulation of these theories. He then presents what he calls his distinctive “sub-theory” for each domain that makes a novel contribution and provides answers to his key questions. Finally, in the fifth and last section (Conclusions) he provides an explicit articulation of ten general principles that collectively may be both necessary and sufficient properties for a cognitive system in order for non-monotonic learning to occur. In this review, let me highlight what I think is most important and provocative about these contributions, along with my thoughts about some of the important questions that remain unaddressed in his theories.

2 Articulation of the Problem and the “Meta-theory” (Chapters 1, 2)

The first section—the Introduction—sets the stage for the volume by outlining the need for the mind to over-ride experience (Chapter 1) and the criteria that must be met to have a satisfactory explanation for this type of learning (Chapter 2). He argues that because we live in a turbulent rather than a clockwork world and because our species depends especially heavily on learning as a form of adaptation, we need to have developed learning mechanisms that allow us to override past experience. As part of this initial argument, he beautifully reviews how 20th century advances in the sciences, with the emergence of complex systems theory, have fundamentally changed our view of material reality from one in which systems with clockwork predictability are the regular case and those with unpredictability are the exception, to a view in which turbulent and unpredictable systems are the norm. He then lays out some of the key properties of complex systems.

In Chapter 2 Ohlsson tackles the difficult question of what is required for a satisfactory theory of cognitive change. By laying out his assumptions explicitly, he not only makes them open for critical inspection and debate, but also challenges cognitive psychologists to develop more explanatorily adequate theories. A frequent criticism of theories in psychology by those who like Ohlsson engage in computer modeling and simulations is that our theories are woefully “under-specified”—that is, they are not sufficiently articulated to allow a computer to perform the computations involved.

More specifically, his meta-theory includes the assumption that such a theory needs to focus on mind and its changing representations, and includes seven core principles about mind. It also includes seven criteria for an explanatorily adequate theory of cognitive change. The first five focus on what is needed for a theory to be completely articulated:

  1. 1.

    A description of the explanatory target […]

  2. 2.

    A background theory of the relevant aspect or aspects of the cognitive architecture […]

  3. 3.

    A repertoire of learning mechanisms […] The micro-theories proposed in this book distinguish mechanisms for monotonic learning from mechanisms for non-monotonic learning.

  4. 4.

    A specification of the triggering conditions under which each learning mechanism tends to occur.

  5. 5.

    An articulation of the mechanisms and triggering conditions vis-à-vis the explanatory target […] (Ohlsson 2011, pp. 48–49)

In addition, two criteria deal with issues of scale-up and implications for practice. As a systems theorist, he is well aware of the complex relations that can occur across different scales, but argues that a theory is better if some properties “punch through” to different scales, while recognizing that there will also be other emergent processes and properties. He also regards an explanation as more adequate if there is evidence that it provides insight about how to support more successful practice.

3 Creativity (Chapters 3, 4, 5)

A good sense of his overall method and approach to each form of cognitive change is evident in the first section on Creativity. In Chapter 3, Ohlsson identifies four key questions to be answered by a successful theory of creativity:

First, how are novel ideas possible? […]

Second, what are the key features that distinguish creative processes and justify calling them creative? […]

Third, what gives direction to the creative process? […]

Fourth, what are the limiting factors? Why is it difficult to create? […]

[Ohlsson, 2011, pp. 116-117]

He argues that past theories have proposed a number of different ideas about how novelty is possible that involve partial insights and good principles; the problem is that these theories have not been sufficiently articulated within a background theory of (non-creative) analytic problem solving to offer answers to the other creativity questions. Thus, he criticizes past theories more for their incompleteness rather than their being false.

His new theory draws on multiple prior insights about what makes novelty possible (which he catalogues as novelty through combination, accumulation, and restructuring), but reassembles, integrates, and embeds them in a more elaborated background theory in order to the answer the four creativity questions. His key “explanatory target” is accounting for the alterations in “mode and tempo” in the key phases of “the insight sequence,” which he describes as involving “search, impasse, insight, and aftermath.” More specifically, he wants to explain what leads to an “unwarranted impasse” in solving small-scale insight problems such as the ones used by Gestalt psychologists (an impasse when one actually has the conceptual knowledge needed to solve the problem) and how that impasse is overcome.

His redistribution theory of creative insight (outlined in Chapter 4) contains: (a) an articulated background theory of analytical (non-creative) problem solving, which he claims is not new, but a synthesis of much prior work; and (b) a novel theory of insight that explains what causes an unwarranted impasse, and what leads to its resolution, in terms of the cognitive architecture assumed by the background theory. The background theory proposes that “analytical thinking unfolds through interactions among three processes: problem solving, knowledge retrieval, and heuristic search” (p. 93). He outlines a number of key principles that constrain perceptual processing and hence how one sets up an initial representation of the problem, whether the problem is presented through visual information or language: “combinatorial, layered processing, mutual constraints within each layer due to horizontal excitatory and inhibitory links and context effects implemented via downward feedback links” (p. 98). He also articulates a number of principles about knowledge retrieval, that include ideas about spreading activation in a complex knowledge network, consisting of nodes and links among nodes, and the ways the strength of links are affected by past experience (e.g., frequency of use, recency of use). Once a representation of a problem is established, it triggers activation of nodes in the network, which determine “the space of solutions that can be reached via analytical thinking” (p. 102).

His theory assumes the prior structure of our knowledge network determines how the problem is initially defined, and hence the initial space of solutions that will be searched. Classic insight problems are difficult to solve because they are designed so that the ways we initially structure the problem from past experience is likely to be “unhelpful.” That is, the desired solution is not within the initial space of solutions considered. He then goes on to assume that “persistent unsuccessful solution attempts cause negative feedback to be passed back down the layers of processing units. The experience of failure—more generally, a negative evaluation of the outcome of a problem solving step—causes activation to be subtracted from the processing units that were instrumental in producing it” (pp. 107-108). His theory includes six principles governing how activation is distributed and redistributed in the network, so that negative feedback can ultimately lead to a qualitative change in problem representation. His theory thus proposes a critical role of negative feedback in the resolution of impasses.

In concluding this chapter, Ohlsson carefully considers how his proposed theory of the insight sequence relates to other theories, noting that it is rare in cognitive psychology for different proposed explanations to be truly rivals; more typically they address somewhat different questions, or propose mechanisms that differ in level of generality or are complementary rather than mutually exclusive. In that spirit, he argues that the earlier explanations by Gestalt psychologists of the cause of impasses in terms of the role of “functional fixedness” or “prior problem solving set” are simply two special cases of his more general principle of “unhelpful prior knowledge.”

In addition, he argues that his account of how impasses are resolved in terms of the propagation effects of negative feedback tipping the balance among competing options offers a principle (and mechanism) that is genuinely new and complementary to other mechanisms proposed in the prior literature. Much of the prior literature had focused on how “turning away” from the problem after a period of intense preparation may help, either because it allows some “differential forgetting” of the current approach or allows for some new external events to intervene that may result in some “fortuitous reminding” that changes the problem representation. He thinks it is likely that all three mechanisms play a role in overcoming impasses, although the prevalence and importance of each mechanism is unknown.

In the final chapter in the Creativity section, he talks about issues of scale-up. What happens if the process envisioned in the micro-theory for simple problems is carried on over a longer time period and for more complex problems? He concludes: “The important conclusion is that creativity scales across complexity through the accumulation of multiple insights, not via some different or unknown cognitive mechanism or mysterious “creative ability” (p. 141). At the same time, he also notes that there are new emergent processes that can occur at different scales that are not well explained by his micro-theory. For example, he argues that problem finding and serendipity are not well explained by his micro-theory. In addition, he thinks the role of externalities (which is not part of the micro-theory) may become increasingly important in scale up.

Thinking about his proposed theory raised a number of questions for me: Does his principle of negative feedback have enough direction? How “dumb” and automatic is the learning mechanism he has in mind? What role, if any, does he see for a reflective meta-cognitive response? One can imagine different individual responses to negative feedback, some of them more productive than others. The one he considers simply is persistence in the face of negative feedback (rather than giving up), perhaps because he wants the effects of negative feedback to be automatic and involuntary. But what about one’s thoughtful consideration and analysis of potential reasons for the impasse, where one asks questions like: Why am I having difficulty? Am I having trouble because of a faulty technique or simple error? Do I need to regroup, change direction, and learn about something new? It would seem that such analysis would provide additional sources of direction in the face of negative feedback. Further, adding meta-cognitive reflection could have important practical implications: To improve one’s chances for breakthroughs and success, one can teach people to ask good meta-cognitive questions of themselves and to put themselves in a position where they get good critical feedback from others. It would be interesting to explore whether creative individuals are actually better at asking themselves such questions and seeking critique, not simply more likely to persist.

Another set of questions for me concerned to what extent one would expect that explanations of unwarranted impasses should be foundational for explaining large scale creative insights (as in Ohlsson’s theory) or whether there might be important positive discovery heuristics (in addition to the effects of negative feedback) that actively aid in changing problem representations. Remember: unwarranted impasses are cases where one already has the cognitive representations needed to solve the problem but doesn’t initially think to use them because of the way the problem is initially construed. But in the case of large scale creative problem solving, such as occurs in novel theory construction in science, the cognitive representations needed to solve the problem don’t already exist in the mind of the thinker. Indeed, as Nancy Nersessian (2008) and John Clement (2008a) have so beautifully described in their detailed case studies of innovations in scientific problem solving, these representations need to be actively constructed through effortful cycles of model-based reasoning.

The work of Nersessian and Clement calls attention to another important pattern observed within studies of scientific creativity and innovation, besides the insight sequence pattern that Ohlsson focuses on—namely that “the use of analogies, thought experiments, and imagistic representations figures prominently in creative problem solving across the sciences” (Nersessian 2008, p. 131) and that these episodes of conceptual innovation typically involve combining partial insights from multiple domains into a workable model (not just directly importing a solution from a retrieved analogy). Nersessian notes that each of these three processes may be useful in creative problem solving precisely because each involve ways of changing representations, and concludes that model-based reasoning is common in episodes of innovation because it is a particularly effective means of selectively abstracting and integrating constraints from multiple sources in the service of solving problems.

These further questions suggest ways that Ohlsson’s redistribution theory of creative insights may not be wrong, so much as incomplete. As Ohlsson himself argues, one would expect theories in cognitive psychology to involve multiple mechanisms working in interaction with each other. If Nersessian and Clement are right, however, some of these mechanisms may be considerably more complex than those currently envisaged by Ohlsson, and include the need to consider the coupling of internal and external representations in the service of problem solving and the role of embodied knowledge (a challenge to current Artificial Intelligence approaches).

4 Adaptation (Chapters 6, 7, 8)

The second form of non-monotonic learning Ohlsson examines concerns how we modify our existing skills to adapt to changing situations and circumstances. Again he takes a three-part three-chapter approach: in the first chapter he lays out the key questions that a satisfactory theory needs to answer, and reviews prior research to set the stage for his new theory about a specific learning mechanism; in the second chapter he extends prior work with the presentation of his novel micro-theory of learning via specialization; and in the third chapter he considers how the proposed new mechanism would scale across time, complexity, and system level. Although the general approach is the same, all the specific details (and learning mechanisms considered) are different from those considered in the case of Creativity, in keeping with his general idea that each form of cognitive change calls for its own theory.

In the case of skill acquisition, he identifies six key questions that a satisfactory theory needs to address:

  1. 1.

    Mechanism: How, by what cognitive processes, does practice exert its effects? What is the structure of a single, local mutation in a cognitive skill?

  2. 2.

    Sufficiency of practice. How can practice be sufficient to produce improvement?

  3. 3.

    Necessity of practice. Why is practice necessary to acquire a skill?

  4. 4.

    Gradual improvement. Why is improvement gradual? Why does it exhibit a negatively accelerated rate of change?

  5. 5.

    Transfer effects. How are acquired skills applied and re-used—transferred–to novel or altered task environments?

  6. 6.

    Efficacy of instruction. Why is instruction possible? How does it work? What are the factors that determine its effectiveness? (Ohlsson 2011, p. 177)

He then articulates a stable background theory of the cognitive architecture underlying skilled performance with its production system rules based on Goal, Situation, Action triads (i.e., what Actions are triggered when you have a certain Goal and are in a certain Situation). Within this architecture and its associated processes for executing rules, he then turns to consider mechanisms by which skills can be acquired and transferred.

His review of prior work on skill acquisition is a real tour de force—part historical review, part theory articulation and part conceptual synthesis. Rather than criticizing prior theories as inadequate to answering these questions, he argues this work is one of cognitive psychology’s real success stories: over the last century we have made remarkable progress in developing an understanding of the learning mechanisms involved. His historical review starts with the seminal work of Thorndike on how positive and negative consequences can change the strength of a behavior (the Law of Effect), traces the changes to these principles with the emergence of Wiener’s cybernetic theory and the idea of positive and negative feedback (which act on plans and other representations, not directly on behaviors), chronicles the multi-mechanism theories of learning proposed by Gagne, which sought to articulate the triggering conditions for different modes of learning, describes the pioneering work of Fitts on stages in skill acquisition, and then goes on to consider the vast proliferation theories of specific learning mechanisms that has subsequently occurred within cognitive psychology and artificial intelligence.

How to bring order to this tangle of riches? Here Ohlsson has a powerful insight: Why not organize the learning mechanisms by the kind of information that they utilize? This leads to his exceptionally clear articulation of the “Nine-Modes” theory of skill acquisition, in which he proposes there are nine types of information that can be drawn on during skill learning and that different sources of information may be particularly important at different phases of the process. In the first phase which involves figuring out how to get started, five types of information and their corresponding learning mechanisms are especially important—translating verbal instructions into action sequences, reasoning from prior declarative knowledge about the problem, making analogies with previously mastered strategies, using information from demonstrations, and trial and error. Two additional sources of information are particularly important in the second phase in which one practices to achieve mastery (generalizing from positive outcomes, learning from error) and two more during the final optimization stage (detecting shortcuts, optimizing performance based on statistical regularities.)

Although Ohlsson argues we “already possess a first-approximation multi-mechanism theory of skill learning” (p. 201) that can provide answers to the four initial questions about skill learning mechanisms, he cautions that this does not mean it is the best possible theory. Much further work will be needed to further develop and finalize the theory. Theory evaluation will necessarily be complex, as learning may reflect the interaction of multiple mechanisms. One approach to theory evaluation involves running simulations that include one or more of the proposed learning mechanisms to see whether adding additional mechanisms improves the fit with data. To date no simulations have involved all nine mechanisms.

In his second chapter on skill acquisition, Ohlsson presents a novel theory of learning from error, one of the key learning mechanisms in the Nine-Modes theory that he feels is not yet well understood. His specialization theory addresses questions of how error is detected in the first place (he argues there is a dissociation between knowledge and action, that declarative knowledge of the task situation is represented as constraints, and error signals are detected as “constraint violations”) and how one changes one’s internal representations in response to error (he argues that one’s errors typically result from an overly general representation of rules, and that errors are corrected by adding more specialized “conditions of application” to the rule). True to form, he articulates his full specialization theory with rigorous detail, being highly specific about how rules are represented and modified in this process (errors are unlearned by incorporating violated constraints into applicability conditions; when constraints are violated, one adds two further specialization conditions, plus maintains a more general version of the rule in memory, thus storing multi-level trees or “root genealogies” in memory).

In the final chapter on scaling across time, complexity, and collectives, Ohlsson reports findings from computer simulation studies showing that models that include two learning mechanisms (generalizing from positive outcomes, and learning from error as articulated in his specialization theory) generate a better fit with learning curve data than models that include only one, consistent with his assumptions that multiple mechanisms are important in learning and some phenomena reflect the interaction of mechanisms. He also shows how his root genealogies representation helps explain how experts maintain flexibility in the face of increasing sophistication and specialization. Finally, he ranges widely over different and fascinating literatures on errors in the decision making of collectives (hospitals, airlines, businesses, nations), considers how under-specified decision rules may have contributed to faulty decision making in various disasters, and how collectives might profitably improve their decision making rules, and capacity for maintaining safety, through increasing rule specialization.

Overall, I found the chapters on skill learning brimming with interesting insights. Although Ohlsson humbly claims the “nine modes theory” is not his theory but the collective accomplishment of the field, pulling it together and systematizing it as he did in these chapters is a major intellectual accomplishment. Another valuable contribution was his specialization theory of learning from error, including his proposals for how declarative knowledge functions more pragmatically as “constraints” that guide skill learning and his specific proposal about how one revises overly general rules in light of constraint violations. I liked how this proposed learning mechanism provided clear guidance for learning from error—an interesting contrast with his mechanism for benefiting from “impasses” in his theory of creativity, in which the “direction” that emerged was massively contingent on the initial structure of the network. Further, I could see how his specialization theory might have interesting practical applications, including informing ideas about how “collectives” might implement systems that would allow them to learn better from error and hence improve safety.

One interesting set of issues not explicitly discussed by Ohlsson was how his “nine-modes” theory relates to Ericsson’s and Charness’s (1994) seminal work on the role of “deliberative” practice in developing expertise. Ericsson and Charness distinguish deliberative practice from other forms of practice and argue the former has a distinctive role in developing world-class experts. Important questions raised are: How is deliberative practice different from other forms of practice? How might it affect different modes of learning? Are there some modes of learning for which it is most essential? Understanding what improvements from practice are automatic and what aspects calls for more effort and metacognitive reflection would have important practical applications for improving teaching and learning.

5 Conversion (Chapters 9, 10)

The last form of non-monotonic cognitive change considered is conversion—cases where someone needs to abandon or revise prior beliefs, not just add new ones. Unlike the preceding two sections, this section includes only two chapters, not three, omitting the final chapter on scale-up. Nonetheless the scope and sweep of this section is still very impressive indeed.

Again Ohlsson begins by outlining the central questions that he thinks a theory of conversion needs to address, and by critically reviewing prior theories to highlight both their good ideas, and how they have failed to adequately address all critical questions. In the case of conversion, he argues there are just two central questions: “First, how, by what processes, is change resisted? […] Second, how, by what processes, and when, under which conditions, is resistance overcome and beliefs revised?” (p. 297).

Through an integrative multi-disciplinary review spanning work in philosophy, psycholinguistics, psychology, and social science, he argues that the field already has an impressive, first order theory of resistance to change. As always Ohlsson does a first-rate job articulating the outlines of such a theory in terms of three basic principles: (a) that new data is always assimilated in light of previous knowledge; (b) that belief systems have a center-periphery structure such that a few general core beliefs subsume (rather than logically imply) many specific peripheral beliefs; and (c) that belief/data conflicts are resolved by a handful of dissonance reduction mechanisms that add new beliefs rather than change the truth value of the existing belief. In so doing, he reminds science educators of the very relevant work on resistance to change that was done over a half century ago in social psychology and the extensive data that supports such a theory.

However, this theory of resistance to change poses serious problems for any account of conversion that depends upon theory-data conflict as the triggering mechanism for change—a view Ohlsson maintains is widespread. For example, he argues it lies not only at the heart of Popper’s falsification theories, but also Kuhn’s more pragmatic discussion of how the accumulation of anomalies in the course of normal puzzle solving can trigger periods of revolutionary science, and Strike and Posner’s accounts of how conceptual change in science education starts by presenting data to students that arouses some initial dissatisfaction with their beliefs. The problem is that such theories of conversion fail to explain why such theory-data conflicts don’t just produce resistance to change instead.

As a novel way around this impasse, Ohlsson proposes his resubsumption theory of conversion in which the triggering conditions for conversion may be the detection of theorytheory conflicts rather than theory-data conflicts. Such conflicts only arise when several prior conditions are met: (a) one possesses two distinct theories, each of which is already believed to be true, when applied to distinct domains; (b) one then discovers that two theories can applied to the same domain; and (c) one has background beliefs that render the two theories incompatible (i.e., they can’t both be true) when applied to that domain. When these three conditions are met, a process of competitive evaluation of the two theories will be triggered, an automatic and protracted process that depends upon evaluating the utility of the theories that in combination with other factors affects one’s confidence in each theory that in turn ultimately affects the truth value of each theory.

Note that Ohlsson’s provocative theory turns several things on their head: It makes having an alternative theory that one already believes is true to be a precondition rather than a result of conversion. It also proposes that the way that one develops the content of both theories in the first place is through normal monotonic learning processes. Ohlsson goes through an interesting and elaborate explanation of how one can possess two incompatible theories without knowing it with his principle that belief systems are only locally rather than globally coherent. He provides several brief examples of cases where this mechanism may apply and discusses why an idea originally formed to explain phenomena in one domain (the contender theory) might provide a better account of phenomena in another domain than the account one had initially developed for this domain (the resident theory). This is because initial accounts often depend upon surface features that don’t get at the deep structure of a domain.

My initial reaction to Ohlsson’s criticisms of the prior literature was that it was not quite fair. Both the work of Kuhn (1962) and conceptual change theorists, such as Posner et al. (1982), was done in reaction to more “empiricist” accounts of learning, and both clearly realized that learning was always guided by one’s initial conceptual framework and ideas. Thus they never proposed that change could be accounted for simply in terms of theory data conflicts; they always realized that there had to be an alternative to convert to and that once there was, there needed to be a process of comparative evaluation of two or more theories (one’s initial view and the view that one was going to convert too) in which one considered the plausibility and fruitfulness of the two views. Further, as Posner et al. argued, to get the process of theory comparison going, one needed to find the alternative minimally intelligible—something that frequently was overlooked in science education where the presented ideas of scientists were often not understandable to students. In the case of theory change in the history of science, Kuhn also recognized that resistance to change among individual theorists often trumped conversion; that is, often proponents of the older view “died out” rather than converted.

In these respects, Posner and colleagues did propose a beginning account of “conditions” that should be met for conversion to occur—including that the alternative needed to be “intelligible”, “plausible”, and “fruitful.” Admittedly these ideas would need further unpacking in order to be precisely formulated enough to be implemented in a computer simulation; they may also be incomplete (see Limon 2001). Nonetheless, the distinctions among these conditions are understandable by many teachers and students, and have been shown to have heuristic value in guiding practical work in the classroom. For example, Sister Gertrude’s elementary school students were taught to use this vocabulary in the process of monitoring the status of their own ideas, which helped her as a teacher decide what were the appropriate next steps in learning (Hennessey and Beeth 1993; Smith et al. 2000). The fact that studies have shown that cognitive conflict alone is not effective in promoting restructuring is entirely consistent with Posner et al.’s multi-condition theory. In a recent review Clement (2008b) argues that curricula that co-ordinate multiple strategies (discrepant events, analogies, model building) can be highly effective.

In addition, I found it unrealistic and simplistic to assume (as Ohlsson claims in his resubsumption theory) that the two competing theories (contender and resident) would already exist “preformed” in a knowledge network, ready for competitive evaluation, or that the two competing theories would initially be learned entirely independently by normal monotonic learning processes, with the only problem being to arrange conditions where the learner thinks of both ideas simultaneously. The picture that emerges from detailed studies of conceptual change in science and the science classroom is more complex. It would seem more likely that what pre-existed would be a range of ideas (from different domains) that would have to be slowly reassembled in novel ways in forming a new model for the domain using multiple analogies. [See: Millman and Smith (1997) on Darwin’s use of multiple analogies and dis-analogies in fashioning his concept of natural selection; Nersessian (2008) for a description of Maxwell’s construction of the field equations for electro-magnetic phenomena; Clement (2008b) for work on using anchoring and bridging analogies in the construction of explanatory models in science education; Minstrell and Kraus (2005) for descriptions of multiple strategies used in developing middle schooler’s understanding of gravity, including discrepant events, argument, analogies, and dis-analogies. See also Carey (2009) for a detailed proposal about how building these new representations may involve Quinean bootstrapping processes, a different and more complex learning mechanism than the one considered by Ohlsson.] In my experience, what teachers need help with is knowing how to orchestrate and sequence this creative model building, so that students’ understanding builds productively and in sustained fashion over long spans of time—an issue left totally unexplored in Ohlsson’s work, but that is centrally investigated in current learning progressions work in science (e.g., Mosher 2011; National Research Council 2007; Smith et al. 2006).

I also thought he was too dismissive of the role of theory-data conflicts in theory change—arguing they call for greater meta-cognitive sophistication than would be expected of a lay person. Work by many science educators (e.g., Clement 2008b; Minstrell and Kraus 2005) shows these theory-data conflicts can be effective in combination with other strategies. Further, much current work in science education (including current learning progressions work) argues for the coordinated development of science content and practice: that part and parcel of developing more sophisticated science concepts may be developing an understanding of what is distinctive about scientific argumentation, including the “authority” of “Nature” and “evidence” (Ford 2008). Therefore, rather than assuming the evaluation functions are themselves static, it would seem more appropriate to see them as dynamic and discipline specific. Further, changes in these evaluation functions would change what ideas were deemed learnable.

Finally, Ohlsson’s resubsumption theory of conversion does not distinguish belief revision and conceptual change, nor does he consider the role of the social dynamic in issues of conversion—all issues of central importance in effective science education. For readers hoping to learn more about Ohlsson’s take on these issues, they will be disappointed to find these issues are ignored. Indeed, as Ohlsson concludes, his theory of conversion is not really a theory of concept, belief, or theory revision at all, but an account of how a pre-existing belief comes to have a wider domain of application.

Although Ohlsson’s resubsumption theory did not satisfactorily answer questions that are important to me as a learning progressions researcher concerned with conceptual change, Ohlsson does highlight many ideas that are important.

First, I think Ohlsson is right in stressing the role of theorytheory conflicts in conversion, although as previously mentioned I think consideration of theory-data conflicts is also part of the process of resolving theory–theory conflicts, and the methods of handling and resolving these conflicts changes over time as students gain greater epistemological understanding.

Second, I think the ideas that belief systems have central-peripheral structure and that coherence is maintained locally rather than globally are important and help shed light on how fundamental revision in students’ knowledge network is possible and can best be brought about. For example, many learning progression researchers seek to identify central (but unstated) assumptions that need to be challenged and to identify the portion of the network to work on first (because they are more accessible to revision) in order to promote change to other parts of the network (Wiser et al. in press).

Third, his idea that beliefs have multiple parameters (as well as propositional content) including their utility, confidence, affective value, and truth value is also of interest, along with his specific ideas about what factors are considered and how they are related in competitive evaluation—first utility which affects confidence which ultimately affects truth value. Exactly what parameters and variables are important to students in their comparative evaluation of theories could be a productive area for further research.

Fourth, his idea that beliefs have components and are part of nested structures is also exceedingly important. In this respect, I only wish he had gone further and considered how concepts are distinct from beliefs and related to these nested structures as well as how the content of concepts are determined. Of course such accounts won’t be simple. At one level, concepts are the constituents of beliefs. At the same time concepts are broader than beliefs, as they organize vast collections of beliefs, although not all beliefs involving a concept are central to its meaning. In addition, as Carey (2009, p. 5) argues, there may be multiple mechanisms for determining the content of a concept, including “causal mechanisms that connect a mental representation to the entities in the world in its extension […] and computational processes internal to the mind that determine how the representation functions in thought.” One current problem for cognitive theory is we don’t yet have a unified theory of concepts and their structure that explains how concepts are individuated and represented in ways that allow them to discharge their multiple functions (Margolis and Laurence 1999).

6 Conclusions (Chapters 11, 12)

Ohlsson’s masterful book on Deep Learning should help put non-monotonic learning on the radar screen of cognitive psychologists as a central topic for further investigation and theory building. Not only does he provide detailed theories of three non-monotonic forms of cognitive change, he also invites the reader to consider fundamental similarities across the three types of change normally not recognized as similar. He argues these similarities are not so much in their specific learning mechanisms as in the general characteristics of a cognitive architecture that they require, and concludes by positing ten general properties of a cognitive system that (collectively) may be necessary and sufficient for deep learning to occur. As my previous comments suggest, I think we may be much closer to an adequate theory for skill learning at this point that of creativity or conversion, and Ohlsson doesn’t really propose a theory of conceptual change. (Readers interested in a theory of conceptual change should read Carey’s new book The Origins of Concepts instead.) But that doesn’t detract from his ambition, vision, or accomplishment. Deep Learning is a book organized around one central hypothesis, but it is not a one-note book. The wealth of detail, the specific examples, and the precision in formulating his many ideas make this book an incredibly rich read.