1 Introduction

Research on the development of “metacognition” was initiated in the early 1970s by Ann Brown, John Flavell and their colleagues (for reviews, see Brown et al. 1983; Flavell et al. 2002; Goswami 2008; Schneider and Pressley 1997). Although various definitions of the term “metacognition” have been used in the literature on cognitive development, the concept has usually been broadly and rather loosely defined as any knowledge or cognitive activity that takes as its object, or regulates, any aspect of any cognitive enterprise (cf. Flavell et al. 2002). Obviously, this conceptualization refers to people’s knowledge of their own information processing skills, as well as knowledge about the nature of cognitive tasks, and about strategies for coping with such tasks. Moreover, it also includes executive skills related to monitoring and self-regulation of one’s own cognitive activities. In a seminal paper, Flavell (1979) described three major facets of metacognition, namely metacognitive knowledge, metacognitive experiences, and metacognitive skills, that is, strategies controlling cognition. According to Flavell et al. (2002), declarative metacognitive knowledge refers to the segment of world knowledge that concerns the human mind and its doings. For instance, metacognitive knowledge about memory includes explicit, conscious, and factual knowledge about the importance of person, task, and strategy variables for memorizing and recalling information. A person is said to possess “conditional” metacognitive knowledge whenever he or she is able to justify or explain the impact of person, task, and strategy variables on memory performance (see Paris and Oka 1986). Metacognitive experiences refer to a person’s awareness and feelings elicited in a problem-solving situation (e.g., feelings of knowing), and metacognitive skills are believed to play a role in many types of cognitive activity such as oral communication of information, reading comprehension, attention, and memory. These facets of metacognition refer to a person’s procedural knowledge, which Brown and colleagues (1983) referred to as “knowing how”, and which can be further subdivided in monitoring and self-regulatory functions (see below). For an excellent discussion of more subtle distinctions among various aspects of metacognition, see Kuhn (1999, 2000).

This theoretical framework of metacognition was subsequently extended by Pressley, Borkowski, and their colleagues (e.g., Pressley et al. 1989), who proposed an elaborate model of metacognition, the Good Information Processing Model, that not only considered aspects of procedural and declarative metacognitive knowledge but also linked these concepts to other features of successful information processing. According to this model, sophisticated metacognition is closely related to the learner’s strategy use, domain knowledge, motivational orientation, general knowledge about the world, and automated use of efficient learning procedures. All of these components are assumed to interact. For instance, specific strategy knowledge influences the adequate application of metacognitive strategies, which in turn affects knowledge. As the strategies are carried out, they are monitored and evaluated, which leads to expansion and refinement of specific strategy knowledge.

It should be noted that conceptualizations of metacognition developed in the fields of general psychology, social psychology, and the psychology of aging typically differ from this taxonomy. Popular conceptualizations of metacognition in the field of cognitive psychology exclusively elaborate on the procedural component, focusing on the interplay between monitoring and self-control (see Nelson 1996). On the other hand, when issues of declarative metacognitive knowledge are analyzed in the fields of social psychology and gerontology, the focus is on a person’s belief about cognitive phenomena and not on veridical knowledge.

More recent conceptualizations of metacognition added components such as self-regulation skills (e.g., Efklides 2001; Schunk and Zimmerman 1998). While the concept of metacognition was first developed in the context of developmental research, it is now widely used in different areas of psychology, including motivation research, clinical and educational psychology. Recent developments also include cognitive neuroscience models of metacognition (cf. Shimamura 2000). Its popularity is mainly due to the fact that metacognition is crucial for concepts of everyday reasoning and those assessing scientific thinking as well as social interactions.

2 Classic Research on the Development of Metamemory

2.1 Declarative Metamemory

From the very beginning, research on the development of metacognitive knowledge has focused on the domain of memory. Flavell and Wellman (1977) coined the term “metamemory” to refer to children’s knowledge about what memory is, how it works, and which factors influence its functioning. Using sensitive methods that minimize demands on the child, it has been possible to demonstrate some rudimentary knowledge about memory functioning in preschoolers. Knowledge of facts about memory develops impressively during the course of elementary school and beyond, reaching its peak in late adolescence and young adulthood (cf. Schneider and Lockl 2002). It seems important to note that even though metacognitive knowledge increases substantially between young childhood and young adulthood, there is also evidence that many adolescents (including college students) demonstrate little knowledge of powerful and important memory strategies when the task is to read, comprehend, and memorize complex text materials (cf. Brown et al. 1983; Garner 1987; Pressley and Afflerbach 1995). Also, knowledge about possible interactions among memory variables (e.g., task demands and strategies) seems to develop rather late and continues to improve after adolescence (Schneider and Pressley 1997).

Taken together, the empirical evidence demonstrates that some declarative knowledge is already available in preschoolers and kindergarten children, and that this component of metamemory develops steadily over the elementary school years and beyond. Nonetheless, declarative metamemory is not complete by the end of childhood.

2.2 Procedural Metamemory

Several early metamemory studies explored how children use their metacognitive knowledge to monitor and self-regulate their mnemonic activities. While self-monitoring involves knowing where you are with regard to your goal of understanding and memorizing task materials, self-regulation includes planning, directing, and evaluating one’s mnemonic activities (cf. Flavell et al. 2002). Early research focusing on monitoring showed that even young children seem to possess the relevant skills, particularly when the memory tasks were not very difficult (see the review by Schneider and Pressley 1997). However, the evidence regarding developmental trends was not consistent, with some studies showing better performance in younger than in older children, and others illustrating age-correlated improvement.

2.3 Metamemory-Memory Relations

From a developmental and educational perspective, the metamemory concept seems well-suited to explain children’s “production deficiencies” on a broad variety of memory tasks. Early empirical research on metamemory was stimulated by the belief that young children do not spontaneously use memory strategies because they are not familiar with memory tasks and are unable to judge the advantages of memory strategies such as rehearsal or categorization. Metamemory researchers assumed that this situation should change after children enter school and are confronted with numerous memory tasks. Experience with such tasks should improve strategy knowledge, which in turn should exert a positive influence on memory behavior (e.g., strategy use). Thus, a major motivation behind studying metamemory and its development was the assumption that although links between metamemory and memory may be weak in early childhood, they should become much stronger with increasing age.

Overall, the empirical findings do not indicate a very strong relationship, even though the numbers show reliable associations. For instance, a statistical meta-analysis of 60 studies (with more than 7,000 participants) produced an average correlation of 0.41 (Schneider and Pressley 1997, p. 220). The size of the correlation seems to depend on factors such as type of task, age of children, task difficulty, and timing of metamemory assessment (before or after the memory task). The causal relation between metamemory and memory is also complex in that metamemory sometimes has an indirect effect on recall, as when knowledge about categorization strategies leads to semantic grouping during the study period, which in turn produces better recall. Moreover, the influence seems to be bidirectional (cf. Flavell et al. 2002; Hasselhorn 1990). That is, metamemory can influence memory behavior, which in turn leads to enhanced metamemory.

3 Development of Metacognitive Knowledge and “Theory of Mind”

Given that this chapter focuses on the development of metacognitive knowledge, it seems important to elaborate on the differences between the classic older metamemory research paradigm and more recent theory-of-mind research (see also Flavell 2000; Kuhn 1999, 2000). While metacognitive development has been studied more in terms of the important mechanisms operating within individual minds, exploring children’s awareness of their own cognition, theory-of-mind (ToM) research is concerned with what children know about somebody else’s mind (cf. Goswami 2008; Kuhn 1999; Schneider and Lockl 2002). Another distinction between the two research paradigms concerns the age groups under study. Because ToM researchers are mainly interested in the origins of knowledge about mental states, they predominantly study infants and young children. On the other hand, metacognitive researchers investigate knowledge components and skills that already require some understanding of mental states, and thus mainly test older children and adolescents. Despite this difference in focus, these two research paradigms are connected in important ways. One of the first to detect this relationship was Wellman (1985), who suggested that metacognition consists of a “large, multi-faceted theory of mind” (p. 29).

An influential recent research paradigm has aimed at understanding metacognitive processes in their developmental dimension, trying to link young children’s “theory of mind” with their subsequent metacognitive developments. The most important aspects of this work will be summarized next.

3.1 Assessment of Children’s “Theory-of-Mind”

From the early 1980s on, numerous studies have focused on young children’s knowledge about the mental world, dealing with very young children’s understanding of mental life and age-related changes in this understanding, for instance, their knowledge that mental representations of events need not correspond to reality (cf. Perner 1991; Wellman 1985). One of the major and consistent outcomes of these ToM studies has been that significant changes in children’s ability to take over the perspective of other people occur between 3 and 4 years of age. Explanations of this rapid change in children’s ToM were linked to developmental changes in functions of the prefrontal cortex, in particular, inhibitory functions and those concerned with the regulation of behavior.

3.2 Links Between Theory of Mind and Metacognitive Knowledge

Several years ago, a longitudinal study was started in our lab with 174 children (who were about 3 years of age at the beginning) that investigated the relationship between early theory of mind and subsequent metamemory development, while simultaneously taking into account the possible mediating role of language development (for more details, see Lockl and Schneider 2006, 2007). Children were tested at four measurement points, separated by a testing interval of approximately half a year. While the main goal was to combine aspects of research on ToM and metamemory within a longitudinal framework, a second goal was to examine the role of language abilities in the emergence of theory-of-mind and metacognitive competencies.

There were several interesting findings. First of all, we demonstrated rapid improvements in both language competencies and children’s theory of mind over the age period under study, this confirming previous longitudinal research on this issue (see Astington and Jenkins 1999; Schneider et al. 1999). Secondly, we were able to show that the stability of the theory of mind construct was only moderate at the beginning but increased subsequently, reaching levels of stability similar to those found for the language tests. This finding clearly points to a continuity in ToM development.

Furthermore, several outcomes addressing the impact of language on ToM and metamemory development seem notable. Findings demonstrated a strong relationship between language and theory of mind, thus confirming results of previous studies (e.g., Ruffman et al. 2002). Moreover, significant relationships between language and metamemory could be shown. That is, language abilities assessed at the ages of 3 and 4 years made significant contributions to the prediction of metamemory scores at the age of 5. Finally, it was shown that theory of mind obviously facilitated the acquisition of metacognitive knowledge. While the amount of variance in metamemory scores at the age of 5 explained by ToM at the age of 3 was relatively small, this proportion increased considerably when ToM scores assessed at age 4 were used as predictors. Early ToM competencies also affected the acquisition of metacognitive vocabulary (e.g., knowledge about mental words such as guessing or knowing), which in turn had an impact on developmental changes in metacognitive knowledge. Obviously, advanced ToM development is characterized by a growing insight into inferential and interpretive mental processes (Sodian 2005). Overall, we demonstrated that children who acquired a theory of mind early also showed better metamemory performance assessed about 2 years later. These findings support the hypothesis that early ToM competencies can be considered as a precursor of subsequent metamemory.

4 New Evidence Concerning Metacognitive Development in Childhood and Adolescence

As already noted above, children’s declarative metamemory increases with age and is correlated with age-related improvements in memory behavior (see Schneider and Lockl 2002; Schneider and Pressley 1997, for reviews). We know from various interview studies that knowledge about memory-relevant knowledge concerning person, task, and strategy variables develops significantly from the early elementary school period on and does not reach its peak before young adulthood (cf. Schneider and Pressley 1997). For instance, factual knowledge about the importance of task characteristics and memory strategies develops rapidly once children enter school. Knowledge about the usefulness of memory strategies was tapped in several studies that focused on organizational strategies (e.g., Justice 1985; Schneider 1986; Sodian et al. 1986). As a main result, these studies reported a major shift in strategy knowledge between kindergarten and Grade 6, a finding replicated in numerous recent studies (e.g., Schneider et al. in press).

Taken together, recent studies on declarative metacognitive knowledge more or less confirmed outcomes of previous research. In comparison, more recent investigations on procedural metacognitive knowledge and its development produced several new insights concerning developmental trends and will be discussed in some detail below.

4.1 The Development of Self-Monitoring and Self-Control

According to Nelson and Narens (1990, 1994), self-monitoring and self-regulation correspond to two different levels of metacognitive processing that interact very closely. Self-monitoring refers to keeping track of where you are with your goal of understanding and remembering (a bottom-up process). In comparison, self-regulation or control refers to central executive activities and includes planning, directing, and evaluating your behavior (a top-down process).

4.2 Monitoring Skills in Children

The most studied type of procedural metamemory is that of self-monitoring, evaluating how well one is progressing (cf. Borkowski et al. 1988; Brown et al. 1983; Schneider and Lockl 2002). The developmental literature has focused on monitoring components such as ease-of-learning (EOL) judgments, judgments of learning (JOL), and feeling-of-knowing (FOK) judgments. What are the major developmental trends? In short, the findings suggest that even young children possess monitoring skills, and that developmental trends are not entirely clear, varying as a function of the paradigm under study. While young kindergarten children tend to overestimate their performance when EOL judgments are considered, EOL judgments can be already accurate in young elementary school children. In most of the relevant studies, subtle improvements over the elementary school years were found (cf. Schneider and Lockl 2002, in press).

Given that only a few developmental studies focused on judgments of learning (JOLs) occurring during or soon after the acquisition of memory materials, the situation is not yet clear. Overall, findings support the assumption that children’s ability to judge their own memory performance after study of test materials seems to increase over the elementary school years. However, even young children are able to monitor their performance quite accurately when judgments are not given immediately after study but are somewhat delayed.

A number of studies have explored children’s feeling-of-knowing (FOK) judgments and accuracy (e.g., Cultice et al. 1983; DeLoache and Brown 1984). FOK judgments occur either during or after a learning procedure and are judgments about whether a currently unrecallable item will be remembered at a subsequent retention test. Typically, children are shown a series of items and asked to name them. When children cannot recall the name of an object given its picture, they are asked to indicate whether the name could be recognized if the experimenter provided it. These FOK ratings are then related to subsequent performance on the recognition test. Overall, most of the available evidence on FOK judgments suggests that FOK accuracy improves continuously across childhood and adolescence (e.g., Wellman 1977; Zabrucky and Ratner 1986).

A more recent study by Lockl and Schneider (2002) using the same experimental paradigm was in accord with the classic findings described above. One of the aims of this study was to explore the basis of FOK judgments by comparing the traditional “trace-based” view with the “trace accessibility mode” developed by Koriat (1993). While the former assumes a two-stage process of monitoring and retrieval, the latter proposes that FOK judgments are based on retrieval attempts and determined by the amount of information that can be spontaneously generated, regardless of its correctness. One important prediction derived from this model is that FOK judgments for correctly recalled and incorrect answers (commission errors) should be comparably high, and also higher than FOK judgments for omission errors. This is what Lockl and Schneider (2002) actually found: While the magnitude of FOK judgments given after Commission errors did not differ much from those provided after correct recall and was significantly higher than that given after omission errors, recognition performance was comparable in the case of commission and omission errors. Thus the assumption that feeling of knowing can be dissociated from knowing was empirically confirmed.

Taken together, recent research assessing monitoring abilities in JOL or FOK tasks demonstrates rather small developmental progression in children’s monitoring skills (see also Roebers et al. 2007).

4.3 The Relation Between Monitoring and Control Processes in Children

An important reason to study metacognitive monitoring processes is because monitoring is supposed to play a central role in directing how people study. Numerous studies including adult participants showed that individuals use memory monitoring, especially judgments of learning (JOLs), to decide which items to study and how long to spend on them (e.g., Metcalfe 2002; Nelson and Narens 1990). However, little is known about how children use monitoring to regulate their study time. A classic paradigm suited to further explore this issue refers to the allocation of study time. Research on study time allocation observes how learners deploy their attention and effort. As already noted by Brown et al. (1983), the ability to attend selectively to relevant aspects of a problem solving task is a traditional index of learner’s understanding of the task. Developmental studies on the allocation of study time examined whether schoolchildren and adults were more likely to spend more time on less well-learned material (e.g., Masur et al. 1973; Dufresne and Kobasigawa 1989; Lockl and Schneider 2004). All of these studies reported an age-related improvement in the efficient allocation of study time. That is, older children (from age 10 on) spent more time studying hard items than they spent studying easy items, despite the fact that even many young children were able to distinguish between hard and easy pairs.

Thus, developmental differences were not so much observed in the metacognitive knowledge itself but in its efficient application to self-regulation strategies.

5 The Importance of Metacognition for Education

During the last three decades, several attempts have been made to apply metacognitive theory to educational settings (cf. Paris and Oka 1986; Moely et al. 1995; Palincsar 1986; Pressley 1995).

One interesting and effective approach to teaching knowledge about strategies was developed by Palincsar and Brown (1984). The “reciprocal teaching” procedure requires that teachers and students take turns executing reading strategies that are being taught with instruction occuring in true dialog. Strategic processes are made very overt, with plenty of exposure to modeling of strategies and opportunities to practice these techniques over the course of a number of lessons. The goal is that children discover the utility of reading strategies, and that teachers convey strategy-utility information as well as information about when and where to use particular strategies. Teachers using reciprocal instruction assume more responsibility for strategy implementation early in instruction, gradually transferring control over to the student (see Palincsar 1986, for an extensive description of the implementation of reciprocal instruction; see Rosenshine and Meister 1994, for a realistic appraisal of its benefits).

During the eighties and nineties of the last century, numerous studies explored the efficiency of strategy training approaches in school (for a review, see Schneider and Pressley 1997). The basic assumption was that although children in most cases do not efficiently monitor the effectiveness of strategies they are using, they can be trained to do so. For instance, in a training program carried out by Ghatala and colleagues (e.g., Ghatala et al. 1986) elementary school children were presented with paired-associate learning tasks. Before studying these lists, some children received a three-component training. They were taught (a) to assess their performance with different types of strategies, (b) to attribute differences in performance to use of different strategies, and (c) to use information gained from assessment and attribution to guide selection of the best strategy for a task. As a major result, it was shown that even children 7-8 years of age can be taught to monitor the relative efficacy of strategies that they are using and to use utility information gained from monitoring in making future strategy selections.

Another more large-scale approach concerns the implementation of comprehensive evaluation programs that aim at assessing the systematic instruction of metacognitive knowledge in schools. As emphasized by Joyner and Kurtz-Costes (1997), both Moely and Pressley, with their colleagues, have conducted very ambitious programs of evaluating effective instruction in public school systems. For instance, Pressley and colleagues found that effective teachers regularly incorporated strategy instruction and metacognitive information about effective strategy selection and modification as a part of daily instruction. It seems important to note that strategy instruction was not carried out in isolation but integrated in the curriculum and taught as part of language arts, mathematics, science, and social studies. In accord with the assumption of the Good Information Processor Model outlined above (cf. Pressley et al. 1989), effective teachers did not emphasize the use of single strategies but taught the flexible use of a range of procedures that corresponded to subject matter, time constraints, and other task demands. On most occasions, strategy instruction occured in groups, with the teachers modeling appropriate strategy use. By comparison, the work by Moely and colleagues (e.g., Moely et al. 1995) illustrated that the effective teaching process described by Pressley and coworkers does not necessarily constitute the rule, and that effective teachers may represent a minority group in elementary school classrooms. Taken together, the careful documentations of instructional procedures carried out by Pressley, Moely, and their research groups have shown that there is a lot of potential for metacognitively guided instructional processes in children’s everyday learning.

More recent research explores the utility of the metacognition concept in research with older children and adolescents, assessing the predictive potential of metacognitive knowledge and skillfulness in reading and math (e.g., Artelt et al. 2002; Veenman et al. 2005; see also the contributions in Desoete and Veenman 2006). Overall, these studies confirm the view that metacognitive knowledge and self-regulated, insightful use of learning strategies predicts math performance and reading comprehension in secondary school settings even after differences in intellectual abilities have been taken into account. They also give evidence that metacognitive knowledge relevant for school-related domains can still be effectively trained in late childhood and early adolescence.