Introduction

Since Flavell (1979) introduced the concept of “metacognition,” many studies have addressed the issue of the influence of metacognition on learning performance. Based on a metareview of studies, Wang et al. (1990) concluded that metacognition is the most important predictor of learning performance. The present study focuses on the development of metacognitive skills in relation to intellectual ability. Additionally, generality vs. domain specificity of metacognitive skills is investigated.

Metacognitive skills

A generally accepted distinction is the one between metacognitive knowledge and metacognitive skills. Metacognitive knowledge refers to the declarative knowledge one has about the interplay between personal characteristics, task characteristics, and available strategies in a learning situation (Flavell 1979). Having metacognitive knowledge at one’s disposal, however, appears to be no guarantee for using this knowledge whenever it is needed (Alexander et al. 1995; Winne 1996; Barnett 2000; Pressley et al. 1997).

Metacognitive skills concern the procedural knowledge that is required for the actual regulation of and control over one’s learning activities (Brown and DeLoache 1978; Veenman 2011). Task analysis, goal setting, planning, monitoring, checking, and recapitulation are manifestations of such skills. Metacognitive skills can be inferred from overt behavior or utterances by the student, that is, from concrete metacognitive activities (Veenman et al. 2006). These activities can be divided in behavior that occurs at the onset of task performance (orientation), during task performance (planning, monitoring, evaluation), and after task performance (reflection and elaboration). Examples of metacognitive activities are given in Table 1. Note that some behavior described in Table 1 may be considered as cognitive, but the purposeful application of such cognitive behavior at the appropriate moment results from metacognitive skillfulness.

Table 1 Examples of domain-specific and general metacognitive activities

Several researchers (Bowen et al. 1992; Brown 1980; Christoph 2006; Markman 1977, 1979; Mevarech and Fridkin 2006; Pressley and Afflerbach 1995; Schoenfeld 1992; Shore and Lazar 1996) have investigated the use of metacognitive skills while performing different tasks (e.g., reading comprehension or problem solving) often focusing on a separate component of metacognitive skills (e.g., planning or monitoring activities). The present study, however, includes a broad range of metacognitive skills referring to orientation, planning, monitoring, as well as reflection skills in both problem-solving and text-studying tasks.

While assessing metacognitive skills over a developmental trajectory, two different perspectives can be taken: the quantity and the quality of metacognitive skills. The quantity concerns the frequency of these skills being applied, whereas the quality concerns the depth or the extent to which they are applied. An example of the latter is that making a sketch of the problem in order to represent the problem is considered as a deeper orientation than just reading (a part of) the problem statement. Using metacognitive skills more frequently does not automatically imply that these metacognitive skills have a higher level of quality. Therefore, this study focuses on the development of both quantity and quality of metacognitive skillfulness.

Metacognitive skills from a developmental perspective

Roughly spoken, research into metacognition development has focused on two issues: (1) Where does metacognition come from?, and (2) When does it first emerge and how does it develop from there? Although the first question is an issue of great interest, it goes beyond the scope of this study. Therefore, it will be discussed only briefly here. Some evidence has been found that “theory of mind” (ToM) can be considered as a precursor of metacognitive knowledge (Lockl and Schneider 2006), while metacognitive knowledge can be considered as a necessary precursor of one’s metacognitive skills (Annevirta and Vauras 2006). Alongside with the ToM, that is, the understanding of one’s own and other people’s state of mind (Wellman 1990), young preschoolers already start to develop some metacognitive awareness (Blöte et al. 2004; Demetriou and Efklides 1990; Kuhn 1999). Larkin (2006) found a relation between ToM, metacognitive knowledge, and strategy use in two 5- to 6-year olds.

In the present study, the focus will be on the second issue, the development of metacognitive skills in particular. According to Piaget (Inhelder and Piaget 1958), children younger than 7 years are not able to keep a record of their own problem solving. Piaget claimed that the egocentrism of young children prevents them from treating their own thinking as an object (Flavell 1992). He assumed that egocentric perspective would change at the age of 11–12 years “when the child moves into thinking characterized by formal operations or hypothetico-deductive reasoning.” Evidence was found, however, against this assumption of Piaget. In a longitudinal study between 2 and 20 years, Schneider and Pressley (1997) found evidence that memory strategy development begins before elementary school and continues into adulthood. Furthermore, young children tend not to apply strategies spontaneously in contexts where they would be useful. Improvements in this respect occur during the middle childhood years, but even by the end of childhood, performance is far from infallible (Brown et al. 1983; Schneider 1985). Whitebread et al. (2009) observed 3- to 5-year olds while interacting in playful problem-solving situations. The children revealed elementary forms of metacognitive skills. Markman (1979) investigated elementary school children’s awareness of their own text comprehension failure. Results showed that third through sixth graders do not spontaneously carry out the monitoring processes that they are capable of. Only modest improvements could be observed through the school years. Veenman et al. (2006) argue that it is most likely that metacognitive skills develop alongside metacognitive knowledge during preschool and early school years at a basic level and that these skills become more sophisticated and academically oriented when needed in formal educational settings. In a cross-sectional study, Veenman et al. (2004) investigated the metacognitive skillfulness of students aged 9 to 22 years. When performing four inductive learning tasks in different domains, students’ metacognitive skillfulness was assessed with logfile analysis and thinking aloud protocols. A linear increase of metacognitive skillfulness with age was found. In the same vein, Veenman and Spaans (2005) assessed the metacognitive skillfulness of 13- and 15-year olds performing a problem-solving task in math and an inductive learning task in biology. It was found that the older students showed more metacognitive activities than the younger students did. A growth in both frequency and quality of metacognitive skills was found in two studies with participants aged between 13 and 15 and 12 and 14 years, respectively (Van der Stel et al. 2010; Van der Stel and Veenman 2010). In line with these results, metacognitive skills are expected to increase in quantity (frequency) as well as in quality (depth) over the years (hypothesis 1).

From the aforecited, it becomes clear that metacognition develops gradually, that is, it does not appear from one moment to the other. As Kuhn (2000, p. 178) formulated: “metacognition emerges early in life, …and follows an extended developmental course during which it becomes more explicit, more powerful, and hence more effective, as it comes to operate increasingly under the individual’s conscious control.”

Kuhn (1999) said that not a great deal is known about the development of metastrategic (memory) skills in school contexts. Therefore, the focus in this study is on the development of spontaneous use of metacognitive skills, without any training or intervention, during the performance of ecologically valid school tasks.

Relation between intellectual ability, metacognitive skills, and learning performance from a developmental perspective

A related research issue is whether the development of metacognitive skills is intelligence related or relatively intelligence independent. Several researchers (Alexander et al. 1995; Borkowski and Peck 1986; Schneider and Pressley 1997; Van der Stel and Veenman 2008; Van der Stel et al. 2010; Veenman et al. 2004; Veenman and Spaans 2005) investigated metacognitive ability in relation to intellectual ability.

An interesting question is whether metacognitive skills are part of intelligence, that is, “whether metacognition can be reduced to cognition” (Slife et al. 1985). Veenman (1993), Veenman and Beishuizen (2004), Veenman and Verheij (2003), and Veenman et al. (1997) described three mutually exclusive models concerning the relation between intellectual ability and metacognitive skills as predictors of learning performance. The intelligence model regards metacognitive skills as an integral part of intellectual ability. According to this model, metacognitive skills do not have a predictive value for learning performance independent of intellectual ability. Sternberg (1990), for instance, advocates such an inclusive position of “metacomponents” in his triarchic theory of intelligence. The second contrasting model is the independency model, in which intellectual ability and metacognitive skills are regarded as entirely independent predictors of learning performance. Finally, in the mixed model, intellectual ability and metacognitive skills are correlated to some extent, but metacognitive skills have a surplus value on top of intellectual ability for the prediction of learning performance.

Over the last decades, support has been found for each of these models (for an overview, see Veenman and Spaans 2005; Veenman et al. 2004). Many studies, however, are difficult to compare, due to dissimilarities in assessing metacognitive skills (thinking aloud, observation, questionnaires), in participants (age, educational background), and in tasks and domains. Moreover, the focus of some studies is restricted to the relation between intellectual ability and metacognitive skills, thereby excluding the relation of both predictors with learning performance. The evidence found so far seems to be in favor of the mixed model, albeit many of those studies concerned the metacognitive skills of older secondary school or university students in cross-sectional designs. From the perspective of the development of metacognitive skills, it remains to be ascertained more thoroughly in a longitudinal design whether the mixed model can be generalized to younger students performing different tasks in different domains. More specifically, the role of intellectual ability in the development of metacognitive skills will be addressed in this study. Alexander et al. (1995) formulated three developmental hypotheses regarding the relation between intelligence and the development of metacognition. According to the ceiling hypothesis, initial effects of intelligence on the development of metacognition diminish over time. The acceleration hypothesis, on the other hand, assumes that the impact of intelligence on the development of metacognition increases with age. The monotonic development hypothesis, finally, assumes that both intelligence and metacognition show a monotonic growth over age, independent of each other. In their literature overview, Alexander et al. (1995) found support for the monotonic development of metacognitive knowledge. Gifted children showed a general superiority in their declarative metacognitive knowledge, relative to non-gifted children at all ages. With regard to metacognitive skills, however, results were inconclusive. In a cross-sectional study, Veenman and Spaans (2005) obtained evidence in favor of a monotonic development of metacognitive skills. They obtained support for a monotonic maturation effect of both intellectual ability and metacognitive skills. In the present study, it is hypothesized that metacognitive skills develop alongside, but not fully dependent on intellectual ability, regardless of tasks and domains. A monotonic development of both metacognitive skillfulness and intellectual ability is expected (hypothesis 2).

Metacognitive skills across domains

Another objective of this study was to establish whether metacognitive skills are general or domain specific. From a developmental as well as from an instructional perspective, it is relevant to know not only whether metacognitive skills develop, but also how they develop: Whether they develop from being general into becoming domain specific or the other way around? Earlier studies yielded contradictory results. Despite differences in assessment, age groups, tasks, and domains, some researchers found evidence for general metacognitive skills (Schraw et al. 1995; Schraw and Nietfeld 1998; Veenman and Beishuizen 2004; Veenman and Verheij 2003; Veenman et al. 1997, 2004).

On the other hand, De Jong (1992) found that the quality and quantity of metacognitive activities of secondary school students varied substantially across tasks. Glaser et al. (1992) showed that metacognitive activities of university students varied across different discovery learning tasks, although improvement between subsequent tasks did not rule out the existence of general strategies. Kelemen et al. (2000) concluded that individual differences in metamemory accuracy were not stable across consecutive sessions and tasks, which they interpreted as evidence against a general metacognitive ability. Veenman et al. (2004) found support for the generality of metacognitive skills. This support could not be corroborated by a study of Veenman and Spaans (2005). In their study, metacognitive skills of the younger students appeared to be rather domain specific, whereas those of the older ones turned out to be general by nature. Thorpe and Satterly (1990) studied metacognition in children aged 7 to 11 years. Factor analysis failed to provide evidence for a general factor. Metacognition appeared to be task specific.

Some researchers (Schraw et al. 1995; Veenman and Spaans 2005) assumed that metacognitive skills might be initially acquired within separate tasks and domains and then progressively become a generalized repertoire across tasks and domains. Participants in the present study are rather young adolescents and inexperienced in applying metacognitive skills. Spear (2000) characterized adolescence as a transitional developmental period. “Adolescence is the gradual period from childhood to adulthood …adolescence is a period of transitions rather than a moment of attainment” (p. 417). Therefore, metacognitive skills of young adolescents are expected to be in a transitory phase of development, which implies that general and domain-specific metacognitive skills occur alongside one another. It is hypothesized that participants will initially use general as well as domain-specific metacognitive skills. It is also hypothesized that the initially acquired domain-specific metacognitive skills tend to generalize during further development. Therefore, students are expected to resort less to domain-specific metacognitive skills with increasing age (hypothesis 3).

Method

Participants

This 3-year longitudinal project started with 32 first year secondary school students. They were recruited from 85 students of three different tracks of an urban school in The Netherlands. This school is known because of its large diversity of children, thus representing a broad educational level of the students, a broad range of social economic status of parents, and various ethnic backgrounds. It was chosen not to work with more than one school in order to avoid confounding variables, such as differences in teachers, pedagogical/didactic philosophy, schoolbooks, etc. Students with learning or conduct disorders were excluded from the study. Participants were selected on their diversity in intellectual ability (see “Intellectual ability” section). Consent was requested from and given by the participants’ parents. In the second year, four students withdrew as participants due to changing residence or school. In the third year, another three students withdrew for the same reason. So, 25 students participated in the third part of this studyFootnote 1. The data in the present study refer to the 25 students that participated in all 3 years. After completing the tasks, participants received a small financial reward.

Intellectual ability

Each year, students’ intellectual ability was assessed by a series of ability tests. Three subtests from the Groninger Intelligentietest Voortgezet Onderwijs (GIVO), a standardized Dutch intelligence test (Van Dijk and Tellegen 1994), were selected: number series, verbal analogies, and unfolding figures. With these subtests, a number of the primary intelligence factors (Carroll 1993) are assessed: verbal and numerical inductive and deductive reasoning abilities and visuospatial ability. The GIVO, however, lacks a test for assessing memory, another primary factor in Carroll’s (1993) model highly relevant to the prediction of school performance (Crone et al. 2006). Therefore, a fourth test (names and professions, requiring the memorization of word pairs; see Veenman and Beishuizen 2004) was added. In order to determine the growth in intellectual ability, the raw scores of the aforementioned four subtests were compared. Furthermore, raw scores were transformed into z scores, and for each participant, a mean z score was calculated over the four subtests for each year. This resulted in a total score of the participant’s intellectual ability for each year. These scores were used in the correlational analyses.

To guarantee sufficient variance in intellectual ability, the median of the intellectual ability scores was calculated and participants were denominated as high (first quartile), average (second and third quartile), or low (fourth quartile) in intelligence. Participants were equally distributed over the four quartiles.

Tasks

To ensure the novelty of tasks, each year, participants were given new tasks with task demands adapted to their age. All tasks were piloted beforehand. Each year, a text-studying task (history) and a problem-solving task (math) were used. In order to allow for domain-specific metacognitive activities, the differences in tasks and domains were maximized.

History task

In an individual session of 50 min, participants were asked to study a history text in the same way as they usually do when preparing for a test. They were also asked to read and think aloud while performing the task. Participants were allowed to study the text for 30 min, and in the remaining 20 min, a posttest was administered (see “Learning performance” section). The history texts were composed of texts parts from two of the most frequently used Dutch schoolbooks for history: MeMo (Van Boxtel and Schrover 1998) and Sprekend verleden (Buskop et al. 1998). To make sure that the texts were suitable for text studying and measuring learning performance afterwards, the content of the texts had to be new to the participants. Participants likely had little or no content knowledge about the topic of the text because topics were taken from the curriculum that was 1 year ahead. All learning tasks were piloted as well. No familiarity with the topics was observed. In order to minimize a possible confounding effect of differences in learning texts, the format of the texts was made as comparable as possible. Van Hout-Wolters (1986) described various text characteristics that affect learning processes and/or learning results in text studying: type of text, structure, difficulty of language used, length, and didactic help. All these variables were taken into account when composing the history learning tasks. All texts were informative and ecologically valid. Structure, layout, and length of the text were kept almost identical. In order to be suitable for text studying, texts need to be of a certain length. Texts that are too short will only be memorized instead of being studied. In each text, the same didactic help was embedded, that is, three activating questions and/or assignments were included in order to elicit (more) metacognitive activities (e.g., “There are several reasons why the north and the south were at war with each other. Describe in your own words at least two of these reasons” (first year); “The First World War started with the murder of Franz Ferdinand. Do you think that the First World War would have broken out without this murder? Explain your answer” (second year)). From a pilot study (Meijer et al. 2006), it appeared that if a text did not contain such activating questions and assignments, most participants just tended to read linearly. In the first year, a text about slavery and the civil war in the USA was used. In the second year, a text about the First World War and, in the third year, a text about politics and economics in the USA in the 1930s of the last century were presented.

Mathematics task

In another individual session of 50 min, participants practiced to solve mathematical word problems for 20 min. Five problems were presented in the first year, six in the second year, and five in the third year. Several categories of problems were presented. For math, the content of assignments in the learning-by-doing phase was adapted to age and grade each year. However, the tasks were made as comparable as possible as far as it concerned the format. Each year, the tasks were ecologically valid because they were composed of adaptations of math problems from a frequently used schoolbook for math in The Netherlands (Getal en Ruimte; Vuijk et al. 2003). The tasks were piloted in the age group of participants. Although the content of assignments had to be new each year, items with a comparable content, that is, an ascending level of difficulty within the same area of math, were included in the tasks over the years. For example, a geometry assignment was included every year. In the first year, participants had to calculate the circumference of a meadow; in the second year, it concerned the surface area of a triangle; and in the last year, they had to apply Pythagoras’ theorem in order to calculate the horizontal side of a triangle.

Together with the assignments, participants received a sheet containing the answers and a brief stepwise explanation of problem solutions. Participants were free to consult this sheet whenever and as much as they liked. The first 20 min was considered as a learning-by-doing phase. Next, the participants handed in all materials and received another series of parallel problems, which had to be solved in the remaining 30 min without the option to consult an answer sheet. This second part is considered as a posttest assessment of learning performance (see “Learning performance” section). All problems had to be solved while thinking aloud.

Metacognitive skills

Participants had to think aloud during task performance. All transcribed thinking aloud protocols were analyzed on spontaneous use of metacognitive skills according to the procedure of Veenman (Prins et al. 2006; Van der Stel and Veenman 2008, 2010; Veenman and Beishuizen 2004; Veenman et al. 2000). Metacognitive skillfulness was divided into four subscales: orientation (O), planning (P), evaluation (Ev), and elaboration (El). In Table 1, general metacognitive activities across both tasks and domains as well as more specific metacognitive activities for text-studying and problem-solving tasks are described for each subscale of metacognitive skillfulness (cf. Meijer et al. 2006; Pressley 2000; Pressley and Afflerbach 1995).

The scoring method consisted of two steps for each protocol. First, an utterance was coded in the margin if belonging to one of the four subscales (O, P, Ev, or El). This resulted in a quantitative score obtained by counting the frequency of metacognitive activities on each subscale (e.g., if a student evaluated five times, the quantitative evaluation score was 5). Secondly, a score for the quality of metacognitive skills was judged from the protocols. To obtain a reliable score for the quality of metacognitive skills, scoring criteria were formulated for each subscale. This resulted in a method which allowed for assessing general as well as domain-specific metacognitive activities. A five-point scale (ranging from 0 to 4) was used for each subscale. For example, a participant received a higher score for “deeper” elaboration (e.g., drawing a conclusion in one’s own words) than for a superficial one (e.g., summarizing a paragraph almost literally). Each year, six protocols of each task were rated by two judges separately. The interrater reliability was computed on the summed scores over the four subscales of metacognition. Since the interrater reliability was high, the remaining protocols were analyzed and rated by one judge. Cronbach’s alpha interrater reliability ranged from .77 to .93 for the quantitative scores and from .89 to .97 for the qualitative scores.

Learning performance

Learning performance was assessed by a posttest. Each year, the posttest for history consisted of five questions to assess reproductive knowledge (facts and dates, e.g., “What was the name of the Austrian-Hungarian Crown Prince?”) and six questions to assess overall text comprehension (e.g., “Describe why things went wrong in agriculture and explain what Roosevelt did to restore agriculture economically”). Participants were not allowed to consult the text or their notes while answering the questions. According to a rating system, points were given for the correctness of answers to each question. Correct answers could render one to four points. A total score was calculated and used as a measure of learning performance in history (Cronbach’s alpha was .58 for the first year; .51 for the second year; and .80 for the third year).

For math, learning performance was assessed by another posttest. In each posttest, the items were parallel to the items in the learning phase, that is, the surface structure of the posttest items differed from the one of the learning task items, but the deep structure was the same. In the first year, two points per item could be earned if both the procedure and the answer were correct; one point if either one of them was correct; and zero points if neither of them was correct (Cronbach’s alpha = .58). Due to an increasing complexity of problems, another scoring system was chosen in the second and third years: For the first five math problems in the second year, an equal amount of 10 points could be earned. Problem 6 consisted of three subproblems that were independent of each other and, therefore, was valued with a maximum of 30 points. So, the maximum obtainable score in the second year was 80 points. A total score was calculated and used as a measure of learning performance in mathematics (Cronbach’s alpha = .69). In the third year, five points per (sub)problem could be earned with a maximum obtainable score of 45 points (Cronbach’s alpha = .77). Because of the differences in the number of obtainable scores per item over the years, the mean proportion of right answers (p value) was calculated for all questions in each year as well as the mean p value per yearFootnote 2. These p values were very similar over the years.

Procedure

Each year, the intellectual ability tests were administered during a group session. The individual thinking aloud sessions took place during school time. The experiment had a counterbalanced design with respect to task order, meaning that half of the participants started with history and the other half with mathematics. Participants were instructed to think aloud while performing the tasks. The experimenter refrained from helping students in any way. Whenever a student fell silent, the experimenter used standard prompts (e.g., “please, keep on thinking aloud”).

Results

Development of metacognitive skills and intellectual ability

To analyze growth in intellectual ability and metacognitive skills, results of the three consecutive years were compared. First, multivariate analysis of variance (MANOVA) was performed on the raw subscale scores of intellectual ability with age as within-subjects factor. A significant effect of age was found [F(7, 18) = 7.40, p < .001, η 2 = .78]. Pairwise comparisons (comparing the first to the second year and the second to the third year) showed an incremental change in intellectual ability over the years. Furthermore, separate MANOVAs with repeated measures were performed on quantitative and qualitative subscale scores of metacognition in both tasks with age as within-subjects factor. A significant age effect was found for the frequency of metacognitive skills in math [F(8, 17) = 4.32, p < .01, η 2 = .67], whereas for history, no significant age effect was found [F(8, 17) = 2.14, p > .05, η 2 = .50]. MANOVA on the quality of metacognitive skills revealed a significant age effect for both tasks [F(8, 17) = 2.90, p < .05, η 2 = .58 for math and F(8, 17) = 3.28, p < .02, η 2 = .61 for history]. Pairwise comparisons were performed in order to look closer into changes on subscale level over the years. These tests revealed different developmental patterns at subscale level (see Figs. 1 and 2).

Fig. 1
figure 1

Qualitative metacognition scores across age (m = math; h = history)

Fig. 2
figure 2

Quantitative metacognition scores across age (m = math; h = history)

The majority of metacognition scores did not increase continuously over the years. Three main patterns can be observed: (1) growth between the first and the second year, followed by stabilization in scores between the second and the third year (orientation in history; planning in math; evaluation in math, qualitative scores; planning in history, quantitative); (2) growth between the first and the second year followed by regression between the second and third year (evaluation in history, quality; all quantitative scores in math; evaluation in history, quantitative), and (3) incidentally, no growth over the years (elaboration in history; orientation in math, qualitative scores). In conclusion, discontinuity in development was found on subscale level. The various subscales of metacognition do not develop linearly or at the same pace.

Testing the mixed model

To determine whether developmental processes affect the relation between intellectual ability and metacognition as predictors of learning performance, the correlations between these three variables over the three consecutive years were compared (see Table 2).

Table 2 Correlations and semi-partial correlations

As far as the math task was concerned, intellectual ability correlated significantly with both quality of metacognitive skills and learning performance in the three consecutive years. The same applies for the correlation between quality of metacognitive skills and learning performance. The correlation between frequency of metacognitive skills for math and learning performance was significant only in the first year. The correlation between intellectual ability and frequency of metacognitive skills was significant, except for the second year.

Results on the history task differ partly from results on the math task. Only in the first year a significant correlation was found between intellectual ability and learning performance. The correlation between intellectual ability and quality of metacognitive skills was significant in the third year only. The correlation between quality of metacognitive skills and learning performance was significant in the first and the third years. The same applies for frequency of metacognitive skills and learning performance. No significant correlations were found between intellectual ability and frequency of metacognitive skills for history.

To test the mixed model, semi-partial correlations for each age group (Nunnally 1967) were calculated by partialling metacognitive skill from the correlation between intellectual ability and learning performance (i.e., semiIA) and partialling intellectual ability from the correlation between metacognitive skills and learning performance (i.e., semiMeta). These semipartial correlations (see Table 2) are needed to calculate the unique, independent contribution of metacognitive skills and intellectual ability to learning performance. Using regression-analytic techniques (Pedhazur 1982; Veenman et al. 2004; Veenman and Spaans 2005), the unique and shared proportions of variance in learning performance were distributed to metacognitive skills and intellectual ability (see Table 3).

Table 3 Percentage of variance accounted for in learning performance

History results in Table 3 show that, despite the variance shared with intellectual ability, both frequency (QN) and quality (QL) of metacognitive skills substantially added to the prediction of learning performance on top of intellectual ability. Between 13 and 14 years, the unique contribution of metacognition decreased in order to increase again between 14 and 15 years. The unique contribution of intellectual ability to learning performance in history faded out over the years. Math results show an increasing contribution of metacognitive skillfulness to the prediction of learning performance on top of intellectual ability between 13 and 14 years. With 15 years, however, this unique contribution practically disappeared. The unique contribution of intellectual ability to the learning performance on top of the quality of metacognitive skills decreased substantially between 13 and 14 years, however, to reappear with 15 years. To check whether the contribution of metacognitive skills differed significantly over the years, Fisher-z ratios were calculated for pairs of correlations (Guilford 1965). All Fisher-z ratios were smaller than 1.45, meaning that none of the correlations differed significantly.

Metacognitive skills across domains

The generality vs. domain specificity of metacognitive skills was investigated by performing a principal component analysis (PCA) on qualitative metacognition scores. For each year separately, a principal component analysis with a two-factor solution was performed on the four subscales of metacognitive skills for both tasks (see Table 4).

Table 4 Unrotated component matrix for the quality of metacognitive skills

The unrotated solutions of the PCAs show that all measures of quality of metacognitive skills substantially load on the first component (see Table 4). This component has eigenvalues of 3.53 (13 years), 3.28 (14 years), and 3.87 (15 years), with variance proportions of .44, .41, and .48, respectively. Moreover, in the first 2 years, a second component contrasting the two domains was extracted with eigenvalues of 1.78 (13 years) and 1.40 (14 years) and with variance proportions of .22 and .17, respectively. Loadings on a second component of the third year, with an eigenvalue of 1.11 (15 years) and a variance proportion of .14, did not contrast the two domains (see Table 4). The same analysis was performed on quantitative scores of metacognitive skills. Results were in line with those of the qualitative data.

Discussion

This study addressed developmental changes in metacognitive skillfulness in young adolescents aged 12 to 15 years. The research aimed to gain insight in (a) whether metacognitive skills grow in frequency and/or in quality during young adolescence; (b) how metacognitive skills relate to intellectual ability as predictors of learning performance during this period in life; and (c) whether metacognitive skills are general or domain specific by nature in young adolescence. It was expected that metacognitive skills would show a continuous increase (hypothesis 1). Furthermore, it was expected that metacognitive skills would have a unique contribution, on top of intellectual ability, to the prediction of learning performance. Moreover, it was expected that intellectual ability and metacognitive skills would develop in a monotonic way as predictors of learning performance (hypothesis 2). Finally, it was predicted that metacognitive skills would tend to generalize across development (hypothesis 3).

Results concerning the growth of metacognitive skills were not quite as expected. Between the first and the second years (13 to 14 years), growth was found, indeed, in both frequency and quality of metacognitive skills. This growth, however, did not continue after the second year (between 14 and 15 years). Only in one of the subscales, metacognition scores increased continuously over the three consecutive years, whereas most of the subscale scores stabilized or regressed in the third year. So, the first hypothesis is partly corroborated as metacognitive skills show discontinuity in growth between 14 and 15 years. In prior research (Veenman and Spaans 2005; Veenman et al. 2004), a continuous growth of metacognitive skillfulness was found. It should be mentioned, however, that these studies were concerned with larger intervals between measurements and did not focus on the development between 14 and 15 years in particular. Moreover, these studies had a cross-sectional design and the same tasks were used over the years. Therefore, it is presumed that growth is only temporarily arrested. According to the dynamic systems theory (Siegler et al. 2010), a theory that focuses on how change occurs over time in complex systems, individual children acquire skills at different ages and at different pace. Individual development entails regressions as well as progress. Development of metacognitive skills seems to be in line with the dynamic systems theory: During development, both progress and regression occur, and not all components of metacognitive skills develop at the same pace. Vukman and Licardo (2010) also found a decrease in all fields of self-regulation from age 14 to 18 years, followed by an increase to the age of 22 years

The second hypothesis concerned the relation between metacognitive skills and intellectual ability as predictors of learning performance over age groups. A unique contribution of metacognitive skills to learning performance and a shared contribution of metacognitive skills with intellectual ability to learning performance was found in all three consecutive years, with the exception of the frequency of metacognitive skills in math in the third year. The unique contribution of the quality of metacognitive skills in math in the third year was rather small (1.6 %). In a cross-sectional study with the same tasks and the same age groups, however, a much higher unique contribution of metacognitive skills (42.8 %) in 15-year olds was found (Van der Stel et al. 2010). In conclusion, the mixed model was found over the years for both history and math, albeit less convincing than expected for math in the third year.

Another part of hypothesis 2 was the expected monotonic development of metacognitive skills and intellectual ability in line with Alexander’s monotonic development hypothesis. Results of the first 2 years of the present study point in the direction of a monotonic development of metacognitive skills: A continuous growth of metacognitive skills with age was found, alongside intellectual growth. Results of the third year, however, show a continued growth in intellectual ability, but no further growth in metacognitive skills. Despite the discontinuity in metacognitive growth, the results predominantly agree with the monotonic development hypothesis. Apparently, intellectual development does not direct metacognitive development, as would be the case for both the ceiling and the acceleration hypotheses (see “Relation between intellectual ability, metacognitive skills, and learning performance from a developmental perspective”). Moreover, the mixed model was found each year, confirming an independent contribution of metacognition to learning performance. Therefore, the current results are considered to be largely in line with the monotonic development hypothesis. Support for the mixed model and the monotonic development hypothesis implies an autonomous contribution of metacognitive skills as predictors of learning performance, which is relevant to the training of metacognitive skills in education.

The third research question concerned the generality vs. domain specificity of metacognitive skills. In the first 2 years, the solutions of the PCA show a highly similar two-component solution: a first component with rather high component loadings, which may be interpreted as representing general metacognitive skills across domains, and a second weaker component with contrasted component loadings, which may be interpreted as representing domain-specific metacognitive skills. In the third year, however, the solution of the PCA changed: The first component still can be interpreted as representing general metacognitive skills, but the structure of the second component has become much more scattered. It no longer can be interpreted as a domain-specific component. The results support the expectation that metacognitive skills of rather young and inexperienced adolescents represent a general as well as a domain-specific component. Veenman and Spaans (2005) assumed that metacognitive skills initially develop on separate islands of tasks and domains. They also assumed that beyond the age of 12 years, these skills merge into a more general repertoire that is applicable and transferable across tasks and domains. Among young adolescents, a phase of transition could be characterized by applying recently acquired general metacognitive skills, along with a remainder of domain-specific metacognitive skills. In line with hypothesis 3, it was expected that the initially acquired domain-specific metacognitive skills would tend to generalize during development. Although the present results corroborate hypothesis 3, the generalization process was less gradual than expected. Next to drawing on a repertoire of general metacognitive skills, students continue to apply domain-specific metacognitive skills between the age of 12 and 14 years. This may indicate that these students are still in a transitory phase of metacognitive skill development. A future longitudinal study starting in primary school would more fully test the hypothesis that metacognitive skills start to develop on entirely separate islands and then tend to generalize with increasing age.

From an instructional perspective, it would be advisable to extend the training of domain-specific metacognitive skills in a particular learning context to domain-surpassing instruction in various learning contexts (Veenman et al. 2004). Students will profit from an explicit training in metacognitive skills more effectively if that training surpasses a particular learning context. Metacognitive skills, acquired in separate domains, may gradually be generalized across domains (Schneider and Pressley 1997). This process can be considered as high road transfer (Salomon and Perkins 1989). Teachers should initially encourage students to develop their domain-specific metacognitive skills. As a next step, teachers should pay attention to the generalized applicability of the students’ repertoire of metacognitive skills across domains. If teachers from various disciplines attune their instructions regarding metacognitive skills, transfer of metacognitive skills could be facilitated, thus providing students with tools for performing new tasks in new domains. Nickerson et al. (1985) stated that a prominent point in the teaching of metacognition is its relationship to transfer. According to them, there is the possibility of treating transfer itself as a metacognitive skill, which could be directly trained. By doing so, generalization and transfer are no longer considered as “hoped-for-by-products” (p. 301) of teaching.

It should be acknowledged that there are limitations to the present study. A first limitation concerns the generalizability due to the small sample size. The time-consuming method of protocol analysis of individual sessions did not allow for a larger sample. A second limitation is the fact that all participants came from the same school. Both limitations could have affected the generalizability of the results. Another limitation could be the dissimilarity in tasks over the years. Repeatedly measuring learning performance in a longitudinal design, however, makes it inevitable to use new tasks each year. By piloting the tasks and consulting teachers, efforts were made to balance the relative difficulty level of the tasks for each age group. Despite the efforts to make the posttests as comparable as possible in relative level of difficulty, participants could have perceived a difference in level of difficulty other than just a relative difference. Therefore, all posttest results were checked on differences in level of difficulty. Tests did not reveal a significant difference in difficulty between the learning outcomes over the 3 years. Finally, the current study relied on one online method for assessing metacognitive skills. Perhaps a multi-method design will enable to assess metacognitive skills in a more fine-grained way. For example, in text, studying eye tracking could be added to thinking aloud (Kinnunen and Vauras 1995).

In the present study, discontinuity in growth took place in the same period that metacognitive skills became fully general. Maybe due to cognitive overload, growth and transition cannot take place at the same time, but occur alternately. Metacognitive skills are considered as procedural knowledge, that is, a production system of condition–action rules acquired in specific domains for specific tasks (Anderson 1996; Veenman 2011; Winne 2010). The condition part of production rules triggers certain activities (actions) of the learner. The intermittent growth of metacognitive skills could mean a temporary hold on the action part. The generalization process of conditions could be considered as a qualitative change for which the growth of action parts temporarily has to give way. Once students are capable of transferring metacognitive skills that were acquired in one context to another different context, the frequency and quality of their metacognitive activity will continue to increase.

In a recent cross-sectional study by Veenman et al. (in preparation), 119 participants in the age of 14, 15, and 16 years performed a computerized inductive learning task in the domain of geography. Metacognitive skillfulness was automatically scored from logfiles in which all activities of participants were registered (for details of logfiles analysis, see Veenman et al. 2013). Between 14 and 15 years, a similar regression in metacognitive skillfulness was found, which was followed by recovery and strong growth between 15 and 16 years. It remains to be further established, however, whether the concurrence of the two developmental processes of growth and generalization are merely coincidental in the present study or crucial to 14–15-year olds in general. Future research with more participants performing widely varying tasks over a longer period of time could give more insight whether the two concurrent developmental processes found at the age of 15 years can be replicated as a stable pattern. Knowing that brain maturation goes on till the early 20s at least, it would be very interesting to follow students for an extended period of time across development. A longitudinal design, starting in primary school and ending in late adolescence, should be considered for future research. A more realistic, that is, pragmatic alternative might be an overlapping roof tile construction of cross-sectional and longitudinal research combined in one study. In such a design, one group of participants will be followed for a number of consecutive years (e.g., at the age of 8, 9, and 10 years), another group of participants will be followed at different, partly overlapping ages (e.g., at the age of 10, 11, and 12 years), a third group from 12 to 14 years, and so on. This way, a lengthy period can be covered by a relatively short period of data collection. To monitor development closely, intervals between assessments should be no longer than 1 year. In such a long-term longitudinal design, the focus will be on processes of change, rather than describing steady states at different ages with cross-sectional designs.

During the last decade, neurocognitive developmental research has shown that changes in the adolescent brain are nonlinear, nonsynchronous, and with large individual differences (Casey et al. 2008; Steinberg 2005; Toga et al. 2006). The current study shows that different components of metacognitive skills develop neither at the same pace nor continuously. Knowledge about the developmental trajectory of the various components of metacognitive skills will enable teachers to teach the right things at the right time.