Introduction

In international large-scale assessments such as the Trends in Mathematics and Science Study (TIMSS), students were found to have difficulties with meaningful learning, as their knowledge was often not organized cumulatively and few connections were made between single facts (Duschl et al. 2007, p.213; Walter et al. 2006, p.91). Meaningful learning (i.e., learning that systematically builds on students’ prior knowledge) has been proposed as a means to improve students’ scientific literacy (Ausubel 1963; Doménech et al. 2007). Several official documents (NRC National Research Council and Committee on Conceptual Framework for the New K-12 Science Education Standards 2012) and studies (Krajcik et al. 2012) have considered core ideas and crosscutting concepts to promote meaningful learning. Even though empirical evidence for a beneficial effect of core ideas on students’ meaningful (cumulative) learning has not yet been published, Bransford et al. (1999) conducted a literature review and concluded that expert knowledge is closely constructed around core ideas. It may therefore be assumed that core ideas can support the integration of new knowledge into existing understanding and that students also can benefit from using core ideas to structure learning. While core ideas aim at the organization and revision of contents within one subject, crosscutting concepts are of general scientific importance and therefore enable students to organize learning across disciplinary boundaries more coherently (NRC National Research Council and Committee on Conceptual Framework for the New K-12 Science Education Standards 2012). Energy is a core idea in all science disciplines, but is also considered a crosscutting concept (Constantinou and Papadouris 2012; Liu and McKeough 2005; Nordine et al. 2010). As such, it is thought to enable a deeper understanding through a unified analysis of various contexts (Krajcik et al. 2012; Solbes et al. 2009). In order to be able to develop meaningful learning materials and curricular standards with respect to core ideas like energy, research on students’ progression in learning these concepts is required.

Most studies have so far focused on students’ understanding of energy in physics contexts. In order to be able to integrate energy more effectively with respect to the concept’s interdisciplinary character, research is also required for the evolvement of students’ understanding of energy in the other science disciplines. This study originated from the observation that students are likely to encounter energy implicitly (informally) in biological contexts prior to formal teaching of energy in lower secondary physics, thus potentially creating dependence of energy learning across subjects. By implicit (informal) learning opportunities, we refer to the term energy being used with reference to its scientific meaning, while the concepts’ meaning is not explained or introduced. At primary level, examples for such learning opportunities on energy include contexts like nutrition, health, growth, movement, plant growth, and energy sources. In addition to the available research on energy learning in biology (e.g., Jin and Anderson 2012), we focus specifically on young learners’ (grade 3–6) understanding of energy and how the students’ later understanding of central energy aspects (e.g., in physics) may be influenced by the learning about energy in the mentioned relevant contexts from early science and biology classes.

Theoretical Background

Apart from its scientific understanding, energy is associated with various everyday connotations and gets extensive media attention due to socio-economically or environmentally relevant topics such as climate change, (renewable) energy sources, health issues (ozone hole, malnutrition, nuclear energy), or globalization. As a result, students’ understanding of energy is influenced by both everyday and scientific contexts (Boyes and Stanisstreet 1990; Jin and Anderson 2012; Solomon 1983).

The concept’s overarching importance in science lead to a strengthened position of the energy concept in science standards and curricula of different countries in the last years (e.g., Australia: ACARA 2013; Germany: KMK 2005a2005b Switzerland: EDK 2011; UK: DfE 2013; USA: NRC 2012). As an example, the US Next Generation Science Standards (NGSS Lead States 2013) introduces energy as both a core idea in all science subjects and as a crosscutting concept, thereby underlining its unique role for science education: As a core idea, energy is introduced as a continuous basis for learning and revision of different content and to stimulate the coherence and depth of learning between the various topics within each discipline (NRC 2012, p.2, 11). Through its recurring position in science and engineering, energy as a crosscutting concept is to help students organize knowledge and core ideas from different subjects into a more coherent and scientific understanding of the world (NRC 2012, p.2, 83). Table 1 summarizes the NGSS’ learning expectations for different disciplinary contexts for students from kindergarten to high school.

Table 1 Successively more complex content of understanding energy as proposed by the Next Generation Science Standards (NGSS Lead States 2013)

Apart from national standards, more general expectations for an understanding of energy have been proposed (e.g., Duit 1984) and used for researching students’ progression towards understanding energy (Liu and McKeough 2005; Liu and Ruiz 2008). These studies have usually identified the following four energy aspects (referred to as such in the following). These are required for a working understanding of the concept: (1) manifestations of energy in different forms and sources, as well as the nature of energy, (2) energy transfers and transformations, (3) energy degradation and dissipation, and (4) energy conservation (Chen et al. 2014).

Research on Students’ Energy Concepts

  1. 1

    Earlier qualitative approaches: Most studies on students’ energy concepts focused on physics contexts. Here, learners of all age groups were found to have limited understanding of energy (kindergarten: Van Hook and Huziak-Clark 2008; primary: Liu and Ruiz 2008; lower secondary: Trumper 1993; upper secondary: Duit and Kesidou 1988; Finegold and Trumper 1989). Studies showed that students especially lack insight to energy degradation and conservation (Driver and Warrington 1985; Duit 1984; Duit and Kesidou 1988; Forde 2003; Gayford 1986; Goldring and Osborne 1994; Liu and McKeough 2005). For example, student’s difficulties to recognize energy export as heat from open systems into the environment (Burger 2001) or student’s failure to adapt energy conservation in idealized closed systems (Duit 1984) have been reported. Similarly, elementary teachers (Trumper 1997a, 1997b; Trumper et al. 2000), pre-service biology students, and biology teachers (Chabalengula et al. 2011) had limited understanding of energy. Several publications (Trumper 1990, 1993; Wandersee et al. 1994) conclude that, despite explicit instruction on energy, earlier everyday notions on energy are retained in favor of alternative energy conceptions that differ from the scientific energy concept. Students have been furthermore shown not to recognize energy as abstract and immaterial, but rather as a concrete substance (Hirca and Akdeniz 2008; Warren 1982). Further research findings about qualitative facets of understanding energy can be found in review articles by Kurnaz and Sağlam Arslan (2011), Millar (2005), and Nordine et al (2010) or Tatar and Oktay (2007).

  2. 2

    Energy in biological contexts: Specifically for biology, the empirical results for learning about energy are similar to those for physics contexts, but only a few studies were conducted in this field. On a general level, the available studies emphasize the problematic link between everyday connotations of energy and the often strongly contextualized scientific nature of biological contexts (Jin and Anderson 2012). Boyes and Stanisstreet (1990) described students’ (age 11–17) understanding of energy sources for animals and plants. Most students thought that energy for plants is received from the sun, but they also named water, soil, or air as sources of energy. Energy sources for animals were often seen in wind, keeping warm, water, and sleeping. A detailed analysis of students’ (grades 5–12) alternative conceptions to energy in biology is available in a study by Burger (2001). Chabalengula et al. (2011) tested 1st year biology university students and found that energy conservation may be reproduced as a definition, but that the law was not applied in biological contexts. Furthermore, the authors asked students to explain energy using a biological context. Students, however, did not refer to the biological context, but rather explained energy using a physical context instead. More than half the sample was found to think of energy in biological systems as a different type of energy than the energy in physical systems, which the authors linked to vitalistic conceptions such as the idea that living beings are constituted from different materials than inanimate objects . Lin and Hu (2003) addressed energy flow and matter cycling in 7th grade students. The authors concluded that even though energy is explicitly introduced 2 years later, energy already has an important role for contexts relevant to students in 7th grade. An emphasis on recalling definitions in textbooks, insufficient teacher knowledge, and students’ difficulties with transferring principles from the non-living world to living beings have been reported as potential reasons for limitations in integrated conceptual learning. Jin and Anderson (2012) proposed a learning progression for energy concepts in carbon transforming processes. Of four possible achievement levels, only students at the highest level were able to use the energy concept as an analytical tool. In contrast, level 1 students were found to employ force-dynamic accounts and see energy as a cause of life and actions, while constraints in processes were seen in agents or life itself and not in conservation laws. Only 10 % of high school students reached level 4, while the majority were categorized as level two. These results suggest that energy understanding in biological contexts is progressing little from elementary school onwards, if no specific learning approaches or cross-disciplinary links for learning about energy are provided.

  3. 3

    Energy learning for young students: Despite the reported shortcomings in students’ energy understanding, young learners have been shown to be able to learn about energy sources, transfers, and storage possibilities through inquiry-based, hands-on activities (Van Hook and Huziak-Clark 2008). Shultz and Coddington (1981) showed that even primary students can conceptually grasp the more difficult energy aspects concerning thermodynamics through practical mechanical experiments. Furthermore, the benefits from an early start of energy concept learning have been addressed by Novak (2005), who showed in a long-term longitudinal study that early energy learning (grade 1 and 2) has positive effects on students’ energy concepts in science classes of their later school life. The above mentioned studies show that students—in relation to the concept’s relevance in science and education—know relatively little about energy, even though they could conceptually handle the concept and would also benefit from an early learning start. Dawson-Tunik (2006) showed that energy learning is characterized by both transitional (moving from one level of understanding to the next) and obligatory consolidation stages, at which the understanding is elaborated at the same conceptual level before students can progress to the next level. These findings are especially relevant for early learning opportunities or learning about energy in different disciplinary contexts, where teachers (or school books) may consider energy on a much higher level (e.g., including thermodynamic aspects) than the students, who may only use the concept as a term without conceptual reference. The previous sections presented studies on students’ energy understanding, which has often been found to be limited for various age groups, encompassing restricted knowledge transfer to other subjects (e.g., biology) and insufficient inclusion of young learners’ potential. The following section addresses energy learning across disciplinary contexts and is followed by a summary of the literature review.

  4. 4

    Towards a learning progression for energy: Lee and Liu (2009) presented one of the few studies to focus on the interdisciplinary nature of energy learning. The authors focused on knowledge integration and found middle school students’ (grade 7–9) energy concepts to depend much on the classes taken. The highest levels of knowledge integration were found for students with a physics-class profile. Life and earth science students were considered to require further instructional support for knowledge integration in energy concept learning. Lancor (2012) published an interdisciplinary (ecology, chemistry, mechanics, electronics) study with university students using student-generated analogies. The results showed that conceptual metaphors used by students differed much more according to the specific context than with the discipline the students originated from. These findings underline the requirement of an interdisciplinary teaching approach for energy based more prominently on a learning progression across disciplines. With respect to potential context- or discipline-specific differences in energy understanding, a broad empirical basis concerning students’ energy understanding could be helpful  for future restructuring of energy education.

    Krajcik et al. (2012) argue that the key to support meaningful learning is to study how learners develop core concepts over time and to use these learning progressions as a basis for helping students to make necessary connections between individual pieces of knowledge. Learning progressions are “descriptions of the successively more sophisticated ways of thinking about a topic that can follow one another as children learn about and investigate a topic over a broad span of time” (Duschl et al. 2007, p.214). Several learning progressions have been published and will be briefly recapitulated. These more recent approaches to students’ energy concepts were often taken via quantitative assessment and usually large samples or re-analyses of large data sets (e.g., Liu and Ruiz 2008). Liu and Tang (2004) showed significant progression in understanding forms of energy among Chinese and Canadian students in grades 4, 8, and 12. In contrast, students showed little knowledge gain for energy transformation in grades 4–8. Even among grade 12 students (Canada), less than 30 % made reference to energy conservation, when they were asked to make a list of energy-related terms and clarify these. Liu and McKeough (2005) described a staircase-like progression of students’ understanding from grade 3–12 across the above mentioned energy aspects. Students developed high proficiency in the basic stage of activity and work until grade 8. Middle school students achieved only intermediate results until grade 8 for energy forms and sources as well as energy transfer. Degradation and conservation were ideas understood insufficiently even by grade 12 students specializing in physics and math. Similarly, Neumann et al. (2013) focused on middle school students’ energy learning. Their data suggest that students simultaneously develop an understanding of energy transfer and energy dissipation. By grade 10, most students mastered an understanding of forms of energy. Energy dissipations and transfer were mastered by above-average students and energy conservation was only understood by the most able/oldest students. Concerning energy learning at primary level, Lacy et al. (2014) proposed a hypothetical energy learning progression for grades 3–5 based on mostly mechanical problems. In exploratory interviews, the authors found indications that grade 3 and grade 5 students progressed after a 2-week tutorial towards quantitatively assessing the amount of energy in a system, recognising different energy forms, recognising the idea of energy transfer between objects, and associating energy gains by one object with energy losses by another object in the same system.

  5. 5

    Summary: Similarities in alternative student conceptions have been shown for the different disciplinary backgrounds. Concerning the learning progressions mentioned earlier, greater insight is required into students’ progression in contexts other than physics. Here, a common frame or model (e.g., energy aspects) can be useful to project energy learning across disciplinary backgrounds in order to approach a more coherent, crosscutting learning approach to energy. The previous section showed that despite the often reported constraints in energy understanding, young learners are able to learn about energy and can later benefit from doing so. Even though several (mostly biological) contexts with relevance for energy are part of early science instruction, little information is available on when and how young students’ energy understanding progresses along these contexts. This study addresses this concern in order to provide further insight for a crosscutting energy learning progression.

Research Questions and Hypotheses

Research Questions

  1. (1)

    Do primary students’ scores on an energy test for biological contexts increase significantly from grade 3–6, i.e., prior to the concept’s formal instruction in physics?

  2. (2)

    Can specific contents and time points be identified at which primary students’ energy understanding progresses towards a more scientific understanding of energy?

Hypotheses

Primary students are confronted with numerous biological contexts that are relevant to  the energy concept before the concept’s scientific understanding is formally (explicitly) introduced in school. These contexts have the potential to trigger the progression towards a basic level of energy concept understanding. This progression in understanding should be statistically testable through the application of test items in the grade band of interest (3–6) and changes will be observable on (a) a quantitative scale (progression of student scores on complex multiple-choice items) and (b) on a qualitative scale, representing different patterns of understanding that relate to the learning opportunities in school. The pattern changes can point at progression points, where students turn towards a more scientific understanding of energy. With respect to the work of Lacy et al. (2014), we hypothesise that, for instance, more advanced students more often recognise heat loss of living organisms as an energy transfer to the environment, or, that older students less frequently uphold a conception of used energy as being “just gone/lost”.

Methods

The research questions were addressed through an explorative, cross-sectional quantitative study in grades 3–6 using multiple-choice questionnaires. A quantitative explorative design was chosen, as research using qualitative data had already been conducted for biological contexts (e.g., Burger 2001; Chabalengula et al. 2011). On  the other hand,no comparable data were available for quantitative learning trends in biology like those presented for physics contexts (e.g., Liu and McKeough 2005). However, insight to learning trends is required from all disciplinary backgrounds for the formulation of an interdisciplinary learning progression for energy as a crosscutting concept. With respect to research question two and the hypothesis concerning changing patterns in understanding, a more qualitative analysis through distractor and attractor analysis was conducted (section Results - Distractor-Based Analysis). Three smaller studies (a school book analysis, a teacher survey and an inter-rater agreement on item-model fit) were conducted as part of the instrument development and in order to address aspects of validity. Accordingly, findings from these studies are detailed within the following section and will again be discussed at the end of this paper in the light of validity and implications. Figure 1 provides the reader with an overview for the following section concerning the employed variables and their relations.

Fig. 1
figure 1

Variables and their proposed relation as employed in this study. Solid-lined connecting arrows mark relations between variables analysed in this data set, dash-lined connection arrows and variables mark hypothetic relations or contents not covered in this study

Instrument Development

Since measuring an abstract construct such as energy may strongly depend on the employed instrument, the following section lays open the steps taken in the development process. Complex multiple-choice items were employed as a medium of assessment to enable the analysis of a larger group of students (see below also ). Items were developed on the basis of a two-dimensional model of energy concept understanding, suggested by Neumann et al. (2013). The first dimension focuses on the four central aspects of energy understanding, which were also identified in previous studies: (1) forms and sources of energy, (2) energy transfer and transformation, (3) energy degradation and dissipation, as well as (4) energy conservation.Footnote 1 Empirical data (Neumann et al. 2013; Liu and McKeough 2005) suggest an overlapping progression in students’ learning of these energy aspects. The second dimension consists of four levels of cognitive complexity ranging from factual knowledge (a single fact, e.g., the energy source for humans) via mappings (combination of two facts) and relations (combination of three facts) to conceptual understanding (complex insight into an energy aspect). Since Neumann et al. (2013) found no clear progression along the levels of cognitive complexity, these levels were used only as a structured guideline for items construction. The instrument for the data sampling was constructed in four steps:

  1. 1

    To identify relevant contexts for energy learning in early science/biological class, an analysis of science/biology school books was conducted along with a survey on teachers’ use of energy in biology class. In the discussion section, findings from these accompanying studies are compared with the progression of students’ energy understanding with respect to the relatedness of the respective findings.

    For primary level, two up-to-date school books (−series) were used for each of the grades 3/4 (Kraft 2009; Meier 2007a, 2007b) and 5/6 (Gottlieb et al. 2010; Jütte and Kähler 2008). The analysis revealed a substantial, explicit (direct) presence of energy in early science learning. Here, the occurrences of the energy concept increase from an average 14 occurrences per book in grades 3 and 4 to 84 occurrences per book in grades 5 and 6. However, only 10 % of the total number of occurrences of energy concern energy degradation and conservation. Even though these aspects still only make out 16 % of the occurrences of energy in high school books, the aspects of degradation and conservation were further considered for the instrument development, as they represent crucial components of an understanding of energy.

    In the general science school books for  elementary school selected for this study (see above), 15 out of 28 total occurrences were concerned with biological contexts. A different distribution was found for implicit energy mentioning. This we defined as the occurrence of a context, which was linked to the energy concept in a reference biology high school textbook (Bayrhuber C v et al. 2010). Implicit occurrences were found to be much more diverse (grades 5/6, 210 contexts per book), thereby underlining the number of opportunities for teachers to come across energy in early science/biology class.

    A second accompanying study was carried out, surveying teachers (N = 16, average teaching experience 14.7 years) with respect to their teaching of energy in primary school science/biology classes. The results indicated that one of the aspects being used in this research,  forms of energy, is referred to  by all teachers in lower secondary biology class, while the focus was on heat (12/16) and kinetic energy (9/16) or energy in chemical bonds (7/16). Energy transfers were treated by 4/16 teachers, while thermodynamic aspects were considered by only a minority (5/16), showing a similar trend as in the school books. Teachers varied greatly  between addressing (only mentioning) and explaining energy in biology class: For “body movement” (10/14), “nutrition/digestion” (13/14), and “respiratory/circulatory system” (12/14), the majority of teachers only mentioned the term energy when teaching these contexts. At the same time, energy was widely applied with at least some teachers mentioning energy in each of the contexts required by the curricula for teaching in grades 5/6 (except “human sexuality”).

    In summary, all energy aspects from the model (see above) were also identified in the school book analysis and in the biology teacher survey. Additionally, neither of these two smaller studies identified biology-specific energy aspects. We concluded that, until grade 6, all aspects from the model play a role in biology education and can therefore be used for instrument development.

  2. 2

    First, context-independent expectations for students’ response behaviour were formulated for all combinations of energy aspects and levels of cognitive complexity in order to allow a parallel construction of items for different contexts. To achieve different levels of cognitive complexity, Neumann et al. (2013) limited the amount of additional information in the stem of items of higher complexity. However, the authors found no clear trend in generating different levels of difficulty through this design feature and concluded that the additional information may have increased cognitive load as students have to process this information. Consequently, we did not pursue this design criterion for items in this study, but achieved varying difficulty by increasing the amount and complexity of information expected in students’ replies to higher-complexity items. The generalised expectations for each combination of energy aspects and levels of cognitive complexity were then specified for the contexts found in the school book analysis (step 1). In the following, these specified expectations were used as authoring guidelines for the items. Only multiple-choice items were constructed, to exclude writing difficulties among the younger students. Distractors were formulated according to common alternative energy conceptions in biology (e.g., Barak et al. 1997; Boyes and Stanisstreet 1990; Burger 2001). Due to the young age of the students, only few items were feasible for the development of the instrument. Consequently, the guessing probability and the reading effort were limited by distributing the correct response on two options. The students could therefore score half credit (≥ 50 % correct) and full credit (100 % correct) on each item, while items left blank were awarded 0 points (compare discussion of structural validity). Classically, sums of item scores were formed for each energy aspect and across all items (“mean energy score”, compare Fig. 2) and then divided by the number of items. Mean scores across all students of the different grade bands were used for the cross-grade comparison.

  3. 3

    Qualitative student feedback was collected for each item from one grade 4 and one grade 5 student using a questionnaire for assessing item quality (AAAS 2007). Items, which students had difficulties with (e.g., with respect to wording) were adapted accordingly. Most students struggled with items representing the higher cognitive levels (“relations” and “concepts”—compare model in section 2.1), possibly due to the amount of required content knowledge and reading requirements. Consequently, items from these two levels were disregarded for further quantitative field testing.

  4. 4

    The remaining 48 items from the two lower levels of complexity (“facts” and “mappings”, see model) were subjected to quantitative field testing with N = 168 students from grades 3–6. Three criteria were used to select items for the main data sampling: (a) Since the required instrument aimed at investigating progression in understanding from elementary to secondary school, item difficulty in grades 5 and 6 was used as a reference (minimum 10 %; best fit to normal distribution of difficulties; range 11–47 %). Since items were still difficult for students in the field testing, we provided a help-box on each item of the main data sampling, telling the students how many options were correct in each item. Through this measure, mean item difficulty in grade 6 was lowered to 40 % (range 5–80 %) for full credit and to 71 % (range 49–100 %) for half credit. (b) As a second criterion, items that showed a distinctly negative correlation with grade level (r < −0.15) were excluded from the analysis, as they were considered likely to represent content specific to a single grade or learning group and could therefore introduce construct-irrelevant variance to the results. (c) As a third criterion, the goodness of fit of items and model was assessed: Researchers (N = 18) in the field of biology education without any prior knowledge about this study were asked to sort items into the model. These raters were provided with information about expected content in student responses only, while no clues were given for specific item design or formulation features. Item selection for data sampling only considered items that had been sorted into the model completely correct in terms of energy aspects by at least 2 of the 3 raters (average correct across all items 75 %). Fleiss’ Kappa (Fleiss et al. 1969) furthermore showed substantial inter-rater agreement (κ = 0.61 ± 0.08SE) for energy aspects. All items and item parameters are provided in the online resources 1 and 2.

Data Sampling

As students in grades 3 to 6 can be expected to have a short attention span, test time was limited to one lesson (45 min), in which each student received 8 items (elementary school 6 items). Besides these, reading ability (Retelsdorf and Becker 2012—short test for reading speed and correctness), cognitive abilities (KFT-N2–graphic analogies: Heller and Perleth 2000), and intrinsic motivation/interest in science/biology class (Deci and Ryan 2012) were tested as control variables (compare Fig. 1). All tests were administered by the first author and three university students while classroom teachers were present. Testing did not stop until all students had finished.

In order to partially make up for the limited feasible number of items per student, two parallel test forms (A and B) were deployed, thereby allowing for a replication and comparison of trends found among students. Each test form contained a separate set of eight items representing the four energy aspects mentioned in the theoretical background section. T tests were conducted to compare mean item difficulty in the two test forms for grades 3/4 (6 items) and grades 5/6 (8 items). The test forms were not significantly different in item difficulty at both levels (grades 3/4: t (5) = 0.37, p = 0.72; grades 5/6: t (7) = 1.27, p = 0.25), therefore allowing comparability of the two forms.

Since the items were constructed analogously to those used by Neumann et al. (2013), we assumed a one dimensional construct of energy. Cronbachs α was therefore calculated as an approximate across all six items used in grades 3–6. Due to the small number of items and potentially undetected subscales (e.g., for energy aspects) among the items, internal consistency was low (α = 0.55/0.52) in the two test forms, respectively. However, the Spearman-Brown formula (Remmers et al. 1927) predicted α = 0.70/0.82 for an increase in test length from 6 to 12/24 items, which can easily be put into practice in studies with older students.

Participants were sampled from schools in a large metropolitan area in northern Germany. Altogether, 25 classes from 8 schools with N = 540 students (grade 3 N = 110; grade 4 N = 120; grade 5 N = 188; grade 6 N = 122) participated. Sample sizes were equally distributed across gender (49 % female). To control for variation in social variability, only schools with an intermediate social background index ("KESS" = 3–4 on a scale of 1–6) were chosen. In elementary schools, students of all abilities were present. In grades 5 and 6, students from both main secondary school types—“Stadtteilschulen” (mixed abilities) and “Gymnasien” (higher abilities)—were represented in  the sample. Data were collected at the end of the school year 2012 within 2 weeks to ensure comparability across classes and schools, respectively.

Data and Statistical Analysis

Tests on differences in energy understanding were conducted through one-way ANOVA in SPSS. Each of the procedures were tested (Shapiro-Wilk’s) and inspected visually for normal distribution. Hartley’s FMax variance ratio (Pearson and Hartley 1954) was computed due to the larger sample size and showed homogeneity of variances to be given (FMax = 1.61 < 1.96 for N > 60). Post-hoc analysis was conducted using Games-Howell’s test, which is reported to perform well in unequal sample sizes (Field 2009, p.374f). Correlation coefficients were calculated using Pearson’s r in parametric and Kendall’s τ in non-parametric data settings. Since several groups were compared, effect sizes were calculated using ω (Field 2009) and the rating of their strength followed benchmarks by Kirk (1996). Analysis of age effects on energy scores were analysed by regression discontinuity analysis via Chow’s test (Chow 1960), following procedures recommended by Lee (2008). Due to significant variation of the covariates reading ability and motivation across levels of the independent variable (grade level), ANCOVA was not a valid procedure in this dataset (Field 2009, p.397). Instead, multiple regression analysis was conducted with forced entry of variables and following limitation to only variables that significantly improved the model.

Results

Progression of Energy Concepts in Biological Contexts in Grades 3–6

The main results to address research question one are presented in Fig. 2a (test form A) and Fig. 2b (test form B). The mean energy scores across all students of grades 3–6 are presented in separate column stacks for each of the four energy aspects (left) and as a mean energy score across these aspects (right). The two test forms showed similar trends in energy understanding with a significant increase in energy scores in grades 3–6 for both test forms, F (3, 266) = 59.50, p < 0.001 (A) and F (3, 266) = 17.44, p < 0.001 (B). The columns labeled “Mean energy aspects (all aspects)” suggest a gain from grade 3–6 by a factor of 2.3/1.7 with a large (Kirk 1996; threshold = 0.14) effect size of ω = 0.39/0.15 for the grade level.

Fig. 2
figure 2

a/b Mean scores across all students in grades 3–6 (see key) for four energy aspects (left) and as a mean score across all aspects (right). a: Test form A. b: Test form B. A score of 1 refers to a performance with ≥50 % of the respective items or choices within the items being solved correctly; a score of 2 refers to all items being solved correctly. Error bars denote 95 % confidence intervals (CI). Different lower case bolt letters indicate significant difference of p < 0.001, italic lower case letters show a difference at the level of p < 0.05. Numbers in brackets behind the key indicate sample size and mean age (± standard deviation, SD)

Student scores were generally low with considerable, but proportionally decreasing variation from grade 3–6. Significant progression of mean energy scores was prominently observed from grade 4 onwards. The scores received by students even in the higher grades do not far progress beyond a score of one (out of two), indicating that most students are only able to partially solve energy items in biological contexts.

The differences between students’ scores for the aspects of energy were compared across grades 3–6 for each test form. With the exception of the difference between energy forms and energy degradation in test form B, T tests showed significant differences between each combination of energy aspects (p < 0.001 in 9/12 combinations), indicating a different progression of students’ energy concepts with respect to the energy aspects. Table 2 sums up Fig. 2a, b and shows mean scores (±standard deviations, SD) for the energy aspects across grade 3–6 in both test forms, as well as means across the two. On average, students in both tests scored lowest for energy conservation, while students scored highest in the two aspects not linked to an understanding of thermodynamics, namely energy forms and transfer. The average across the two test forms showed that scores for energy forms were highest, while energy transfer and degradation received medium scores and energy conservation received only 2/3 of the scores of energy forms.

Table 2 Mean scores (±SD) across grades 3–6 for the four energy aspects

Distractor-Based Analysis of Energy Concepts in Biological Contexts

To obtain a more detailed insight into the quantitative findings presented above and to address the second research question, the qualitative patterns of students’ selection of attractors (i.e., the correct answering options) and distractors (i.e., the incorrect answering options) were analysed. Fig. 3a–f show progressions in students’ understanding of plant and animal/human energy sources (Fig. 3a, b), energy transfer in plants and animals/humans (Fig. 3c, d), as well as transfer with focus on degradation and conservation (Fig. 3e, f, respectively).

Fig. 3
figure 3

Distractor analysis on energy items concerning energy sources, transfer, degradation, and conservation in plants and humans/animals. Progression of the share (%) of 3rd–6th grade students selecting distractors or the correct options (see key). Dotted lines indicate assumed, but not measured progressions, since the study design was cross-sectional. Percentages refer to between N = 54–96 students per grade level

Students demonstrate limited understanding across all grade levels for energy storage in plants (Fig. 3a). The data indicate that older students increasingly selected the distractor “light stores” and least of all the correct ideas of sugars, fats, and proteins. When students were asked to rediscover light energy transformed in a tree (Fig. 3e), their understanding that energy is stored in tree material clearly increases from 1/4 of the 3rd/4th graders to 73 % in grade 6. Concerning a beginning understanding of energy degradation, a substantial share of students in all grades (> 44 %) acknowledges heat loss from a tree into the environment. Few students of all groups traced energy into tree movement or electrical currents in nerves (< 20 %). The incorrect understanding of energy leaving the tree via water decreased only among 6th graders (19 %), while roughly half of students in grades 3–5 still held this idea.

For energy in human nutrition (Fig. 3b–d), 6th graders generally understand that humans obtain energy through foods from sugar, fat, and protein (> 78 %, compare Fig. 3c) with less than 17 % selecting the distractors heat, vitamins, and nutrients. Third and 4th graders show no such trend and select all options to similar degrees. Fifth graders selected similar options as older students, but still widely disregard fat (< 40 %) as an energy source in food and more often select vitamins (43 %) instead, thereby potentially demonstrating a more vitalistic conception of energy (Barak et al. 1997). Figure 3c also focuses on the often neglected role of oxygen with respect to nutrients as energy sources (Warren 1983). Here, the progress between elementary and secondary students’ appreciation of the role of oxygen for nutrient degradation (here: fat) is comparatively small, with 2/3 of secondary students seeing this relation. In both Fig. 3c, d, a third of the elementary students selected water and sunlight as energy sources for human movement, while these distractors are less selected by older students. Heat from the environment as an energy source was in both items disregarded by more than 80 % of all students. At the same time, distinctly more students (43 and 34 %, respectively) incorrectly saw the heat in foods (here: a warm pizza) as an energy source (Fig. 3b).

When students were asked to rediscover energy after its transformation in a muscle (Fig. 3f), the results were more consistent in progression than for the parallel item concerning trees (Fig. 3e): More students recognised heat energy leaking into the environment in the botanical item than in the corresponding zoological item. Few students actively rejected energy conservation: In Fig. 3f, less than 20 % of 4th–6th graders and roughly 30 % of the 3rd graders selected the incorrect option that energy is lost after the muscle movement stops. In contrast, older students more often agreed that sugars are produced in a muscle during movement and that the sugars then store the energy of the movement.

Influence of Control Variables on Energy Scores

All results in this section were achieved by contrasting mean energy scores with the control variable (compare Fig. 1) scores for all students simultaneously through multiple regression analysis. Variables that significantly contributed to the improvement of the regression model are presented in Table 3. Regression coefficients (B), their standard errors (SE B), and the standardised regression coefficients (β) are shown for the variables in each model step. The note below Table 3 shows the corresponding model fit values: The model accounted for 34.5 % of variance (R 2) in the energy scores. Generalisation is possible to a far extent with only 0.4 % shrinkage when transferring the results from the sample to the population (adj. R 2). Standardised regression coefficients (in brackets) allow a comparison of relevance between the following factors, which contributed to the best fitting model: Grade level (0.37), reading ability (0.27), and cognitive abilities (0.16), with p < 0.001 for all three.

Table 3 Multiple regression model. Steps show forced entry inclusion of significantly contributing variables to predicting energy scores (see below)

The following variables were excluded from multiple regression analysis due to a non-significant contribution to model improvement and low or non-significant (n.s.) correlations/differences with/in students’ energy scores: student statements on the extent of prior energy instruction (r = 0.27), interest/motivation in science/biology class (r = 0.27), sex (n.s.), and grade received in biology/science class (n.s.). In addition to multiple regression analysis, the influence of age was assessed through two procedures. First, the correlation of age and energy scores in each of the 24 participating classes identified only one class with a significant correlation of τ = −0.58 (p < 0.01), indicating little effect of age besides the effect of grade level (r = 0.53, p < 0.001 across all students). Second, the effect of age was contrasted with the effect of grade through regression discontinuity analysis (Lee 2008; Luyten 2006). The Chow test showed that the interaction effect between age and grade level was described by significant differences between both slopes and intercepts between grade levels, F (3, 539) = 17.34, p < 0.001, indicating that energy scores were primarily related to grade level and only to a minor extent to age.

To control for a possible school effect (higher representation of high-ability students in grades 5 and 6 in comparison to grades 3 and 4), the progression of student scores was calculated again with only students from mixed ability schools (“Stadtteilschulen”) in grades 5 and 6. In this case, the differences between grade levels were slightly levelling, yet still significant, thereby indicating a school effect to be negligible.

Discussion

The following section addresses aspects of validity and reliability. Afterwards, findings from this study are discussed and possible implications are pointed out.

Discussion of Reliability and Validity

In the section Methods - Data Sampling, we introduced limitations of the questionnaire approach in relation to the learners’ age in this study. Accordingly, the maximum length of the instrument was limited, causing low reliability levels in both test forms (α = 0.55/0.52). Measurement accuracy is therefore limited. Even though we could only compare average student performance in different energy aspects on the basis of few items, it needs to be reiterated that our research focus lay in discovering broad learning trends, which we found in both test forms. With respect to the predicted results for the reliability of longer tests (Spearman-Brown, section 2.2), we therefore deem reliability sufficient for our needs.

In the following, we use Messick’s (1995) unified concept of construct validity to discuss our findings.

  1. 1

    Content aspect: The findings from the school book analysis supported our hypothesis, suggesting that students are early (and often implicitly) confronted with energy in biological contexts before explicit instruction on energy in physics. We found that the model by Neumann et al. (2013) was well applicable for school book and curricular analysis, as all occurrences could be assigned to energy aspects. A curricular analysis supported the findings from school book analysis. The results from the teacher questionnaires indicated that teachers implement at least parts of the suggested teaching of energy found in school books.

  2. 2

    Substantive aspect: The theoretical basis of this study was generally addressed through the foundation on core concept-based meaningful learning. For the processes required by students for solving the energy tasks in the items, we refer to the studies by Shultz and Coddington (1981), as well as Novak (2005), showing that early learning about an abstract concept such as energy is possible and sustainable. For empirical proof of the occurring processes in the developed items, think-aloud protocols (N = 4–5 per grade level) are currently evaluated. Results indicate that by grade 6, students are able to apply the intended processes for solving the employed items.

  3. 3

    Structural aspect: In many publications the energy concept has been identified to consist of the four energy aspects mentioned above. These aspects were used as a basis for assessing students’ understanding in this study. Additional aspects specific to biological contexts were not identified in the analysis of school books or the teacher survey. The assessment of fit of the model to items through researcher-ratings showed substantial agreement of Fleiss κ = 0.61 ± .08 SE for the four energy aspects. Distractor analysis revealed that some alternative conceptions are readily decreasing while others persist, indicating that a categorical scoring rubric (0–1–2) was content-wise helpful to achieve more representative results especially among younger learners. This trend was confirmed, as both the number of items that were only partially worked on (grade 3: 12 %, grade 6: 4 %) and the number of items left blank (grade 3: 7 %, grade 6: 4 %) decreased, indicating that learners early  ideas about energy were included in this scoring system. Less than 2 % (grade 3) and < 1 % (grade 6) of students misunderstood the instructions and ticked more than the correct amount of options, thereby showing that this factor in influencing energy scores over the grade band was negligible. Even though mean student scores for the different energy aspects vary only on an interval scale due to the limited number of items, we see this problem manifested as a reliability issue which was discussed above.

  4. 4

    Generalizability: This study contained two test forms, which showed similar trends. These may also reflect the specificity of the assessed construct. Concerning the latter, it is important to note that the findings are intended as insight for formulating a more general learning progression on energy. The results from multiple regression analysis have furthermore shown that the findings are well generalizable to the population level (shrinkage: 0.4 %).

  5. 5

    External aspect: The data showed a significant correlation of energy scores and students’ reading skills (r = 0.48), underlining the limitations at elementary school for written assessment concerning more abstract/difficult topics like energy. The correlation with cognitive skills (written language independent, fluid intelligence; r = 0.28) was low, indicating—in combination with the limited correlations with further control variables—that the energy scores in this study were not bound to other common learning variables. Taking a closer look, the limited empirical connection between motivation/interest in science and our measured energy scores (r 2 = 0.27) may reflect that the energy concept—as we tested it—is not among the regular science topics primary students want to find out about. If students are interested in energy, content they would face are possibly more likely to include e.g., the energy transition or global change and less often energy in the sense of an abstract concept to be used as a tool. The low correlation between test scores and students’ statements on prior coverage of energy in school may relate to the trend of teachers and school books mentioning energy without explaining it (compare results from school book analysis and teacher interviews in the section Methods - Instrument Development). This, as well, may be reflected by the non-significant correlation with students’ science grades. Our finding that energy scores were more related to grade level than to age would suggest that curricular-dependent factors are more relevant for energy scores than extra-curricular ones.

  6. 6

    Consequential aspect: The section Discussion - Weaknesses and Perspective below points out limitations of the presented data. In light of this, it is important to note that this study aims only at the identification of learning trends to provide further insight for the formulation of a wider learning progression. The instrument was neither developed for diagnosis at the level of single students nor for giving quantitative statements concerning students’ achievement in relation to e.g., science standards.

Main Findings, Relevance, and Implications

Following the theory of meaningful learning (Ausubel 1963) and results from international student assessment studies, this study was motivated by the long-term goal to pursue more cumulative, meaningful learning in science through the core idea and crosscutting concept energy. We focused on providing further insight for the formulation of a more coherent learning progression for energy. For this, we targeted young students’ learning of energy in biological contexts before explicit introduction of energy in physical contexts (here, typically in grade 8 or 9). Novak (2005, p.36) points out that the commonly “grossly underestimate[d]” capability of young students for conceptual learning was markedly opposed by their findings, showing beneficial effects for later learning through an early introduction of core concepts like energy.

Our analysis of primary school books and teachers’ statements on the inclusion of energy in primary school showed substantial presence of energy in biological contexts at primary level. In combining these findings with Novak’s (2005), a detailed empirical analysis of early energy learning can provide valuable information for the formulation of an energy learning progression that builds on young learners’ potential and includes the respective learning opportunities for energy.

Concerning our research questions, the results showed (1) a significant increase in energy concept understanding in biological contexts prior to the concept’s explicit instruction in physics. (2) Until grade 6, the average student scored half the possible points. Still, several students at this level scored almost full credit on our energy test forms. The analysis of distractors and attractors (Fig. 3a–f) showed for specific contents that students’ average understanding of energy progressed at specific grades, thereby approaching—at a basic level—the scientific energy concept. Student proficiency differed significantly between items for different energy aspects. Across test forms, understanding of energy conservation showed least progression and the lowest scores. On average, the score for energy degradation was similar to that of understanding energy transfers and transformation (mean across all grades and across both test forms = 0.83 for both aspects, Table 2).

We also related the observed student energy understanding to findings from the adjoining sub-studies (school book analysis and teacher interviews, see section Methods - Instrument Development) in order to account for the role of the (intended and enacted) curriculum on students progression in understanding the energy concept. A focus on the concepts’ aspects forms and transfer in both school-books and teacher interviews was analogous to students’ understanding, which progressed distinctly within the same two energy aspects (compare Fig. 2a, b, Table 2). We therefore argue that the similar scores for energy degradation and energy transfer may have been effected by limited measurement accuracy, as the students' understanding per energy aspect and per test form was based on so few items. In relation to the findings from the school book analysis, understanding of energy degradation may have therefore been overestimated in our measurement. As the analysis of regression discontinuity showed an effect of grade level over maturation (age), we argue that the differences in students’ understanding of the respective energy aspects may be a reflection of the focus on forms and transfer of energy in biology school books and the science biology classes (compare teacher survey). These learning opportunities concerning e.g., energy forms may enable 6th grade students to solve items for this aspect more often than younger students. In contrast, the older students may have to make sense of e.g., energy conservation items on our test in the same way as younger students do, since they had a similar amount of learning opportunities for this aspect. Since students performed well on items for energy forms, we agree with Jin and Anderson (2012) in suggesting that despite its limitations, the forms of energy approach should be continued, as students may use it as a scaffold for further learning opportunities with energy.

The focus in biology on energy forms/sources and energy transfer/transformation limits the potential of energy as a crosscutting concept and analytical tool, as students cannot apply e.g., the essential idea of energy conservation, for which this analytical concept was developed (Lacy et al. 2014). Our qualitative analysis (Fig. 3e, f) may present an example, in which relatively few students indicated by grade 6 that energy was “just gone” after an energy transformation—at the same time, relatively few students were able to explain where the energy went, indicating a constraint on their ability to apply energy conservation to a specific context (Chabalengula et al. 2011; Constantinou and Papadouris 2012). For biological contexts, this would require an increased emphasis on energy degradation and conservation in curricula, school books, and the teachers’ awareness when addressing energy in biology. It has already been shown that learning of difficult concepts such as energy conservation is possible in lower secondary school and before (Novak 2005; Shultz and Coddington 1981; Van Hook and Huziak-Clark 2008) if adequate learning arrangements are provided. Our data suggest, that to a certain degree, students learn about energy from implicit learning opportunities. On the basis of the above and our findings, that in early biology contexts, energy is much more implicitly introduced than explicitly addressed, we argue that an early beginning of a more explicit teaching of energy in biology may have a positive effect on older students’ understanding of energy.

Concerning the interdisciplinary character of energy, our findings suggest that students—apart from an everyday energy understanding—enter explicit energy instruction in physics with early stages of the scientific understanding of energy, which the students already start to construct in early learning opportunities in biological contexts. The overlap between disciplinary contexts that students encounter when learning about energy are unlikely to decrease among older students. In the light of meaningful (Ausubel 1963) and interdisciplinary learning, appropriate links between the disciplinary contexts are required as a scaffold for students. With these, an improved application of energy as a tool for analysis of complex contexts at different levels (subatomic/molecular, macro, global—compare theoretical background) and across disciplines could be supported.

For overarching learning goals in biology (such as to understand socio-environmentally relevant topics like climate change or health issues), Holden and Barrow (1984) showed that traditional energy instruction has only a minor effect on the understanding of these topics. For their understanding and evaluation, an appreciation of energy conservation is crucial (Jin and Anderson 2012). If future students are expected to use energy as an analytical tool for deeper analysis of these topics, students’ understanding of energy degradation and conservation in biological contexts is crucial and—in light of the above findings—can still be strengthened.

Reflection of Findings in the Light of Other Studies

The broad trends concerning learning of different energy aspects in biology are similar to those found for physics by Lee and Liu (2009), Liu and McKeough (2005), Liu and Tang (2004), and Neumann et al. (2013) in that our data also demonstrate students’ difficulties in understanding the thermodynamic aspects of energy. Neumann et al. (2013) found that the understanding of energy degradation was not much more difficult than understanding energy transfer, while conservation was understood only by the most able students. Across the two test forms, our results were similar. However, we argue in biological contexts, difficulties in both energy conservation and degradation may be a reflection of the focus on forms and transfer in biology school books, curricula, and teachers’ employment of energy in biology class (see section Methods - Instrument Development).

The findings from distractor analysis suggest a differentiated progression for alternative conceptions: While some (e.g., Fig. 3c “sunlight”—idea that humans use energy from sunlight for movement) decline steadily with higher grade levels, others seem not to change until grade 6 (e.g., Fig. 3a “oxygen”—plants store energy in the form of oxygen inside leaves), while a third set of alternative conceptions decline markedly at a specific age group (e.g., Fig. 3e “discarded water”—idea that energy transformed by a tree can be rediscovered in discarded water). Concerning the hypothesised pattern change of qualitative understanding, the data indicate that students’ energy understanding can progress in a specific context along several ways: for some ideas, students’ understanding is aligning with the scientific understanding, e.g., the decreasing understanding that energy is simply lost after it was used. Other ideas in the students’ energy understanding represent persistent alternative conceptions that limit the students’ ability to use energy effectively in this context. With respect to the progression of an alternative energy understanding, Burger (2001) demonstrated that 99.8 % of the combinations of students’ correct and incorrect conceptions about energy are logically sound, meaning that students may obtain certain alternative conceptions in order to make them logically fit other conceptions. For the formulation of a learning progression of energy, we therefore think it is important to consider stagnating progression in understanding of energy in a specific context as a possible effect of students’ energy understanding in other disciplinary contexts or other topics.

Boyes and Stanisstreet (1990) and Burger (2001) also analysed grade 5 students with respect to their understanding of energy sources. Similarities across the two studies and our findings with respect to the share of students holding specific conceptions can be identified, e.g., students’ belief that animals take up energy through food was similar in all data sets (Boyes and Stanisstreet 90 %/Burger 67 %/this study 86 %). Based on our data, we agree with Boyes and Stanisstreet (1990) that students may have difficulties with distinguishing between energy requirements and other requirements for plants and animals.

Weaknesses and Perspective

The findings from school book/curriculum analysis and teacher questionnaires have to be analysed with reservations as they focus on results from only one region and are not representative. These analyses were conducted only in relation to requirements for this study. Even though the language was revised in qualitative piloting and strictly kept short and simple, the effect of reading ability on energy scores would suggest, in further research, to conduct a triangulation of the results from this study with interview data. Due to students’ young age and their consequently limited attention span, this study was restricted in testing time, thereby causing further constraints for additional instruments, e.g., for crystallised intelligence. Further studies currently assess the effect of this possible confounding variable. More importantly, the short testing time markedly limited the number of available items for testing, which thereby decreased reliability, the predictive force of the multiple regression model, and the accuracy of scores for the energy aspects. In this study, we have tried to overcome parts of this problem by replicating our research in two test forms.

We see the benefits of this study in addressing significant contexts for energy learning at an important age, where first ideas concerning energy are constructed and possible connections to everyday conceptions of energy may form. With the design of this study oriented along an understanding of energy as used in physics education research, we try to work towards an interdisciplinary approach to research on the crosscutting facet of the energy concept, which is still too little recognised. In addition to further results for all grade levels and disciplinary contexts, longitudinal research will have to serve as confirmation, once an initial interdisciplinary learning progression has been formulated. Implementations of these findings for classrooms will require prior addressing in experimental design studies in order to develop and evaluate useful learning materials.

Conclusion

Young learners have been shown to be able to learn about energy through specially designed instruction (Van Hook and Huziak-Clark 2008). Our results show, that students also progress towards basic stages of the scientific understanding of energy in informal, often implicit learning opportunities in regular science/biology classes. Performance on the energy items was linked to students’ reading and cognitive skills. Yet, no or little connection was found with students’ interest in science and their grades in this subject. This led us—in combination with findings from school book analysis and teacher interviews—to conclude that energy as a concept itself is not likely a part of the regularly taught topics in science class, but that students instead formulate their own ideas about the energy concept on the basis of energy-relevant contexts they encounter in implicit energy learning opportunities at school. At primary level, understanding of energy conservation (and potentially degradation) is restricted in the contexts that we analysed, thereby confirming common concerns about early teaching of energy (Warren 1982). Since a basic understanding of conservation has been found conceptually possible at primary level (Shultz and Coddington 1981), we conclude from our findings that energy instruction has to focus more on interdisciplinary learning in order to make up for e.g., a bottleneck of young learners’ energy understanding forming in biological contexts. The progression in understanding the concept in our data is not only likely to influence later energy learning in physics or chemistry through persistent alternative conceptions, but also through valid ideas developed in biological contexts. With respect to possible benefits of an early start of energy instruction (Novak 2005) and the often described difficulties in students’ later understanding of energy (Herrmann-Abell and DeBoer 2011), we conclude that the numerous implicit opportunities for early energy learning in biological contexts could form the basis for a deeper understanding of the concept in students’ later school life. This may enable learners to use energy more effectively as an interdisciplinary core idea and analytical tool for complex sets of facts.