Introduction

Models and modeling play an important role in the teaching and learning of science, by introducing children to scientific ways of reasoning and linking the worlds of observations and theory (Schwarz et al. 2009). For instance, in their discussion of the teaching of scientific reasoning, Windschitl et al. (2008) argue against the inquiry cycle as the only representation of a scientific method. Instead, based on work by Nersessian (1995), they argue that inquiry entails the construction and evaluation of models. This focus on models and modeling is found among other authors as well, including White and colleagues (Schwarz and White 2005; White and Frederiksen 1998), who explicitly address teaching about modeling in what they call a “meta-modeling” approach. In a philosophical stance, Giere (1999) stresses the model-based nature of scientific knowledge and the importance of visual reasoning with models. Buckley et al. (2004) present the role of modeling in the biology curriculum. Louca and Zacharia (2011) provide a review of model-based learning and conclude that our review shows that this type of learning contributes to science education on cognitive, metacognitive, social, material and epistemological aspects. Achér et al. (2007), along with many others, argue for modeling as an integral part of science education. Furthermore, the framework for K-12 science education includes models and modeling as core concepts in science education (National Research Council 2012). Van Borkulo et al. (2011) found that creating models contributes to increased knowledge about the structure of the domain under study.

Constructing models requires creation and evaluation of model elements and their relations. Modeling is a creative process that requires generating new ideas, often based on analogy and visual re-representation. One possible way to support such re-representation is by using drawings to represent ideas and reasoning processes (Ainsworth et al. 2011). Drawings have the potential to stimulate the generation of new ideas because they are highly expressive and have no syntactical constraints. Drawings have been used in support of scientific reasoning and modeling in a number of studies. Van Meter (2001) successfully used drawings as a means to improve the processing of information from scientific texts. Achér et al. (2007) used drawings to support the construction of models of materials and found that drawing-based models served as a means to translate between forms of “perceived reality,” meaning that the models mediated the (re-)construction of learners’ views of the world. CogSketch (Forbus et al. 2011) also uses drawings, in this case, to construct semantic models of domains based on a large database, linking drawing elements (“glyphs”) to concepts in a large relational database, and using the spatial configuration of the glyphs to support reasoning about the model.

In work on the solar system, which is also the domain that is studied in the current paper, Parnafes (2012) studied students’ model construction using drawings, including those of situations representing the cause of solar and lunar eclipses. She found that the external, “tangible” aspects of the drawings helped students to retain the record of earlier explanations and to use these to construct deeper explanations and conceptualizations.

During the model building process, experimentation is not necessarily the only form of evaluation of the model. The focus is often more on the consistency, parsimony and plausibility of the new model. It is also important to assess whether the model will provide results that match expectations. Computer simulation can help in this evaluation, and its results can feed back into improvements of the model.

In the current paper, we take on the challenge of providing relatively young children with a realistic scientific modeling activity. Our aim is to let children create models of a phenomenon of which they have at least partial knowledge. For the phenomenon to be modeled, we therefore chose a topic from astronomy: the structure of the solar system and the origin of eclipses. Most children learn at a young age that the Earth orbits around the Sun and rotates about its own axis. However, they may know less about the role of the moon as orbiting around the Earth and causing eclipses. The goal of the modeling activity was for the children to express their understanding of the solar system by creating a model and to use the model to extend that understanding.

For the target age group (7–15), creating quantitative models using computer programs or dynamical modeling tools such as Stella (Steed 1992) or Co-Lab (van Joolingen et al. 2005) is not feasible, because children in that age group generally do not have the knowledge and skills needed to use such tools. Alternative tools such as Model-It (Jackson et al. 1994) do not require the programming or specification of quantitative rules that these tools require, but are still based on formal mathematical relations.

In the study we present in this article, we introduce a drawing-based modeling tool called SimSketch that allows learners to create models based on drawings. SimSketch is designed to bring modeling within the reach of young learners starting from approximately 8 years old.

The key idea behind SimSketch is to combine drawings with a modeling engine, so that the drawings can not only be used to show static structures in the learners’ models, but can also become animations visualizing dynamic properties of the model. A full description of SimSketch is given elsewhere (Bollen and van Joolingen 2013); here, we summarize its essential features. A SimSketch model starts with a user-created drawing in which the objects that play roles in the model are represented. Users can then split the drawing into separate objects and assign a behavior to each of the objects. A clustering agent supports users in performing this splitting into objects; the clustering agent guesses which are the drawing strokes that go together to compose individual objects, based on spatial distribution and order of drawing. The user-assigned behavior can be an object’s independent motion or an interaction with another object. For example, the GO behavior specifies an object’s independent motion in a specified direction, whereas the CIRCLE behavior specifies that the object moves in a circular orbit around another object. Behaviors can be combined and can interact. For instance, if an object is assigned the CIRCLE behavior and the object it is to circle around is moving, the circling object will move along, too. The effects of behaviors that result in the motion of objects are combined, using simple vector addition. Other behaviors include attractive and repulsive relations between objects, as well as reproduction, termination and “killing” of objects based on certain conditions. After specifying behaviors, users can run the model, which creates an animated copy of their drawing in which the objects move according to the behaviors specified. Learners can zoom, speed up or slow down the simulation and can have the simulation draw traces of the moving objects.

The design of SimSketch is aimed to support essential reasoning processes such as identifying model components, their properties and behavior in an intuitive way. The object-based nature of the models allows learners to specify one object at a time in a relatively simple way, whereas complex behavior may result from the combination of object and behaviors. SimSketch focuses on domains in which modeling results are best represented through animations, displaying qualitative aspects of complex behavior.

The aim of this exploratory study was to investigate the validity of the approach of using drawings to create computational models for children in the target age group of 7–15 years. Children created drawing-based models of the solar system, with the intent to explain solar and lunar eclipses using these models. In the study, we intended to answer the following questions:

  • Are (young) learners able to create models from drawings?

  • Do they value this approach and our SimSketch software?

  • Does their creation of these models result in any knowledge gain?

  • How do age and prior knowledge influence their experience?

The study was performed in a science museum with visitors of the museum as participants.

Method

Our 247 participants (126 girls and 121 boys), whose ages ranged from 7 to 15, were recruited from among the visitors to a science center. Recruiting took place over 4 weekends. Visitors received a leaflet announcing the study, and museum personnel actively invited visitors in the target age group to participate. Participants filled in a questionnaire on their knowledge of the solar system (multiple-choice test, eight items). Then, they created a drawing of the solar system according to their ideas using SimSketch, the drawing-based modeling program described above (Bollen and van Joolingen 2013). As part of the modeling activity, they were asked to create situations in which solar and lunar eclipses could occur. Finally, a knowledge post-test and questionnaires on their motivation and attitude toward SimSketch were administered.

SimSketch

SimSketch (Bollen and van Joolingen 2013) allows the learner to assign behavior such as rotating or orbiting to elements in the learner’s drawing (see Fig. 1). SimSketch provides support for the modeling process in two ways: By providing the menu of behaviors, which indicates model elements that are suitable for the domain, and by providing feedback through the animation of the model. The latter assists with evaluation of the model. Once learners finish their drawing, they can run it, meaning that the elements of the drawing start moving according to the behaviors assigned to them by the learner. In the instructions, our participants were asked to create an animated view of the solar system and to stop and save the simulation at the moment when a solar or lunar eclipse would occur. The models were phenomenological, meaning that the motions were described as they are, for example, a behavior was “the moon circles around the earth,” as opposed to a model where motion emerges as a consequence of the underlying physics, such as in terms of gravitational attraction. The behavior set available to the students was limited to behaviors that related to relative motion of the objects; behaviors that involved creation or deletion of objects were removed.

Fig. 1
figure 1

Screenshot of SimSketch during the creation of a solar system model

The models created were automatically collected on a server and scored for the presence of necessary elements and the correctness of the behavior assigned. A maximum of 14 points could be awarded to a model.

Domain Knowledge Test

The domain knowledge test used as pre- and post-test consisted of eight multiple-choice questions. The questions were based on the work of Vosniadou (1992), who composed a list of typical misconceptions children have about solar system. Her results showed that children conceptualized that the earth is located in the center of the solar system and that the day and night cycle is caused of the motion by the sun and the moon. She also mentioned that children often believed that clouds cover up the sun at night. The questions and alternative answers in this study address these potential misconceptions that children may have formed about the solar system. The pre- and post-test versions consisted of the same items, but used a different order for the items and the answer alternatives. Items included questions such as “What type of object is the Sun?”, “Does the Earth rotate around another object?” and “What causes a Solar Eclipse?”.

Questionnaires

Motivational Questions

In this questionnaire, participants had to answer twelve questions, which were about their post-task competence and about whether they found the task interesting and valued the task as useful. These questions address the participants’ motivation, in the form of perceived competence gain and valuing. The affective questions were measured with a four-point Likert Scale.

Questions About Software Attitude

Students’ attitude toward SimSketch is measured by a semantic differential. This semantic differential measures the connotative meaning of concepts (Hassenzahl et al. 2002). An overview of the concepts is presented in Table 1. The semantic differential consisted of ten contradictory variables. Participants could rate them on a five-point scale. These ratings were added to create a single value representing participants’ attitude toward SimSketch.

Table 1 Overview of the concepts checked in the software attitude questionnaire

Finally, there was a space where participants could leave remarks about SimSketch or the study in general.

Procedure

The session took place in a computer lab under the guidance of the experiment leader and an experiment assistant. Eight participants at a time could work in the computer lab. Participants’ parents were informed about the study, with a letter. Before the participants entered the learning environment, they received brief instructions about the experiment and more specifically about the tests and questionnaires and the SimSketch login procedure.

After these brief instructions, the participants started by completing the pre-test, then worked on their model and then completed the post-test and questionnaires. Before starting on the modeling assignment, they completed a 10-min SimSketch tutorial that explained how students could navigate through the SimSketch learning environment and operate the available tools. The steps in this tutorial explained every part of SimSketch that was relevant for the task at hand. The experiment leader and assistant helped participants who did not understand the tutorial.

After this, tutorial participants worked on the modeling assignment. During the modeling assignment, the experiment leader and assistant offered help when there were problems with understanding the assignment or linguistic problems. The participants were told not to talk to each other during the experiment, but that they could ask questions by raising their hand at any time. The participants were given no help on the content of the assignment. The modeling assignment took approximately 30 min; the exact time spent was measured and taken into account in the analysis. After the modeling assignment, the participants could move on to the next part of the session, with the post-test and final questionnaires. Participating in the study lasted approximately 45 min.

Analysis

To compute the inter-rater reliability for the scores on the modeling assignment, a second coder received a protocol and did independent scoring of twenty models. Cohen’s kappa was 0.7, which is considered to be good. The reliability of the knowledge test is reasonable to good (pre-test: α = 0.698, post-test: α = 0.721).

The motivational questionnaire measures several constructs. Reliability analysis showed that two constructs could be measured reliably: valuing and perceived competence. The construct valuing measures the extent to which participants liked the assignment about the solar system and enjoyed working with the computer. The construct perceived competence, competence for short, measures the extent to which participants consider that they understand the solar system better by modeling and simulating. The items belonging to each of the constructs are shown in Table 2. Items 1, 5, 6 and 9 were removed due to insufficient reliability. These items are not used in the analysis. The reliability of the valuing scale is reasonable (α = 0.604); the reliability of the competence scale is good (α = 0.774).

Table 2 Motivational constructs and the associated items

The scale measuring attitude toward SimSketch consisted of all ten items from the semantic scale, with a reasonable reliability (α = 0.650).

The relation between participants’ age, gender, time on (modeling) task, pre-test and post-test score, model score, SimSketch attitude, valuing and competence were used in further analysis, using structural equation modeling. This analysis used only the data of the 219 participants for whom a complete dataset was obtained. Incomplete data sets were caused by early dropout, for instance, because parents wanted to move on. Also one data set was removed because observations indicated that the parents helped the participant in the modeling task.

Results

Table 3 shows the descriptive statistics for the variables that were investigated. From this table it becomes clear that the average pre-test and post-test score is about 75 %, which is reasonable for children in the target age group. Their solar system model scores are about 50 %. This is also quite reasonable, as some of the required elements in the model, such as heavenly bodies rotating on their own axes, were not relevant for the problem investigated, but would be part of a model of the complete system.

Table 3 Means and standard deviations for the variables measured in the study

The motivational scores are above what might be expected on average; especially, the score for SimSketch attitude shows that children have a positive attitude toward SimSketch. The same positive attitude becomes clear from the remarks students made on the open questions. In total, 141 participants answered this question. Of them, 115 stated that it was fun to work with the program of which 30 used a superlative (very, super). Thirteen respondents indicated that the task was difficult; fourteen stated that they had learned; sixteen used the word “interesting”; and four indicated that the task took too long.

Table 4 shows the correlations between these variables. It is clear that there are a number of highly significant correlations between the variables. Using structural equation modeling, we sought an underlying causal model that could explain the (cor)relations between these variables in greater detail. In particular, we were interested in the influence of scores for the solar system models and the motivational scales on post-test knowledge. Figure 2 shows an all-encompassing causal model, which includes all eight variables. We fitted this causal model and four others in which one or more of these variables and relations with them are omitted. This results in the five causal models summarized in Table 5. For each causal model, the exogenous variables and the R 2 values for the endogenous variables are given (where n/a means that this variable was not part of the model). The root mean square error of approximation (RMSEA) was used as an indication of goodness of fit. This indicator balances the Chi-square value of the causal model and the number of degrees of freedom, with a penalty for more degrees of freedom. A value of RMSEA smaller than 0.05 is considered to be a good fit, whereas a value greater than 0.1 is considered a bad fit. When the Chi-square value is smaller than the number of degrees of freedom, RMSEA is set to 0. We also computed the Akaike Information Criterion (AIC), as this provides a balanced comparative index between structural equation models (Schermelleh-Engel et al. 2003). The AIC estimates how much information is lost when choosing a simpler causal model. The causal model with the lowest AIC is to be preferred.

Table 4 Correlations between the variables measured in the study
Fig. 2
figure 2

Graphical representation of the most elaborate structural equation model fitted (Model 1 in Table 5)

Table 5 Characteristics and fit parameters for the five models that were fitted to the data

Table 5 shows that three causal models have an RMSEA of 0 and thus have an excellent fit. Looking at the AIC, it becomes clear that the simplest causal model (model 5) is the preferred model, although the differences are small, and the RMSEA of model 5 is slightly above zero. In this causal model, only time on task is left out; all other variables are part of the model. In Fig. 3, the estimated coefficients of this model are assigned to the model relations. Due to the fact that this causal model has the best fit and given the values of the model coefficients and their significance, it becomes clear that the best predictor of post-test score is pre-test score. Looking at the other causal models, participants’ solar system model score may have a small but relevant independent contribution to the post-test knowledge score, as the causal model that leaves out the solar system model score provides an equally good fit. Solar system model score is dependent on pre-test knowledge score. Time on task is not relevant for either post-test knowledge score or solar system model score. Also, while pre-test knowledge score depends on participants’ age, their solar system model score does not.

Fig. 3
figure 3

The best-fitting structural equation model (Model 5 in Table 5) with weights associated to the relations

Conclusions

The scores for the children’s solar system models show that on average, participants were capable of creating adequate drawing-based models. This is true for all ages; children’s model scores do not depend directly on age. The scores for the solar system models depend on age only indirectly through prior knowledge as measured on the pre-test. As expected, participants’ prior knowledge about the solar system increases with age; older children score higher on the pre-test and also express that knowledge in the models they draw.

The results on the motivational scales and the participants’ responses on the open question indicate the potential of the approach to motivate children to engage into drawing-based modeling. Repeated studies should corroborate this.

Because the current study did not involve any explicit instruction about the domain, and that any knowledge gain between pre- and post-test should be attributed to engaging in the task itself. Therefore, a large knowledge gain was not to be expected. This is clear in the results: The differences from pre- to post-test are small. There is no overall effect and only girls seem to profit from the task, when looking at the mean knowledge gains. However, the gender effect is not present in the best-fitting causal model.

The relation between the participants’ scores on their solar system models and the knowledge they acquired is interesting. Although some of the causal models we fitted do show a significant contribution of the score for the solar system model to the post-test score, the best-fitting causal model does not. The design of the study does not allow the conclusion that a drawing that becomes a better model of the solar system causes better post-test results, but this may serve as a hypothesis for future studies. Taking this together, we see that drawing-based modeling is a feasible approach to teaching model-based learning, one that is within the reach of even young children.

The results show that perceived competence gain and post-test score are negatively related. A higher post-test score therefore relates to lower perceived increase in competence. This may seem unexpected, but could be explained by the assumption that the higher scoring participants did not see additional benefits from the drawing task, as they were likely to already have relatively high knowledge of the solar system. The causal model indicates that this negative relationship depends mainly on prior knowledge and, to a lesser extent, directly on the participant’s age: Older participants see less value in the drawing-based modeling approach for learning. A major question related to this is whether this decreased gain in perceived competence is related to the domain at hand. As students and experts of all ages use drawings as a basis for reasoning, one would hypothesize that the perceived competence gain is strongly associated with the domain studied.

To summarize, the results show that (prior) knowledge about the solar system increases with age and that, independent of age, children in the target age group are capable of creating reasonably drawn, running models of the solar system. Overall, motivational scores as well as evaluations of the SimSketch system were on the positive side, showing that children liked the task and on average experienced it as useful for the learning task.

Young children are willing to learn a lot and prefer to learn in an active way (Holt 1977). It is important that young children are involved early in constructive scientific learning activities and are encouraged to develop scientific thinking. The current study displays the feasibility of this approach. The fact that even young children aged seven and up are capable of constructing models that represent the solar system illustrates the fact that it is possible to begin using a modeling approach as early as primary education. This is in line with the findings by Achér et al. (2007), who used a modeling approach to understanding materials in primary education. Of course, a necessary precondition for this is that primary teachers’ understanding of and attitudes toward science are at a level that can support children’s learning about modeling and about science. Studies into the conceptualization and actual levels of scientific understanding and attitudes, as well as teacher training programs for primary school teachers, are necessary (van Aalderen-Smeets et al. 2011; Walma van der Molen et al. 2010).

The approach we are advocating based on the findings from this study is in line with recent work by Damnik and colleagues (Damnik et al. 2013). In their study, they found that, as compared to using given prepared representations, self-constructed representations improved learners’ performance on application tasks. This indicates improved higher-order thinking skills. Due to the practical limitations of the current study, it was not possible to search for similar effects. This is, of course, a subject for further study. Another use of drawing-based models is identified by Harle and others (Harle and Towns 2013). In their study, learners created drawings of the molecular structure of proteins. They found that the drawing helped with understanding learners’ mental models of the “secondary” structures of the proteins. In a sense, this is similar to the way drawings helped learners to understand the occurrence of eclipses, which is strongly associated with the spatial structure of the system.

The context of the study at the science center had some advantages and limitations that are worth mentioning. The fact that the learning setting was informal provided the opportunity to study the way educational software functions outside of the formal, usually externally motivated learning setting. Drawbacks include the fact that participants were selected from a group that visits science centers and that participation was voluntary. This yields two levels of self-selection by participants, which means that results should be interpreted with care. Also, the setting at the science center meant that it was not possible to set up experimental and control groups, which would have yielded more information on the effects of using the tool. Therefore, this study should be (and will be) supplemented with studies in more controlled situations. In those studies, SimSketch will also be adapted to collect more process data from the learners engaged in drawing-based modeling activities.

In the longer term, with more investigation in this direction, the presented results can be beneficial as a way toward a modeling curriculum, giving young learners the opportunity to grasp the idea of modeling in an easy and intuitive way and preparing them for using more demanding modeling approaches such as System Dynamics when they are old enough.