Keywords

1 Introduction

In science inquiry contexts, students require support in order to effectively engage in inquiry investigations [2,3,4]. Supports provided to students can be in the form of scaffolds designed to help students reach a level of performance that would not be possible if they were to do a task independently [5, 6]. The types of scaffolds students receive within online science environments may vary from fixed [7] to faded [8] to adaptive scaffolds [9]. Fixed scaffolds are supports that are provided to all students consistently, regardless of student performance [7, 10]. Faded scaffolds, on the other hand, are supports that are gradually removed with increasing use of a particular system [8, 10, 11]. Another form of scaffolds are adaptive scaffolds, which are supports that are provided to students in real-time based on students’ performance in a system [12, 13]. While fixed [7], faded [8], and adaptive scaffolds [12] have benefited student learning in science environments to some extent, adaptive scaffolds show the greatest promise in terms of promoting transfer of inquiry practices [13, 14] because they provide students with the information they need when they need it most [15].

In the context of science inquiry, transfer of inquiry practices may be assessed in terms of near transfer (i.e. transfer to similar inquiry tasks presented briefly after the initial inquiry task; [16]) or far transfer (i.e. transfer to inquiry tasks in different contexts and after extended periods of time; [16]). Studies have demonstrated how engagement in computer-supported learning environments can promote transfer of science content understandings [17, 18] and practices such as scientific reasoning [16]. In the intelligent tutoring system, Inq-ITS [9], researchers have demonstrated transfer of multiple scientific practices across topics and over time [14] including: hypothesizing [12, 19], collecting data [20, 21], and interpreting data/warranting claims with evidence [22, 23]. Each of these practices can be operationalized into different finer-grained sub-practices. Studies have yet to investigate the transfer of inquiry at the sub-practice level over time and across topics. The present study examines whether adaptive scaffolding of inquiry practices in the first three Inq-ITS activities (i.e. driving questions) leads to transfer of inquiry practices across topics at varying time intervals at the sub-practice level.

2 Method

2.1 Participants and Materials

The participants in the present study were 108 6th grade students from a middle school in the northeastern United States who completed the following Inq-ITS [9] lab activities: Animal Cell (three driving questions: (1) how can you increase the transfer or protein in an animal cell?, (2) how you can decrease the production of ribosomes?, and (3) how you can reduce the production of protein?), Plant Cell (three driving questions: (1) how can you increase the transfer or protein in a plant cell?, (2) how you can decrease the production of ribosomes?, and (3) how you can reduce the production of protein?), Genetics (three driving question activities: how does changing a mother monster’s (1) F, (2) L, and (3) H alleles impact the traits of the babies?), and Natural Selection (four driving questions: what is the optimal foliage for (1) the green, long furred and (2) the red, short furred monsters?, what is the optimal temperature for (3) the green, short furred and (4) the red, long furred monsters?).

Each of these Inq-ITS activities contained four stages where students first formed a question/hypothesis, carried out an investigation/collected data, analyzed and interpreted data, and finally communicated their findings [9, 10]. Currently, adaptive, real-time scaffolding is available within the first three stages of the microworlds [19,20,21,22,23] (scaffolding is being developed for communicating findings [24]) based on automated scoring in Inq-ITS ([25]; see Measures section). The only difference between adaptive scaffolded and unscaffolded Inq-ITS activities is the presence of the pedagogical agent, Rex. For example, in the scaffolded animal cell activities in the present study, if a student was evaluated as having difficulty on a particular practice, then Rex would pop up on the student’s screen with different types of information depending on the student’s specific difficulty [26, 27]. Rex would first provide students with an orienting hint reminding the students of the inquiry practice/sub-practice that they were engaging in [28]. If the students continued to have difficulty with the practice, Rex would provide a procedural hint (explaining the steps involved in the practice/sub-practice) followed by a conceptual hint (explaining the inquiry practice/sub-practice) and finally an instrumental hint (explaining the exact steps).

2.2 Measures

In the present study, the dependent variables were four inquiry practices. Each inquiry practice in Inq-ITS is operationalized at a fine-grained level (i.e., broken down into different sub-practices/sub-components). The hypothesizing practice was measured by: identifying an independent variable (IV) and dependent variable (DV). The collecting data practice was measured by: testing the hypothesis and running targeted and controlled trials. The interpreting data practice was measured by: correctly selecting the IV and DV for a claim, correctly interpreting the relationship between the IV and DV, and correctly interpreting the hypothesis/claim relationship. The warranting claims practice was measured by: warranting the claim with more than one trial, warranting with controlled trials, correctly warranting the relationship between the IV and DV, and correctly warranting the hypothesis/claim relationship. Each inquiry sub-practice was automatically scored as 0 points if incorrect or 1 point if correct using the knowledge engineering and educational data mining techniques in Inq-ITS, validated in prior studies [9].

This study had a time variable with four levels: Time 1 (i.e., Animal Cell in month 0), Time 2 (i.e., Plant Cell in month 1.3), Time 3 (i.e., Genetics in month 2.7), and Time 4 (i.e., Natural Selection in month 5.7). Moreover, this study had a variable of the number of driving questions that students completed over time: driving questions 1 to 3 in month 0 (i.e., Animal Cell), 4 to 6 in month 1.3 (i.e. Plant Cell), 7 to 9 in month 2.7 (i.e., Genetics), and 10 to 13 in month 5.7 (i.e., Natural Selection).

3 Results and Discussion

We used linear mixed models (LMMs) to investigate whether there was evidence of transfer by evaluating students’ inquiry competencies across driving questions over time after removing the adaptive scaffolding. We performed four sets of LMM analyses where we focused on the pattern within each inquiry practice.

3.1 Model Selection

For the analysis of the data, we followed the “top-down” modeling strategy and selected the models that best fit the data. We ran an unconditional model with intercepts only, and then added each variable independently as well as in combination. Each type of added variable(s) generated three models based on the variation of random effects: subjects only (Intercept), the number of driving questions and/or time variable(s) only (Slope), or both subjects and the number of driving questions and/or time variable(s). We compared the models using the −2 Restricted Log Likelihood (−2RLL) [29] and selected the full models in this study due to their best fit for a greater number of practices (namely, hypothesis, data collection, and warranting claims).

3.2 Performance Across Driving Questions and Over Time

We then examined inquiry scores across driving questions and time for each practice. Results showed that the fixed effects for the hypothesizing practice were significant, F(1, 108.25) = 24.39, p < .001 for driving questions and F(1, 107.25) = 11.32, p = .001 for time. Fixed-effects parameters were significant for hypothesizing (β = 0.03, p < .001 for driving question; β = −0.04, p = .001 for time), collecting data (β = 0.05, p < .001 for driving question; β = −0.06, p < .001 for time), and warranting claims practices (β = 0.04, p < .001 for driving question; β = −0.05, p < .001 for time). These results indicate that students improved their performance on these three inquiry practices with the increasing use of Inq-ITS, but that the long-time intervals between usage resulted in a slight decrease in performance. This pattern was not found for the practice of interpreting data, potentially due to students starting with relatively high performance (Mean = 0.79) or interactions with topic complexity [30].

The random effects showed a significant intercept for the hypothesizing (β = 0.03, Z = 3.17, p < .01), collecting data (β = 0.05, Z = 3.87, p < .001), and warranting claims practice (β = 0.06, Z = 4.18, p < .001). Results also showed a significant driving question random effect for hypothesizing (β = 0.001, Z = 1.97, p < .05) and collecting data (β = 0.002, Z = 2.00, p < .05). Additionally, in hypothesizing, we found a significant driving question and time random effect (β = − 0.002, Z = −2.03, p < .05) and significant time effect (β = 0.004, Z = 2.26, p < .05). We also found a significant covariance between the intercept and the driving question coefficient for collecting data (β = −0.01, Z = −2.01, p < .05). The findings of these random effects confirmed a fair amount of student-to-student variation in the starting performance for practices of hypothesizing, collecting data, and warranting claims, but varied patterns for driving question, time, and both driving question and time effects. This demonstrates that transfer of learning was different for different inquiry practices for students.

4 Conclusions, Future Directions, and Implications

In this study we investigated the robustness of our scaffolding using students’ performances on various inquiry practices across driving questions at different time intervals, thereby addressing near (across driving questions at each time) and far transfer (over time). Our results showed, in general, that our scaffolding was robust for practices of hypothesizing, collecting data, and warranting claims. A limitation of the present study is that there was no control condition, which makes it challenging to distinguish between effects of external factors such as teacher instruction between usage of the system. In the future it will be valuable to examine differences between students in a scaffolded and unscaffolded condition to more fully understand the influence of the adaptive scaffolds in Inq-ITS on students’ inquiry performance.

Overall, the findings in the present study inform assessment designers and researchers that, if properly designed, scaffolding aimed at supporting students’ competencies at various inquiry practices can greatly benefit students’ deep learning of, transfer of, and performance on inquiry practices over time.