Keywords

1 Introduction

Models, like Deep Knowledge Tracing (DKT) [6], encode student knowledge as a latent variable along with its temporal dynamics. Based on the progression of a student’s work on the tutor, these models update their estimates of predicted knowledge based on the correctness of student responses.

On tutor funtoot, used in school and at home, students might not complete a topic in one session. Even in one session, various factors might hinder students’ engagement with the tutor. For instance, talking to the neighbors/friends. Analysis of the data of students’ interaction on tutor funtoot showed that there is a time difference between submitting one problem and generating the next problem. We call this time difference as a time-gap on the task. The time-gap between two practice opportunities is as low as 0–2 s and also as high as a week.

Researchers in [3, 9] have extended the Bayesian knowledge tracing [1] (BKT) model and authors of [10] have extended DKT by adding lots of features and improve its predictions. However, none of these three studies have considered the time-gap we are interested in. Authors of [7] have modeled time-gap of a day or more, but in Bayesian Knowledge Tracing.

In this study, we leverage the time-gap in DKT to enable us to predict the delayed performance after the time-gap. We attempt to improve the prediction accuracy of DKT with time-gap and compare it with the DKT model without the time-gap. Using simulation, we also analyze the predicted knowledge following a time-gap and trace the forgetting curves.

2 Experiments

We have trained the DKT model as explained in paper [5] with bloom’s cognitive level - bloom’s taxonomy learning objective (btlo) as a feature (skill). A problem might involve more than one skills. This can also be encoded in DKT as shown in [4].

The DKT model explained in the above papers considers only the skill of a problem and its correctness. In the proposed variant of DKT, we also consider time-gap as a feature. We call this variant as DKT-t. We have identified 9 time-gaps which we model as a feature in DKT. The 9 time-gaps are as follows: Gap#1 - \(<2\,\text {s}\); Gap#2 - \([2\,\text {s} - 5\,\text {s})\); Gap#3 - \([5\,\text {s} - 10\,\text {s})\); Gap#4 - \([10\,\text {s} - 30\,\text {s})\); Gap#5 - \([30\,\text {s} - 1\,\text { min})\); Gap#6 - \([1\,\text {min} - 5\,\text {min})\); Gap#7 - \([5\,\text { min} - 1\,\text { h})\); Gap#8 - \([1\,\text {h} - 1\,\text { week})\); Gap#9 - \(>1\,\text { week}\). Please note that ‘[’ denotes the inclusion and ‘)’ denotes the exclusion of the respective point in the interval.

We test the performance of DKT and DKT-t on three datasets: funtoot dataset, Cognitive Tutor dataset and Assistments dataset. We used the publicly available Assistments datasetFootnote 1 [2] which has start-time and end-time of a problem attempt. We chose 12 highest used skills from this dataset and only considered answers to the original problems. It contains 8, 97, 971 data-points from 7, 856 students. One datasetFootnote 2 we choose comes from the Cognitive Tutor called Algebra I 2005-2006 [8]. We chose 114 units with the prefix ‘CTA1’ and ‘ES’. It contains 72 skills and 5, 62, 103 data-points generated by 560 students. funtoot dataset contains 17 skills and 4, 66, 212 data-points generated by 8, 000 students.

To study the effect of time-gap on the predicted knowledge immediately following a time-gap, we perform simulation using DKT-t model. We pick five most used skills from all the three datasets. For the chosen skills, we predict the response to the next problem for all the time-gaps using DKT-t after solving five problems correctly of the skill.

3 Results

The results of the learned models are evaluated and compared by AUC, the square of Pearson correlation (\(R^{2}\)) and mean error (me). Mean error is the residual error computed as: the mean of the actual performance subtracted by the predicted performance [7].

Table 1. Model statistics

We report these three metrics per gap in DKT and DKT-t and overall to analyze the difference with the time-gap parameter. Table 1 shows the results.

The overall AUC of DKT and DKT-t remained almost same for Assistments and funtoot. There is a minor improvement of \(1.72\%\) AUC with DKT-t for dataset Algebra I 2005-2006. For funtoot, considering the time-gaps also the AUC’s remain similar for DKT and DKT-t. There is a clear decrease in AUC from gap#1–gap#9 with DKT, and also with DKT-t for datasets - Assistements and Algebra I 2005-2006.

We also observe that there is a decrease in \(R^{2}\) from gap#1–gap#9 with DKT, and also with DKT-t for datasets - Assistments and Algebra I 2005-2006. For funtoot dataset, DKT highly over-predicts (negative mean error) for gap#7–gap#9 which is reduced to almost half by DKT-t. In Assistments dataset, DKT heavily over-predicted for gap#3–gap#7 while DKT-t almost reduced the mean error to zero. For gap#8–gap#9, DKT moderately over-predicted, whereas, DKT-t moderately under-predicted (positive mean error). DKT under-predicted for gap#2–gap#4, while it is close to zero with DKT-t for Algebra I 2005-2006 dataset. DKT heavily over-predicted for gap#7–gap#9, but the mean error is close to zero with DKT-t.

Forgetting Curve. Since we have the estimates of how the students might do following each time-gap through simulations, we can plot a curve of predictions for every time-gap. If the hypothesis that post higher time-gaps, students might forget the learned material, there should be a decline in predictions as the time-gap increases which is called as forgetting curve.

Figure 1 shows the forgetting curves for five most used skills from all the datasets. For datasets funtoot and Algebra I 2005–2006, as shown in Figs. 1A and 1C, either there is a slight increase in predictions for gap#3–gap#5 or they remain steady. However, for Assistments dataset shown in Fig. 1B, there is a steady decline in predictions as the time-gap increases. Across all the three datasets, there is a slight increase in prediction following gap#8.

Fig. 1.
figure 1

Forgetting curves for skills from Algebra I 2005-2006 dataset

4 Discussion and Conclusion

This work attempts to incorporate and model time-gap into Deep Knowledge Tracing. The predictability of the student performance decreases systematically with the increase in time-gaps for DKT (indicated by AUC and \(R^{2}\)) which remains the same even after modeling the time-gap in DKT-t. However, \(R^{2}\) is comparatively higher with DKT-t than DKT for larger time-gaps. For Algebra I 2005-2006 dataset, the predictability with DKT-t improved by 0.14 AUC units.

Since DKT considers only the ordering of the student responses, we observe that it heavily over-predicts the next response performance following larger time-gaps. DKT-t reduces these residuals between the actual and predicted performances.

The forgetting curves across all the three datasets demonstrate the decay of knowledge as time progresses. The predicted performance for gap#1 is slightly lower than the gap#2–gap#5 for some skills. In gap#1, students might be gaming the system or moving on to the next problem too quickly with little introspection. We need to study and validate this hypothesis in the future work. Additionally, there is a rise in the predicted performance following gap#8. We are not clear about the reason behind this.

One of the main contributions of this work is the unique approach to trace the forgetting curve using the historical student responses generated on the digital tutors. These models have the potential to empower the researchers to simulate various learning scenarios and theories and get the sense of their effects on learning and forgetting.