Keywords

1 Introduction

The intelligent tutoring systems have come to force in recent years, especially with the rise of affective computing that deals with the possibility of making computers to recognize human affective states in diverse ways [1, 2]. The relationship between the students’ emotions and affective states and their academic performance, the college enrollment rate, and their choices of whether majored in STEM field has been revealed through learning data captured by these tutoring systems [3,4,5,6].

In this paper, we attempt to enhance sensor-free affect detection derived from the existing psychological studies. Previous affect detectors have focused on single affective states and used a clip slice of learning data captured by the learning system as a data sample for model training and testing. We focus on the changes of affective states, reorganize the dataset to emphasize the transitions among the affective states and verify whether the models of affect changes can produce a better predictive accuracy than those prior algorithms.

2 Dataset

This work adopts the dataset drawn from the ASSISTments learning platform to evaluate our proposed approach to detecting affective states. Most of the previous papers have mentioned the concern of the imbalance issue of the dataset and developed resampling methods to solve this problem [7, 8]. However, none of them illustrated the imbalance issue explicitly and did not provide much detail regarding the resampling methods. However, the resampling methods have a significant influence on the performance of machine learning algorithms adopted by previous affect detectors. Therefore, transparent detail regarding resampling methods is needed to increase the confidence of the affect detectors.

Since this paper focuses on the transitions among the four types of affective states, we also conducted statistical analysis of these transitions. Figure 1(a) represents the transition between affective states of students. Table 1 illustrates the statistical analysis of the students who did/not experienced affect changes. We examined the students who were always confused or bored, none of their number of the clips are longer than 3.

Fig. 1.
figure 1

(a) The affect-change model. (b) Illustration of the organizing and labeling process of 3-clip (up) and 2-clip (down) dataset. (student number: 4)

Table 1. The students who did/not experienced affect changes.

The analysis in Table 1 has two implications. First, we can simplify the models to four types of transitions; (Always Concentrated), (Concentration ˂˃ Confusion), (Concentration ˂˃ Bored), and (Concentration ˂˃ Frustration), which is equivalent to the trained models in previous work [7]. Second, down sampling the clips of the students who were always concentrated should be an excellent solution to solve the imbalance issue, because the clips without affect change should have reliable feature distribution that is not influenced significantly by the down sampling process. Based on these implications, we developed our method to build the affect detectors for detecting affect changes rather than states.

3 Methodology

According to our affect-change model, we reorganize and relabel the dataset, and then generate two types of the dataset with new format and labels. We adopt the RapidMiner [9] as the model training and testing platform, and test six training models; Logistic Regression, Decision Tree, Random Forest, SVM, Neural Nets, and AutoMLP. To conduct a fair comparison with previous work, we keep the settings for trained models as the same as previous work [7].

Based on the affect-change model, we developed two types of organizational strategies for the dataset, called “3-clips” and “2-clips” data format as shown in Fig. 1(b). It is noteworthy that we conduct a down sampling process on the clips that are always concentrated. The data organization with down sampling process solved the imbalance issue of the original dataset.

To facilitate the comparable experiments with previous work, we adopted the same data science platform, RapidMiner, as the tool for model training and testing. The selected models include Logistic Regression, Decision Tree, Random Forest, SVM, Neural Nets, and AutoMLP. All models are evaluated using 5-fold cross-validation, split at the student level to determine how the models perform for unseen students. These training and testing strategies were set up as the same as previous work [7].

4 Results

The evaluation measures for the results of each of our models include two statistics, AUC ROC/A’ and Cohen’s Kappa. Each Kappa uses a 0.5 rounding threshold. The best detector of each kind of affect change is identified through a trade-off between AUC and Kappa. The performance of efficient model is compared in Tables 2 and 3. In all the detectors, the models trained by SVM performed better than others both AUC and Kappa wise. There is no much difference between the raw data and average data. Most of the evaluation measures are close to each other. The only difference is that for 3-clip data, AutoMLP performs slightly better than neural nets. To save the computation complexity, we prefer to choose the average data to reduce the data dimension.

Table 2. Model performance for each individual affect label using the 2-clip dataset.
Table 3. Model performance for each individual affect label using the 3-clip dataset.

5 Conclusion and Future Work

In this paper, we attempt to develop an affect-change model to build the relationship between the domain knowledge and the learning dataset previously studied using traditional feature engineering and machine learning algorithms. Our future work will include (1) developing models integrating semantic context to identify the affect states, (2) verifying and validating the trained models in population studies, such as Blackboard System that has collected plenty of interaction data at University of Maryland, Baltimore County, (3) combining sensor-free models and sensor-based models to develop more robust and flexible systems.