Keywords

1 Introduction

The development of education has entered a new stage of informatization, and it is transforming from digital education to smart education supported by technologies such as data mining and machine learning [1]. In order to achieve the goal of wisdom education, different management requirements should be adopted for students with different personalities and attitudes, and early warming should be given to students who may fail. To meet these needs, this paper collects MOOC student learning behavior data, makes different researches and comparisons on the personality and attitude of students through different machine learning algorithms. Finally, students’ learning personality is divided into 3 categories, and students’ learning attitude is divided into 4 categories, and through multiple linear regression fitting function model to predict students’ scores.

2 Related Work

In terms of clustering algorithms, S. M. Mostafa [2] et al. used nine clustering algorithms to cluster eight datasets, comparing multiple algorithms by seven performance measures. H Cui [3] et al., who proposed a K-means++− based clustering method for social e-commerce users. It is shown that the proposed method can accurately classify social e-commerce users. There are many research results on the analysis of learner behavior data. such as, José A et al. used learner learning data provided by a regional MOOC provider in Jordan to explore the differences in learners’ behavior and preferences. In the end, it was found that the region attracted younger learners, women, and learners with lower levels of education [4]. Juan Zambrano et al. [5] put forward six measures of student performance in the course based on the data provided by the Massachusetts Institute of Technology in online courses, based on these measurement indicators, the student population was divided into multiple categories, and analyzed the resource usage of middle school students in each class. Zeng Shufang [6] and others analyzed MOOC data to extract learners’ learning behavior characteristics. Finally, she used Ward’s and K-Means clustering to classify learners, which were mainly divided into three categories: “active learners”, “passive learners” and “bystanders”. The results show that active learners have a higher completion rate and achieve better final grades. Tang Mingwei analyzed students’ study behaviors through big data and gave a hidden Markov model. This model establishes the relationship between classroom behavior and student performance, through which the content of the classroom can be adjusted [7]. Zhang Xiaoying applied the K-Means analysis algorithm to classify and analyze the various behaviors of students at school, establishing mathematical model, applied correlation analysis to explain and predict the behavior of college students [8]. Zhang Liyuan [9] and others used data analysis and machine learning methods to research and analyze student behavior, and found that students’ online learning performance is highly correlated with learning behavior. Deng Tianping [10] et al. clustered analysis of student learning data and final exam scores on the MOOC platform. Explore the impact of each learning dimension on the learning effect in different class groups. Tian Chunzi [11] and others used K-Means and DBSCAN to analyze multiple types of data generated by students during school, and compared the two algorithms. In terms of score prediction, M. Zaffard [12] et al. proposed a hybrid feature selection framework to predict student performance. Jin Xiuling [13] optimized the SVM model parameters and established the GA-SVM student performance prediction model. Zhao Xiaoyan [14] based on multi-source data fusion technology, fused various data of college students, including sports, consumption and social behavior, and used support vector machines (SVM) and machine learning (ML) to predict college students’ English scores. Tian Yu [15] and others proposed a novel multi-feature neural network model to predict college entrance examination scores, and verified the effectiveness of the algorithm through simulation experiments. Li Longzhen [16] uses decision tree C4.5 to establish a student’s score prediction model for research, and its prediction accuracy is about 88%. Ren Ge [17] and others used BP neural network to predict students’ score in multiple courses, and the prediction accuracy rate could reach 70%. Yu Tiesuo [18] and others used SVR (Support Vector Regression) to predict performance, and used the prediction results for statistical analysis and early warning.

In this paper, different clustering algorithms are used to analyze the personality and attitude of students’ behavior data. Comparing the quality of different clustering methods, the K-means clustering results are analyzed for students’ learning personality and attitude. Finally, students’ scores and results are obtained by multiple linear regression in this paper.

3 Experimental Process and Result Analysis

This paper mainly focuses on the classification of student’s personality and attitude and the prediction of student performance by a course in the online learning platform. The process of classifying students’ personality and attitude mainly includes:

  1. 1.

    Data acquisition, data cleaning and preprocessing.

  2. 2.

    Classification by K-Means, MiniBatchKMeans, Birch algorithm.

  3. 3.

    Compare the three algorithms and analyze the clustering results of the algorithms.

The predicting process of multiple linear regression is as follows:

  1. 1.

    Select the attribute that has a relatively obvious linear relationship between student behavior data and student performance as the independent variable.

  2. 2.

    Fit the relationship function between independent variables and performance through multiple linear regression.

  3. 3.

    Analyze the function model.

3.1 Data Processing

The data source in this paper is the student behavior data and student basic data of a course on MOOC platform. We extracted the data related to student behavior, including student ID (Id), name (Name), video views (Video), unit detection times (Unit), document reading times (Document), discussion times (Discussion), number of postings (Message), login Number (Login) and final grade (Score). Then we cleaned the data and mainly deleted the students with missing or abnormal field data, and finally left 1685 student data.

3.2 Student Personality Analysis

We use K-Means, MiniBatchKMeans, Birch for cluster analysis, and use the contour coefficient to compare the quality of the algorithm, where the larger the contour coefficient, the better the clustering effect. The contour coefficient is the SC index, which indicates the degree of aggregation within each cluster and the degree of separation between each cluster after clustering. The smaller the distance between samples in the same class, the larger the sample distance between different classes [19] 16, the larger the value of \(SC\), the better the clustering effect will be. Therefore, \(SC\) is often used as a performance index to evaluate the clustering results. We let \({a}_{i}\) represent the average distance between sample i and other samples in the cluster, and \({b}_{i}\) represents the average separation distance between each cluster. Then we can use the following formula to calculate the contour coefficient \({SC}_{i}\).

$$SC_i = \frac{b_i - a_i }{{\max \left( {a_i ,b_i } \right)}}$$
(1)

First, we select Document and Discussion in the data, and standardize the data. The behavioral data of students selected by us are analyzed for their personality through K-Means, MiniBatchKMeans, and Birch. In addition, the curve changes of the three algorithms’ classification cluster number = 2, 3, 4, 5, 6, 7, 8 and profile coefficient are plotted. In the graph, blue represents K-Means, orange represents Birch, and green represents MiniBatchKMeans (Fig. 1).

Fig. 1.
figure 1

The curve change of the cluster number and silhouette coefficient of the three algorithms of student personality.

The clustering results show that when the K-Means clustering result is optimal, the students are divided into 3 categories at this time, and the contour coefficient at this time is 0.7695. When the MiniBatchKMeans algorithm is optimal, the students are divided into 3 categories, and the contour coefficient is at this time. It is 0.7692. When the result of Birch algorithm is optimal, the students are divided into 3 categories. At this time, the contour coefficient is 0.7668. The results show that when the three algorithms have the best clustering effect, students are divided into three categories. We number the three personalities as 0, 1, and 2, and analyze the three clustering algorithms, as shown in the following Tables 1, 2 and 3:

3.3 Student Attitude Analysis

The realization of student attitude analysis is similar to personality. First we select Video, Unit, Document, Message, Login from the data set as the original clustering data set, and standardize the data, then use the principal component analysis method to transform the data into 2 dimensions. Principal Component Analysis, a method of processing data [20], converts high-dimensional data containing a large amount of redundant information into a small amount of low-dimensional data, and contains the effective information of the original data. Its basic idea is to find a projection transformation matrix that best represents the main personalityistics of the original data under the constraint of the minimum mean square error [21]. Then K-Means, MiniBatchKMeans, and Birch are used to analyze student attitudes based on our selected student behavior data. The curves of cluster number = 2, 3, 4, 5, 6, 7, 8 and contour coefficient are drawn. In the graph, blue represents K-Means, orange represents Birch, and green represents MiniBatchKMeans (Fig. 2).

Fig. 2.
figure 2

The curve change of the cluster number and silhouette coefficient of the three algorithms of student attitude.

The clustering results show that when the K-Means clustering result is optimal, the students are divided into 3 categories at this time, and the contour coefficient at this time is 0.7695. When the MiniBatchKMeans algorithm is optimal, the students are divided into 3 categories, and the contour coefficient is at this time. It is 0.7692. When the result of Birch algorithm is optimal, the students are divided into 3 categories. At this time, the contour coefficient is 0.7668. The results show that when the three algorithms have the best clustering effect, students are divided into three categories. We number the three personalities as 0, 1, and 2, and analyze the three clustering algorithms, as shown in the following Tables 4, 5 and 6:

3.4 Student Performance Prediction

Student performance prediction is to use multiple linear regression to fit a function and then predict the performance. Multiple linear regression analysis forecasting method refers to the establishment of a forecasting model through the correlation analysis of two or more independent variables and dependent variables. When there is a linear relationship between the independent variable and the dependent variable, it is called multiple linear regression analysis [22]. One of the significance test methods of the regression equation is to test by Multi-correlation coefficient. When the result of Multi-correlation coefficient is closer to 1, the better the correlation fitting effect will be [23]. The calculation formula of Multi-correlation coefficient is:

$$R = \sqrt {\frac{{\Sigma \left( {\hat{y} - \overline{y}} \right)^2 }}{{\Sigma \left( {y_i - y} \right)^2 }}}$$
(2)

Through our analysis of each attribute and score, we found that the linear relationship between Video, Unit, Document and Score is relatively high. We calculated the Pearson correlation coefficients between them through SPSS software, which were 0.894, 0.935, and 0.937, respectively. So we finally choose Video, Unit, Document as the independent variables, and the linear relationship between these three and the score is as follows (Fig. 3):

Fig. 3.
figure 3

Linear relationship diagram of video, unit, document and score.

In the second step, we use Video, Unit, Document as independent variables and score as dependent variables, and finally use SPSS software to perform multiple linear regression to obtain the function model: y = 0.141 *\({\boldsymbol{ }{\varvec{x}}}_{1}\) + 1.335 * \({{\varvec{x}}}_{2}\) + 0.239 * \({{\varvec{x}}}_{3}\)− 2.545.

3.5 Result Analysis

For student personality analysis, comparing the K-Means, Birch, and MiniBatchKMeans algorithms, the three algorithms divide the students’ learning personality into three categories when the contour coefficient of the three algorithms is the largest. The three algorithms found the best for personality 0, personality 2 the second, and personality 1 the worst. Among the three algorithms, the Birch algorithm has a lower clustering accuracy than the other two algorithms. We analyze the three personalities through the clustering results of K-means. As shown in Table 7. We can see from the table that the behavioral data values of students in category 1 are relatively small, students in category 2 have the largest value, and students in category 3 are in the middle. We divide students into three categories: “active”, “ordinary”, and “dull”. Among them, the active type is more enthusiastic about things, the normal type is positive and indifferent to things, and the dull type is introverted and indifferent to things.

Table 1. K-Means cluster analysis of student personality.
Table 2. Birch cluster analysis of student personality.
Table 3. MiniBatchKmeans cluster analysis of student personality
Table 4. K-Means cluster analysis of student attitude.
Table 5. Birch cluster analysis of student attitude
Table 6. MiniBatchKmeans cluster analysis of student attitude
Table 7. Student personality clustering results.
Table 8. Model summary.
Table 9. ANOVA.
Table 10. Coefficients

For the analysis of student attitudes, when the contour coefficients of the three algorithms are the largest, students’ learning attitudes are divided into 4 categories. The three algorithms found the best for attitude 0 and the worst for attitude 2. For attitude 0, the Kmeans algorithm has the lowest accuracy, and for student learning attitude 1, attitude 2 and attitude 3, MiniBatchKmeans has the lowest accuracy. We draw the result of kmeans algorithm into a radar chart, as shown in Fig. 4. We can draw the following conclusions: the first type of students can be summarized as “negative and laziness” and their attitudes are: negative learning, laziness, and even giving up learning. The second class of students is “perfunctory and active” and their attitudes are: perfunctory, positive comments, and easy going. The third type of students is the “ Medium-general “ whose performance is “medium grades”, “average enthusiasm”, and “sloppy”. The fourth category is “proactive”, which is manifested as “good scores”, “high motivation”, and “active learning”. Among them, there are 993 people for “passive and lazy”, 295 people for “perfunctory and active”, 324 people for “Medium-general”, and 73 people for “proactive”.

Fig. 4.
figure 4

Radar chart of student attitudes.

According to the function model obtained above, we only need to know that Video, Unit, and Discussion can predict students’ performance. In order to evaluate the quality of the model, we use SPSS to test and evaluate the model. The evaluation results are as follows (Tables 8, 9 and 10) :

Through analysis, we can see that the coefficients corresponding to the number of final video views, detection times, and document reading obtained by using the multiple linear regression are 0.141, 0.1335, 0.239, respectively, the constant term is −2.545, and the significance of each independent variable is less than 0.001. The description shows that the influence of each independent variable on the dependent variable is significant. At the same time, the multi-correlation coefficient R of the model is 0.974 close to 1 and the significance is less than 0.001, indicating that the fit is good.

4 Summary

In his paper, we clear and preprocess students’ behavior data, then use three clustering algorithms to classify students’ learning personality and attitude and compare the results of the three algorithms. Finally, we select the K-Means algorithm to cluster the data and analyze the model. The personality of students is divided into three categories: “active”, “ordinary”, and “boring”. Students’ attitude is divided into four categories: “negative and lazy”, “perfunctory and active”, “medium-general” and “Proactive”. Then the multiple linear regression algorithm is used to predict the student’s performance and test the model. The study of student behavior data can provide targeted suggestions for future teaching practice, and can also provide a theoretical basis for continuous improvement of teachers’ classroom teaching [10].