Keywords

1 Introduction

In the teaching process is crucial to evaluate the teaching performance [1]. This evaluation is one of the most complex processes in any university, since various factors and criteria like: planning of classes, schedules, delivery of evaluation evidence, attendance to courses for the improvement of teaching and teaching styles, among others, should be met to be concentrated in order to provide a final assessment for the professor. Teacher evaluation can be performed by an observation guide or a rubric. However, when teacher performance is evaluated by students, varied opinions are collected from the same established criteria. Therefore Education is one of the areas that in recent years has shown interest in analyzing the comments of students in order that teachers improve their teaching techniques promoting appropriate learning in students. This is possible by Sentiment Analysis [2] an application of natural language processing, text mining and computational linguistics, to identify information from the text.

In his research, Binali [3] ensures that students represents their emotions in comments, so it is a way to learn about various aspects of the student. Wen [4] applies Sentiment Analysis on feedback from students about their teachers, enrolled in online courses in order to know their opinion and determine whether there is a connection between emotions and dropout rates. Students feedback on quality and standards of learning is considered as a strategy to improve the teaching process [4] and can be collected through a variety of social networks, blogs and surveys.

In this paper, we presented a model called SocialMining to support the Teacher Performance Assessment applying Support Vector Machines (SVM). We selected SVM as a classifier due to its high performance in classification applications [5, 6]. Further experiments with other machine learning algorithms will follow.

This paper is organized as follows. Section 2 presents related work. Section 3 shows the SocialMining model architecture. Section 4 describes data and methods used and experimental design. Section 5 includes the results. Finally, the conclusions of the work are presented in Sect. 6.

2 Related Work

The Table 1 shows an overview of some related work. All these works have obtained good results in their different combinations of methods and algorithms. This table is not exhaustive.

Table 1. List of features to analyze

From Table 1 we can see that most of previous research has focused on particular aspects like: know the student emotional state, analyze the terms and phrases from opinions of students, detect the feelings of students on some topics and know the user opinions of the E-learning systems. In this work we proposed a model to evaluate teacher performance considering spanish reviews from students and applying machine learning algorithms to classify them as positive, negative and neutral. The results of this work may help to improve the classification process of comments and suggest courses to teachers.

3 SocialMining Model Architecture

The SocialMining model is composed of three phases: a comments extraction process (feedback from students about their teachers) and cleaning, a feature selection process, and classification of comments into positive, negative and neutral, applying SVM. The last phase includes an evaluation process of SVM results in kernels.

Phase 1: Comments Extraction and Cleaning Process.

In this phase we extracted feedback from students about their teachers to generate a corpus of comments. Then we do a labeling process to classify the comments into positive and negative considering a numeric range. The numeric range varied from: −2 to −0.2 is used for negative comments, −2 value express very negative comments. Values between +0.2 to +2 apply to positive comments, +2 is used as a positive comments. Likewise, those comments labeled with the number 1 are considered as neutral (Fig. 1).

Fig. 1.
figure 1

Numeric range used to tagging process in comments

In this cleaning process, the stop words and nouns that appear in most of the comments are deleted (e.g. teacher, university, class, subject, school and others). In addition, punctuation marks are removed and capitalized words are converted to lowercase. The output in this phase is the corpus of comments.

Phase 2: Feature Selection.

Once finished the cleaning process, we performed a feature selection process, removing repetitive terms and applying some functions to select the required terms or features, this process is like a filtering. So the input in this phase is the corpus of comments and the output are the features.

A feature in Sentiment Analysis is a term or phrase that helps to express a positive or negative opinion. There are several methods used in feature selection, where some of them are based on the syntactic word position, based in information gain, using a importance variable calculated by genetic algorithms [15] and trees like the variable importance measures for random forest [16]. In this phase is necessary to know the importance of each feature, by their weight. So the Term Frequency - Inverse Document Frequency (TF-IDF) is applied (Fig. 2).

Fig. 2.
figure 2

SocialMining model architecture

Phase 3: Comments Classification Process.

In this phase, the corpus of comments and features (matrixCF) is partitioned into two independent datasets. The first dataset is dedicated to training process (train) and is used in classification to find patterns or relationships among data; the second dataset is considered for the testing process (test) in order to adjust the model performance. In this work two thirds of the matrixCF are used for training dataset and one-third for test dataset. Then the cross-validation method of k iterations is applied to control the tuning and training of SVM. In this method matrixCF is divided into K subsets. One of the subsets is used as test data and the remaining (K−1) as training data. The cross-validation process is repeated for K iterations, with each of the possible subsets of test data, resulting in a confusion matrix with average values. Once the K iterations have been completed, cross validation accuracy is obtained. In this research, K is equal to 10.

The tuning process in SVM allows adjusting the parameters of each kernel (linear, radial basis and polynomial). Then is performed a training process, through which is identified whether the value of the parameters vary or remain constant.

Finally, the implementation of SVM is performed presenting as a result the confusion matrix and accuracy values as well as the metric Receiver operating characteristic (ROC) curve.

4 Materials and Methods

4.1 Data

The dataset used in this work comprises 1040 comments in Spanish of three groups of systems engineering students at Universidad Politécnica de Aguascalientes. They evaluated 21 teachers in the first scholar grade (2016). For this study we considered only those comments free from noise or spam (characterized in this study as texts with strange characters, empty spaces, no opinion or comments unrelated to teacher evaluation). In this work we identified a set of 99 features. An extract of the features are listed in Table 2.

Table 2. List of features

4.2 Performance Measures

We used typical performance measures in machine learning such as:

  • Accuracy, primary measure to evaluate the performance of a predictive model.

  • Balanced accuracy, a better estimate of a classifier performance when a unequal distribution.

  • Sensitivity, which measures the proportion of true positives.

  • Specificity, measures the proportion of true negatives.

  • ROC curve, measures the performance of a classifier through graphical representation [17, 18].

4.3 Classifiers

SVM is an algorithm introduced by Vapnik [19] for the classification of both linear and nonlinear data, it has been known for its quality in text classification [20]. There are kernels that can be used in SVM, such as: linear, polynomial, radial basis function (RBF) and sigmoid. Each of these kernels has particular parameters and they must be tuned in order to achieve the best performance. In this work we selected the first three kernels to classify comments; this is mainly because of their good performance in text classification [5, 6]. Table 3 shows the parameters of each kernel used in this study.

Table 3. Kernel parameters.
  • C is the parameter for the soft margin cost function, it determines a tradeoff between a wide margin and classifier error. A very small value of C cause a larger margin separating hyperplane and the model get fit tighter to data, however a large value of C reduce the margin and this may cause more error on the training set.

  • Sigma determines the width for Gaussian distribution in Radial basis kernel.

  • Degree control the flexibility of the resulting classifier in Polynomial Kernel.

4.4 Experimental Design

We created a dataset containing 1040 comments and 99 features associated with teacher performance assessment. We used train-test evaluation, two-thirds (2/3) for training, and (1/3) one-third for testing, then there were performed 30 runs applying SVM with polynomial, radial basis function (RBF) and linear kernel. For each run performance measures are computed. In each run we set a different seed to ensure different splits of training and testing sets, all kernels use the same seed at the same run.

Each kernel requires tuning different parameters (see Table 3). A simple and effective method of tuning parameters of SVM has been proposed by Hsu [21], the grid search. The C values used for the kernels, range from 0.1 to 2, the value of sigma (σ) varied from 0.01 to 2, the degree value parameter range from 2 to 10, and values between 0 and 1 are assigned for coef parameter. We performed 30 train-test runs using different seeds and calculated the accuracy and balanced accuracy for each run.

5 Discussion and Results

In this section, we present the results with three kernels in SVM. The first step is to determine the parameters of each kernel of SVM, so we first load the data and create a partition of corpus of comments, then divided it into training and testing datasets, then use a train control in R [22] to set the training method. We use the Hsu [21] methodology to specify the search space in each kernel parameter. ROC is the performance criterion used to select the optimal kernels parameters of SVM.

Setting the seed to 1 in the process of optimization parameters, we generated paired samples according to Hothorn [23] and compare models using a resampling technique. Table 4 shows the summary of resampling results using R [22], the performance metrics are: ROC, sensibility and specificity. In the Fig. 3 we can see the plot of summary resampling results, in this case, the polynomial kernel apparently has a better performance than linear and RBF (radial).

Table 4. Summary resampling results of parameters optimization
Fig. 3.
figure 3

Summary resampling results of kernels performance

Once obtained optimized parameters for each kernel, the execution of each SVM model is performed.

The Table 5 shows the average results of each kernel of SVM across 30 runs. Also the standard deviation of each metric is presented.

Table 5. Average results across 30 runs in three kernels of SVM

The linear kernel obtained a balanced accuracy above 0.80, this is an indicator that the classifier is feasible to use in comments classification. Values obtained in Sensitivity were much higher than those obtained in specificity in all kernels, which indicates that the classifier can detect the negative comments of the teachers. The kernel polynomial (SVM Poly) had the lowest performance in all metrics except in sensitivity. The three kernels resulted more sensitive than specific.

6 Conclusions

In computer science is attractive the use of this type of machine learnings models to automate processes, save time and contribute to decision-making. The SocialMining model supports the analysis of the behavior from unstructured data provided by students. The sentiment analysis is based on the analysis of texts and the SocialMining can provide a feasible solution to the problem of analysis of teacher evaluation comments. Further experiments will be conducted in this ongoing research project.

It is important to point out that is necessary reduce the number of features through a depth analysis to identify the most relevant features of teacher performance assessment, in order to improve the results of comments classification process. Also we considered important having a corpus of balanced comments (positive and negative comments in equal quantity) for testing and training process.

In addition to conduct a deeper analysis for relevant features selection, we considered necessary to implement other machine learning algorithms in order to measure the performance of each algorithm in the classification of comments and select the optimal with high accuracy results.

Based on the adequate results that have been obtained by the SocialMining model applying Naïve Bayes and a corpus of subjectivity [24], we considered that with the implementation of other algorithms of machine learning well-known for their good performance in classification process.

About how SocialMining model support the improvement of teaching in the first instance it allows a quicker analysis of student comments, identifying which teachers have mostly negative comments which allows interventions with the teacher in order to support it through teacher improvement courses. Each school period, courses are offered to teachers, however the comments of students are not considered among the criteria to recommend a certain course to the teacher. For this reason it is believed that the Model presented in this work will support the improvement of teaching.