Keywords

1 Introduction

With the rapid development of science and technology, the research trend of artificial intelligence is constantly rising. More and more people are investigating in the field of machine learning and deep learning. As the direct carrier of information transmission, people have paid more attention to the image classification. Including the subjective image quality evaluation, the machine learning quality evaluation of the SVM + SVR system formed by the BRISQUE [1] algorithm, and the deep learning quality evaluation model of CNN IQA [2], people’s passion for image research burst out constantly. Image classification is one of the contents in the field of image quality evaluation. The results of image classification often have a great impact on the subsequent image computation. This paper proposes an image classification method using the histogram of oriented gradient (HOG) features, the gray-level co-occurrence matrix (GLCM) features, and the support vector machine (SVM) classifier. By extracting the HOG features and the GLCM features, the SVM is used to perform the prediction.

2 Algorithm Flow

There are about 300 pictures in this experiment, which are divided into four types: car, cat, flower, and fish. The pictures above are divided into a test set and a training set according to a certain proportion. The test set contains 50 pictures of car, 35 pictures of cat, 35 pictures of flower, and 60 pictures of fish. The test set sample is shown in Fig. 1.

Fig. 1
figure 1

Samples of test set

The algorithm flowchart is shown in Fig. 2, and the computation steps are illustrated as follows.

Fig. 2
figure 2

Algorithm computation flowchart

  • Step 1: The system batches the pictures into the certain size for the following experiment computation.

  • Step 2: This system reads the picture information, extracts the HOG features and the GLCM features of the picture, and then merges the two features above.

  • Step 3: The system inputs the feature vectors into the SVM and adopts a one-to-one solution to train and test the SVM.

  • Step 4: This system displays the classification results and analyzes the results.

2.1 HOG Feature Extraction

The HOG feature is a kind of directional histogram feature. It is a typical image feature that is widely used in various fields of image research. The general acquisition of HOG features roughly goes through the following steps [3].

  • Step 1: The algorithm converts the original image into a grayscale image. The Eq. (1) shows the process of converting a color image into a grayscale image:

    $${\text{Gray}} = 0.3 * R + 0.59 * G + 0.11 * B$$
    (1)

    where R, G, and B represent the color components of the corresponding position of the image;

  • Step 2: The Gamma correction method [4] is used for image normalization. In the case of uneven image illumination, Gamma correction can be used to increase or decrease the overall brightness of the image. In practice, Gamma standardization can be performed in two different ways, i.e., the square root and the logarithmic method. In general, the square root method is usually used, and its formula is as follows:

    $$Y(x,y) = I(x,y)^{1/2}$$
    (2)

    where I(x, y) represents the brightness of the corresponding position of the image;

  • Step 3: The algorithm calculates the gradient of each pixel of image separately;

  • Step 4: The algorithm divides the image into cell units and counts the gradient direction of each cell unit;

  • Step 5: The algorithm combines several cell units into blocks and then connects the gradient directions of all blocks in the series to obtain the HOG features (Fig. 3).

    Fig. 3
    figure 3

    Example of HOG features. The original picture is shown in left and the picture of HOG features is presented in right

2.2 GLCM Feature Extraction

The GLCM [5] is an image recognition technology with strong robustness and adaptability which can effectively realize the classification and retrieval of images. The GLCM actually refers to the probability that a gray-level point leaves a certain position d(Dx, Dy) and therefore reaches a j gray level. Equation (3) shows its definition method. In this paper, in order to obtain the GLCM at different angles, four directions are computed: 0, 45, 90, and 135. The corresponding features, including the contrast, the inverse gap, the entropy, and the autocorrelation, are all calculated. And then, the average and the variance of them are computed as the final extracted features.

$$Pd(i,j)\quad (i,j = 0,1,2,3,\;. \ldots ,\;L - 1)$$
(3)

where L represents the gray level of the image pixel; i, j are used for the gray level of pixel; d mainly refers to the direction and distance between two different pixels.

2.3 Svm

The SVM is a typical binary classifier. It is widely used in the field of machine learning. As shown in Fig. 4, the hollow circles and black squares, respectively, represent two types of linearly separable training samples. The symbol L is a classification line that separates the two classes without errors. The symbols L1 and L2 are the straight lines that pass through the nearest point in the two types of samples to the classification line. And they are also parallel to L. The interval d between L1 and L2 is called the classification interval. The optimal classification line can maximize the classification interval d. If the above situation is extended to a high-dimensional space, the optimal classification line is called the optimal classification surface. The kernel function-based SVM maps linearly inseparable problems from the low-dimensional space to high-dimensional space through kernel mapping. Similarly, the optimal classification surface in high-dimensional space can also be found to solve the classification problem.

Fig. 4
figure 4

SVM algorithm sketch map

Although the SVM is just a typical binary classifier, it can also achieve the effect of multi-classification. There are usually three types of schemes for implementing the SVM: the remaining schemes, the one-to-one scheme, and the directed acyclic graph scheme. We use a one-to-one solution in this paper. It is assumed that there are a total of K categories in the sample; for a one-to-one solution, you need to train a classifier for any two of the categories; thus, it is needed to train a number of classifiers K(K − 1)/2 for each category. Although the number of classifiers is larger, the total time spent in the training phase is much less than the other methods.

3 The Experiment Results

This experiment mainly uses MATLAB programming to process 300 pictures and extracts the HOG features and the GLCM features. We also use the SVM algorithm to classify and obtain the classification results. And then, we calculate them in the form of a confusion matrix [6]. The accuracy and the recall of various image classifications are used to judge the performance of our proposed method. There are 180 pictures in the test set, which are divided into four types of pictures, including 50 pictures of car, 35 pictures of cat, 35 pictures of flower, and 60 pictures of fish. The classification results are shown in Table 1.

Table 1 Test set classification results

It can be seen from Table 1 that the fish-type picture classification can get the best result, and the flower type classification result is poor. After the data processing, the accuracy and the recall rate are also obtained. The results are shown in Table 2. According to Table 2, it can be obtained that the average correct rate of the classification result of the test set is about 91.87%, and the classification effect is relatively satisfactory.

Table 2 Accuracy and the recall rate of test set

In order to reflect the rationality of the combination of the HOG features and the GLCM features, the classification experiments using only the HOG features or the GLCM feature are done, respectively, in this paper. The respective classification results are shown as follows.

From these Tables 3, 4, 5 and 6, it can be obtained that the average correct rate of the classification result of the test set only using the HOG feature is about 88.61%, and the average correct rate of the classification result of the test set only using the GLCM feature is about 83.06%. However, the average correct rate of the classification result of the test set using both the HOG feature and the GLCM feature is about 91.87%. Clearly, the use of the combination of HOG features and GLCM features in image classification is far superior to the use of them alone.

Table 3 Test set classification results only using the HOG feature
Table 4 Accuracy and the recall rate of test set only using the HOG feature
Table 5 Test set classification results only using the GLCM feature
Table 6 Accuracy and the recall rate of test set only using the GLCM feature

4 Conclusion

This paper analyzes a SVM image classification method based on the combination of the HOG features and the GLCM feature. We also conduct the experimental analysis of the corresponding algorithms. The classification results of this method are reasonable.