Keywords

1 Introduction

During the past several decades, we have seen advances in machine learning. However, with the expansion of machine learning applications, many new challenges have also emerged. In particular, adversarial machine learning, as a machine learning technique, mainly learns the potential vulnerabilities of machine learning in adversarial scenarios and have attracted a lot of attention [1,2,3]. Adversarial samples have been widely found in the application fields of machine learning, notably image classification, speech recognition, and malware detection [4,5,6]. Meanwhile, various defensive techniques for the adversarial samples have been proposed recently, including adversarial training, defensive Distillation, pixel deflection, and local flatness regularization [7,8,9,10].

As a popular machine learning method, support vector machines (SVMs) were widely used to solve security problems, such as image classification, malware detection, spam filtering, and intrusion detection [11,12,13]. As described in [14], adversarial attacks against machine learning can be categorized as poisoning attacks and evasion attacks in general. A poisoning attack happens at test time, where the adversary injects a small number of specifically modified samples into the training data, which makes a change in the boundary of the model and results in misclassification. With the rise of various poisoning attack measures against SVMs [15,16,17,18,19], the countermeasures for protecting SVM classifier from poisoning attacks have been developed, one is data cleaning technology [20], and the other is to improve the robustness of learning algorithms against malicious training data [21].

In this paper, we focus mainly on evasion attacks on the SVM classifier. An evasion attack is an attack that evades the trained model by constructing a well-crafted input sample during the test phase. In 2013, Biggio et al. [22] simulated various evasion attack scenarios with different risk levels to enable classifier designers to select models more wisely. However, as time went by, more and more evasion attack methods began to emerge. There are two main directions of evasion attacks to generate adversarial examples. One attack is based on the gradient, which is the most common and most successful attack method. The core idea is to use the input image as the starting and modify the image in the direction of the gradient of the loss function, such as the Fast gradient Sign Method [23], Basic Iterative Method [24], and Iterative gradient Sign Method [25]. Another is to generate adversarial samples based on hyperplane classification, such as the DeepFool algorithm [26]. Although the above methods of generating adversarial examples all consider deep neural networks as machine learning models, in this work, we focus on SVMs. Therefore, we first attempted to apply the above methods of generating adversarial samples to the SVM classifier and proposed corresponding defense strategies.

In this work, our main contribution is to propose an effective defense strategy based on kernel optimization in SVM to protect the classifier against an attack method similar to the method proposed in [26]. The experimental results (in Sect. 4) show that our approach has a very significant defensive effect on the iterative attack based on gradient. Moreover, after using kernel optimization for defense, our classifier becomes more robust. Besides, to our best knowledge, this is the first attempt to apply this adversarial attack, which is proposed in [26] to the SVM model, to generate adversarial examples and achieved good experimental results.

The remaining of this paper is arranged as follows: In Sect. 2, we introduce the relevant knowledge of SVM and the attack approach that we use throughout our work. In Sect. 3, we illustrate our defend method based on kernel optimization in SVM against adversarial examples. Experimental results are presented in Sect. 4, followed by discussion and conclusions in Sect. 5.

2 Preliminary

To better illustrate the proposed procedures, we briefly review the main concepts of the model and the adversarial attack used throughout this paper. We first introduce our notation and summarize the model we utilized in the SVM in Sect. 2.1. Then we describe the major method which was used to generate adversarial samples in Sect. 2.2.

2.1 Support Vector Machine

The SVM model is a prevailing approach of classification between two sets. For illustration, we first describe the main idea of binary SVM, which is to find a hyperplane that well-separated the two classes. In SVM, a hyperplane is a solution that can correctly divide positive and negative class samples based on the principle of structural risk minimization. Thus, the hyperplane equation is univocally represented as \({\mathbf{w}}^{T} \cdot {\mathbf{x}} + b = 0\), where normal vector \({\mathbf{w}}\) gives its orientation, and b is its intercept displacement.

Assuming that the problem is one of binary classification, we symbol a training dataset as \(D = \{ ({\mathbf{x}}_{i} ,y_{i} \} _{{i = 1}}^{N}\). Here \({\mathbf{x}}_{i} \in \mathbb{R}^{d}\) is the input feature vector, \(y \in \{ - 1, + 1\}\) the output label, respectively, where N is the number of samples, and d is the dimensionality of the input space. The solution of the optimal hyperplane of the SVM model can be expressed as a convex quadratic programming problem with inequality constraints. The Lagrangian multiplier method can be used to obtain its dual problem and then \({\mathbf{\alpha }}\) can be solved by the SMO algorithm. Finally, we can get the discriminant function. In addition, \({\mathbf{w}}\) can be calculated as \(\sum\limits_{{i = 1}}^{N} {\alpha _{i} y_{i} } {\mathbf{x}}_{i}\), and the intercept b can be computed as \(b = \frac{1}{{|S|}}\sum\limits_{{i \in S}} {(y_{i} } - \sum\limits_{{j \in S}} {\alpha _{j} } y_{j} ({\mathbf{x}}_{i} ,{\mathbf{x}}_{j} ))\).

Although SVM was initially designed to solve linear classification problems, SVM was extended to nonlinear classification cases by choosing from among different kernel functions [27]. Through the kernel matrix, the training data can be projected to more complex feature space. The process of solving SVM is to solve the following quadratic optimization problem

$$ \begin{array}{*{20}l} {\mathop {\min }\limits_{\alpha } {\text{ }}\frac{1}{2}\sum\limits_{{i = 1}}^{N} {\sum\limits_{{j = 1}}^{N} {\alpha _{i} } } \alpha _{j} y_{i} y_{j} k({\mathbf{x}}_{i} ,{\mathbf{x}}_{j} ) - \sum\limits_{{i = 1}}^{N} {\alpha _{i} } ,} \hfill \\ {s.t.{\text{ }}\sum\limits_{{i = 1}}^{N} {\alpha _{i} } y_{i} = 0,s.t.{\text{ }}\sum\limits_{{i = 1}}^{N} {\alpha _{i} } y_{i} = 0,} \hfill \\ {\quad \quad \alpha _{i} \ge 0,{\text{ }}i = 1,2...,N,} \hfill \\ \end{array} $$
(1)

in which \(\alpha _{i}\) is the Lagrange multiplier corresponding to the training data \({\mathbf{x}}_{i}\), \({\mathbf{K}}( \cdot )\) is the kernel function. If we define a mapping function \(\Phi :X \to \chi\), that is to say, the function maps the training sets into a higher-dimensional feature space, then \({\mathbf{K}}({\mathbf{x}}_{i} ,{\mathbf{x}}_{j} )\) can be generalized to \(\Phi ({\mathbf{x}}_{i} )^{T} \Phi ({\mathbf{x}}_{j} )\), so \({\mathbf{w}}\), and b can be written as

$$ {\mathbf{w}} = \sum\limits_{{i = 1}}^{N} {\alpha _{i} y_{i} } \Phi ({\mathbf{x}}_{i} ), $$
(2)
$$ b = \frac{1}{{|S|}}\sum\limits_{{i \in S}} {(y_{i} } - \sum\limits_{{j \in S}} {\alpha _{j} } y_{j} {\mathbf{K}}({\mathbf{x}}_{i} ,{\mathbf{x}}_{j} )), $$
(3)

where \(S = \{ i|\alpha _{i} > 0,i = 1,2,...m\}\) the subscript set of all the support vectors. Though it may be too complicated to compute in the feature space, one need not explicitly know, and it only corresponds to the kernel function.

2.2 Attack Strategy

In [26], they proposed the DeepFool algorithm, which is simple as well as an accurate method and based on hyperplane classification to generate adversarial samples. The primary attack method used in our study is similar to this method. In the case where the classifier \(f\) is linear, from [26], we know that the minimal perturbation to change the classifier’s decision is equal to the distance from the point to the hyperplane classification times the negative gradient of the unit vector of \({\mathbf{w}}\), where \({\mathbf{w}}\) is the weight vector of the hyperplane classification. For the nonlinear case, we consider the iterative procedure to find the minimum perturbation vector, as shown in Fig. 1. In some situations, we may not be able to reach the classification hyperplane in one step, like in the case of linearity, and multi-step superposition may be required. Consequently, in a high dimensional space, the minimum perturbation vector of the adversarial sample can be expressed as

$$ {\mathbf{\varepsilon }}_{\Phi } = - \frac{{{\mathbf{w}}_{\Phi } ^{T} \Phi ({\mathbf{x}}) + b}}{{||{\mathbf{w}}_{\Phi } ||_{2}^{2} }}{\mathbf{w}}_{\Phi } , $$
(4)

where \({\mathbf{w}}\) and \(b\) is represented in Eq. (2) and Eq. (3).

In fact, \({\mathbf{w}}_{\Phi }\) can also be formally represented by all the support vectors in high dimension space, such as

$$ {\mathbf{w}}_{\Phi } = \sum\limits_{{i \in S}} {\alpha _{i} } y_{i} \Phi ({\mathbf{x}}_{i} ). $$
(5)

Of course, \(\Phi ({\mathbf{x}}_{i} )\) showing no explicit expression, so Eq. (5) is only part of the \({\mathbf{w}}_{\Phi }\) formalized representation, cannot be obtained.

Fig. 1.
figure 1

The minimum perturbation that to classify the positive sample to the negative sample for a nonlinear binary classifier. On the left is the plane figure, on the right is a geometric illustration of the method.

Next, we proposed the adversarial generation method, which is based on kernel. For the nonlinear function \(f({\mathbf{x}})\), combined with Eq. (3) and Eq. (5), is then defined as follows

$$ \begin{aligned} f({\mathbf{x}})\,=\,& {\mathbf{w}}_{\Phi } ^{T} \Phi ({\mathbf{x}}) + b \\ = & \sum\limits_{{i \in S}} {\alpha _{i} } y_{i} {\mathbf{K}}({\mathbf{x}}_{i} ,{\mathbf{x}}) + \frac{1}{{||S||}}\sum\limits_{{i \in S}} {(y_{i} } - \sum\limits_{{j \in S}} {\alpha _{j} y_{j} K({\mathbf{x}}_{i} ,{\mathbf{x}})} ). \\ \end{aligned} $$
(6)

For an unclassified testing sample, if the value of \(f({\mathbf{x}})\) is positive, the sample would be classified as a normal example. Otherwise, it would be classified as a malicious sample. The gradient of \(f({\mathbf{x}})\) with respect to x is thus given by

$$ \nabla _{{\mathbf{x}}} f({\mathbf{x}}) = \sum\limits_{{i \in S}} {\alpha _{i} } y_{i} \nabla _{{\mathbf{x}}} {\mathbf{K}}({\mathbf{x}}_{i} ,{\mathbf{x}}). $$
(7)

Here, if we use the Radial Basis Function (RBF) as the kernel function, for this kernel \({\mathbf{K}}({\mathbf{x}}_{i} ,{\mathbf{x}}_{j} ) = e^{{ - \frac{{||{\mathbf{x}}_{i} - {\mathbf{x}}_{j} ||_{2}^{2} }}{{\sigma ^{2} }}}}\), the gradient is \(\nabla _{{\mathbf{x}}} {\mathbf{K}}({\mathbf{x}}_{i} ,{\mathbf{x}}) = - \frac{2}{{\sigma ^{2} }}e^{{ - \frac{{||{\mathbf{x}}_{i} - {\mathbf{x}}||}}{{\sigma ^{2} }}}} ({\mathbf{x}} - {\mathbf{x}}_{i} )\). Therefore, the gradient of \(f({\mathbf{x}})\) can be rewritten as

$$ \nabla _{{\mathbf{x}}} f({\mathbf{x}}) = - \frac{2}{{\sigma ^{2} }}\sum\limits_{{i \in S}} {\alpha _{i} } y_{i} e^{{ - \frac{{||{\mathbf{x}}_{i} - {\mathbf{x}}||}}{{\sigma ^{2} }}}} ({\mathbf{x}} - {\mathbf{x}}_{i} ). $$
(8)

According to Algorithm 1, we can thus find the adversarial sample.

figure a

3 The Defense Based on Kernel Optimization

If we choose RBF as the kernel function, according to Eq. (1), the dual problem of SVM can be described as

$$ \begin{array}{*{20}l} {\mathop {\min }\limits_{\alpha } {\text{ }}\frac{1}{2}\sum\limits_{{i = 1}}^{N} {\sum\limits_{{j = 1}}^{N} {\alpha _{i} } } \alpha _{j} y_{i} y_{j} e^{{ - \frac{{||{\mathbf{x}}_{i} - {\mathbf{x}}_{j} ||_{2}^{2} }}{{\sigma ^{2} }}}} - \sum\limits_{{i = 1}}^{N} {\alpha _{i} } } \hfill \\ {s.t.{\text{ }}\sum\limits_{{i = 1}}^{N} {\alpha _{i} } y_{i} = 0} \hfill \\ {\quad \quad \alpha _{i} \ge 0,{\text{ }}i = 1,2...,N.} \hfill \\ \end{array} $$
(9)

After solving Eq. (9) to obtain the value of \({\mathbf{\alpha }}\), considering optimize the kernel parameter to improve the ability of defense against adversarial attack. We noted the support vector as \({\mathbf{x}}_{s}\) then the discriminant function of support vectors is \(f({\mathbf{x}}_{s} ) = {\mathbf{w}}_{\Phi }^{T} \Phi ({\mathbf{x}}_{s} ) + b = \pm 1\). Combining with Eq. (4), correspondingly, we get the minimum perturbation radius of the support vector against the adversarial samples, which is as below

$$ {\mathbf{\varepsilon }} = \frac{1}{{||{\mathbf{w}}_{\Phi } ||_{2} }}. $$
(10)

To make our model more difficult to be attacked, we urgently maximize the minimum perturbation semidiameter. Therefore, the task of defense is to maximize the value of Eq. (10), which can be achieved by minimizing \(||{\mathbf{w}}_{\Phi } ||_{2}^{2}\). When given the value of \({\mathbf{\alpha }}\), combined with Eq. (5), the optimization of the kernel parameters to defend the attacks as follows

$$ \mathop {\min }\limits_{\alpha } {\text{ }}A(\sigma ) = \sum\limits_{{i \in S}} {\sum\limits_{{j \in S}} {\alpha _{i} } } \alpha _{j} y_{i} y_{j} e^{{ - \frac{{||{\mathbf{x}}_{i} - {\mathbf{x}}_{j} ||_{2}^{2} }}{{\sigma ^{2} }}}} . $$
(11)

This is an unconstrained optimization problem, which can be solved by the gradient descent method

$$ \sigma _{k} = \sigma _{{k - 1}} - \eta A^{\prime}(\sigma _{{k - 1}} ), $$
(12)

where \({\text{ }}A^{\prime}(\sigma ) = \frac{2}{{\sigma ^{3} }}\sum\limits_{{i \in S}} {\sum\limits_{{j \in S}} {\alpha _{i} } } \alpha _{j} y_{i} y_{j} ||{\mathbf{x}}_{i} - {\mathbf{x}}_{j} ||_{2}^{2} e^{{ - \frac{{||{\mathbf{x}}_{i} - {\mathbf{x}}_{j} ||_{2}^{2} }}{{\sigma ^{2} }}}}\). The Gaussian kernel parameter optimization algorithm for defending against adversarial attack, as shown in Algorithm 2. The initial value of the kernel parameter can be defined as \(\sigma ^{{(0)}} = \sqrt {\frac{1}{{N(N - 1)}}\sum\limits_{{i = 1}}^{N} {\sum\limits_{{j = 1}}^{N} {||{\mathbf{x}}_{i} - {\mathbf{x}}_{j} ||_{2}^{2} } } }\), where N is the number of training samples.

In [15], they proposed a simple yet accurate method for computing and comparing the robustness of different classifiers to adversarial perturbations; they defined the average robustness \(\hat{\rho }_{{adv}} (f)\) as follows

$$ \hat{\rho }_{{adv}} (f) = \frac{1}{D}\sum\limits_{{x \in D}} {\frac{{||{\mathbf{\hat{r}}}({\mathbf{x}})||_{2} }}{{||{\mathbf{x}}||_{2} }}} . $$
(13)

To verify the effectiveness of our defense method, we also use this method to compare the robustness of the classifier under different kernel parameters.

figure b

4 Experimental Results

Datasets.

For the sake of demonstrating the effectiveness of the kernel optimization defense method, we validated it on MNIST [28] and CIFAR-10[29] image classification datasets, respectively. In these experiments, we only consider a standard SVM with the RBF kernel and choose data from two classes, considering one class as the benign class and a different one as the attack class. The class and number of samples employed in each training and test set are given in Table 1. In order to limit the range of the adversarial example, each pixel of the example in both datasets is normalized to \({\mathbf{x}} \in [0,1]^{d}[0,1]\)\(^{d}\) by dividing by 255, in which d represents the number of feature vectors. For the MNIST dataset, each digital image represents a grayscale image of \(28\,*\,28\) pixels, which means that feature vectors have \(d = 28 * 28 = 784\) values, while for the CIFAR-10 dataset, each image is a color image with three channels and each channel have 32 * 32 pixels, which means that feature vectors have \(d = 32 * 32 * 3 = 3072\) features. In these experiments, only the kernel parameter \(\sigma\) is considered, and the regularization parameter \(c\) of the SVM is fixed to default.

Table 1. Datasets used for training and testing with RBF-SVMs

After the process of training, \({\mathbf{\alpha }}\) can be obtained, and we began to the kernel optimization training. According to Sect. 3, we know that the defense method’s task is to maximize Eq. (10), that is, to minimize function \(A\) in Eq. (11). The gradient descent method is used to evaluate the function \(A\), as described in Algorithm 2. The graph of the value of function \(A\) varying with the value of is shown in Fig. 2. We found that the value of the function \(A\) grows with the increase of \(\sigma\) on the two datasets. Therefore, the minimum value of the function \(A\) is obtained at the initial value of \(\sigma\) on both datasets.

Then we verify the effectiveness of the defense method of \(\sigma\) at different values. We use a method that we proposed in Sect. 2.2 to generate adversarial samples. In order to prevent the gradient from disappearing, we add a small value \(\eta {\text{ = 0}}{\text{.02}}\) to the disturbance each time we generate adversarial samples. The method used to generate the adversarial sample is shown in Algorithm 2. On the MNIST dataset, we selected the value of \(\sigma\) as 8.6 (initial value of the \(\sigma\)), 20, 40, and 100, respectively, and then compared the generated adversarial samples (as shown in Fig. 3 on the top). On the CIFAR-10 dataset, we selected the value of \(\sigma\) 19.6 (the initial value of \(\sigma\)), 30, 40, and 50, and then compared the resulting adversarial samples (as shown in Fig. 3 on the bottom).

Fig. 2.
figure 2

How the function \(A\) changes with different values of \(\sigma\) on MNIST and CIFAR-10 datasets. The picture shows that function \(A\) and \(\sigma\) are positively correlated.

Fig. 3.
figure 3

Different defense effects. The figure on the top was the result obtained on the MNIST dataset. On the top row, the first picture is the original example, representing the digit ‘1’, and the other four pictures are the adversarial samples generated by the initial sample under different kernel parameters, representing the digit ‘7’. The picture below shows the results of the CIFAR-10 dataset. On the bottom row, the first one is the original example, which represents ‘dog’. The other four are the adversarial samples generated by the first one under different \(\sigma\), which is meant ‘cat’.

Finally, we verified the robustness of the classifier under different values of the kernel parameters. As shown in Fig. 4, after kernel optimization, it significantly increased the robustness of the classifier.

Fig. 4.
figure 4

Relation diagram between the robustness of the classifier and the kernel parameter on MNIST and CIFAR-10 datasets. As the value of \(\sigma\) increases, the robustness of the classifier will decrease. The performance is more obvious on the CIFAR-10 dataset.

5 Discussion and Conclusion

In this work, we are the first to propose a strategy for protecting SVMs against the adversarial generation method which is based on kernel. In [26], they put forward a technique based on hyperplane classification for generating adversarial examples of deep neural networks. We think a similar approach could also work for SVMs, namely applying it to SVM classifiers. Through experiments, it is confirmed that this method was beneficial on SVM, especially on MNIST dataset, which have been caused by nearly 100% misclassification. According to this phenomenon, we proposed a strategy for protecting SVMs against the adversarial attack. This defense approach is based on the kernel optimization of SVM. We extensively evaluate our proposed attack and defense algorithms on MNIST and CIRAR-10 datasets.

According to Fig. 3, we found that when the initial value of the \(\sigma\), that is, the minimum value of its corresponding function \(A\) (see Fig. 2), was taken, there was the largest perturbation required to generate the adversarial sample, which means that the defenses are at their best. This finding holds for both datasets. The experimental results also show that our proposed defense method can effectively increase the price of attackers and achieve a robust performance (see Fig. 4). This gives the classifier’s designer a better picture of the classifier performance under adversarial attacks.

In this paper, we first described a practical attack method which has already confirmed to be effective. Then we proposed a defense method which is based on kernel. The experimental results demonstrated that the defense method is useful and effective to the security of SVM. Finally, we believe that our work will inspire future research towards developing more secure learning algorithms against adversarial attacks.