Laplacain Pair-Weight Vector Projection with Adaptive Neighbor Graph for Semi-supervised Learning

Xue, Yangtao; Zhang, Li

doi:10.1007/978-981-19-6142-7_18

Yangtao Xue¹² &
Li Zhang¹²

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1637))

Included in the following conference series:

International Conference on Neural Computing for Advanced Applications

714 Accesses

Abstract

Recently, Laplacian pair-weight vector projection (LapPVP) algorithm was proposed for semi-supervised classification. Although LapPVP achieves a good classification performance for semi-supervised learning, it may be sensitive to noise and outliers for using the neighbor graph with a fixed similarity matrix. To remedy it, this paper proposes a novel method named Laplacain pair-weight vector projection with adaptive neighbor graph (ANG-LapPVP), in which the graph induced by the Laplacian manifold regularization is adaptively constructed by solving an optimization problem. For binary classification problems, ANG-LapPVP learns a pair of projection vectors by solving the pair-wise optimal formulations in which we maximize the between-class scatter and minimize both the within-class scatter and the adaptive neighbor graph (ANG) regularization. The ANG regularization is to learn the ANG whose similarity matrix varies with iterations, which may solve the issue of LapPVP. Thus, ANG-LapPVP simultaneously learns adaptive similarity matrices and a pair of projection vectors with an iterative process. Experimental results on an artificial and real-world benchmark datasets show the superiority of ANG-LapPVP compared to the related methods. Thus, ANG-LapPVP is promising in semi-supervised learning.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Semi-supervised Multi-class Classification Methods Based on Laplacian Vector Projection

Manifold proximal support vector machine with mixed-norm for semi-supervised classification

Article 01 October 2014

Adaptive semi-supervised dimensionality reduction based on pairwise constraints weighting and graph optimizing

Article 02 June 2015

Keywords

1 Introduction

Recently, more and more research attention acts in for semi-supervised classification tasks. On the basis of graph theory, the manifold regularization is becoming a popular technology to extend supervised learners to semi-supervised ones [2, 4, 6]. Considering the structure information provided by unlabeled data, semi-supervised learners outperform related supervised ones and have more applications in reality [1, 3, 12, 15].

Because the performance of semi-supervised learners partly depends on the corresponding supervised ones, we need to choose a good supervised learner to construct a semi-supervised one. Presently, inspired by the idea of non-parallel planes, many supervised algorithms have been designed for dealing with binary classification problems, such as generalized proximal support vector machine (GEPSVM) [9, 16], twin support vector machine (TSVM) [7], least squares twin support vector machine (LSTSVM) [8], multi-weight vector support vector machine (MVSVM) [18], and enhanced multi-weight vector projection support vector machine (EMVSVM) [17]. In these learners with non-parallel planes, each plane is as close as possible to samples from its own class and meanwhile as far as possible from samples belonging to the other class. Owing to the outstanding generalization performance of TSVM and LSTSVM, they have been extended to semi-supervised learning by using the manifold regularization framework [3, 12]. In [12], Laplacian twin support vector machine (LapTSVM) constructs a more reasonable classifier from labeled and unlabeled data by integrating the manifold regularization. Chen et al. [3] proposed Laplacian least squares twin support vector machine (LapLSTSVM) based on LapTSVM and LSTSVM. Different from LapTSVM, LapLSTVM needs to solve only two systems of linear equations with remarkably less computational time. These semi-supervised learners have proved that the manifold regularization is a reasonable and effective technology. On the basis of EMVSVM and manifold regularization, Laplacian pair-weight vector projection (LapPVP) was extended to semi-supervised learning for binary classification problems [14]. LapPVP achieves a pair of projection vectors by maximizing the between-class scatter and minimizing the within-class scatter and manifold regularization.

These performance of semi-supervised learners is partly related to the neighbor graph induced by the manifold regularization. Generally, the neighbor graph is predefined and may be sensitive to noise and outliers [10]. To improve the robustness of LapPVP, we propose a novel semi-supervised learner, named Laplacain pair-weight vector projection with adaptive neighbor graph (ANG-LapPVP). ANG-LapPVP learns a pair of projection vectors by solving the pair-wise optimal formulations, where we maximize the between-class scatter and minimize both the within-class scatter and the adaptive neighbor graph (ANG) regularization. In ANG, the similarity matrix is not fixed but adaptively learned on both labeled and unlabeled data by solving an optimization problem [11, 19, 20]. Moreover, the between- and within-class scatter matrices are computed separably for each class, which can strengthen the discriminant capability of ANG-LapPVP. Therefore, it is easy for ANG-LapPVP to handle binary classification tasks and achieve a good performance.

2 Proposed Method

The propose method ANG-LapPVP is an enhanced version of LapPVP. In ANG-LapPVP, we learn an ANG based on the assumption that the smaller the distance between data points is, the greater the probability of being neighbors is. Likewise, ANG-LapPVP is to find a pair of the projection vectors by maximizing between-class scatter and minimizing both the within-class scatter and the ANG regularization.

Let $\textbf{X}= [\textbf{X}_\ell ; \textbf{X}_u]\in \mathbb {R}^{n\times m}$ be the training sample matrix, where n and m are the number of total samples and features, respectively; $\textbf{X}_\ell \in \mathbb {R}^{\ell \times m}$ and $\textbf{X}_u\in \mathbb {R}^{u\times m}$ are the labeled and unlabeled sample matrices, respectively; $\ell $ and u are the number of labeled and unlabeled samples, respectively, and $n=\ell +u$. For convenience, we use $y_i$ to describe the label situation of sample $\textbf{x}_i$. If $y_i=1$, $\textbf{x}_i$ is a labeled and positive sample; if $y_i=-1$, $\textbf{x}_i$ is a labeled and negative sample; if $y_i=0$, $\textbf{x}_i$ is unlabeled. Furthermore, the labeled sample matrix $\textbf{X}_{\ell }$ can be represented as $\textbf{X}_\ell = [\textbf{X}_1; \textbf{X}_2]$, where $\textbf{X}_1 = [\textbf{x}_{11}, \textbf{x}_{12}, \dots , \textbf{x}_{1\ell _1}]^T \in \mathbb {R}^{\ell _1 \times m}$ is the positive sample matrix with a label of 1, $\textbf{X}_2 = [\textbf{x}_{21}, \textbf{x}_{22}, \dots , \textbf{x}_{2\ell _2}]^T \in \mathbb {R}^{\ell _2 \times m}$ is the negative sample matrix with a label of $-1$, $\ell =\ell _1+\ell _2$, $\ell _1$ and $\ell _2$ are the number of positive and negative samples, respectively.

2.1 Formulations of ANG-LapPVP

For binary classification tasks, the goal of ANG-LapPVP is to find a pair of projection vectors similar to LapPVP. As mentioned above, the proposed ANG-LapPVP is an enhanced version of LapPVP. To better describe our method, we first briefly introduce LapPVP [14]. For the positive class, LapPVP is to solve the following optimization problem:

$$\begin{aligned} \begin{aligned}&\max \limits _{\textbf{v}_1}\quad \textbf{v}_1^T \textbf{B}_1 \textbf{v}_1 - \alpha _1 \textbf{v}_1^T \textbf{W}_1 \textbf{v}_1- \beta _1\textbf{v}_1^T \textbf{X}^T \textbf{L} \textbf{X} \textbf{v}_1 \\&s.t. \quad \textbf{v}_1^T\textbf{v}_1 = 1 \end{aligned} \end{aligned}$$

(1)

where $\alpha _1 > 0$ and $\beta _1>0$ are regularization parameters, $\textbf{L}$ is the Laplacian matrix of all training data, and $\textbf{B}_1$ is the between-class scatter matrix and $\textbf{W}_1$ is the within-class scatter matrix of the positive class, which can be calculated by

$$\begin{aligned} \textbf{B}_{1} =\left( \textbf{X}-\textbf{e} \textbf{u}_1^T\right) ^T\left( \textbf{X}-\textbf{e} \textbf{u}_1^T\right) \end{aligned}$$

(2)

and

$$\begin{aligned} \textbf{W}_1 = \left( \textbf{X}_1 -\textbf{e}_1 \textbf{u}_1^T\right) ^T\left( \textbf{X}_1 -\textbf{e}_1 \textbf{u}_1^T\right) \end{aligned}$$

(3)

where $\textbf{v}_1\in \mathbb {R}^{m}$ is the projection vector of the positive class, $\textbf{u}_1=\frac{1}{\ell _1}\sum _{i=1}^{\ell _1} \textbf{x}_{1i}$ is the mean vector of the positive samples, $\textbf{e}_1 \in \mathbb {R}^{\ell _1}$ and $\textbf{e} \in \mathbb {R}^{\ell }$ are the vectors of all ones with different length.

In the optimization problem (1), the laplacian matrix $\textbf{L}$ is computed in advance and is independent of the objective function. The concept of ANG was proposed in [11], which has been applied to feature selection for unsupervised multi-view learning [19] and semi-supervised learning [20]. We incorporate this concept into LapPVP and form ANG-LapPVA.

Similarly, the pair of projection vectors of ANG-LapPVP is achieved by a pair-wise optimal formulations. On the basis of (1), the optimal formulation of ANG-LapPVP for the positive class is defined as:

$$\begin{aligned} \begin{aligned}&\max \limits _{\textbf{v}_1,\textbf{S}_1} \quad \textbf{v}_1^T \textbf{B}_1 \textbf{v}_1 - \alpha _1 \textbf{v}_1^T \textbf{W}_1 \textbf{v}_1- \beta _1(\textbf{v}_1^T \textbf{X}^T \textbf{L}_{s_1} \textbf{X} \textbf{v}_1 + \gamma _1 \textbf{S}_1^T \textbf{S}_1) \\&s.t. \quad \textbf{v}_1^T\textbf{v}_1 = 1, \quad \textbf{S}_1 \textbf{e} = \textbf{e}, \quad \textbf{S}_1 > 0 \end{aligned} \end{aligned}$$

(4)

where $\textbf{S}_1$ is the similarity matrix for the positive class, $\textbf{L}_{s_1}$ is the Laplacian matrix related to $\textbf{S}_1$ for the positive class, and $\gamma _1>0$ is a regularization parameter.

Compared with (1), (4) has a different term, or the third term, which is called the ANG regularization here. $\textbf{S}_1$ varies with iterations, and then the Laplacian matrix $\textbf{L}_{s_1} = \textbf{D}_{s_1} - \textbf{S}_1$ is changed, where $\textbf{D}_{s_1}$ is a diagonal matrix with diagonal elements of $({D}_{s_1})_{ii} = \sum _j ({S}_{1})_{ij}$. The first and second terms represent the between- and within-class scatters of the positive class, and the regularization parameter $\alpha _1$ is to make a balance between these two scatters. ANG-LapPVP can keep data points as near as possible in the same class while as far as possible from the other class by maximizing the between-class scatter and minimizing the within-class scatter.

For the negative class, ANG-LapPVP has the following similar problem:

$$\begin{aligned} \begin{aligned}&\max \limits _{\textbf{v}_2, \textbf{S}_2} \textbf{v}_2^T \textbf{B}_2 \textbf{v}_2 - \alpha _2 \textbf{v}_2^T \textbf{W}_2 \textbf{v}_2 - \beta _2 (\textbf{v}_2^T \textbf{X}^T \textbf{L}_{s_2} \textbf{X} \textbf{v}_2 + \gamma _2 \textbf{S}_2^T \textbf{S}_2) \\&s.t. \quad \textbf{v}_2^T\textbf{v}_2 = 1, \quad \textbf{S}_2 \textbf{e} = \textbf{e}, \quad \textbf{S}_2 > 0 \end{aligned} \end{aligned}$$

(5)

where $\textbf{v}_2$ is the projection vector for the negative class, $\alpha _2$, $\beta _2$, and $\gamma _2$ are positive regularization parameters, $\textbf{S}_2$ is the similarity matrix for the negative class, $\textbf{L}_{s_2}$ is the Laplacian matrix related to $\textbf{S}_2$, $\textbf{B}_2$ and $\textbf{W}_2$ are the between- and within-class scatter matrices for the negative class, respectively, which can be written as:

$$\begin{aligned} \textbf{B}_2 = \left( \textbf{X}- \textbf{e} \textbf{u}_2^T\right) ^T\left( \textbf{X}- \textbf{e} \textbf{u}_2^T\right) \end{aligned}$$

(6)

and

$$\begin{aligned} \textbf{W}_2 = \left( \textbf{X}_2-\textbf{e}_2 \textbf{u}_2^T\right) ^T\left( \textbf{X}_2-\textbf{e}_2 \textbf{u}_2^T\right) \end{aligned}$$

(7)

where $\textbf{u}_2=\frac{1}{\ell _2}\sum _{i=1}^{\ell _2} \textbf{x}_{2i}$ is the mean vector of negative samples, $\textbf{e}_2 \in \mathbb {R}^{\ell _c}$ is the vector of all ones.

2.2 Optimization of ANG-LapPVP

Problems (4) and (5) form the pair of optimization problems for ANG-LapPVP, where projection vectors $\textbf{v}_1$ and $\textbf{v}_2$ and similarity matrices $\textbf{S}_1$ and $\textbf{S}_2$ are unknown. It is difficult to find the optimal solution to them at the same time. Thus, we use an alternative optimization approach to solve (4) or (5). During the optimization procedure, we would fix a set of variables and solve the other set of ones.

When $\textbf{S}_1$ and $\textbf{S}_2$ are fixed, the optimization formulations of ANG-LapPVP can be reduced to

$$\begin{aligned} \begin{aligned}&\max \limits _{\textbf{v}_1} \quad \textbf{v}_1^T \textbf{B}_1 \textbf{v}_1 - \alpha _1 \textbf{v}_1^T \textbf{W}_1 \textbf{v}_1- \beta _1 \textbf{v}_1^T \textbf{X}^T \textbf{L}_{s_1} \textbf{X} \textbf{v}_1 \\&s.t. \quad \textbf{v}_1^T \textbf{v}_1 = 1 \end{aligned} \end{aligned}$$

(8)

and

$$\begin{aligned} \begin{aligned}&\max \limits _{\textbf{v}_2} \quad \textbf{v}_2^T \textbf{B}_2 \textbf{v}_2 - \alpha _2 \textbf{v}_2^T \textbf{W}_2 \textbf{v}_2 - \beta _2 \textbf{v}_2^T \textbf{X}^T \textbf{L}_{s_2} \textbf{X} \textbf{v}_2 \\&s.t. \quad \textbf{v}_2^T \textbf{v}_2= 1 \end{aligned} \end{aligned}$$

(9)

which are exactly LapPVP.

According to the way in [14], we can find the solutions $\textbf{v}_1$ and $\textbf{v}_2$ to (8) and (9), respectively. In [14], (8) and (9) can be respectively converted to the following eigenvalue decomposition problems:

$$\begin{aligned} \begin{aligned}&\textbf{B}_1 \textbf{v}_1 - \alpha _1 \textbf{W}_1 \textbf{v}_1- \beta _1 \textbf{X}^T \textbf{L}_{s_1} \textbf{X} \textbf{v}_1 = \lambda _1 \textbf{v}_1 \\&\textbf{B}_2 \textbf{v}_2 - \alpha _2 \textbf{W}_2 \textbf{v}_2 - \beta _2 \textbf{X}^T \textbf{L}_{s_2} \textbf{X} \textbf{v}_2 = \lambda _2 \textbf{v}_2 \end{aligned} \end{aligned}$$

(10)

where $\lambda _1$ and $\lambda _2$ are eigenvalues for the positive and negative classes, respectively. Thus, the optimal solutions here are the eigenvectors corresponding to the largest eigenvalues.

Once we get the projection vectors $\textbf{v}_1$ and $\textbf{v}_2$, we take them as fixed variables, then solve $\textbf{S}_1$ and $\textbf{S}_2$. In this case, the optimization formulations of ANG-LapPVP are reduced to

$$\begin{aligned} \begin{aligned}&\min \limits _{\textbf{S}_1}~~ \textbf{v}_1^T \textbf{X}^T \textbf{L}_{s_1} \textbf{X} \textbf{v}_1 + \gamma _1 \textbf{S}_1^T \textbf{S}_1 \\&s.t. \quad \textbf{S}_1 \textbf{e} = \textbf{e}, \quad \textbf{S}_1 > 0 \end{aligned} \end{aligned}$$

(11)

and

$$\begin{aligned} \begin{aligned}&\min \limits _{\textbf{S}_2}~~ \textbf{v}_2^T \textbf{X}^T \textbf{L}_{s_2} \textbf{X} \textbf{v}_2 + \gamma _2 \textbf{S}_2^T \textbf{S}_2 \\&s.t. \quad \textbf{S}_2 \textbf{e} = \textbf{e}, \quad \textbf{S}_2 > 0 \end{aligned} \end{aligned}$$

(12)

For simplicity, let $(Z_1)_{ij} = ||\textbf{v}_1^T\textbf{x}_i-\textbf{v}_1^T\textbf{x}_j||^2$ and $(Z_2)_{ij} = ||\textbf{v}_2^T\textbf{x}_i-\textbf{v}_2^T\textbf{x}_j||^2$. Then matrices $\textbf{Z}_1$ and $\textbf{Z}_2$ are constant when $\textbf{v}_1$ and $\textbf{v}_2$ are fixed. Thus, (11) and (12) can be rewritten as:

$$\begin{aligned} \begin{aligned}&\min \limits _{\textbf{S}_{1}} \quad \frac{1}{2}(\textbf{S}_1 + \frac{1}{2\gamma _1} \textbf{Z}_1)^T(\textbf{S}_1 + \frac{1}{2\gamma _1} \textbf{Z}_1) \\&s.t. \quad \textbf{S}_1 \textbf{e} = \textbf{e}, \quad \textbf{S}_1 > 0 \end{aligned} \end{aligned}$$

(13)

and

$$\begin{aligned} \begin{aligned}&\min \limits _{\textbf{S}_{2}} \quad \frac{1}{2}(\textbf{S}_2 + \frac{1}{2\gamma _2} \textbf{Z}_2)^T(\textbf{S}_2 + \frac{1}{2\gamma _2} \textbf{Z}_2) \\&s.t. \quad \textbf{S}_2 \textbf{e} = \textbf{e}, \quad \textbf{S}_2 > 0 \end{aligned} \end{aligned}$$

(14)

Because (13) and (14) are similar, we describe the optimization procedure only for (13). First, we generate the Lagrangian function of (13) with multipliers $\delta _1$ and $\zeta _1$ as follows:

$$\begin{aligned} L(\textbf{S}_1,\delta _1,\zeta _1) = \left( \textbf{S}_1 + \frac{1}{2\gamma _1} \textbf{Z}_1\right) ^T \left( \textbf{S}_1 + \frac{1}{2\gamma _1} \textbf{Z}_1\right) - \delta _1(\mathbf {S_1} \textbf{e}- \textbf{e}) - \zeta _1\textbf{S}_1 \end{aligned}$$

(15)

According to the KKT condition [13], we derive the partial derivative of $L(\textbf{S}_1,\delta _1,\zeta _1)$ with respect to the primal variables $\textbf{S}_1$ and make it vanish, which results in

$$\begin{aligned} \textbf{S}_1 = \delta _1 + \zeta _1-\frac{1}{2\gamma _1} \textbf{Z}_1 \end{aligned}$$

(16)

Similarly, the similarity matrix $\textbf{S}_2$ is achieved by

$$\begin{aligned} \textbf{S}_2 = \delta _2 + \zeta _2 -\frac{1}{2\gamma _2} \textbf{Z}_2 \end{aligned}$$

(17)

where $\delta _2$ and $\zeta _2$ are positive Lagrange multipliers.

Since the constraint $\textbf{S}\textbf{e}=\textbf{e}$, we have

$$\begin{aligned} \delta _1 + \zeta _1 = \frac{1}{n}+\frac{1}{2n\gamma _1} \textbf{Z}_1 \end{aligned}$$

(18)

and

$$\begin{aligned} \delta _2 + \zeta _2 = \frac{1}{n}+\frac{1}{2n\gamma _2} \textbf{Z}_2 \end{aligned}$$

(19)

where the parameters $\gamma _1$ and $\gamma _2$ can be computed as follows [11]:

$$\begin{aligned} \gamma _1 = \frac{1}{n} \textbf{e}^T\left( \frac{k}{2} \textbf{Z}_1 \textbf{q}_{k}-\frac{1}{2}\textbf{Z}_1 \widetilde{\textbf{q}}_{k-1}\right) \end{aligned}$$

(20)

and

$$\begin{aligned} \gamma _2 = \frac{1}{n} \textbf{e}^T\left( \frac{k}{2} \textbf{Z}_1 \textbf{q}_{k}-\frac{1}{2}\textbf{Z}_1 \widetilde{\textbf{q}}_{k-1}\right) \end{aligned}$$

(21)

where k is the neighbor number in the graph, $\textbf{q}_{k}$ and $\widetilde{\textbf{q}}_{k-1}$ are indicator vectors. We set $\textbf{q}_k = [0, 0, \cdots , 0, 1, 0, \cdots , 0, 0]^T \in \mathbb {R}^n$ in which the k-th element is one and the others are zero and $\widetilde{\textbf{q}}_{k-1} = [1,1, \cdots 1, 0, \cdots , 0, 0]^T \in \mathbb {R}^n$ in which the first $(k-1)$ elements are one and the others are zero.

2.3 Strategy of Classification

The pair of projection vectors $(\textbf{v}_1,\textbf{v}_2)$ can project data points into two different subspaces. The distance measurement is a reasonable way to estimate the class label of an unknown data point $\textbf{x} \in \mathbb {R}^m$. Here, we define the strategy of classification using the minimum distance.

For an unknown point $\textbf{x}$, we project it into two subspaces induced by $\textbf{v}_1$ and $\textbf{v}_2$. In the subspace induced by $\textbf{v}_1$, the projection distance between $\textbf{x}$ and positive samples is defined as:

$$\begin{aligned} {d}_1 = \min \limits _{i=1,2,\cdots ,\ell _1} \left( \textbf{v}_1^T \textbf{x} -\textbf{v}_1^T \textbf{x}_{1i}\right) ^2 \end{aligned}$$

(22)

In the subspace induced by $\textbf{v}_2$, the projection distance between $\textbf{x}$ and negative samples is computed as:

$$\begin{aligned} {d}_2 = \min \limits _{i=1,2,\cdots ,\ell _2} \left( \textbf{v}_2^T \textbf{x} -\textbf{v}_2^T \textbf{x}_{2i}\right) ^2 \end{aligned}$$

(23)

It is reasonable that $\textbf{x}$ is taken as the positive point if $d_1< d_2$, which is the minimum distance strategy. Thus, we assign a label to $\textbf{x}$ by the following rule:

$$\begin{aligned} \hat{y} = \left\{ \begin{array}{c l} 1, &{} if~~ d_1 \le d_2\\ -1, &{} Otherwise \end{array} \right. \end{aligned}$$

(24)

2.4 Computational Complexity

Here, we analyze the computational complexity of ANG-LapPVP. Problems (4) and (5) are non-convex. The ultimate optimal pair-wise projection vectors are obtained by applying an iterative method. In iterations, the optimization problems of ANG-LapPVP can be decomposed into eigenvalue decomposition ones and quadratic programming ones with constraints.

The computational complexities of an eigenvalue decomposition problem and a quadratic programming one are $O\left( m^2\right) $ and $O\left( n^2\right) $, respectively, where m is the number of features and n is the number of samples. Let t be the iteration times. Then, the total computational complexity of ANG-LapPVP is $O\left( t\left( m^2+n^2\right) \right) $. In the iteration process of ANG-LapPVP, the convergence condition is set as the difference between current and previous projection vectors, i.e., $||\textbf{v}^t_{1}- \textbf{v}^{t-1}_{1}|| \le 0.001$ or $||\textbf{v}^t_{2}- \textbf{v}^{t-1}_{2}|| \le 0.001$.

3 Experiments

We conduct experiments in this section. Firstly, we compare ANG-LapPVP with LapPVP on an artificial dataset to illustrate the improvement achieved by the ANG regularization. The comparison with other non-parallel planes algorithms on benchmark datasets is then implemented to analyze the performance of ANG-LapPVP.

3.1 Experiments on Artificial Dataset

An artificial dataset, called CrossPlane, is generated by perturbing points originally lying on two intersecting planes. CrossPlane contains 400 instances with only 2 labeled and 198 unlabeled ones for each class. The distribution of CrossPlane is shown in Fig. 1. Obviously, some data points belonging to Class $+1$ are surrounded by the data points of Class $-1$ and vice versa.

Figure 2 plots the projection vectors learned by LapPVP and ANG-LapPVP. We can see that projection vectors learned by ANG-LapPVP are more suitable than LapPVP. The accuracy of LapPVP is $91.50\%$, and that of ANG-LapPVP is $97.00\%$. Clearly, ANG-LapPVP has a better classification performance on the CrossPlane dataset. In other words, ANG-LapPVP is robust to noise and outliers. In to all, the ANG regularization can improve the performance of LapPVP, which makes ANG-LapPVP better.

3.2 Experiments on Benchmark Datasets

In the following experiments, we compare ANG-LapPVP with supervised algorithms, including GEPSVM, MVSVM, EMVSVM, TSVM and LSTSVM to evaluate the effectiveness of ANG-LapPVP, and compare it with semi-supervised algorithms (LapTSVM, LapLSTSVM and LapPVP) to verify the superiority of ANG-LapPVP on ten benchmark datasets. The benchmark datasets are collected from the UCI Machine Learning Repository [5]. We normalize the datasets so that all features range in the interval [0, 1].

Each experiment is run 10 times with random 70% training data and the rest 30% test data. The average classification results are reported as the final ones. The grid search method is applied to finding the optimal hyper-parameters in each trial. Parameters $\beta _1$ and $\beta _2$ in both LapPVP and ANG-LapPVP are selected from $\{2^{-10}, 2^{-9}, \dots , 2^{0}\}$, and other regularization parameters in all methods are selected from the set $\{2^{-5}, 2^{-4}, \dots , 2^{5}\}$. In semi-supervised methods, the number of nearest neighbors is selected from the set $\{3, 5, 7, 9\}$.

Table 1. Mean accuracy and standard deviation (%) obtained by supervised algorithms with different scale of labeled data.

Full size table

Comparison with Supervised Algorithms. We first compare ANG-LapPVP with GEPSVM, MVSVM, EMVSVM, TSVM and LSTSVM to investigate the performance of adaptive neighbors graph. Specially, we discuss the impact of the different scale of labeled data on these algorithms. Additionally, ANG-LapPVP has $50\%$ training samples as unlabeled ones.

Table 1 lists the results of supervised algorithms and ANG-LapPVP with $10\%$, $30\%$ and $50\%$ of training data as labeled samples, where the best results are highlighted. Experimental results in Table 1 show the effectiveness of ANG-LapPVP. With the increasing number of labeled data, the accuracy of ANG-LapPVP on most datasets goes up gradually, which indicates that the labeled data can provide more discriminant information. Moreover, we observe that ANG-LapPVP trained with unlabeled data has the best classification performance on all ten datasets except Breast with $50\%$ labeled data and German with three situations, which fully demonstrates the significance of the adaptive similarity matrices provided by labeled and unlabeled training data. Generally speaking, semi-supervised algorithms outperform the related supervised ones, and the proposed ANG-LapPVP gains the most promising classification performance.

Comparison with Semi-supervised Algorithms. To validate the superiority of ANG-LapPVP, we further analyze experimental results of LapTSVM, LapLSTSVM, LapPVP and ANG-LapPVP. Tables 2 and 3 list mean accuracy and standard deviation obtained by semi-supervised algorithms on $30\%$ and $50\%$ training samples as unlabeled ones, respectively, where the best results are in bold. Additionally, there are $20\%$ training samples as labeled ones.

Table 2. Mean accuracy and standard deviation (%) obtained by semi-supervised algorithms on $30\%$ unlabeled data.

Full size table

From the results in Tables 2 and 3, we can see that ANG-LapPVP has a higher accuracy than LapPVP on all ten datasets except Heart with $30\%$ unlabeled data. The evidence further indicates that ANG-LapPVP with the ANG regularization well preserves the structure of training data and has a better classification performance than LapPVP. Moreover, compared with the other semi-supervised algorithms, ANG-LapPVP has the highest accuracy on eight datasets in Table 2 and on nine datasets in Table 3. That is to say, ANG-LapPVP has substantial advantages over LapTSVM and LapLSTSVM. On the whole, ANG-LapPVP has an excellent ability in binary classification tasks.

Table 3. Mean accuracy and standard deviation (%) obtained by semi-supervised algorithms on $50\%$ unlabeled data.

Full size table

4 Conclusion

In this paper, we propose ANG-LapPVP for binary classification tasks. As the extension of LapPVP, ANG-LapPVP improves its classification performance by introducing the ANG regularization. The ANG regularization induces an adaptive neighbor graph where the similarity matrix is changed with iterations. Experimental results on the artificial and benchmark datasets validate that the effectiveness and superiority of the proposed algorithm. In a nutshell, ANG-LapPVP has a better classification performance than LapPVP and is a promising semi-supervised algorithm.

Although ANG-LapPVP achieves a good classification performance on datasets used here, the projection vectors obtained by ANG-LapPVP may be not enough when handling with a large scale dataset. In this case, we could consider projection matrices that may provide more discriminant information. Therefore, the dimensionality of projection matrices is a practical problem to be addressed in our following work. In addition, multi-class classification tasks in reality are also in consideration.

References

Belkin, M., Niyogi, P., Sindhwani, V.: Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J. Mach. Learn. Res. 7, 2399–2434 (2006)
MathSciNet MATH Google Scholar
Chapelle, O., Schölkopf, B., Zien, A.: Introduction to semi-supervised learning. In: Chapelle, O., Schölkopf, B., Zien, A. (eds.) Semi-Supervised Learning, pp. 1–12. The MIT Press, Cambridge (2006)
Google Scholar
Chen, W., Shao, Y., Deng, N., Feng, Z.: Laplacian least squares twin support vector machine for semi-supervised classification. Neurocomputing 145, 465–476 (2014)
Article Google Scholar
Culp, M.V., Michailidis, G.: Graph-based semisupervised learning. IEEE Trans. Pattern Anal. Mach. Intell. 30(1), 174–179 (2008)
Article Google Scholar
Dua, D., Graff, C.: UCI machine learning repository (2017). https://archive.ics.uci.edu/ml
Fan, M., Gu, N., Qiao, H., Zhang, B.: Sparse regularization for semi-supervised classification. Pattern Recogn. 44(8), 1777–1784 (2011)
Article Google Scholar
Jayadeva, Khemchandani, R., Chandra, S.: Twin support vector machines for pattern classification. IEEE Trans. Pattern Anal. Mach. Intell. 29(5), 905–910 (2007)
Google Scholar
Kumar, M.A., Gopal, M.: Least squares twin support vector machines for pattern classification. Expert Syst. Appl. 36(4), 7535–7543 (2009)
Article Google Scholar
Mangasarian, O.L., Wild, E.W.: Multisurface proximal support vector machine classification via generalized eigenvalues. IEEE Trans. Pattern Anal. Mach. Intell. 28(1), 69–74 (2006)
Article Google Scholar
Nie, F., Dong, X., Li, X.: Unsupervised and semisupervised projection with graph optimization. IEEE Trans. Neural Netw. Learn. Syst. 32(4), 1547–1559 (2021)
Article MathSciNet Google Scholar
Nie, F., Wang, X., Huang, H.: Clustering and projected clustering with adaptive neighbors. In: Macskassy, S.A., Perlich, C., Leskovec, J., Wang, W., Ghani, R. (eds.) The 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2014, 24–27 August 2014, pp. 977–986. ACM, New York (2014)
Google Scholar
Qi, Z., Tian, Y., Shi, Y.: Laplacian twin support vector machine for semi-supervised classification. Neural Netw. 35, 46–53 (2012)
Article Google Scholar
Vapnik, V.: Statistical Learning Theory. Wiley, Hoboken (1998)
MATH Google Scholar
Xue, Y., Zhang, L.: Laplacian pair-weight vector projection for semi-supervised learning. Inf. Sci. 573, 1–19 (2021)
Article MathSciNet Google Scholar
Yang, Z., Xu, Y.: Laplacian twin parametric-margin support vector machine for semi-supervised classification. Neurocomputing 171, 325–334 (2016)
Article Google Scholar
Yang, Z.: Nonparallel hyperplanes proximal classifiers based on manifold regularization for labeled and unlabeled examples. Int. J. Pattern Recognit. Artif. Intell. 27(5), 1350015 (2013)
Article Google Scholar
Ye, Q., Ye, N., Yin, T.: Enhanced multi-weight vector projection support vector machine. Pattern Recogn. Lett. 42, 91–100 (2014)
Article Google Scholar
Ye, Q., Zhao, C., Ye, N., Chen, Y.: Multi-weight vector projection support vector machines. Pattern Recogn. Lett. 31(13), 2006–2011 (2010)
Article Google Scholar
Zhang, H., Wu, D., Nie, F., Wang, R., Li, X.: Multilevel projections with adaptive neighbor graph for unsupervised multi-view feature selection. Inf. Fusion 70, 129–140 (2021)
Article Google Scholar
Zhong, W., Chen, X., Nie, F., Huang, J.Z.: Adaptive discriminant analysis for semi-supervised feature selection. Inf. Sci. 566, 178–194 (2021)
Article MathSciNet Google Scholar

Download references

Acknowledgments

This work was supported in part by the Natural Science Foundation of the Jiangsu Higher Education Institutions of China under Grant Nos. 19KJA550002 and 19KJA610002, by the Priority Academic Program Development of Jiangsu Higher Education Institutions, and by the Collaborative Innovation Center of Novel Software Technology and Industrialization.

Author information

Authors and Affiliations

The School of Computer Science and Technology and Joint International Research Laboratory of Machine Learning and Neuromorphic Computing, Soochow University, Suzhou, 215006, China
Yangtao Xue & Li Zhang

Authors

Yangtao Xue
View author publications
You can also search for this author in PubMed Google Scholar
Li Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Li Zhang .

Editor information

Editors and Affiliations

Harbin Institute of Technology, Shenzhen, China
Haijun Zhang
University of Jinan, Jinan, China
Yuehui Chen
Shenzhen University, Shenzhen, China
Xianghua Chu
Hefei University of Technology, Hefei, China
Zhao Zhang
South China Normal University, Guangzhou, China
Tianyong Hao
Chongqing University, Chongqing, China
Zhou Wu
Western University, London, ON, Canada
Yimin Yang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xue, Y., Zhang, L. (2022). Laplacain Pair-Weight Vector Projection with Adaptive Neighbor Graph for Semi-supervised Learning. In: Zhang, H., et al. Neural Computing for Advanced Applications. NCAA 2022. Communications in Computer and Information Science, vol 1637. Springer, Singapore. https://doi.org/10.1007/978-981-19-6142-7_18

Download citation

DOI: https://doi.org/10.1007/978-981-19-6142-7_18
Published: 21 October 2022
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-6141-0
Online ISBN: 978-981-19-6142-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Laplacain Pair-Weight Vector Projection with Adaptive Neighbor Graph for Semi-supervised Learning

Abstract

Similar content being viewed by others

Semi-supervised Multi-class Classification Methods Based on Laplacian Vector Projection

Manifold proximal support vector machine with mixed-norm for semi-supervised classification

Adaptive semi-supervised dimensionality reduction based on pairwise constraints weighting and graph optimizing

Keywords

1 Introduction