Ramp-based twin support vector clustering

Wang, Zhen; Chen, Xu; Shao, Yuan-Hai; Li, Chun-Na

doi:10.1007/s00521-019-04511-3

Ramp-based twin support vector clustering

Original Article
Published: 08 October 2019

Volume 32, pages 9885–9896, (2020)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Neural Computing and Applications Aims and scope Submit manuscript

Ramp-based twin support vector clustering

Download PDF

Zhen Wang¹,
Xu Chen¹,
Yuan-Hai Shao² &
…
Chun-Na Li²

464 Accesses
14 Citations
1 Altmetric
Explore all metrics

Abstract

Traditional plane-based clustering methods measure the within-cluster or between-cluster scatter by linear, quadratic or some other unbounded functions, which are sensitive to the samples far from the cluster center. This paper introduces the ramp functions into plane-based clustering and proposes a ramp-based twin support vector clustering (RampTWSVC). RampTWSVC is very robust to the samples far from the cluster center, because its within-cluster and between-cluster scatters are measured by the bounded ramp functions. Thus, it is easier to find the intrinsic clusters than other plane-based clustering methods. The nonconvex programming problem in RampTWSVC is solved efficiently through an alternating iteration algorithm, and its local solution can be obtained in a finite number of iterations theoretically. In addition, its nonlinear manifold clustering formation is also proposed via a kernel trick. Experimental results on several benchmark datasets show the better performance of our RampTWSVC compared with other plane-based clustering methods.

k-Proximal plane clustering

Article 13 April 2016

A New Perspective of Support Vector Clustering with Boundary Patterns

Fast clustering-based weighted twin support vector regression

Article 27 February 2020

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Clustering that discovers the relationship among data samples has been applied to many real-world problems, e.g., marketing, text mining and web analysis [1,2,3,4,5,6,7]. In particular, the partition-based clustering methods [1, 8, 9] are widely used in real applications for their simplicity, e.g., the classical kmeans [10]. The partition-based clustering supposes there are k clusters and each cluster has a cluster center. Then, the data samples are assigned to these clusters according to the distances from the cluster centers. Kmeans supposes the cluster centers are some points in the sample space, but the cluster centers may be some other types in the datasets, e.g., planes in the “cross-planes” dataset [11,12,13]. As an extension of point center, the plane center is able to discover comprehensive structures in the sample space. Therefore, the partition-based clustering with planes as cluster centers, called plane-based clustering, has been studied widely [14,15,16,17,18,19,20].

In plane-based clustering, starting from an initialization, the clustering result is obtained iteratively through the cluster assignment and cluster update steps. In the cluster assignment step, the samples are assigned to the clusters by the distances from the cluster center planes. In the cluster update step, the cluster center planes are the solutions of k optimization problems, where different plane-based methods have different optimization problems. In general, the optimization problems consist of two aspects: within-cluster compactness and between-cluster separation. The first proposed k-plane clustering (kPC) [14] measured the within-cluster scatter by a quadratic function, and another proximal planes clustering (PPC) [15, 16] measured the within-cluster and between-cluster scatters by the same quadratic function. Subsequently, twin support vector clustering (TWSVC) [17] improved the between-cluster scatter by hinge function [12, 21], robust twin support vector clustering (RTWSVC) [18] and fast robust twin support vector k-plane clustering (FRTWSVC) [18] improved them by absolute function. Figure 1 exhibits the metric functions used in these methods. Since these plane-based clustering methods measure the within-cluster and/or between-cluster scatter by some unbounded functions, some samples may influence the cluster center planes significantly, e.g., the samples far from their corresponding cluster center.

To further reduce the influence of the samples far from the cluster center, a better choice is to hire some bounded functions to measure the within-cluster and between-cluster scatters instead of unbounded functions. Therefore, we consider to hire the ramp functions [22] into the plane-based clustering, where the ramp functions are bounded piecewise linear functions and have been applied in semi-supervised and supervised learning successfully [23,24,25]. In this paper, a ramp-based twin support vector clustering method, called RampTWSVC, is proposed by introducing the ramp functions for the measurement of the within-cluster and between-cluster scatters (see Fig. 1). RampTWSVC reduces the influence of the samples far from the cluster centers to obtain the stable cluster center planes. The optimization problems in our RampTWSVC are some more complicated nonconvex programming problems than other plane-based clustering methods. These problems are recast to some mixed-integer programming problems with the same formation. Then, an iterative algorithm is proposed to solve the mixed-integer programming problem, and we prove that the iterative algorithm terminates in a finite number of iterations at a local solution. In addition, RampTWSVC is extended to nonlinear case via a kernel trick to cope with the manifold clustering [26, 27]. Experimental results on the benchmark datasets show the better performance of the proposed RampTWSVC compared with other plane-based clustering methods.

2 Review of plane-based clustering

In this paper, we consider m data samples $\{x_1,\,x_2,\ldots ,x_m\}$ in the n-dimensional real vector space ${\mathbb {R}}^{n}$. Assume these m samples belong to k clusters with their corresponding labels $y\in \{1,\,2,\ldots ,k\}$, and they are represented by a matrix $X=(x_1,x_2,\ldots ,x_m)\in {\mathbb {R}}^{n\times m}$. We further organize the samples from X with the current label i into a matrix $X_i$ and those with the rest labels into a matrix ${\hat{X}}_i$ with $i=1,\,2,\ldots ,k$. For readers’ convenience, the symbols X, $X_i$, and ${\hat{X}}_i$ will also refer to the corresponding sets, depending on the specific context they appear. For example, the symbol X can be comprehended as a matrix belonging to $R^{n\times m}$ or a set that contains m samples. The cluster center planes are defined as

$$\begin{aligned} \begin{array}{l} f_i(x)=w_i^\top x+b_i=0,\,\,i=1,\ldots ,k, \end{array} \end{aligned}$$

(1)

where $w_i\in {\mathbb {R}}^n$ is the weight vector and $b_i\in {\mathbb {R}}$ is the bias.

The following plane-based clustering methods share the same clustering procedures. Starting from an initial assignment of the m samples into k clusters, the cluster center planes (1) are constructed by the current cluster assignment. Once the cluster center planes are obtained, the m samples are reassigned by

$$\begin{aligned} \begin{array}{l} y=\underset{i}{\arg }\min \,|w_i^\top x+b_i|, \end{array} \end{aligned}$$

(2)

where $|\cdot |$ denotes the absolute value. The cluster center planes and the sample labels are updated alternately until some termination conditions are satisfied. In the following, we briefly describe the different constructions of the cluster center planes in kPC, PPC, TWSVC and RTWSVC.

2.1 kPC and PPC

kPC [14] wishes each cluster center plane close to the current cluster samples. Further on, PPC [15] considers that it should also be far away from different cluster samples. Therefore, the ith ($i=1,\ldots ,k$) cluster center plane in PPC is constructed by considering the problem

$$\begin{aligned} \begin{array}{l} \underset{w_i,b_i}{\min }\,\, ||X_i^\top w_i+b_ie||^2-c||{\hat{X}}_i^\top w_i+b_ie||^2\\ \mathrm{s.t.}\,\,\,\,||w_i||=1, \end{array} \end{aligned}$$

(3)

where $||\cdot ||$ denotes the $L_2$-norm, e is a column vector of ones with an appropriate dimension and $c>0$ is a user-set parameter, while the optimization problem in kPC does not have the second term in the objective of (3).

From the objective of (3), it is obvious that the samples from the within-cluster receive quadratic costs, and the samples from the between-cluster receive quadratic rewards. Therefore, the samples far from the cluster center will have great impact on the potential cluster center planes, and PPC may obtain a cluster center plane which is far from the current cluster.

2.2 TWSVC and RTWSVC

In contrast, TWSVC [17] degrades the rewards of the samples from different clusters by considering the problem with $i=1,\ldots ,k$ as

$$\begin{aligned} \begin{array}{ll} \underset{w_{i},b_{i}}{\min } &{}\frac{1}{2}\Vert X_i^\top w_{i}+b_{i}e\Vert ^{2}+ce^{\top }(e-|{\hat{X}}_i^\top w_{i}+b_{i}e|)_+,\\ \end{array} \end{aligned}$$

(4)

where $(\cdot )_+$ replaces the negative parts by zeros.

From the second part of (4), it is clear that the samples from the between-cluster would have a certain impact on the cluster center plane only with deviation in [0, 1). Thus, TWSVC is more robust than PPC [17]. However, the issue of within-cluster also exists because of the quadratic function in the first part of (4). Thus, RTWSVC [18] was proposed to decrease its influence by replacing the $L_2$-norm in (4) with the $L_1$-norm. Note that the metric function of within-cluster of RTWSVC is unbounded from Fig. 1. In order to eradicate the influence of the samples far from the cluster center, it is reasonable to hire a bounded metric function for the within-cluster, whose principle is similar to the metric function for the between-cluster used in TWSVC.

3 RampTWSVC

Similar to the above plane-based clustering methods mentioned in Sect. 2, our RampTWSVC starts with initial sample labels and then computes each cluster center plane and the sample labels alternately, until some termination conditions are satisfied. In the following, we consider to obtain one of the cluster center planes for the given samples with their labels.

3.1 Formation

To obtain the ith ($i=1,\ldots ,k$) cluster center plane, our RampTWSVC considers the following problem

$$\begin{aligned} \begin{array}{l} \underset{w_i,b_i}{\min }\,\,c_1\sum \limits _{x_j\in X_i}R_{1}(x_j)+c_2\sum \limits _{x_j\in {\hat{X}}_i}R_{2}({x}_j)+\frac{1}{2}(||w_i||^2+||b_i||^2), \end{array} \end{aligned}$$

(5)

where $c_1,c_2>0$ are the trade-offs. $R_{1}(x)$ and $R_{2}(x)$ are two piecewise linear functions w.r.t. the deviation $|f_i(x)|=|w_i^\top x+b_i|$ (see Fig. 1) as

$$\begin{aligned} R_{1}(x)=\left\{ \begin{array}{ll} 0&{}\quad \text {if}\,\,|f_i(x)|\le 1-\Delta ,\\ 1-s&{}\quad \text {if}\,\,|f_i(x)|\ge 2-\Delta -s,\\ |f_i(x)|-1+\Delta &{}\quad \text {otherwise}, \end{array}\right. \end{aligned}$$

(6)

and

$$\begin{aligned} R_{2}(x)=\left\{ \begin{array}{ll} 2+2\Delta &{}\quad \text {if}\,\,|f_i(x)|\le -s,\\ 1+\Delta -s&{}\quad \text {if}\,\,|f_i(x)|\ge 1+\Delta ,\\ -|f_i(x)|+2+2\Delta -s&{}\quad \text {otherwise}, \end{array}\right. \end{aligned}$$

(7)

where $\Delta \in [0,1)$ and $s\in (-1,0]$ are two constants to control the function form (for instance, $\Delta =0.3$ and $s=-0.2$ [23]). $||w_i||$ and $||b_i||$ are the regularization terms to control the problem complexity.

It is obvious that both of the functions $R_1(x)$ (for the within-cluster $X_i$) and $R_2(x)$ (for the between-cluster ${\hat{X}}_i$) have upper and lower bounds. Thus, the samples much further from the cluster center plane do not have greater impact on the cluster center plane. The above property indicates our RampTWSVC is more robust than PPC, TWSVC and RTWSVC.

In the following, we extend our RampTWSVC to nonlinear manifold clustering, and the solutions to the linear and nonlinear RampTWSVC are elaborated in the next subsection.

The plane-based clustering methods can be extended to nonlinear manifold clustering easily via kernel tricks [13, 28]. For instance, by introducing a pre-defined kernel function $K(\cdot ,X)$ [11, 17, 21], the plane-based nonlinear clustering seeks k cluster center manifolds in the kernel-generated space as

$$\begin{aligned} \begin{array}{l} g_i(x)=K(x,X)^\top w_i+b_i=0,\,i=1,\ldots ,k. \end{array} \end{aligned}$$

(8)

Then, the nonlinear RampTWSVC considers introducing the ramp functions into the plane-based nonlinear clustering. By replacing $f_i(x)$ with $g_i(x)$ in (6) and (7), and substituting them into (5), one can easily obtain the optimization problems to construct the cluster center manifolds (8). When we obtain the k cluster centers (8), a sample x is assigned to a cluster given by

$$\begin{aligned} y=\underset{i}{\arg }\min |K(x,X)^\top w_i+b_i|. \end{aligned}$$

(9)

The procedure of the nonlinear case is the same as the linear one, so the details are omitted.

3.2 Solution

In this subsection, we study the solution to problem (5). The corresponding problem in nonlinear RampTWSVC is similar to the one in linear case. For convenience, let $u_i=(w_i^\top ,b_i)^\top$, $Z_i$ be a matrix whose jth column $z_j$ is $x_j$ with an additional feature 1 (where the corresponding $x_j$ belongs to the ith cluster), and let ${\hat{Z}}_i$ be a matrix whose column is similar to $z_j$ (where the corresponding $x_j$ does not belongs to the ith cluster). After some algebra, problem (5) is recast to

$$\begin{aligned} \begin{array}{l} \underset{u_i}{\min }\,\,\frac{1}{2}||u_j||^2+c_1e^\top (-1+\Delta -Z_i^\top u_i)_+ +c_1e^\top (-1+\Delta \\ \quad +\,Z_i^\top u_i)_++c_2e^\top (1+\Delta -{\hat{Z}}_i^\top u_i)_+ +c_2e^\top (1+\Delta \\ \quad +\,{\hat{Z}}_i^\top u_i)_+-c_1e^\top (s-2+\Delta -Z_i^\top u_i)_+ -c_1e^\top (s-2+\Delta \\ \quad +\,Z_i^\top u_i)_+-c_2e^\top (s-{\hat{Z}}_i^\top u_i)_+ -c_2e^\top (s+{\hat{Z}}_i^\top u_i)_+. \end{array} \end{aligned}$$

(10)

It is easy to see that the above problem is a nonconvex programming problem because of the concave part $-(\cdot )_+$. By introducing two auxiliary vectors $p_1\in \{-1,0,1\}^{m_i}$ and $p_2\in \{-1,0,1\}^{m-m_i}$ (where $m_i$ is the sample number of the current ith cluster), the above problem is equivalent to the following mixed-integer programming problem

$$\begin{aligned} \begin{array}{l} \underset{u_i,p_1,p_2}{\min }\,\,\frac{1}{2}||u_i||^2+c_1e^\top (-1+\Delta -Z_i^\top u_i)_+\\ \quad +\,c_1e^\top (-1+\Delta +Z_i^\top u_i)_++c_2e^\top (1+\Delta -{\hat{Z}}_i^\top u_i)_+\\ \quad +\,c_2e^\top (1+\Delta +{\hat{Z}}_i^\top u_i)_++c_1p_1^\top Z_i^\top u_i +c_2p_2^\top {\hat{Z}}_i^\top u_i\\ s.t.\,\,p_1(j)=\left\{ \begin{array}{ll} -1&{}\quad \text {if}\,\,z_j^\top u_i>2-\Delta -s,\\ 1&{}\quad \text {if}\,\,z_j^\top u_i<-2+\Delta +s,\\ 0&{}\quad \text {otherwise}, \end{array}\right. \forall z_j\in Z_i\\ p_2(j)=\left\{ \begin{array}{ll} -1&{}\quad \text {if}\,\, z_j^\top u_i>-s,\\ 1&{}\quad \text {if}\,\,z_j^\top u_i<s,\\ 0&{}\quad \text {otherwise}, \end{array}\right. \forall z_j\in {\hat{Z}}_i \end{array} \end{aligned}$$

(11)

where $p_1(j)$ and $p_2(j)$ are the corresponding jth elements of $p_1$ and $p_2$, respectively.

Here, we propose an alternating iteration algorithm to solve mixed-integer programming problem (11). Starting with an initialized $u_i^{(0)}$, it is easy to calculate $p_1^{(0)}$ and $p_2^{(0)}$ by the constraints of (11). For fixed $p_1^{(t-1)}$ and $p_2^{(t-1)}$ ($t=1,2,\ldots$), (11) becomes an unconstrained convex problem and its solution can be obtained by many algorithms easily (e.g., sequential minimal optimization (SMO) [29] and fast Newton-Amijio algorithm [30]). Once $u_i^{(t)}$ is obtained, $p_1^{(t)}$ and $p_2^{(t)}$ are updated again. The loop will be continued until the objective of (11) does not decrease any more.

Theorem 3.1

The above alternating iteration algorithm terminates in a finite number of iterations at a local optimal point, where a local optimal point of the mixed-integer programming problem (11) is defined as the point$(u_i^*,p_1^*,p_2^*)$if$u_i^*$is the global solution to problem (11) with fixed$(p_1^*,p_2^*)$and vice versa.

Proof

From the procedure of the alternating iteration algorithm, it is obvious that the global solutions to (11) with fixed $u_i$ or $(p_1,p_2)$ are obtained in iteration. Since there are a finite number of ways to select $p_1$ and $p_2$, there are two finite numbers $0<r_1<r_2$ such that $(p_1^{(r_1)},p_2^{(r_1)})=(p_1^{(r_2)},p_2^{(r_2)})$. Thus, we have $u_i^{(r_1)}=u_i^{(r_2)}$. That is to say, the objective values are equal in the $r_1$th and $r_2$th iterations. Since $p_1^\top Z_i^\top u_i\le 0$ and $p_2^\top {\hat{Z}}_i^\top u_i\le 0$ always hold, the objective value of (11) keeps nonincreasing in iteration. Therefore, the objective is invariant after the $r_1$th iteration, and then the algorithm would terminate at the $r_1$th iteration.

Let us consider the point $(u_i^{(r_1)},p_1^{(r_1)},p_2^{(r_1)})$. From the above proof, we have $G(u_i^{(r_1)},p_1^{(r_1)},p_2^{(r_1)})=G(u_i^{(r_1)},p_1^{(r_1+1)},p_2^{(r_1+1)})$, where $G(\cdot )$ is the objective value of (11). If there is more than one global solution to (11) with fixed $u_i$, we always select the same one for the same $u_i$. Thus, we have $(p_1^{(r_1)},p_2^{(r_1)})=(p_1^{(r_1+1)},p_2^{(r_1+1)})$, which indicates $(u_i^{(r_1)},p_1^{(r_1)},p_2^{(r_1)})$ is a local optimal point. $\square$

Table 1 Details of the benchmark datasets, where m is the number of samples, n is the number of dimensions and k is the number of classes

Full size table

Table 2 Linear clustering on benchmark datasets

Full size table

Table 3 Nonlinear clustering on benchmark datasets

Full size table

4 Experimental results

In this section, we analyze the performance of our RampTWSVC compared with kmeans [10], kPC [14], PPC [15], TWSVC [17], RTWSVC [18] and FRTWSVC [18] on several benchmark datasets [31, 32]. All the methods were implemented by MATLAB 2017a on a PC with an Intel Core Duo Processor (double 4.2 GHz) with 16GB RAM. The parameters c in PPC, TWSVC, RTWSVC, FRTWSVC and $c_1$, $c_2$ in RampTWSVC were selected from $\{2^i|i=-8,-7,\ldots ,-1\}$. For nonlinear case, the Gaussian kernel $K(x_1,x_2)=\exp \{-\mu ||x_1-x_2||^2\}$ [28] was used, and its parameter $\upmu$ was selected from $\{2^i|i=-10,-9,\ldots ,1\}$. The random initialization was used on kmeans, and the nearest neighbor graph (NNG) initialization [17] was used on other methods. In the experiments, we used the metric accuracy (AC) [17] and mutual information (MI) [33] to measure the performance of these methods. We ran kmeans ten times on each dataset and recorded its mean value and standard deviation.

Table 4 Linear clustering rank in Table 2

Full size table

Table 5 Nonlinear clustering rank in Table 3

Full size table

Table 1 shows the details of the benchmark datasets. Tables 2 and 3 exhibit the linear and nonlinear clustering results on the benchmark datasets, respectively. The highest metrics among these methods on each dataset are in bold. From Table 2, it can be seen that our linear RampTWSVC performs better than other linear methods on seven datasets in terms of both AC and MI, and it is more accurate than other methods on other six datasets. On the rest of nine datasets, our linear RampTWSVC is also competitive with the best one. From Table 3, it is obvious that our nonlinear RampTWSVC has much higher AC and MI over other methods on most of the datasets. More precisely, we reported the rank of each method on these datasets in Tables 4 and 5 from Tables 2 and 3, respectively, where the numbers represent the ranks of these seven methods, the same number implies the same performance and the smaller number implies the higher performance. The mean metrics were also reported at the last two columns in these tables. It is obvious that our RampTWSVC owns the smallest mean metrics among these methods, which indicates that our RampTWSVC performs better than other methods on these datasets. It is worth to notice that there are several datasets on which other methods perform better than RampTWSVC. For example, kmeans obtains a much higher performance than others on dataset (g) in Table 2, which implies the cluster centers in dataset (g) are more suitable for points than planes. In these plane-based methods, our RampTWSVC cut off the influence of the samples far from the cluster centers, and the bounds are decided by the parameters. So the parameter selection is very important in our RampTWSVC.

To evaluate the parameter influence in our RampTWSVC, as examples, we reported the metric ACs with different parameter settings on four datasets. Figure 2 exhibits the results of linear RampTWSVC with parameters $c_1$ and $c_2$. It can be seen that the above two parameters play an important role in this method. Thence, we should be very careful about the parameter selection. Figure 3 exhibits the results of nonlinear RampTWSVC with parameters $c_1$, $c_2$ and $\upmu$, where $c_1=c_2$ for simplicity. From Fig. 3, it is obtained that we cannot obtain a desired result if $\upmu$ is improper, which indicates the parameter $\upmu$ is more important that the other two parameters for nonlinear RampTWSVC. This is obviously because the kernel mapping plays an more important role in the nonlinear case.

5 Conclusions

A plane-based clustering method (RampTWSVC) has been proposed in this paper. RampTWSVC hires the bounded ramp functions to reduce the influence of the samples far from the cluster centers, and it performs well in a robust manner. The nonconvex programming problems in RampTWSVC are recast to a series of mixed-integer programming problems and solved by a proposed alternating iteration algorithm. The local solutions of the nonconvex problems are guaranteed in theory. RampTWSVC contains both the linear and nonlinear formations. Experimental results on several benchmark datasets have indicated that our RampTWSVC performs much better than other plane-based clustering methods on many benchmark datasets. For practical convenience, the corresponding RampTWSVC MATLAB code has been uploaded upon http://www.optimal-group.org/Resources/Code/RampTWSVC.html. Future work includes the parameters regulation and efficient solver design for our RampTWSVC.

References

Jain A, Murty M, Flynn P (1999) Data clustering: a review. ACM Comput Surv 31(3):264–323
Article Google Scholar
Mehmood R, Zhang G, Bie R, Dawood H, Ahmad H (2016) Clustering by fast search and find of density peaks via heat diffusion. Neurocomputing 208:210–217
Article Google Scholar
Zhou J, Chen L, Chen C, Zhang Y, Li H (2016) Fuzzy clustering with the entropy of attribute weights. Neurocomputing 198:125–134
Article Google Scholar
Zhou Z (2017) A brief introduction to weakly supervised learning. Natl Sci Rev 5(1):44–53
Article Google Scholar
Berry M (2004) Survey of text mining I: clustering, classification, and retrieval, vol 1. Springer, Berlin
Book Google Scholar
Ilin R (2012) Unsupervised learning of categorical data with competing models. IEEE Trans Neural Netw Learn Syst 23(11):1726–1737
Article Google Scholar
Zhang H, Chow T, Wu Q (2015) Organizing books and authors by multilayer som. IEEE Trans Neural Netw Learn Syst 27(12):2537–2550
Article Google Scholar
Tan P, Steinbach M, Kumar V (2005) Introduction to data mining, 1st edn. Addison-Wesley Longman Publishing Co. Inc, Boston
Google Scholar
Huang X, Ye Y, Zhang H (2014) Extensions of kmeans-type algorithms: a new clustering framework by integrating intracluster compactness and intercluster separation. IEEE Trans Neural Netw Learn Syst 25(8):1433–1446
Article Google Scholar
Jain A, Dubes R (1988) Algorithms for clustering data. Prentice-Hall Inc, Upper Saddle River
MATH Google Scholar
Mangasarian O, Wild E (2006) Multisurface proximal support vector classification via generalize eigenvalues. IEEE Trans Pattern Anal Mach Intell 28(1):69–74
Article Google Scholar
Khemchandani R, Chandra S (2007) Twin support vector machines for pattern classification. IEEE Trans Pattern Anal Mach Intell 29(5):905–910
Article Google Scholar
Wang Z, Shao Y, Bai L, Li C, Liu L, Deng N (2018) Insensitive stochastic gradient twin support vector machines for large scale problems. Inf Sci 462:114–131
Article MathSciNet Google Scholar
Bradley P, Mangasarian O (2000) k-plane clustering. J Global Optim 16(1):23–32
Article MathSciNet Google Scholar
Shao Y, Bai L, Wang Z, Hua X, Deng N (2013) Proximal plane clustering via eigenvalues. Proc Comput Sci 17:41–47
Article Google Scholar
Shao Y, Guo Y, Wang Z, Yang Z, Deng N (2017) k-proximal plane clustering. Int J Mach Learn Cybernet 8(5):1537–1554
Article Google Scholar
Wang Z, Shao Y, Bai L, Deng N (2015) Twin support vector machine for clustering. IEEE Trans Neural Netw Learn Syst 26(10):2583–2588
Article MathSciNet Google Scholar
Ye Q, Zhao H, Li Z, Yang X, Gao S, Yin T, Ye N (2018) L1-norm distance minimization-based fast robust twin support vector k-plane clustering. IEEE Trans Neural Netw Learn Syst 29(9):4494–4503
Article Google Scholar
Khemchandani R, Pal A, Chandra S (2018) Fuzzy least squares twin support vector clustering. Neural Comput Appl 29(2):553–563
Article Google Scholar
Bai L, Shao Y, Wang Z, Li C (2019) Clustering by twin support vector machine and least square twin support vector classifier with uniform output coding. Knowl-Based Syst 163:227–240
Article Google Scholar
Shao Y, Zhang C, Wang X, Deng N (2011) Improvements on twin support vector machines. IEEE Trans Neural Netw 22(6):962–968
Article Google Scholar
Collobert R, Sinz F, Weston J, Bottou L (2006) Large scale transductive SVMS. J Mach Learn Res 7:1687–1712
MathSciNet MATH Google Scholar
Cevikalp H (2017) Best fitting hyperplanes for classification. IEEE Trans Pattern Anal Mach Intell 39(6):1076–1088
Article Google Scholar
Tian Y, Mirzabagheri M, Bamakan S, Wang H, Qu Q (2018) Ramp loss one-class support vector machine: a robust and effective approach to anomaly detection problems. Neurocomputing 310(8):223–235
Article Google Scholar
Liu D, Shi Y, Tian Y, Huang X (2016) Ramp loss least squares support vector machine. J Comput Sci 14:61–68
Article MathSciNet Google Scholar
Souvenir R, Pless R (2005) Manifold clustering. In: Tenth IEEE international conference on computer vision, ICCV 2005, vol 1, IEEE, pp 648–653
Cao W, Haralick R (2006) Nonlinear manifold clustering by dimensionality. In: 18th international conference on pattern recognition. ICPR 2006, vol 1. IEEE, pp 920–924
Khemchandani R, Chandra S (2009) Optimal kernel selection in twin support vector machines. Optim Lett 3:77–88
Article MathSciNet Google Scholar
Platt J (1999) Fast training of support vector machines using sequential minimal optimization. Advances in kernel methods-support vector learning. MIT Press, Cambridge, pp 185–208
Google Scholar
Wang Z, Shao Y, Wu T (2013) A ga-based model selection for smooth twin parametric-margin support vector machine. Pattern Recogn 46(8):2267–2277
Article Google Scholar
Blake C, Merz C (1998) UCI Repository for Machine Learning Databases, http://www.ics.uci.edu/~mlearn/MLRepository.html
Li C, Shao Y, Yin W, Liu M (2019) Robust and sparse linear discriminant analysis via an alternating direction method of multipliers. IEEE Trans Neural Netw Learn Syst (in press)
Liu H, Fu Y (2015) Clustering with partition level side information. In: 2015 IEEE international conference on data mining (ICDM). IEEE, pp 877–882

Download references

Acknowledgements

This work is supported in part by National Natural Science Foundation of China (Nos. 61966024, 11501310, 61866010, 11871183 and 61703370), in part by Program for Young Talents of Science and Technology in Universities of Inner Mongolia Autonomous Region (No. NJYT-19-B01), in part by Natural Science Foundation of Hainan Province (No. 118QN181), in part by Scientific Research Foundation of Hainan University (No. kyqd(sk)1804) and in part by Zhejiang Provincial Natural Science Foundation of China (No. LQ17F030003).

Author information

Authors and Affiliations

School of Mathematical Sciences, Inner Mongolia University, Hohhot, 010021, People’s Republic of China
Zhen Wang & Xu Chen
School of Management, Hainan University, Haikou, 570228, People’s Republic of China
Yuan-Hai Shao & Chun-Na Li

Authors

Zhen Wang
View author publications
You can also search for this author in PubMed Google Scholar
Xu Chen
View author publications
You can also search for this author in PubMed Google Scholar
Yuan-Hai Shao
View author publications
You can also search for this author in PubMed Google Scholar
Chun-Na Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yuan-Hai Shao.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, Z., Chen, X., Shao, YH. et al. Ramp-based twin support vector clustering. Neural Comput & Applic 32, 9885–9896 (2020). https://doi.org/10.1007/s00521-019-04511-3

Download citation

Received: 21 March 2019
Accepted: 21 September 2019
Published: 08 October 2019
Issue Date: July 2020
DOI: https://doi.org/10.1007/s00521-019-04511-3

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Ramp-based twin support vector clustering

Abstract