Abstract
This study uses fuzzy set theory for least squares support vector machines (LS-SVM) and proposes a novel formulation that is called a fuzzy hyperplane based least squares support vector machine (FH-LS-SVM). The two key characteristics of the proposed FH-LS-SVM are that it assigns fuzzy membership degrees to every data vector according to the importance and the parameters for the hyperplane, such as the elements of normal vector and the bias term, are fuzzified variables. The proposed fuzzy hyperplane efficiently captures the ambiguous nature of real-world classification tasks by representing vagueness in the observed data set using fuzzy variables. The fuzzy hyperplane for the proposed FH-LS-SVM model significantly decreases the effect of noise. Noise increases the ambiguity (spread) of the fuzzy hyperplane but the center of a fuzzy hyperplane is not affected by noise. The experimental results for benchmark data sets and real-world classification tasks show that the proposed FH-LS-SVM model retains the advantages of a LS-SVM which is a simple, fast and highly generalized model, and increases fault tolerance and robustness by using fuzzy set theory.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
An W, Liang M (2013) Fuzzy support vector machine based on within-class scatter for classification problems with outliers or noises. Neurocomputing 110(6):101–110
Blake CL, Merz CJ (1998) UCI repository of machine learning databases. Univ. California, Dept. Inform. Comput. Sci., Irvine, CA. Available: http://kdd.ics.uci.edu/
Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. J Mach Learn Res 3:993–1022
Chen S-G, Wu X-J (2018) A new fuzzy twin support vector machine for pattern classification. Int J Mach Learn Cybern 9:1553–1564
Chen S, Cao J, Chen F, Liu B (2020) Entropy-based fuzzy least squares twin support vector machine for pattern classification. Neural Process Lett 51:41–66
Chiang J-H, Hao P-Y (2003) A new kernel-based fuzzy clustering approach: support vector clustering with cell growing. IEEE Trans on Fuzzy Syst 11(4):518–527
Day M-Y, Lee C-C (2016) Deep learning for financial sentiment analysis on finance news providers. In: 2016 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM), pp 1127–1134
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
Do TN (2021) Multi-class bagged proximal support vector machines for the ImageNet challenging problem. In: Dang TK, Küng J, Chung TM, Takizawa M (eds) Future data and security engineering. FDSE 2021. Lecture notes in computer science, vol 13076. Springer, Cham, pp 99–112
Do T-N, Le Thi HA (2022) Training support vector machines for dealing with the ImageNet challenging problem. In: Le Thi HA, Pham Dinh T, Le HM (eds) Modelling, computation and optimization in information systems and management sciences. MCO 2021. Lecture notes in networks and systems, vol 363, Springer, Cham, pp 235–246
Do TN, Poulet F (2017) Parallel learning of local SVM algorithms for classifying large datasets, In: Transactions on large-scale data- and knowledge-centered systems, vol XXXI, pp 67–93
Fan RE, Chang KW, Hsieh CJ, Wang XR, Lin CJ (2008) LIBLINEAR: a library for large linear classification. J Mach Learn Res 9(4):1871–1874
Fletcher R (1987) Practical methods of optimization. John Wiley and Sons, Chichester
Gite S, Khatavkar H, Kotecha K, Srivastava S, Maheshwari P, Pandey N (2021) Explainable stock prices prediction from financial news articles using sentiment analysis. PeerJ Comput Sci 7:e340. https://doi.org/10.7717/peerj-cs.340
Haddoud M, Mokhtari A, Lecroq T, Abdeddaïm S (2016) Combining supervised term-weighting metrics for SVM text classification with extended term representation. Knowl Inf Syst 49(3):909–931
Hao P-Y (2016) Support vector classification with fuzzy hyperplane. J Intell Fuzzy Syst 30(3):1431–1443
Hao P-Y (2021) Asymmetric possibility and necessity regression by twin support vector networks. IEEE Trans Fuzzy Syst 29(10):3028–3042
Hao P-Y, Kung C-F, Chang C-Y, Ou J-B (2021) Predicting stock price trends based on financial news articles and using a novel twin support vector machine with fuzzy hyperplane. Appl Soft Comput 98:106806
Hao P-Y, Chiang J-H, Chen Y-D (2022) Possibilistic classification by support vector networks. Neural Netw 149:40–56
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR 2016), Las Vegas, NV, USA, 2016, pp 770–778. https://doi.org/10.1109/CVPR.2016.90
Huang J-L et al (2012) Establishment of a Chinese dictionary of language exploration and word counting. Chin J Psychol 54(2):185–201
Iman RL, Davenport JM (1980) Approximations of the critical region of the fbietkan statistic. Commun Stat Theory Methods 9(6):571–595
Jayadeva R, Khemchandani S, Chandra S (2007) Twin support vector machines for pattern classification. IEEE Trans Pattern Anal Mach Intell 29(5):905–910
Klir GJ, Yuan B (1995) Fuzzy sets and fuzzy logic: theory and applications. Prentice-Hall, New Jersey
Kreßel UHG (1999) Pairwise classification and support vector machines. In: Schölkopf B, Burges CJC, Smola AJ (eds) Advances in kernel methods. MIT Press, Cambridge, pp 255–268
Krizhevsky A, Sutskever I, Hinton GE (2017) ImageNet classification with deep convolutional neural networks. Commun ACM 60(6):84–90
Kumar A, Singh SK, Saxena S, Singh AK, Shrivastava S, Lakshmanan K, Kumar N, Singh RK (2021) CoMHisP: a novel feature extractor for histopathological image classification based on fuzzy SVM with within-class relative density. IEEE Trans Fuzzy Syst 29(1):103–117. https://doi.org/10.1109/TFUZZ.2020.2995968
Laxmi S, Gupta SK (2020) Intuitionistic fuzzy proximal support vector machines for pattern classification. Neural Process Lett 51:2701–2735
Li K, Ma HY (2013) A fuzzy twin support vector machine algorithm. Int J Appl Innov Eng Manag 2(3):459–465
Li Q, Tan J, Wang J, Chen H (2021) A multimodal event-driven LSTM model for stock prediction using online news. IEEE Trans Knowl Data Eng 33(10):3323–3337
Lin C-F, Wang S-D (2002) Fuzzy support vector machines. IEEE Trans Neural Netw 13(2):464–471
Michie D, Spiegelhalter DJ, Taylor CC (1994) Machine learning, neural and statistical classification. Ellis Horwood. Available: http://www.maths.leeds.ac.uk/~charles/statlog/
Nasiri JA, Charkari NM, Jalili S (2015) Least squares twin multi-class classification support vector machine. Pattern Recogn 48(3):984–992
Pinheiro LDS, Dras M (2017) Stock market prediction with deep learning: a character-based neural language model for event-based trading. In: Proceedings of Australasian language technology association workshop, pp 6–15
Prechelt L (1994) PROBEN 1—a set of neural network benchmark problems and benchmarking rules. Technical Report 21/94. Fakultat fur Informatik, Universitat Karlsruhe, D-76128, Karlsruhe, Germany.Anonymous FTP: pub/papers/techreports/1994/1994-21.ps.Z on https://ftp.ira.uka.de
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M et al (2014) Imagenet large scale visual recognition challenge
Saigo H, Vert J-P, Ueda N, Akutsu T (2004) Protein homology detection using string alignment kernels. Bioinformatics 20(11):1682–1689
Schölkopf B, Smola AJ, Williamson R, Bartlett PL (2000) New support vector algorithms. Neural Comput 12(5):1207–1245
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: The 3rd international conference on learning representations (ICLR2015). https://arxiv.org/abs/1409.1556
Suykens JAK, Vandewalle J (1999) Least squares support vector machine classifiers. Neural Process Lett 9:293–300
Szegedy C, Liu W, Jia Y et al (2015a) Going deeper with convolutions. In: 2015 IEEE Conference on computer vision and pattern recognition (CVPR), Boston, MA, USA, 2015, pp 1–9
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2015a) Rethinking the inception architecture for computer vision. CoRR arXiv:1512.00567
Tankaka H, Uejima S, Asai K (1982) Linear regression analysis with fuzzy model. IEEE Trans Syst Man Cybern 12(6):903–907
Tanveer M, Sharma A, Suganthan PN (2021) Least squares KNN-based weighted multiclass twin SVM. Neurocomputing 459(12):454–464
Tao X, Li Q, Ren C, Guo W, He Q, Liu R, Zou J (2020) Affinity and class probability-based fuzzy support vector machine for imbalanced data sets. Neural Netw 122:289–307
Tsujinishi D, Abe S (2003) Fuzzy least squares support vector machines for multiclass problems. Neural Netw 16:785–792
Vapnik VN (1995) The nature of statistical learning theory. Springer, New York
Yasoda K, Ponmagal RS, Bhuvaneshwari KS, Venkatachalam K (2020) Automatic detection and classification of EEG artifacts using fuzzy kernel SVM and wavelet ICA (WICA). Soft Comput. https://doi.org/10.1007/s00500-020-04920-w
Yu L (2014) Credit risk evaluation with a least squares fuzzy support vector machines classifier. Discret Dyn Nat Soc 1:1–9
Yu J, Tao D, Wang M (2012) Adaptive hypergraph learning and its application in image classification. IEEE Trans Image Process 21(7):3262–3272
Yu J, Tan M, Zhang H, Rui Y, Tao D (2022) Hierarchical deep click feature prediction for fine-grained image recognition. IEEE Trans Pattern Anal Mach Intell 44(02):563–578
Yun H, Sim G, Seok J (2019) Stock prices prediction using the title of newspaper articles with Korean natural language processing. In: International conference on artificial intelligence in information and communication (ICAIIC)
Zhang S, Zhao S, Sui Y, Zhang L (2015) Single Object tracking with fuzzy least squares support vector machine. IEEE Trans Image Process 24(12):5723–5738
Zhang S, Lu W, Xing W, Zhang L (2018) Using fuzzy least squares support vector machine with metric learning for object tracking. Pattern Recogn 84:112–125
Zhang S, Zhang L, Hauptmann AG (2020) Fuzzy least squares support vector machine with adaptive membership for object tracking. IEEE Trans Multimed 22(8):1998–2011
Zhang J, Cao Y, Wu Q (2021) Vector of locally and adaptively aggregated descriptors for image feature representation. Pattern Recogn 116:107952
Zhang J, Yang J, Yu J, Fan J (2022) Semisupervised image classification by mutual learning of multiple self-supervised models. Int J Intell Syst 37(5):3117–3141
Zhao W, Zhang J, Li K (2015) An efficient LS-SVM-based method for fuzzy system construction. IEEE Trans Fuzzy Syst 23(3):627–643
Acknowledgements
This research work was supported in part by the Ministry of Science and Technology Research Grant MOST 111-2221-E-992-071 -.
Author information
Authors and Affiliations
Contributions
Chien-Feng Kung and Pei-Yi Hao wrote the main manuscript text and Pei-Yi Hao prepared experiment. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Data Availability
The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix A: Derivation of Eq. (25)
Appendix A: Derivation of Eq. (25)
The optimal solution for Eq. (24) lies on the saddle point of the following Lagrangian function:
where α1i and α2i (i = 1, …, N) are Lagrange multipliers. In terms of the Kuhn–Tucker conditions [13], the Lagrange multipliers α1i and α2i can be either negative or positive because of the equality constraints that are used. By calculating the first-order derivative of L with respect to w, c, b, d ξ1i, ξ2i, α1i and α2i, the Kuhn–Tucker conditions for optimality are:
Equation (A.8) then gives to a set of linear equations that is expressed as:
where Y, 1,Iw and ξ1 are respectively
Let \({\mathbf{Z}} = (y_{1} \Phi ({\mathbf{x}}_{1} ), \ldots ,y_{N} \Phi ({\mathbf{x}}_{N} ))^{t}\) and \({\mathbf{G}} = (\Phi (|{\mathbf{x}}_{1} |), \ldots ,\Phi (|{\mathbf{x}}_{N} |))^{t}\), Eq. (A.10) becomes:
Equations (A.2) and (A.3) give the following equations:
where \({{\varvec{\upalpha}}}_{1} = (\alpha_{11} , \ldots ,\alpha_{1N} )^{t}\) and \({{\varvec{\upalpha}}}_{2} = (\alpha_{21} , \ldots ,\alpha_{2N} )^{t}\). Substituting Eqs. (A.6) and (A.16)–(A.17) into the matrix equation in (A.15) gives:
where \({\mathbf{S}} = diag\left( {\frac{1}{{C\mu_{1} }}, \ldots ,\frac{1}{{C\mu_{N} }}} \right)\) is a N × N diagonal matrix.
Similarly, Eq. (A.9) gives a set of linear equations that is expressed as:
Let \({\mathbf{Z}} = (y_{1} \Phi ({\mathbf{x}}_{1} ), \ldots ,y_{N} \Phi ({\mathbf{x}}_{N} ))^{t}\) and \({\mathbf{G}} = (\Phi (|{\mathbf{x}}_{1} |), \ldots ,\Phi (|{\mathbf{x}}_{N} |))^{t}\), Eq. (A.19) become:
Substituting Eqs. (A.7), (A.16) and (A.17) into the matrix equation in (A.20) then gives:
Equations (A.4) and (A.5) are arranged in matrix form as:
Therefore, the optimal fuzzy separating hyperplane of FH-LS-SVM is determined by solving the set of linear equations in Eqs. (A.18) and (A.21)–(A.23), rather than by solving a quadratic programming problem (QPP). This reduces the computational complexity, especially for large-scale problems. In matrix form, Eqs. (A.18) and (A.21)–(A.23) are expressed as:
where
and
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Kung, CF., Hao, PY. Fuzzy Least Squares Support Vector Machine with Fuzzy Hyperplane. Neural Process Lett 55, 7415–7446 (2023). https://doi.org/10.1007/s11063-023-11267-4
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-023-11267-4