Abstract
Short text classification is one of the hotspots of research in Natural Language Processing. a new model of text representation is proposed in this paper (N-of-DOC), and in order to solve the problem of sparse representation in Chinese, the word2vec distributed representation is used, finally, it is applied to the improved convolution neural network model (CNN) to extract the high level features from the filter layer, the classification model is obtained by connecting the softmax classifier after the pooling layer. In the experiment, the traditional text representation model and the improved text representation model are used as the input of the original data, respectively. It acts on the model of traditional machine learning (KNN, SVM, logistic regression, naive Bayes) and the improved convolution neural network model. The results show that the proposed method can not only solve the dimension disaster and sparse problem of Chinese text vectors, but also improve the classification accuracy by 10.23% compared with traditional methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Jiang, B.: Micro-blog Automatic Classification Method Research and Application. Harbin Institute of Technology, Harbin (2012)
Zhang, Z., Miao, D., Chan, H.: Short text classification method LDA topic model. Based Comput. Appl. 33(6), 1587–1590 (2013)
Zhang, A., Liu, G., Liu, C.: Research on multi class text classification based on SVM. Inf. Mag. 23(9), 6–7 (2004)
Guo, S.: Research on Short Text Classification Algorithm Based on Bayesian Network. Chongqing University of Posts and Telecommunications, Chongqing (2010)
Zhong, W., Liu, R.: An improved KNN text classification. Comput. Eng. Appl. 48(2), 142–144 (2012)
Miaomiao, T.: A study of text classification based on decision tree. J. Jilin Norm. Univ. (Nat. Sci. Edit.) 29(1), 54–56 (2008)
Kim, Y.: Convolutional neural networks for sentence classification. Eprint Arxiv (2014)
Huang, W., Moyang, : Chinese spam filtering. Comput. Eng. Based Text Weight. KNN Algorithm 43(3), 193–199 (2017)
Chen, Y., Wu, J., Xu, K.: Development, Gini index for attribute selection of microcomputer based on decision tree. Microcomput. Dev. 14(5), 66–68 (2004)
Hu, W., He, T., Zhang, Y.: Extraction of Chinese terminology based on Chi square test. Comput. Appl. 27(12), 3019–3020 (2007)
Tan, S., Li, : Menstrual in text classification TF IDF. Improv. Method Mod. Libr. Inf. Technol. 29(10), 27–30 (2013)
Acknowledgements
First of all, I would like to thank my tutor, Professor Chen Qiaohong, for his great care and help in my life and my studies, Chen virtuous, friendly, knowledgeable, rigorous scholarship, During my master’s study, She not only taught me the skills of learning, she also taught me the rules of being a man, which will certainly benefit me for life. Finally, I would like to thank my parents for their greatest support, and I love you.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Wang, L., Chen, Q., Sun, Q., Jia, Y. (2019). Research on Short Text Classification Method Based on Convolution Neural Network. In: Xhafa, F., Patnaik, S., Tavana, M. (eds) Advances in Intelligent, Interactive Systems and Applications. IISA 2018. Advances in Intelligent Systems and Computing, vol 885. Springer, Cham. https://doi.org/10.1007/978-3-030-02804-6_53
Download citation
DOI: https://doi.org/10.1007/978-3-030-02804-6_53
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-02803-9
Online ISBN: 978-3-030-02804-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)