Abstract
With the substantial outgrowth of e-commerce, social media and online news portals have witnessed a great wave in expressing views through short text. Most textual contents are unstructured and messy forms, which are impractical and cumbersome to organize or manipulate by human experts. Therefore, developing an automatic short text classification model concerning low-resource languages, including Bengali, is critical. Moreover, the crucial barrier to classifying short text in Bengali is the unavailability of text corpora, scarcity of linguistics tools, a limited number of words in the text, and a lack of dependencies between the words. This paper presents a short text classification model using the ensemble of four base deep learning classifiers (Neural Network (NN), Convolutional Neural Network (CNN), Bidirectional Long Short Term Memory (BiLSTM), and Bidirectional Gated Recurrent Unit (BiGRU)). Additionally, a corpus of around 0.13 million Bengali texts is developed for short text classification into six categories (e.g., international, national, sports, amusement, technology, and politics). The evaluation results on the developed corpus demonstrated that the proposed method outperformed all the baselines machine learning and deep learning models by obtaining the highest weighted f1-score of \(84.4\%\).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bhuiyan, M.R., Keya, M., Masum, A.K.M., Hossain, S.A., Abujar, S.: An approach for Bengali news headline classification using LSTM. In: Hassanien, A.E., Bhattacharyya, S., Chakrabati, S., Bhattacharya, A., Dutta, S. (eds.) Emerging Technologies in Data Mining and Information Security. AISC, vol. 1286, pp. 299–308. Springer, Singapore (2021). https://doi.org/10.1007/978-981-15-9927-9_30
Dhar, P., Abedin, M., et al.: Bengali news headline categorization using optimized machine learning pipeline. Int. J. Inf. Eng. Electron. Bus. 13(1) (2021)
Hossain, M.R., Hoque, M.M.: Semantic meaning based Bengali web text categorization using deep convolutional and recurrent neural networks (DCRNNs). In: Misra, R., Kesswani, N., Rajarajan, M., Bharadwaj, V., Patel, A. (eds.) ICIoTCT 2020. AISC, vol. 1382, pp. 494–505. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-76736-5_45
Kandhro, I.A., et al.: Classification of Sindhi headline news documents based on TF-IDF text analysis scheme. Indian J. Sci. Technol. 12, 33 (2019)
Khan, M.B.: Urdu news classification using application of machine learning algorithms on news headline. IJCSNS 21(2), 229 (2021)
Khushbu, S.A., Masum, A.K.M., Abujar, S., Hossain, S.A.: Neural network based Bengali news headline multi classification system: selection of features describes comparative performance. In: 2020 11th International Conference on Computing, Communication and Networking Technologies (ICCCNT), pp. 1–6. IEEE (2020)
Lu, Z., Liu, W., Zhou, Y., Hu, X., Wang, B.: An effective approach for Chinese news headline classification based on multi-representation mixed model with attention and ensemble learning. In: Huang, X., Jiang, J., Zhao, D., Feng, Y., Hong, Yu. (eds.) NLPCC 2017. LNCS (LNAI), vol. 10619, pp. 339–350. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73618-1_29
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Qiu, X., Gong, J., Huang, X.: Overview of the NLPCC 2017 shared task: Chinese news headline categorization. In: Huang, X., Jiang, J., Zhao, D., Feng, Y., Hong, Yu. (eds.) NLPCC 2017. LNCS (LNAI), vol. 10619, pp. 948–953. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73618-1_85
Silva, J., Coheur, L., Mendes, A.C., Wichert, A.: From symbolic to sub-symbolic information in question classification. Artif. Intell. Rev. 35(2), 137–154 (2011)
Yin, Z., Tang, J., Ru, C., Luo, W., Luo, Z., Ma, X.: A semantic representation enhancement method for Chinese news headline classification. In: Huang, X., Jiang, J., Zhao, D., Feng, Y., Hong, Yu. (eds.) NLPCC 2017. LNCS (LNAI), vol. 10619, pp. 318–328. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73618-1_27
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Jannat, M., Hossain, E., Hoque, M.M., Rahaman, M.A. (2023). Multi-class Short Text Classification Using Ensemble of Deep Learning Classifier. In: Vasant, P., Weber, GW., Marmolejo-Saucedo, J.A., Munapo, E., Thomas, J.J. (eds) Intelligent Computing & Optimization. ICO 2022. Lecture Notes in Networks and Systems, vol 569. Springer, Cham. https://doi.org/10.1007/978-3-031-19958-5_45
Download citation
DOI: https://doi.org/10.1007/978-3-031-19958-5_45
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-19957-8
Online ISBN: 978-3-031-19958-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)