Abstract
A major problem of text categorization is the high dimensionality of the input feature space. This paper proposes a novel approach for aggressive dimensionality reduction in text categorization. This method utilizes the local feature selection to obtain more positive terms and then scales the weighting in the global level to suit the classifier. After that the weighting is enhanced with the feature selection measure to improve the distinguishing capability. The validity of this method is tested on two benchmark corpuses by the SVM classifier with four standard feature selection measures.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Sebastiani, F.: Machine learning in automated text categorization. Acm Computing Surveys 34(1), 1–47 (2002)
Yang, Y., Pedersen, J.: A comparative study on feature selection in text categorization. In: Machine Learning-International Workshop Then Conference, pp. 412–420 (1997) (Citeseer)
Hwee Tou, N., Wei Boon, G., Kok Leong, L.: Feature selection, perceptron learning, and a usability case study for text categorization. In: SIGIR Forum, spec. issue., pp. 67–73 (1997)
Joachims, T.: Text categorization with support vector machines: Learning with many relevant features. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 137–142. Springer, Heidelberg (1998)
Wang, T., Chiang, H.: One-against-one fuzzy support vector machine classifier: An approach to text categorization. Expert Systems with Applications 36, 10 030–10 034 (2009)
Shehata, S., Karray, F., Kamel, M.: A concept-based model for enhancing text categorization. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 629–637. ACM, New York (2007)
Makrehchi, M., Kamel, M.S.: Text classification using small number of features. In: 4th International Conference on Machine Learning and Data Minining in Pattern Recognition, pp. 580–589 (2005)
Gabrilovich, E., Markovitch, S.: Text categorization with many redundant features: using aggressive feature selection to make SVMs competitive with C4. 5. In: Proceedings of the Twenty-First International Conference on Machine Learning, p. 41. ACM, New York (2004)
How, B.C., Kiong, W.T.: An examination of feature selection frameworks in text categorization. In: Lee, G.G., Yamada, A., Meng, H., Myaeng, S.-H. (eds.) AIRS 2005. LNCS, vol. 3689, pp. 558–564. Springer, Heidelberg (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zheng, W., Qian, Y. (2010). Aggressive Dimensionality Reduction with Reinforcement Local Feature Selection for Text Categorization. In: Wang, F.L., Deng, H., Gao, Y., Lei, J. (eds) Artificial Intelligence and Computational Intelligence. AICI 2010. Lecture Notes in Computer Science(), vol 6319. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16530-6_43
Download citation
DOI: https://doi.org/10.1007/978-3-642-16530-6_43
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16529-0
Online ISBN: 978-3-642-16530-6
eBook Packages: Computer ScienceComputer Science (R0)