A Perceptron-Like Linear Supervised Algorithm for Text Classification

Gkanogiannis, Anestis; Kalamboukis, Theodore

doi:10.1007/978-3-642-17316-5_8

Anestis Gkanogiannis²² &
Theodore Kalamboukis²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6440))

Included in the following conference series:

International Conference on Advanced Data Mining and Applications

2409 Accesses
5 Citations

Abstract

A fast and accurate linear supervised algorithm is presented which compares favorably to other state of the art algorithms over several real data collections on the problem of text categorization. Although it has been already presented in [6], no proof of its convergence is given. From the geometric intuition of the algorithm it is evident that it is not a Perceptron or a gradient descent algorithm thus an algebraic proof of its convergence is provided in the case of linearly separable classes. Additionally we present experimental results on many standard text classification datasets and artificially generated linearly separable datasets. The proposed algorithm is very simple to use and easy to implement and it can be used in any domain without any modification on the data or parameter estimation.

Access provided by Autonomous University of Puebla. Download to read the full chapter text

Chapter PDF

Supervised Machine Learning Text Classification: A Review

Automatic Text Classification Using Neural Network and Statistical Approaches

Text Classification Using Deep Neural Networks

Keywords

References

Buckley, C., Salton, G.: Optimization of relevance feedback weights. In: SIGIR 1995: Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 351–357. ACM, New York (1995)
Google Scholar
Cristianini, N., Shawe-Taylor, J.: An Introduction To Support Vector Machines (and other kernel-based learning methods). Cambridge University Press, Cambridge (2000)
Book MATH Google Scholar
Dagan, I., Karov, Y., Roth, D.: Mistake-driven learning in text categorization. In: 2nd Conference on Empirical Methods in Natural Language Processing, EMNLP 1997, pp. 55–63 (1997)
Google Scholar
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. Wiley Interscience, Hoboken (November 2000)
MATH Google Scholar
Dumais, S., Platt, J., Heckerman, D., Sahami, M.: Inductive learning algorithms and representations for text categorization (1998)
Google Scholar
Gkanogiannis, A., Kalampoukis, T.: A modified and fast perceptron learning rule and its use for tag recommendations in social bookmarking systems. In: ECML PKDD Discovery Challenge 2009 - DC 2009 (2009)
Google Scholar
Harman, D.: Relevance feedback and other query modification techniques, pp. 241–263 (1992)
Google Scholar
Hersh, W., Buckley, C., Leone, T., Hickman, D.: Ohsumed: an interactive retrieval evaluation and new large test collection for research (1994)
Google Scholar
Joachims, T.: Text categorization with support vector machines: learning with many relevant features (1998)
Google Scholar
Joachims, T.: Making large-scale support vector machine learning practical. In: Schölkopf, B., Burges, C.J.C., Smola, A.J. (eds.) Advances in Kernel Methods: Support Vector Learning, pp. 169–184. MIT Press, Cambridge (1999), http://portal.acm.org/citation.cfm?id=299104
Google Scholar
Joachims, T.: Training linear svms in linear time. In: KDD 2006: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 217–226. ACM, New York (2006), http://dx.doi.org/10.1145/1150402.1150429
Google Scholar
Karypis, G., Shankar, S.: Weight adjustment schemes for a centroid based classifier (2000)
Google Scholar
Lang, K.: Newsweeder: learning to filter netnews (1995)
Google Scholar
Lewis, D.D.: Evaluating text categorization. In: Workshop on Speech and Natural Language HLT 1991, pp. 312–318 (1991)
Google Scholar
Lewis, D.D., Schapire, E.R., Callan, P.J., Papka, R.: Training algorithms for linear text classifiers. In: 19th ACM International Conference on Research and Development in Information Retrieval SIGIR 1996, pp. 298–306 (1996)
Google Scholar
Lewis, D.D., Yang, Y., Rose, T., Li, F.: Rcv1: A new benchmark collection for text categorization research. Journal of Machine Learning Research 5, 361–397 (2004)
Google Scholar
Novikoff, A.B.: On convergence proofs for perceptrons. In: Proceedings of the Symposium on the Mathematical Theory of Automata, vol. 12, pp. 615–622 (1963), http://citeseer.comp.nus.edu.sg/context/494822/0
Porter, M.F.: An algorithm for suffix stripping. Program 14(3), 130–137 (1980)
Article Google Scholar
Rosenblatt, F.: The perceptron: a probabilistic model for information storage and organization in the brain. Psychological Review 65(6), 386–408 (1958)
Article Google Scholar
Salton, G.: Automatic Text Processing – The Transformation, Analysis, and Retrieval of Information by Computer. Addison-Wesley, Reading (1989)
Google Scholar
Schapire, R.E., Singer, Y., Singhal, A.: Boosting and rocchio applied to text filtering. In: SIGIR 1998: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 215–223. ACM, New York (1998)
Google Scholar
Sebastiani, F.: Machine learning in automated text categorization. ACM Computing Surveys 34(1), 1–47 (2002)
Article Google Scholar
Yang, Y.: A study on thresholding strategies for text categorization (2001)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Informatics, Athens University of Economics and Business, Athens, Greece
Anestis Gkanogiannis & Theodore Kalamboukis

Authors

Anestis Gkanogiannis
View author publications
You can also search for this author in PubMed Google Scholar
Theodore Kalamboukis
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Engineering and Information Technology, University of Technology Sydney, 2007, Sydney, NSW, Australia
Longbing Cao
College of Computer Science, Chongqing University, 400030, Chongqing, China
Yong Feng
College of Computer Science, Chongqing University , 400030, Chongqing, China
Jiang Zhong

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gkanogiannis, A., Kalamboukis, T. (2010). A Perceptron-Like Linear Supervised Algorithm for Text Classification. In: Cao, L., Feng, Y., Zhong, J. (eds) Advanced Data Mining and Applications. ADMA 2010. Lecture Notes in Computer Science(), vol 6440. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17316-5_8

Download citation

DOI: https://doi.org/10.1007/978-3-642-17316-5_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-17315-8
Online ISBN: 978-3-642-17316-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Perceptron-Like Linear Supervised Algorithm for Text Classification

Abstract

Chapter PDF

Similar content being viewed by others

Supervised Machine Learning Text Classification: A Review

Automatic Text Classification Using Neural Network and Statistical Approaches

Text Classification Using Deep Neural Networks

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

A Perceptron-Like Linear Supervised Algorithm for Text Classification

Abstract

Chapter PDF

Similar content being viewed by others

Supervised Machine Learning Text Classification: A Review

Automatic Text Classification Using Neural Network and Statistical Approaches

Text Classification Using Deep Neural Networks

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation