RNBL-MN: A Recursive Naive Bayes Learner for Sequence Classification

Kang, Dae-Ki; Silvescu, Adrian; Honavar, Vasant

doi:10.1007/11731139_8

Dae-Ki Kang²²,
Adrian Silvescu²² &
Vasant Honavar²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3918))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

3135 Accesses
4 Citations

Abstract

Naive Bayes (NB) classifier relies on the assumption that the instances in each class can be described by a single generative model. This assumption can be restrictive in many real world classification tasks. We describe RNBL-MN, which relaxes this assumption by constructing a tree of Naive Bayes classifiers for sequence classification, where each individual NB classifier in the tree is based on a multinomial event model (one for each class at each node in the tree). In our experiments on protein sequence and text classification tasks, we observe that RNBL-MN substantially outperforms NB classifier. Furthermore, our experiments show that RNBL-MN outperforms C4.5 decision tree learner (using tests on sequence composition statistics as the splitting criterion) and yields accuracies that are comparable to those of support vector machines (SVM) using similar information.

Supported in part by grants from the National Science Foundation (IIS 0219699) and the National Institutes of Health (GM 066387).

Access provided by Autonomous University of Puebla. Download to read the full chapter text

Chapter PDF

Augmented Semi-naive Bayes Classifier

Classification of Protein Sequences by Means of an Ensemble Classifier with an Improved Feature Selection Strategy

Tree-based dynamic classifier chains

Article Open access 25 March 2022

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

McCallum, A., Nigam, K.: A comparison of event models for naive bayes text classification. In: AAAI 1998 Workshop on Learning for Text Categorization (1998)
Google Scholar
Andorf, C., Silvescu, A., Dobbs, D., Honavar, V.: Learning classifiers for assigning protein sequences to gene ontology functional families. In: 5^th International Conference on Knowledge Based Computer Systems, pp. 256–265 (2004)
Google Scholar
Langley, P.: Induction of recursive bayesian classifiers. In: Proc. of the European Conf. on Machine Learning, London, UK, pp. 153–164. Springer-Verlag, Heidelberg (1993)
Google Scholar
Quinlan, J.R.: C4.5: Programs for machine learning. Morgan Kaufmann Publishers Inc., San Francisco (1993)
Google Scholar
Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian network classifiers. Machine Learning 29, 131–163 (1997)
Article MATH Google Scholar
Kang, D.K., Zhang, J., Silvescu, A., Honavar, V.: Multinomial event model based abstraction for sequence and text classification. In: 6^th International Symposium on Abstraction, Reformulation and Approximation, pp. 134–148 (2005)
Google Scholar
Platt, J.C.: Fast training of support vector machines using sequential minimal optimization. Advances in kernel methods: support vector learning, 185–208 (1999)
Google Scholar
Apté, C., Damerau, F., Weiss, S.M.: Towards language independent automated learning of text categorization models. In: 17^th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 23–30 (1994)
Google Scholar
Dumais, S., Platt, J., Heckerman, D., Sahami, M.: Inductive learning algorithms and representations for text categorization. In: Proceedings of the 7^th international conference on Information and knowledge management, pp. 148–155. ACM Press, New York (1998)
Google Scholar
Reinhardt, A., Hubbard, T.: Using neural networks for prediction of the subcellular location of proteins. Nucleic Acids Research 26, 2230–2236 (1998)
Article Google Scholar
Bairoch, A., Apweiler, R.: The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Research 28, 45–48 (2000)
Article Google Scholar
Kohavi, R.: Scaling up the accuracy of Naive-Bayes classifiers: a decision-tree hybrid. In: Proc. of the 2^nd International Conference on Knowledge Discovery and Data Mining, pp. 202–207 (1996)
Google Scholar
Gama, J., Brazdil, P.: Cascade generalization. Machine Learning 41, 315–343 (2000)
Article MATH Google Scholar
Blake, C., Merz, C.: UCI repository of machine learning databases (1998)
Google Scholar

Download references

Author information

Authors and Affiliations

Artificial Intelligence Research Laboratory, Department of Computer Science, Iowa State University, Ames, IA, 50011, USA
Dae-Ki Kang, Adrian Silvescu & Vasant Honavar

Authors

Dae-Ki Kang
View author publications
You can also search for this author in PubMed Google Scholar
Adrian Silvescu
View author publications
You can also search for this author in PubMed Google Scholar
Vasant Honavar
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Nanyang Technological University, Singapore
Wee-Keong Ng
Institute of Industrial Science, The University of Tokyo, 4-6-1 Komaba, Meguro-ku, 153-8505, Tokyo, Japan
Masaru Kitsuregawa
School of Computer Science and Technology, Heilongjiang University, China
Jianzhong Li
School of Computer Engineering, Nanyang Technological University, 639798, Singapore, Singapore
Kuiyu Chang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kang, DK., Silvescu, A., Honavar, V. (2006). RNBL-MN: A Recursive Naive Bayes Learner for Sequence Classification. In: Ng, WK., Kitsuregawa, M., Li, J., Chang, K. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2006. Lecture Notes in Computer Science(), vol 3918. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11731139_8

Download citation

DOI: https://doi.org/10.1007/11731139_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-33206-0
Online ISBN: 978-3-540-33207-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

RNBL-MN: A Recursive Naive Bayes Learner for Sequence Classification

Abstract

Chapter PDF

Similar content being viewed by others

Augmented Semi-naive Bayes Classifier

Classification of Protein Sequences by Means of an Ensemble Classifier with an Improved Feature Selection Strategy

Tree-based dynamic classifier chains

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

RNBL-MN: A Recursive Naive Bayes Learner for Sequence Classification

Abstract

Chapter PDF

Similar content being viewed by others

Augmented Semi-naive Bayes Classifier

Classification of Protein Sequences by Means of an Ensemble Classifier with an Improved Feature Selection Strategy

Tree-based dynamic classifier chains

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation