A Mixture-of-Experts Framework for Learning from Imbalanced Data Sets

Estabrooks, Andrew; Japkowicz, Nathalie

doi:10.1007/3-540-44816-0_4

Andrew Estabrooks⁵ &
Nathalie Japkowicz⁶

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2189))

Included in the following conference series:

International Symposium on Intelligent Data Analysis

1268 Accesses
38 Citations

Abstract

Re-Sampling methods are some of the different types of approaches proposed to deal with the class-imbalance problem. Although such approaches are very simple, tuning them most effectively is not an easy task. In particular, it is unclear whether oversampling is more effective than undersampling and which oversampling or undersampling rate should be used. This paper presents an experimental study of these questions and concludes that combining different expressions of the resampling approach in a mixture of experts framework is an effective solution to the tuning problem. The proposed combination scheme is evaluated on a subset of the REUTERS-21578 text collection (the 10 top categories) and is shown to be very effective when the data is drastically imbalanced.

We would like to thank Rob Holte and Chris Drummond for their useful comments. This research was funded, in part, by an NSERC grant. The work conducted in this paper was conducted at Dalhousie University.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

TOMBoost: a topic modeling based boosting approach for learning with class imbalance

Article 12 October 2022

Learning from Imbalanced Data: A Comparative Study

Handling Class Imbalance in k-Nearest Neighbor Classification by Balancing Prior Probabilities

References

Domingos, Pedro (1999): Metacost: A general method for making classifiers cost sensitive, Proceedings of the Fifth International Conference on Knowledge Discovery and Data Mining, 155–164.
Google Scholar
Estabrooks, Andrew (2000): A Combination Scheme for Inductive Learning from Imbalanced Data Sets, MCS Thesis, Faculty of Computer Science, Dalhousie University.
Google Scholar
Hansen, L. K. and Salamon, P. (1990): Neural Network Ensembles, IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(10), 993–1001.
Article Google Scholar
Japkowicz, Nathalie (2000): The Class Imbalance Problem: Significance and Strategies, Proceedings of the 2000 International Conference on Artificial Intelligence (IC-AI’2000), 111–117.
Google Scholar
Kubat, Miroslav and Matwin, Stan (1997): Addressing the Curse of Imbalanced Data Sets: One-Sided Sampling, Proceedings of the Fourteenth International Conference on Machine Learning, 179–186.
Google Scholar
Lewis, D. and Gale, W. (1994): Training Text Classifiers by Uncertainty Sampling, Proceedings of the Seventh Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.
Google Scholar
Shimshoni, Y. and Intrator, N. (1998): Classifying Seismic Signals by Integrating Ensembles of Neural Networks, IEEE Transactions On Signal Processing, Special issue on NN.
Google Scholar

Download references

Author information

Authors and Affiliations

IBM Toronto Lab, Office 1B28B, 1150, Eglinton Avenue East, North York, Ontario, Canada, M3C 1H7
Andrew Estabrooks
SITE, University of Ottawa, 150 Louis Pasteur, P.O. Box 450 Stn. A, Ottawa, Ontario, Canada, K1N 6N5
Nathalie Japkowicz

Authors

Andrew Estabrooks
View author publications
You can also search for this author in PubMed Google Scholar
Nathalie Japkowicz
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Royal Institute of Technology, Centre for Autonomous Systems, 10044, Stockholm, Sweden
Frank Hoffmann
Imperial College, Huxley Building 180 Queen’s Gate, London, SW7 2BZ, UK
David J. Hand & Niall Adams &
Department of Computer Science, Vanderbilt University, Box 1679, Station B, Nashville, TN, 37235, USA
Douglas Fisher
Department of Computer Science, New University of Lisbon, 2825-114, Caparica, Portugal
Gabriela Guimaraes

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Estabrooks, A., Japkowicz, N. (2001). A Mixture-of-Experts Framework for Learning from Imbalanced Data Sets. In: Hoffmann, F., Hand, D.J., Adams, N., Fisher, D., Guimaraes, G. (eds) Advances in Intelligent Data Analysis. IDA 2001. Lecture Notes in Computer Science, vol 2189. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44816-0_4

Download citation

DOI: https://doi.org/10.1007/3-540-44816-0_4
Published: 03 September 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42581-6
Online ISBN: 978-3-540-44816-7
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

A Mixture-of-Experts Framework for Learning from Imbalanced Data Sets

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

TOMBoost: a topic modeling based boosting approach for learning with class imbalance

Learning from Imbalanced Data: A Comparative Study

Handling Class Imbalance in k-Nearest Neighbor Classification by Balancing Prior Probabilities

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

A Mixture-of-Experts Framework for Learning from Imbalanced Data Sets

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

TOMBoost: a topic modeling based boosting approach for learning with class imbalance

Learning from Imbalanced Data: A Comparative Study

Handling Class Imbalance in k-Nearest Neighbor Classification by Balancing Prior Probabilities

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation