Proposing a New Method for Non-relative Imbalanced Dataset

Parvin, Hamid; Ansari, Sara; Parvin, Sajad

doi:10.1007/978-3-642-32922-7_31

Hamid Parvin⁴,
Sara Ansari⁴ &
Sajad Parvin⁴

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 188))

1204 Accesses
1 Citations

Abstract

A well-known domain in that it is highly likely for each exemplary dataset to be imbalanced is patient detection. In such systems there are many clients while a few of them are patient and the all others are healthy. So it is very common and likely to face an imbalanced dataset in such a system that is to detect a patient from various clients. In a breast cancer detection that is a special case of the mentioned systems, it is tried to discriminate the patient clients from healthy clients. It should be noted that the imbalanced shape of a dataset can be either relative or non-relative. The imbalanced shape of a dataset is relative where the mean number of samples is high in the minority class, but it is very less rather than the number of samples in the majority class. The imbalanced shape of a dataset is non-relative where the mean number of samples is low in the minority class. This paper presents an algorithm which is well-suited for and applicable to the field of non-relative imbalanced datasets. It is efficient in terms of both of the speed and the efficacy of learning. The experimental results show that the performance of the proposed algorithm outperforms some of the best methods in the literature.

Access provided by Autonomous University of Puebla. Download to read the full chapter text

Chapter PDF

Evidential Hybrid Re-sampling for Multi-class Imbalanced Data

A new technique for classification method with imbalanced training data

Article 24 February 2024

Breast Cancer Diagnosis Using Cluster-based Undersampling and Boosted C5.0 Algorithm

Article 18 February 2021

Keywords

References

He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowledge And Data Engineering 21(9), 1263–1284 (2009)
Article Google Scholar
Liu, X.Y., Wu, J., Zhou, Z.H.: Exploratory Under Sampling for Class Imbalance Learning. In: Proc. Int’l Conf. Data Mining, pp. 965–969 (2006)
Google Scholar
Liu, X.Y., Wu, J., Zhou, Z.H.: Exploratory Under sampling for Class-Imbalance Learning. IEEE Transactions on Systems, Man, and Cybernetics-part B: Cybernetics (2009)
Google Scholar
Zhang, J., Mani, I.: KNN Approach to Imbalanced Data Distributions: A Case Study Involving Information Extraction. In: Int’l Conf. Machine Learning (2003)
Google Scholar
Hamzei, M., Kangavari, M.R.: Learning from imbalanced data. Technical Report, Iran University of Sci. & Tech., Iran (2010)
Google Scholar
Minaei, F., Soleimanian, M., Kheirkhah, D.: Investigation the relationship between risk factors of occurrence of breast tumor in women, Aranobidgol, Iran (2009)
Google Scholar
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: Synthetic Minority Over-Sampling Technique. J. Artificial Intelligence Research 16, 321–357 (2002)
MATH Google Scholar
He, H., Bai, Y., Garcia, E.A., Li, S.: ADASYN: Adaptive Synthetic Sampling Approach for Imbalanced Learning. In: Proc. Int’l J. Conf. Neural Networks, pp. 1322–1328 (2008)
Google Scholar
Batista, G.E.A.P.A., Prati, R.C., Monard, M.C.: A Study of the Behavior of Several Methods for Balancing Machine Learning Training Data. ACM SIGKDD Explorations Newsletter 6(1), 20–29 (2004)
Article Google Scholar
Jo, T., Japkowicz, N.: Class Imbalances versus Small Disjuncts. ACM SIGKDD Explorations Newsletter 6(1), 40–49 (2004)
Article MathSciNet Google Scholar
Chawla, N.V., Lazarevic, A., Hall, L.O., Bowyer, K.W.: SMOTEBoost: Improving Prediction of the Minority Class in Boosting. In: Lavrač, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds.) PKDD 2003. LNCS (LNAI), vol. 2838, pp. 107–119. Springer, Heidelberg (2003)
Chapter Google Scholar
Schapire, R.E.: The strength of weak learn ability. Machine Learning 5(2), 1971–1227 (1990)
Google Scholar

Download references

Author information

Authors and Affiliations

Nourabad Mamasani Branch, Islamic Azad University, Nourabad, Mamasani, Iran
Hamid Parvin, Sara Ansari & Sajad Parvin

Authors

Hamid Parvin
View author publications
You can also search for this author in PubMed Google Scholar
Sara Ansari
View author publications
You can also search for this author in PubMed Google Scholar
Sajad Parvin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hamid Parvin .

Editor information

Editors and Affiliations

VŠB-TU Ostrava, 17. listopadu 15, Ostrava, 70833, Czech Republic
Václav Snášel
(MIR Labs), Scientific Network for Innovation and, Machine Intelligence Research Labs, Auburn, 98071-2259, Washington, USA
Ajith Abraham
, Dept. de Informática y Automática, University of Salamanca, Plaza de la Merced, s/n, Salamanca, 37008, Spain
Emilio S. Corchado

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Parvin, H., Ansari, S., Parvin, S. (2013). Proposing a New Method for Non-relative Imbalanced Dataset. In: Snášel, V., Abraham, A., Corchado, E. (eds) Soft Computing Models in Industrial and Environmental Applications. Advances in Intelligent Systems and Computing, vol 188. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32922-7_31

Download citation

DOI: https://doi.org/10.1007/978-3-642-32922-7_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32921-0
Online ISBN: 978-3-642-32922-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Proposing a New Method for Non-relative Imbalanced Dataset

Abstract

Chapter PDF

Similar content being viewed by others

Evidential Hybrid Re-sampling for Multi-class Imbalanced Data

A new technique for classification method with imbalanced training data

Breast Cancer Diagnosis Using Cluster-based Undersampling and Boosted C5.0 Algorithm

Keywords

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Proposing a New Method for Non-relative Imbalanced Dataset

Abstract

Chapter PDF

Similar content being viewed by others

Evidential Hybrid Re-sampling for Multi-class Imbalanced Data

A new technique for classification method with imbalanced training data

Breast Cancer Diagnosis Using Cluster-based Undersampling and Boosted C5.0 Algorithm

Keywords

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation