Abstract
In the paper we propose a new fuzzy interval type-2 C-ordered-means clustering algorithm for incomplete data. The algorithm uses both marginalisation and imputation to handle missing values. Thanks to imputation values in incomplete items are not lost, thanks to marginalisation imputed data can be distinguished from original complete items. The algorithm elaborates rough fuzzy sets (interval type-2 fuzzy sets) to model imprecision and incompleteness of data. For handling outliers the algorithm uses loss functions, ordering technique, and typicalities. Outliers are assigned with low values of typicalities. The paper describes also a new imputation technique–imputation with values from k nearest neighbours.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Cooke, M., Green, P., Josifovski, L., Vizinho, A.: Robust automatic speech recognition with missing and unreliable acoustic data. Speech Commun. 34, 267–285 (2001)
Dixon, J.K.: Pattern recognition with partly missing data. IEEE Trans. Syst. Man Cybern. SMC-9, 617–621 (1979)
D’Urso, P., Leski, J.M.: Fuzzy clustering of fuzzy data based on robust loss functions and ordered weighted averaging. Fuzzy Sets Syst. (2019)
Frank, A., Asuncion, A.: UCI machine learning repository (2010)
Grzymała-Busse, J.: A rough set approach to data with missing attribute values. In: Wang, G., Peters, J., Skowron, A., Yao, Y. (eds.) Rough Sets and Knowledge Technology. Lecture Notes in Computer Science, vol. 4062, pp. 58–67. Springer, Heidelberg (2006)
Krishnapuram, R., Keller, J.: A possibilistic approach to clustering. IEEE Trans. Fuzzy Syst. 1, 98–110 (1993)
Leski, J., Kotas, M.: On robust fuzzy \(c\)-regression models. Fuzzy Sets Syst. 279, 112–129 (2015)
Leski, J.M.: Fuzzy \(c\)-ordered-means clustering. Fuzzy Sets Syst. 286, 114–133 (2014)
Masson, M.-H., Denœux, T.: ECM: an evidential version of the fuzzy c-means algorithm. Pattern Recogn. 41, 1384–1397 (2008)
Matyja, A., Siminski, K.: Comparison of algorithms for clustering incomplete data. Found. Comput. Decis. Sci. 39(2), 107–127 (2014)
Nowicki, R.: Rough-neuro-fuzzy system with MICOG defuzzification. In: 2006 IEEE International Conference on Fuzzy Systems, Vancouver, Canada, pp. 1958–1965 (2006)
Renz, C. Rajapakse, J.C., Razvi, K., Liang, S.K.C.: Ovarian cancer classification with missing data. In: Proceedings of the 9th International Conference on Neural Information Processing, ICONIP 2002, Singapore, vol. 2, pp. 809–813 (2002)
Sikora, M., Sikora, B.: Application of machine learning for prediction a methane concentration in a coal-mine. Arch. Min. Sci. 51(4), 475–492 (2006)
Siminski, K.: Neuro-rough-fuzzy approach for regression modelling from missing data. Int. J. Appl. Math. Comput. Sci. 22(2), 461–476 (2012)
Siminski, K.: Clustering with missing values. Fundamenta Informaticae 123(3), 331–350 (2013)
Siminski, K.: Rough subspace neuro-fuzzy system. Fuzzy Sets Syst. 269, 30–46 (2015)
Siminski, K.: Imputation of missing values by inversion of fuzzy neuro-system. In: Gruca, A., Brachman, A., Kozielski, S., Czachórski, T. (eds.) Man–Machine Interactions 4, pp. 573–582. Springer, Cham (2016)
Siminski, K.: Fuzzy weighted c-ordered means clustering algorithm. Fuzzy Sets Syst. 318, 1–33 (2017)
Siminski, K.: NFL - free library for fuzzy and neuro-fuzzy systems. In: Kozielski, S. (ed.) Beyond Databases, Architectures and Structures. Paving the Road to Smart Data Processing and Analysis, pp. 139–150. Springer, Cham (2019)
Timm, H., Borgelt, C., Döring, C., Kruse, R.: An extension to possibilistic fuzzy cluster analysis. Fuzzy Sets Syst. 147, 3–16 (2004)
Troyanskaya, O., Cantor, M., Sherlock, G., Brown, P., Hastie, T., Tibshirani, R., Botstein, D., Altman, R.B.: Missing value estimation methods for DNA microarrays. Bioinformatics 17(6), 520–525 (2001)
Kuo-Lung, W., Yang, M.-S.: Alternative c-means clustering algorithms. Pattern Recogn. 35, 2267–2278 (2002)
Yager, R.R.: On ordered weighted averaging aggregation operators in multicriteria decisionmaking. IEEE Trans. Syst. Man Cybern. 18(1), 183–190 (1988)
Yang, M.-S., Kuo-Lung, W.: Unsupervised possibilistic clustering. Pattern Recogn. 39, 5–21 (2006)
Cheng Yeh, I.: Modeling of strength of high-performance concrete using artificial neural networks. Cem. Concr. Res. 28(12), 1797–1808 (1998)
Acknowledgements
The research has been supported by the Rector’s Grant for Research and Development (Silesian University of Technology, grant number: 02/020/RGJ19/0165).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Siminski, K. (2020). FIT2COMIn – Robust Clustering Algorithm for Incomplete Data. In: Gruca, A., Czachórski, T., Deorowicz, S., Harężlak, K., Piotrowska, A. (eds) Man-Machine Interactions 6. ICMMI 2019. Advances in Intelligent Systems and Computing, vol 1061 . Springer, Cham. https://doi.org/10.1007/978-3-030-31964-9_10
Download citation
DOI: https://doi.org/10.1007/978-3-030-31964-9_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-31963-2
Online ISBN: 978-3-030-31964-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)