Abstract
Data fusion systems are widely used in various areas such as sensor networks, robotics, video and image processing, and intelligent system design. Data fusion is a technology that enables the process of combining information from several sources in order to form a unified picture or a decision. Today, anomaly detection algorithms (ADAs) are in use in a wide variety of applications (e.g. cyber security systems, etc.). In particular, in this research we focus on the process of integrating the output of multiple ADAs that perform within a particular domain. More specifically, we propose a two stage fusion process, which is based on the expertise of the individual ADA that is derived in the first step. The main idea of the proposed method is to identify multiple types of outliers and to find a set of expert outlier detection algorithms for each type. We propose to use semi-supervised methods. Preliminary experiments for the single-type outlier case are provided where we show that our method outperforms other benchmark methods that exist in the literature.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Chandola, V., Banerjee, A., Kumar, V.: Outlier detection: A survey. ACM Computing Surveys (2007) (to appear)
Petrovskiy, M.I.: Outlier detection algorithms in data mining systems. Programming and Computer Software 29(4), 228–237 (2003)
Zhang, L., Leung, H., Chan, K.C.C.: Information fusion based smart home control system and its application. IEEE Transactions on Consumer Electronics 54(3), 1157–1165 (2008)
Ahmed, M., Pottie, G.: Fusion in the context of information theory. Distributed Sensor Networks, 419–436 (2005)
Jeon, B., Landgrebe, D.A.: Decision fusion approach for multitemporal classification. IEEE Transactions on Geoscience and Remote Sensing 37(3), 1227–1233 (1999)
Schubert, E., et al.: On Evaluation of Outlier Rankings and Outlier Scores. In: SDM (2012)
Dietterich, T.G.: Ensemble methods in machine learning. Multiple classifier systems, pp. 1–15. Springer, Heidelberg (2000)
Tan, A.C., Gilbert, D.: Ensemble machine learning on gene expression data for cancer classification (2003)
Balke, W.-T., Kießing, W.: Optimizing multi-feature queries for image databases. In: Proc. of the Intern. Conf. on Very Large Databases (2000)
Asuncion, A., Newman, D.: UCI machine learning repository (2007)
Kriegel, H.-P., Kröger, P., Schubert, E., Zimek, A.: Interpreting and unifying outlier scores. In: Proc. SDM, pp. 13–24 (2011)
Lazarevic, A., Kumar, V.: Feature bagging for outlier detection. In: Proc. KDD, pp. 157–166 (2005)
Nguyen, H.V., Ang, H.H., Gopalkrishnan, V.: Mining outliers with ensemble of heterogeneous detectors on random subspaces. In: Kitagawa, H., Ishikawa, Y., Li, Q., Watanabe, C. (eds.) DASFAA 2010. LNCS, vol. 5981, pp. 368–383. Springer, Heidelberg (2010)
Berger, T.M., Durrant-Whyte, H.F.: Model distribution in decentralized multi-sensor data fusion. In: American Control Conference. IEEE (1991)
Chandola, V.: Anomaly detection for symbolic sequences and time series data. Diss. University of Minnesota (2009)
Kriegel, H.-P., et al.: Interpreting and Unifying Outlier Scores. In: SDM (2011)
Geusebroek, J.M., Burghouts, G.J., Smeulders, A.: The Amsterdam Library of Object Images. Int. J. Computer Vision 61(1), 103–112 (2005)
Grnitz, N., Kloft, M.M., Rieck, K., Brefeld, U.: Toward supervised anomaly detection. arXiv preprint arXiv:1401.6424 (2014)
Rajab, M.A., et al.: CAMP: Content-Agnostic Malware Protection. In: NDSS (2013)
Rieck, K., et al.: Automatic analysis of malware behavior using machine learning. Journal of Computer Security 19(4), 639–668 (2011)
Jang, J., Brumley, D., Venkataraman, S.: Bitshred: feature hashing malware for scalable triage and semantic analysis. In: Proceedings of the 18th ACM Conference on Computer and Communications Security. ACM (2011)
Egele, M., et al.: A survey on automated dynamic malware-analysis techniques and tools. ACM Computing Surveys (CSUR) 44(2), 6 (2012)
Thom, D., et al.: Spatiotemporal anomaly detection through visual analysis of geolocated twitter messages. In: 2012 IEEE Pacific Visualization Symposium (PacificVis). IEEE (2012)
Breunig, M.M., Kriegel, H.-P., Ng, R.T., Sander, J.: LOF: identifying density-based local outliers. SIGMOD Rec. 29(2) (May 2000)
Zhang, K., Hutter, M., Jin, H.: A New Local Distance-Based Outlier Detection Approach for Scattered Real-World Data. In: Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, PAKDD 2009 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
David, E., Leshem, G., Chalamish, M., Chiang, A., Shapira, D. (2014). Expert-Based Fusion Algorithm of an Ensemble of Anomaly Detection Algorithms. In: Cheng, SM., Day, MY. (eds) Technologies and Applications of Artificial Intelligence. TAAI 2014. Lecture Notes in Computer Science(), vol 8916. Springer, Cham. https://doi.org/10.1007/978-3-319-13987-6_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-13987-6_11
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-13986-9
Online ISBN: 978-3-319-13987-6
eBook Packages: Computer ScienceComputer Science (R0)