Abstract
We present an extension of the usual agent-based data mining cooperative work flow that adds a so-called adjustment work flow. It allows for the use of various knowledge-based strategies that use information gathered from the miners and other agents to adjust the whole system to the particular data set that is mined. Among these strategies, in addition to the basic exchange of hints between the miners, are parameter adjustment of the miners and the use of a clustering miner to select good working data sets. Our experimental evaluation in mining rules for two medical data sets shows that adding a loop with the adjustment work flow substantially improves the efficiency of the system with all the strategies contributing to this improvement.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, A.I.: Fast discovery of association rules. In: Fayyad, U.M. (ed.) Advances in Knowledge Discovery and Data Mining, pp. 307–328. AAAI Press (1996)
Denzinger, J., Kronenburg, M.: Planning for distributed theorem proving: The teamwork approach. In: Görz, G., Hölldobler, S. (eds.) KI 1996. LNCS(LNAI), vol. 1137, pp. 43–56. Springer, Heidelberg (1996)
de Paula, A.C.M.P., Ávila, B.C., Scalabrin, E.E., Enembreck, F.: Using distributed data mining and distributed artificial intelligence for knowledge integration. In: Klusch, M., Hindriks, K.V., Papazoglou, M.P., Sterling, L. (eds.) CIA 2007. LNCS (LNAI), vol. 4676, pp. 89–103. Springer, Heidelberg (2007)
Gao, J., Denzinger, J., James, R.C.: A cooperative multi-agent data mining model and its application to medical data on diabetes. In: Gorodetsky, V., Liu, J., Skormin, V.A. (eds.) AIS-ADM 2005. LNCS (LNAI), vol. 3505, pp. 93–107. Springer, Heidelberg (2005)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: An update. SIGKDD Explorations Newsletter 11, 10–18 (2009)
Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques, 3rd edn. Morgan Kaufmann (2011)
Karaffa, M.C. (ed.): International Classification of Diseases, 9th Revision, Clinical Modification, 4th edn. Practice Management Information Corp., Los Angeles (1992)
Kargupta, H., Hamzaoglu, I., Stafford, B.: Scalable, distributed data mining using an agent based architecture. In: Proc. 3rd KDD, pp. 211–214 (1997)
Lisý, V., Jakob, M., Benda, P., Urban, Š., Pěchouček, M.: Towards cooperative predictive data mining in competitive environments. In: Cao, L., Gorodetsky, V., Liu, J., Weiss, G., Yu, P.S. (eds.) ADMI 2009. LNCS, vol. 5680, pp. 95–108. Springer, Heidelberg (2009)
Liu, H., Lu, H., Yao, J.: Toward multidatabase mining: Identifying relevant databases. IEEE Transactions on Knowledge and Data Engineering 13(4), 541–553 (2001)
Moemeng, C., Gorodetsky, V., Zuo, Z., Yang, Y., Zhang, C.: Agent-based distributed data mining: A survey. In: Cao, L. (ed.) Data Mining and Multi-agent Integration, pp. 47–58. Springer (2009)
Pelleg, D., Moore, A.W.: X-means: Extending k-means with efficient estimation of the number of clusters. In: Proc. 17th ML, pp. 727–734 (2000)
Quinlan, J.R.: C4.5: Programs for Machine Learning, Morgan Kaufmann (1993)
Stolfo, S.J., Prodromidis, A.L., Tselepis, S., Lee, W., Fan, D.W., Chan, P.K.: JAM: Java agents for meta-learning over distributed databases. In: Proc. 3rd KDD, pp. 74–81 (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gao, J., Denzinger, J. (2013). Improving the Efficiency of Distributed Data Mining Using an Adjustment Work Flow. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2013. Lecture Notes in Computer Science(), vol 7988. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39712-7_6
Download citation
DOI: https://doi.org/10.1007/978-3-642-39712-7_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-39711-0
Online ISBN: 978-3-642-39712-7
eBook Packages: Computer ScienceComputer Science (R0)