Abstract
In this paper we study Distributed Data Mining from a Distributed Artificial Intelligence perspective. Very often, databases are very large to be mined. Then Distributed Data Mining can be used for discovering knowledge (rule sets) generated from parts of the entire training data set. This process requires cooperation and coordination between the processors because incon-sistent, incomplete and useless knowledge can be generated, since each processor uses partial data. Cooperation and coordination are important issues in Distributed Artificial Intelligence and can be accomplished with different techniques: planning (centralized, partially distributed and distributed), negotiation, reaction, etc. In this work we discuss a coordination protocol for cooperative learning agents of a MAS developed previously, comparing it conceptually with other learning systems. This cooperative process is hierarchical and works under the coordination of a manager agent. The proposed model aims to select the best rules for integration into the global model without, however, decreasing its accuracy rate. We have also done experiments comparing accuracy and complexity of the knowledge generated by the cooperative agents.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Blake, C., Merz, C.J.: UCI Repository of machine learning databases - Irvine. CA: University of California. Department of Information and Computer Science (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html
Hand, D., Mannila, H., Smyth, P.: Principals of Data Mining. MIT Press, Cambridge, Mass (2001)
Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufman Publishers, San Francisco, CA (2001)
Kargupta, H., Sivakumar, K.: Existential pleasures of distributed data minig. In: Kargupta, H., Joshi, A., Sivakumar, K., e Yesha, Y.: (eds.) Data Mining: Next Generation Challenges and Future Directions, MIT/ AAAI Press (2004)
Park, B., Kargupta, H.: Distributed Data Mining: Algorithms, Systems, and Applications. In: Ye, N. (ed.) The Handbook of Data Mining, pp. 341–358. Lawrence Erlbaum Associates, Mahwah (2003)
Wittig, T.: ARCHON: On Architecture for Multi-Agent Systems. Ellis Horwood (1992)
Sridharam, N.S.: Workshop on Distribuited AI (Report) AI Magazine, 8 (3) (1987)
Jennings, N., Sycara, K., Wooldridge, M.: A Roadmap of Agent Research and Development. Autonomous Agents and Multi-Agent Systems 1, 7–38 (1998)
Wooldridge, M.J.: Reasoning about Rational Agents. MIT Press, Cambridge (2000)
Schroeder, L.F., Bazzan, A.L.C.: A multi-agent system to facilitate knowledge discovery: an application to bioinformatics. In: Proc. of the Workshop on Bioinformatics and Multi-Agent Systems, Bologna, Italy, pp. 44–50 (2002)
Viktor, H., Arndt, H.: Combining data mining and human expertise for making decisions, sense and policies. J. of Systems and Information Technology 4(2), 33–56 (2000)
Freitas, A., Lavington, S.H.: Mining very largedatabases with parallel processing. Kluwer Academic Publishers, The Netherlands (1998)
Stolfo, S., et al.: Jam: Java agents for meta-learning over distributed databases. In: Proceedings of the Third International Conference on Knowledge Discovery and Data Mining, pp. 74–81. AAAI Press, Menlo Park, CA (1997)
Kargupta, H., et al.: Collective data mining: A new perspective towards distributed data mining. In: Advances in Distributed and Parallel Knowledge Discovery, pp. 133–184. AAAI/MIT Press, Cambridge, MA (2000)
Jennings, N.R.: Coordination tecniques for distributed artificial intelligence. In: O’hare, G.M.P., Jennings, N.R (eds.) Foundations of distributed artificial intelligence, pp. 187–210. John Wiley & Sons, New York (1996)
Jennings, N., Sycara, K., Wooldridge, M.: A Roadmap of Agent Research and Development. Autonomous Agents and Multi-Agent Systems 1, 7–38 (1998)
Wooldridge, M.J.: Reasoning about Rational Agents. MIT Press, Cambridge (2000)
Cohen, W.W.: Fast effective rule induction. In: Proc. of the Twelfth Intl. Conf. on Machine Learning, pp. 115–123 (1995)
Schapire, R.E., Freund, Y.: The Boosting Approach to Machine Learning: An Overview. In: MSRI Workshop on Nonlinear Estimation and Classification, Berkeley, CA (2001)
Kargupta, H., et al.: Scalable, distributed data mining using an agent-based architecture. In: Heckerman, D., Mannila, H., Pregibon, D., Uthurusamy, R. (eds.) Proc 3rd International Conference on Knowledge Discovery and Data Mining, AAAI Press, Newport Beach, California, USA (1997)
Bailey, S., et al.: Papyrus: a system for data mining over local and wide area clusters and super-clusters. In: Proc. Conference on Supercomputing, ACM Press, New York (1999)
Santos, C., Bazzan, A.: Integrating Knowledge through cooperative negotiation - A case study in bioinformatics. In: Gorodetsky, V., Liu, J., Skormin, V.A. (eds.) AIS-ADM 2005. LNCS (LNAI), vol. 3505, Springer, Heidelberg (2005)
Breiman, L.: Bagging Predictors. Machine Learning 24(2), 123–140 (1996)
Shapire, R.E.: The Boosting Approach to Machine Learning: An Overview, MSRI Workshop on Nonlinear Estimation and Classification (2002)
Ting, K.M., Witten, I.H.: Stacking Bagged and Dagged Models. In: ICML, pp. 367–375 (1997)
de Paula, A.C.M.P., Scalabrin, E.E., Ávila, B.C., Enembreck, F.: Multiagent-based Model Integration. In: IEEE/WIC/ACM International Conference on Intelligent Agent Technology, 2006, Hong Kong. International Workshop on Interaction between Agents and Data Mining (IADM-2006) (2006)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
de Paula, A.C.M.P., Ávila, B.C., Scalabrin, E., Enembreck, F. (2007). Using Distributed Data Mining and Distributed Artificial Intelligence for Knowledge Integration. In: Klusch, M., Hindriks, K.V., Papazoglou, M.P., Sterling, L. (eds) Cooperative Information Agents XI. CIA 2007. Lecture Notes in Computer Science(), vol 4676. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75119-9_7
Download citation
DOI: https://doi.org/10.1007/978-3-540-75119-9_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-75118-2
Online ISBN: 978-3-540-75119-9
eBook Packages: Computer ScienceComputer Science (R0)