Abstract
A central issue in the design of cooperative multiagent systems is how to coordinate the behavior of the agents to meet the goals of the designer. Traditionally, this had been accomplished by hand-coding the coordination strategies. However, this task is complex due to the interactions that can take place among agents. Recent work in the area has focused on how strategies can be learned. Yet, many of these systems suffer from convergence, complexity and performance problems. This paper presents a new approach for learning multiagent coordination strategies that addresses these issues. The effectiveness of the technique is demonstrated using a synthetic domain and the predator and prey pursuit problem.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Berry, D. & Fristedt, B. (1985). Bandit Problems: Sequential Allocation of Experiments.London: Chapman and Hall.
Cammarata, S., McArthur, D. & Steeb, R. (1983). Strategies of Cooperation in DistributedProblem Solving. Proceedings of IJCAI-83 (pp. 767-770). Karlsruhe, FDR: William Kaufmann.
Chien, S., Gratch, J. & Burl, M. (1995). On the Efficient Allocation of Resources for Hypothesis Evaluation: A Statistical Approach. IEEE Transactions on PAMI, 17(7), 652-665.
Dowell, M. L. & Stephens, L. (1996). Mage: Additions to theAGEAlgorithm for Learning in Multiagent Systems. unpublished manuscript.
Durfee, E., Lesser, V. & Corkill, D. (1987).Coherent Cooperation Among Communicating Problem Solvers. IEEE Transactions on Computers, 36, 1275-1291.
Durfee, E., H. (1988). Coordination of Distributed Problem Solvers. Boston: Kluwer Academic.
Fong,W. L. (1995).A Quantitative Study of Hypothesis Selection. Proceedings of ML-95 (pp. 226-234) Morgan Kaufmann.
Findler, N.V. & Lo, R. (1986). An Examination of Distributed Planning in the World of Air Traffic Control. Journal of Parallel and Distributed Computing, 3, 411-431.
Gratch, J. (1993). Composer: A Decision-theoretic Approach to AdaptiveProblem Solving. UIUCDCS-R-93-1806. University of Illinois at Urbana-Champaign.
Grefenstette, J. (1992). Theevolution of strategies for multi-agent environments. Adaptive Behavior, 1, 65-90.
Greiner, R. (1996). PALO: aprobabilistic hill-climbing algorithm. Artificial Intelligence, 84, 177-208.
Haynes, T. & Sen, S. (1995).Evolving behavioral strategies in predators and prey. In S. Sen (Ed), Working notes of the Adaptation and Learning in Multiagent Systems Workshop, ICJAI-95.
Haynes, T., Sen, S., Schoenefeld, D. & Wainwright, R. (1996). Evolving aTeam. AAAI Fall Symposium on Genetic Programming.
Haynes, T. & Sen. S. (1996). Learning Cases to ResolveConflicts and Improve Group Behavior. Working Notes of the AAAI Agent Modeling Workshop.
Hoeffding, W. (1963).Probability Inequalities for Sums of Bounded Random Variables. American Statistical Association Journal, 13-30.
Ishida, T. (1994). Parallel, distributed, and multiagent production systems. Berlin: Springer-Verlag.
Jennings, N. R.(1996). Coordination Techniques for Distributed Artificial Intelligence. In G. M. P. O'Hare & N. R. Jennings (Eds.), Foundations of Distributed Artificial Intelligence. New York, NY: John Wiley.
Kaelbling, L. P. (1993). Learning inEmbedded Systems. Cambridge, MA.: MIT Press.
Kamel, M. & Syed, A. (1994). A multiagent task planning method foragents with disparate capabilities. Journal of Advanced Manufacturing Technology, 9, 408-420.
Kirkpatrick, E.(1974). Introductory Statistics and Probability for Engineering, Science and Technology. Englewood Cliffs, NJ: Prentice-Hall.
Mataric, M. (1995). Designing and Understanding Adaptive Group Behavior. Adaptive Behavior, 4, 51-80.
Newell, A. & Steier, D. (1993). Intelligent control of external software systems. Artificial Intelligence inEngineering, 8, 3-21.
Parker, L. (1995). L-ALLIANCE:A Mechanism for Adaptive Action Selection inHeterogeneous Multi-Robot Teams. ORNL/TM-13000, Oak Ridge National Labs.
Rivest, R. L. & Yin, Q. (1994). SimulationResults for a New Two-armed Bandit Heuristic. In S. J. Hanson, G. A. Drastal & R. L. Rivest (Eds.), Computational Learning Theory and Natural Learning Systems Vol. 1: Constraints and Prospects. Cambridge, MA.: MIT Press.
Sandholm, T. & Crites, R., H. (1995). Multiagent Reinforcement Learning in the Iterated Prisoner's Dilemma. Biosystems:Special Issue on the Prisoner's Dilemma.
Sen, S., Sekaran, M. & Hale, J. (1994). Learning to coordinate without sharinginformation. Proceedings of AAAI-94 (pp. 426-431). Seattle, WA: AAAI Press.
Sen, S. & Sekaran, M. (1995).Multiagent coordination with learning classifier system. Working Notes of Adaptation and Learning in Multiagent Systems Workshop, ICJAI-95.
Tennenholtz, M. (1995). On computational social laws for dynamic non-homogeneous socialstructures. J. Expt. Theor. Artif. Intell., 7, 379-390.
Weiß, G. (1993). Learning to Coordinate Actions in Multi-agent Systems. Proceedings of IJCAI-93 (pp. 311-316). Chambéry France: Morgan Kaufmann.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Ho, F., Kamel, M. Learning Coordination Strategies for Cooperative Multiagent Systems. Machine Learning 33, 155–177 (1998). https://doi.org/10.1023/A:1007562506751
Issue Date:
DOI: https://doi.org/10.1023/A:1007562506751