Abstract
Over the last 5 years, the AI community has shown considerable interest in decentralized control of multiple decision makers or “agents” under uncertainty. This problem arises in many application domains, such as multi-robot coordination, manufacturing, information gathering, and load balancing. Such problems must be treated as decentralized decision problems because each agent may have different partial information about the other agents and about the state of the world. It has been shown that these problems are significantly harder than their centralized counterparts, requiring new formal models and algorithms to be developed. Rapid progress in recent years has produced a number of different frameworks, complexity results, and planning algorithms. The objectives of this paper are to provide a comprehensive overview of these results, to compare and contrast the existing frameworks, and to provide a deeper understanding of their relationships with one another, their strengths, and their weaknesses. While we focus on cooperative systems, we do point out important connections with game-theoretic approaches. We analyze five different formal frameworks, three different optimal algorithms, as well as a series of approximation techniques. The paper provides interesting insights into the structure of decentralized problems, the expressiveness of the various models, and the relative advantages and limitations of the different solution techniques. A better understanding of these issues will facilitate further progress in the field and help resolve several open problems that we identify.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Amato, C., Bernstein, D., & Zilberstein, S. (2007). Optimizing memory-bounded controllers for decentralized POMDP. In Proceedings of the Twenty-Third Conference on Uncertainty in Artificial Intelligence (UAI). Vancouver, Canada, July 2007.
Amato, C., Bernstein, D. S., & Zilberstein, S. (2007). Solving POMDPs using quadratically constrained linear programs. In Proceedings of the Twentieth International Joint Conference on Artificial Intelligence (IJCAI) (pp. 2418–2424). Hyderabad, India, January 2007.
Aström K.J. (1965) Optimal control of Markov decision processes with incomplete state estimation. Journal of Mathematical Analysis and Applications 10: 174–205
Becker, R., Zilberstein, S., Lesser, V., & Goldman, C. V. (2004). Solving transition independent decentralized Markov decision processes. Journal of Artificial Intelligence Research (JAIR), 22, 423–455.
Bernstein, D. S. (2005). Complexity analysis and optimal algorithms for decentralized decision making. PhD thesis, Department of Computer Science, University of Massachusetts Amherst, Amherst, MA.
Bernstein D.S., Givan R., Immerman N., Zilberstein S. (2002) The complexity of decentralized control of Markov decision processes. Mathematics of Operations Research 27(4): 819–840
Bernstein, D. S., Hansen, E. A., & Zilberstein, S. (2005). Bounded policy iteration for decentralized POMDPs. In Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence (IJCAI) (pp. 1287–1292). Edinburgh, Scotland, July 2005.
Bernstein, D. S., Zilberstein, S., & Immerman, N. (2000). The complexity of decentralized control of Markov decision processes. In Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence (UAI) (pp. 32–37). Stanford, California, June 2000.
Beynier, A., & Mouaddib, A. (2006). An iterative algorithm for solving constrained decentralized Markov decision processes. In Proceedings of the Twenty-First National Conference on Artificial Intelligence (AAAI) (pp. 1089–1094). Boston, MA, July 2006.
Boutilier, C. (1996). Planning, learning and coordination in multiagent decision processes. In Proceedings of the Conference on Theoretical Aspects of Rationality and Knowledge (pp. 195–210). De Zeeuwse Stromen, The Netherlands.
Cogill, R., Rotkowitz, M., van Roy, B., & Lall, S. (2004). An approximate dynamic programming approach to decentralized control of stochastic systems. In Proceedings of the Allerton Conference on Communication, Control, and Computing (pp. 1040–1049). Urbana-Champaign, IL, 2004.
de Farias D.P., van Roy B. (2003) The linear programming approach to approximate dynamic programming. Operations Research 51(6): 850–865
Doshi, P., & Gmytrasiewicz, P. J. (2005). A particle filtering based approach to approximating interactive POMDPs. In Proceedings of the Twentieth National Conference on Artificial Intelligence (AAAI) (pp. 969–974). Pittsburg, Pennsylvania, July 2005.
Doshi, P., & Gmytrasiewicz, P. J. (2005). Approximating state estimation in multiagent settings using particle filters. In Proceedings of the Fourth International Joint Conference on Autonomous Agents and Multi-Agent Systems (AAMAS) (pp. 320–327). Utrecht, Netherlands, July 2005.
Emery-Montemerlo, R., Gordon, G., Schneider, J., & Thrun, S. (2004). Approximate solutions for partially observable stochastic games with common payoffs. In Proceedings of the Third International Joint Conference on Autonomous Agents and Multi-Agent Systems (AAMAS) (pp. 136–143). New York, NY, July 2004.
Gmytrasiewicz P.J., Doshi P. (2005) A framework for sequential planning in multiagent settings. Journal of Artificial Intelligence Research JAIR) 24: 49–79
Goldman C.V., Allen M., Zilberstein S. (2007) Learning to communicate in a decentralized environment. Autonomous Agents and Multi-Agent Systems 15(1): 47–90
Goldman, C. V., & Zilberstein, S. (2003). Optimizing information exchange in cooperative multi agent systems. In Proceedings of the Second International Joint Conference on Autonomous Agents and Multi-Agent Systems (AAMAS) (pp. 137–144). Melbourne, Australia, July 2003.
Goldman C.V., Zilberstein S. (2004) Decentralized control of cooperative systems: Categorization and complexity analysis. Journal of Artificial Intelligence Research (JAIR) 22: 143–174
Goldman, C. V., & Zilberstein, S. (2004). Goal-oriented DEC-MDPs with direct communication. Technical Report 04-44, Department of Computer Science, University of Massachusetts Amherst.
Hansen, E. A. (1998). Finite-memory control of partially observable systems. PhD thesis, Department of Computer Science, University of Massachuetts Amherst.
Hansen, E. A., Bernstein, D. S., & Zilberstein, S. (2004). Dynamic programming for partially observable stochastic games. In Proceedings of the Nineteenth National Conference on Artificial Intelligence (AAAI) (pp. 709–715). San Jose, California, July 2004.
Kaelbling L.P., Littmann M.L., Cassandra A.R. (1998) Planning and acting in partially observable stochastic domains. Artificial Intelligence 101(2): 99–134
Kalai E., Lehrer E. (1993) Rational learning leads to nash equilibrium. Econometrica 1: 1231–1240
Madani O., Hanks S., Condon A. (2003) On the undecidability of probabilistic planning and related stochastic optimization problems. Artificial Intelligence 147(1–2): 5–34
Nair, R., Pynadath, D., Yokoo, M., Tambe, M., & Marsella, S. (2003). Taming decentralized POMDPs: Towards efficient policy computation for multiagent settings. In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence (IJCAI) (pp. 705–711). Acapulco, Mexico, August 2003.
Nair, R., Varakantham, P., Tambe, M., & Yokoo, M. (2005). Networked distributed POMDPs: A synthesis of distributed constraint optimization and POMDPs. In Proceedings of the Twentieth National Conference on Artificial Intelligence (AAAI) (pp. 133–139). Pittsburgh, Pennsylvania, July 2005.
Nisan, N., Roughgarden, T., Tardos, E., Vazirani, V.V. (eds) (2007) Algorithmic Game Theory. Cambridge University Press., New York, NY
Ooi, J. M., & Wornell, G. W. (1996). Decentralized control of a multiple access broadcast channel: Performance bounds. In Proceedings of the Thirty-Fifth Conference on Decision and Control (pp. 293–298). Kobe, Japan, December 1996.
Osborne M.J., Rubinstein A. (1994) A course in game theory. The MIT Press, Cambridge, MA
Papadimitriou C.H. (1994) Computational complexity. Addison Wesley, Reading, MA
Papadimitriou C.H., Tsitsiklis J.N. (1986) Intractable problems in control theory. SIAM Journal on Control and Optimization 24(4): 639–654
Papadimitriou C.H., Tsitsiklis J.N. (1987) The complexity of Markov decision processes. Mathematics of Operations Research 12(3): 441–450
Parkes, D. C. (2008). Computational mechanism design. In Lecture notes of Tutorials at 10th Conf. on Theoretical Aspectsof Rationality and Knowledge (TARK-05). Institute of Mathematical Sciences, University of Singapore (to appear).
Peshkin, L., Kim, K.-E., Meuleau, N., & Kaelbling, L. P. (2000). Learning to cooperate via policy search. In Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence (UAI) (pp. 489–496). Stanford, CA, July 2000.
Petrik, M., & Zilberstein, S. (2007). Anytime coordination using separable bilinear programs. In Proceedings of the Twenty-Second Conference on Artificial Intelligence (AAAI) (pp. 750–755). Vancouver, Canada, July 2007.
Poupart, P., & Boutilier, C. (2003). Bounded finite state controllers. In Advances in Neural Information Processing Systems 16 (NIPS). Vancouver, Canada, December 2003.
Puterman M.L. (1994) Markov decision processes: discrete stochastic dynamic programming. Wiley, New York, NY
Pynadath D.V., Tambe M. (2002) The communicative multiagent team decision problem: Analyzing teamwork theories and models. Journal of Artificial Intelligence Research (JAIR) 16: 389–423
Rabinovich, Z., Goldman, C. V., & Rosenschein, J. S. (2003). The complexity of multiagent systems: The price of silence. In Proceedings of the Second International Joint Conference on Autonomous Agents and Multi-Agent Systems (AAMAS) (pp. 1102–1103). Melbourne, Australia, 2003.
Russell S., Norvig P. (2003) Artificial intelligence: A modern approach. Prentice Hall, Upper Saddle River, NJ
Seuken, S., & Zilberstein, S. (2007). Memory-bounded dynamic programming for DEC-POMDPs. In Proceedings of the Twentieth International Joint Conference on Artificial Intelligence (IJCAI) (pp. 2009–2015). Hyderabad, India, January 2007.
Seuken, S., & Zilberstein, S. (2007). Improved memory-bounded dynamic programming for decentralized POMDPs. In Proceedings of the 23rd Conference on Uncertainty in Artificial Intelligence (UAI). Vancouver, Canada, July 2007.
Shen, J., Becker, R., & Lesser, V. (2006). Agent interaction in distributed MDPs and its implications on complexity. In Proceedings of the Fifth International Joint Conference on Autonomous Agents and Multi-Agent Systems (AAMAS). Hakodate, Japan, May 2006.
Suzuki I., Yamashita M. (1999) Distributed anonymous mobile robots: Formation of geometric patterns. SIAM Journal on Computing 28(4): 1347–1363
Szer, D., & Charpillet, F. (2005). An optimal best-first search algorithm for solving infinite horizon DEC-POMDPs. In Proceedings of the Sixteenth European Conference on Machine Learning (ECML) (pp. 389–399). Porto, Portugal, October 2005.
Szer, D., & Charpillet, F. (2006). Point-based dynamic programming for DEC-POMDPs. In Proceedings of the Twenty-First National Conference on Artificial Intelligence (AAAI) (pp. 1233–1238). Boston, MA, July 2006.
Szer, D., Charpillet, F., & Zilberstein, S. (2005). MAA*: A heuristic search algorithm for solving decentralized POMDPs. In Proceedings of the Twenty-First Conference on Uncertainty in Artificial Intelligence (UAI) (pp. 576–583). Edinburgh, Scotland, July 2005.
Tambe M. (1997) Towards flexible teamwork. Journal of Artificial Intelligence Research (JAIR) 7: 83–124
Washington, R., Golden, K., Bresina, J., Smith, D. E., Anderson, C., & Smith, T. (1999). Autonomous rovers for Mars exploration. In Proceedings of the IEEE Aerospace Conference (pp. 237–251). Snowmass, CO, March 1999.
Zilberstein, S., Washington, R.,Bernstein, D. S., & Mouaddib, A.-I. (2002). Decision-theoretic control of planetary rovers. In Beetz, M., Guibas, L., Hertzberg, J., Ghallab, M., & Pollack, M.E. (Eds.), Advances in Plan-Based Control of Robotic Agents, Lecture Notes in Computer Science (Vol. 2466, pp. 270–289). Berlin, Germany: Springer.
Author information
Authors and Affiliations
Corresponding author
Additional information
This work was done while S. Seuken was a graduate student in the Computer Science Department of the University of Massachusetts, Amherst.
Rights and permissions
About this article
Cite this article
Seuken, S., Zilberstein, S. Formal models and algorithms for decentralized decision making under uncertainty. Auton Agent Multi-Agent Syst 17, 190–250 (2008). https://doi.org/10.1007/s10458-007-9026-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10458-007-9026-5