Asymptotically Optimal Agents

Lattimore, Tor; Hutter, Marcus

doi:10.1007/978-3-642-24412-4_29

Tor Lattimore²² &
Marcus Hutter^22,23

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6925))

Included in the following conference series:

International Conference on Algorithmic Learning Theory

2893 Accesses
8 Citations

Abstract

Artificial general intelligence aims to create agents capable of learning to solve arbitrary interesting problems. We define two versions of asymptotic optimality and prove that no agent can satisfy the strong version while in some cases, depending on discounting, there does exist a non-computable weak asymptotically optimal agent.

Access provided by Autonomous University of Puebla. Download to read the full chapter text

Chapter PDF

Algorithms may not learn to play a unique Nash equilibrium

Article 14 March 2021

Reinforcement Learning with Guarantees that Hold for Ever

On the Computability of Solomonoff Induction and Knowledge-Seeking

Keywords

References

Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Machine Learning 47, 235–256 (2002)
Article MATH Google Scholar
Berry, D.A., Fristedt, B.: Bandit Problems: Sequential Allocation of Experiments. Chapman and Hall, London (1985)
Book MATH Google Scholar
Diaconis, P., Freedman, D.: On inconsistent Bayes estimates of location. The Annals of Statistics 14(1), 68–87 (1986)
Article MathSciNet MATH Google Scholar
Diaconis, P., Freedman, D.: On the consistency of Bayes estimates. The Annals of Statistics 14(1), 1–26 (1986)
Article MathSciNet MATH Google Scholar
Frederick, S., Oewenstein, G.L., O’Donoghue, T.: Time discounting and time preference: A critical review. Journal of Economic Literature 40(2) (2002)
Google Scholar
Hutter, M.: Self-optimizing and Pareto-optimal policies in general environments based on Bayes-mixtures. In: Kivinen, J., Sloan, R.H. (eds.) COLT 2002. LNCS (LNAI), vol. 2375, pp. 364–379. Springer, Heidelberg (2002)
Chapter Google Scholar
Hutter, M.: Universal Artificial Intelligence: Sequential Decisions based on Algorithmic Probability. Springer, Berlin (2004)
MATH Google Scholar
Hutter, M., Muchnik, A.A.: On semimeasures predicting Martin-Löf random sequences. Theoretical Computer Science 382(3), 247–261 (2007)
Article MathSciNet MATH Google Scholar
Lattimore, T., Hutter, M.: Time consistent discounting. In: Kivinen, J., Szepesvári, C., Ukkonen, E., Zeugmann, T. (eds.) Algorithmic Learning Theory. LNCS, vol. 6925, pp. 384–398. Springer, Heidelberg (2011)
Google Scholar
Legg, S.: Is there an elegant universal theory of prediction? In: Balcázar, J.L., Long, P.M., Stephan, F. (eds.) ALT 2006. LNCS (LNAI), vol. 4264, pp. 274–287. Springer, Heidelberg (2006)
Chapter Google Scholar
Legg, S., Hutter, M.: Universal intelligence: A definition of machine intelligence. Minds & Machines 17(4), 391–444 (2007)
Article Google Scholar
Li, M., Vitanyi, P.: An Introduction to Kolmogorov Complexity and Its Applications, 3rd edn. Springer, Heidelberg (2008)
Book MATH Google Scholar
Norvig, P., Russell, S.J.: Artificial Intelligence: A Modern Approach, 2nd edn. Prentice Hall Series in Artificial Intelligence. Prentice Hall, Englewood Cliffs (2003)
MATH Google Scholar
Orseau, L.: Optimality issues of universal greedy agents with static priors. In: Hutter, M., Stephan, F., Vovk, V., Zeugmann, T. (eds.) ALT 2010. LNCS, vol. 6331, pp. 345–359. Springer, Heidelberg (2010)
Google Scholar
Strehl, A.L., Littman, M.L.: An analysis of model-based interval estimation for Markov decision processes. Journal of Computer and System Sciences 74(8), 1309–1331 (2008)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Research School of Computer Science, Australian National University, Australia
Tor Lattimore & Marcus Hutter
ETH Zürich, Australia
Marcus Hutter

Authors

Tor Lattimore
View author publications
You can also search for this author in PubMed Google Scholar
Marcus Hutter
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of Helsinki, (Gustaf Hällströmin katu 2b), P.O. Box 68, 00014, Helsinki, Finland
Jyrki Kivinen & Esko Ukkonen &
Department of Computing Science, University of Alberta, T6G 2E8, Edmonton, AB, Canada
Csaba Szepesvári
Division of Computer Science, Hokkaido University, N-14, W-9, 060-0814, Sapporo, Japan
Thomas Zeugmann

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lattimore, T., Hutter, M. (2011). Asymptotically Optimal Agents. In: Kivinen, J., Szepesvári, C., Ukkonen, E., Zeugmann, T. (eds) Algorithmic Learning Theory. ALT 2011. Lecture Notes in Computer Science(), vol 6925. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24412-4_29

Download citation

DOI: https://doi.org/10.1007/978-3-642-24412-4_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24411-7
Online ISBN: 978-3-642-24412-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Asymptotically Optimal Agents

Abstract

Chapter PDF

Similar content being viewed by others

Algorithms may not learn to play a unique Nash equilibrium

Reinforcement Learning with Guarantees that Hold for Ever

On the Computability of Solomonoff Induction and Knowledge-Seeking

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Asymptotically Optimal Agents

Abstract

Chapter PDF

Similar content being viewed by others

Algorithms may not learn to play a unique Nash equilibrium

Reinforcement Learning with Guarantees that Hold for Ever

On the Computability of Solomonoff Induction and Knowledge-Seeking

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation