Infinitely Many-Armed Bandits with Unknown Value Distribution

David, Yahel; Shimkin, Nahum

doi:10.1007/978-3-662-44848-9_20

Yahel David²³ &
Nahum Shimkin²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8724))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

4904 Accesses
6 Citations

Abstract

We consider a version of the classical stochastic Multi-Armed bandit problem in which the number of arms is large compared to the time horizon, with the goal of minimizing the cumulative regret. Here, the mean-reward (or value) of newly chosen arms is assumed to be i.i.d. We further make the simplifying assumption that the value of an arm is revealed once this arm is chosen. We present a general lower bound on the regret, and learning algorithms that achieve this bound up to a logarithmic factor. Contrary to previous work, we do not assume that the functional form of the tail of the value distribution is known. Furthermore, we also consider a variant of our model where sampled arms are non-retainable, namely are lost if not used continuously, with similar near-optimality results.

Download to read the full chapter text

Chapter PDF

Refined Algorithms for Infinitely Many-Armed Bandits with Deterministic Rewards

The non-stationary stochastic multi-armed bandit problem

Article 30 March 2017

Multi-armed bandits with dependent arms

Article 20 November 2023

References

Amin, K., Kearns, M., Draief, M., Abernethy, J.D.: Large-scale bandit problems and KWIK learning. In: Proceedings of the 30th International Conference on Machine Learning (ICML 2013), pp. 588–596 (2013)
Google Scholar
Auer, P., Ortner, R., Szepesvári, C.: Improved rates for the stochastic continuum-armed bandit problem. In: Bshouty, N.H., Gentile, C. (eds.) COLT. LNCS (LNAI), vol. 4539, pp. 454–468. Springer, Heidelberg (2007)
Chapter Google Scholar
Babaioff, M., Immorlica, N., Kempe, D., Kleinberg, R.: Online auctions and generalized secretary problems. ACM SIGecom Exchanges 7(2), 1–11 (2008)
Article Google Scholar
Berry, D.A., Chen, R.W., Zame, A., Heath, D.C., Shepp, L.A.: Bandit problems with infinitely many arms. The Annals of Statistics, 2103–2116 (1997)
Google Scholar
Bonald, T., Proutiere, A.: Two-target algorithms for infinite-armed bandits with Bernoulli rewards. In: Advances in Neural Information Processing Systems 26, pp. 2184–2192. Curran Associates, Inc. (2013)
Google Scholar
Bubeck, S., Munos, R., Stoltz, G., Szepesvári, C.: X-armed bandits. Journal of Machine Learning Research 12, 1655–1695 (2011)
MATH Google Scholar
Chakrabarti, D., Kumar, R., Radlinski, F., Upfal, E.: Mortal multi-armed bandits. In: Advances in Neural Information Processing Systems 21, pp. 273–280. Curran Associates, Inc. (2009)
Google Scholar
David, Y., Shimkin, N.: Infinitely many-armed bandits with unknown value distribution. Technical report, Technion—Israel Institute of Technology (2014), http://webee.technion.ac.il/people/shimkin/PAPERS/ECML14-full.pdf
Freeman, P.: The secretary problem and its extensions: A review. International Statistical Review, 189–206 (1983)
Google Scholar
Kakade, S.M., von Luxburg, U. (eds.): COLT 2011 - The 24th Annual Conference on Learning Theory, Budapest, Hungary, June 9-11. JMLR Proceedings, vol. 19. JMLR.org (2011)
Google Scholar
Kleinberg, R., Slivkins, A., Upfal, E.: Multi-armed bandits in metric spaces. In: Proceedings of the 40th Annual ACM Symposium on Theory of Computing, pp. 681–690. ACM (2008)
Google Scholar
Langford, J., Zhang, T.: The epoch-greedy algorithm for multi-armed bandits with side information. In: Advances in Neural Information Processing Systems, pp. 817–824 (2007)
Google Scholar
Lu, T., Pál, D., Pál, M.: Contextual multi-armed bandits. In: International Conference on Artificial Intelligence and Statistics, pp. 485–492 (2010)
Google Scholar
Teytaud, O., Gelly, S., Sebag, M.: Anytime many-armed bandits. In: CAP, Grenoble, France (2007)
Google Scholar
Wang, Y., Audibert, J.-Y., Munos, R.: et al. Infinitely many-armed bandits. In: Advances in Neural Information Processing Systems, vol. 8, pp. 1–8 (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical Engineering, Technion—Israel Institute of Technology, Haifa, 32000, Israel
Yahel David & Nahum Shimkin

Authors

Yahel David
View author publications
You can also search for this author in PubMed Google Scholar
Nahum Shimkin
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Applied Sciences, Department of Computer and Decision Engineering, Université Libre de Bruxelles, Av. F. Roosevelt, CP 165/15, 1050, Brussels, Belgium
Toon Calders
Dipartimento di Informatica, Università degli Studi “Aldo Moro”, via Orabona 4, 70125, Bari, Italy
Floriana Esposito
Department of Computer Science, Universität Paderborn, Warburger Str. 100, 33098, Paderborn, Germany
Eyke Hüllermeier
Dipartimento di Informatica, Università degli Studi di Torino, Corso Svizzera 185, 10149, Torino, Italy
Rosa Meo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

David, Y., Shimkin, N. (2014). Infinitely Many-Armed Bandits with Unknown Value Distribution. In: Calders, T., Esposito, F., Hüllermeier, E., Meo, R. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2014. Lecture Notes in Computer Science(), vol 8724. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-44848-9_20

Download citation

DOI: https://doi.org/10.1007/978-3-662-44848-9_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-44847-2
Online ISBN: 978-3-662-44848-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Infinitely Many-Armed Bandits with Unknown Value Distribution

Abstract

Chapter PDF

Similar content being viewed by others

Refined Algorithms for Infinitely Many-Armed Bandits with Deterministic Rewards

The non-stationary stochastic multi-armed bandit problem

Multi-armed bandits with dependent arms

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Infinitely Many-Armed Bandits with Unknown Value Distribution

Abstract

Chapter PDF

Similar content being viewed by others

Refined Algorithms for Infinitely Many-Armed Bandits with Deterministic Rewards

The non-stationary stochastic multi-armed bandit problem

Multi-armed bandits with dependent arms

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation