Refined Algorithms for Infinitely Many-Armed Bandits with Deterministic Rewards

David, Yahel; Shimkin, Nahum

doi:10.1007/978-3-319-23528-8_29

Yahel David¹⁰ &
Nahum Shimkin¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9284))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

4893 Accesses
1 Citations

Abstract

We consider a variant of the Multi-Armed Bandit problem which involves a large pool of a priori identical arms (or items). Each arm is associated with a deterministic value, which is sampled from a probability distribution with unknown maximal value, and is revealed once that arm is chosen. At each time instant the agent may choose a new arm (with unknown value), or a previously-chosen arm whose value is already revealed. The goal is to minimize the cumulative regret relative to the best arm in the pool. Previous work has established a lower bound on the regret for this model, depending on the functional form of the tail of the sample distribution, as well as algorithms that attain this bound up to logarithmic terms. Here, we present a more refined algorithm that attains the same order as the lower bound. We further consider several variants of the basic model, involving an anytime algorithm and the case of non-retainable arms. Numerical experiments demonstrate the superior performance of the suggested algorithms.

Download to read the full chapter text

Chapter PDF

Infinitely Many-Armed Bandits with Unknown Value Distribution

The non-stationary stochastic multi-armed bandit problem

Article 30 March 2017

Multi-armed bandits with dependent arms

Article 20 November 2023

Keywords

References

Auer, P., Ortner, R., Szepesvári, C.: Improved rates for the stochastic continuum-armed bandit problem. In: Bshouty, N.H., Gentile, C. (eds.) COLT. LNCS (LNAI), vol. 4539, pp. 454–468. Springer, Heidelberg (2007)
Chapter Google Scholar
Babaioff, M., Immorlica, N., Kempe, D., Kleinberg, R.: Online auctions and generalized secretary problems. ACM SIGecom Exchanges 7(2), 1–11 (2008)
Article Google Scholar
Berry, D.A., Chen, R.W., Zame, A., Heath, D.C., Shepp, L.A.: Bandit problems with infinitely many arms. The Annals of Statistics, 2103–2116 (1997)
Google Scholar
Bonald, T., Proutiere, A.: Two-target algorithms for infinite-armed bandits with bernoulli rewards. In: Advances in Neural Information Processing Systems, pp. 2184–2192 (2013)
Google Scholar
Bubeck, S., Munos, R., Stoltz, G., Szepesvári, C.: X-armed bandits. Journal of Machine Learning Research 12, 1655–1695 (2011)
MATH Google Scholar
Chakrabarti, D., Kumar, R., Radlinski, F., Upfal, E.: Mortal multi-armed bandits. In: Advances in Neural Information Processing Systems, pp. 273–280 (2009)
Google Scholar
David, Y., Shimkin, N.: Infinitely many-armed bandits with unknown value distribution. In: Calders, T., Esposito, F., Hüllermeier, E., Meo, R. (eds.) ECML PKDD 2014, Part I. LNCS, vol. 8724, pp. 307–322. Springer, Heidelberg (2014)
Google Scholar
Freeman, P.: The secretary problem and its extensions: A review. International Statistical Review, 189–206 (1983)
Google Scholar
Kleinberg, R., Slivkins, A., Upfal, E.: Multi-armed bandits in metric spaces. In: Proceedings of the 40th Annual ACM Symposium on Theory of Computing, pp. 681–690. ACM (2008)
Google Scholar
Teytaud, O., Gelly, S., Sebag, M.: Anytime many-armed bandits. In: CAP (2007)
Google Scholar
Wang, Y., Audibert, J.-Y., Munos, R.: Algorithms for infinitely many-armed bandits. In: Advances in Neural Information Processing Systems, pp. 1729–1736 (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical Engineering, Technion—Israel Institute of Technology, 32000, Haifa, Israel
Yahel David & Nahum Shimkin

Authors

Yahel David
View author publications
You can also search for this author in PubMed Google Scholar
Nahum Shimkin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yahel David .

Editor information

Editors and Affiliations

University of Bari Aldo Moro, Bari, Italy
Annalisa Appice
University of Porto, Porto, Portugal
Pedro Pereira Rodrigues
University of Porto - CRACS/INESC TEC, Porto, Portugal
Vítor Santos Costa
University of Porto - INESC TEC, Porto, Portugal
Carlos Soares
University of Porto - INESC TEC, Porto, Portugal
João Gama
University of Porto - INESC TEC, Porto, Portugal
Alípio Jorge

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

David, Y., Shimkin, N. (2015). Refined Algorithms for Infinitely Many-Armed Bandits with Deterministic Rewards. In: Appice, A., Rodrigues, P., Santos Costa, V., Soares, C., Gama, J., Jorge, A. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2015. Lecture Notes in Computer Science(), vol 9284. Springer, Cham. https://doi.org/10.1007/978-3-319-23528-8_29

Download citation

DOI: https://doi.org/10.1007/978-3-319-23528-8_29
Published: 29 August 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23527-1
Online ISBN: 978-3-319-23528-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Refined Algorithms for Infinitely Many-Armed Bandits with Deterministic Rewards

Abstract

Chapter PDF

Similar content being viewed by others

Infinitely Many-Armed Bandits with Unknown Value Distribution

The non-stationary stochastic multi-armed bandit problem

Multi-armed bandits with dependent arms

Keywords

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Refined Algorithms for Infinitely Many-Armed Bandits with Deterministic Rewards

Abstract

Chapter PDF

Similar content being viewed by others

Infinitely Many-Armed Bandits with Unknown Value Distribution

The non-stationary stochastic multi-armed bandit problem

Multi-armed bandits with dependent arms

Keywords

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation