Computing the Performance of a New Adaptive Sampling Algorithm Based on The Gittins Index in Experiments with Exponential Rewards

He, James K.; Villar, Sofía S.; Mavrogonatou, Lida

doi:10.1007/978-3-031-37717-4_10

James K. He^10,11,
Sofía S. Villar¹⁰ &
Lida Mavrogonatou¹⁰

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 711))

Included in the following conference series:

Science and Information Conference

644 Accesses
1 Altmetric

Abstract

Designing experiments often requires balancing between learning about the true treatment effects and earning from allocating more samples to the superior treatment. While optimal algorithms for the Multi-Armed Bandit Problem (MABP) provide allocation policies that optimally balance learning and earning, they tend to be computationally expensive. The Gittins Index (GI) is a solution to the MABP that can simultaneously attain optimality and computationally efficiency goals, and it has been recently used in experiments with Bernoulli and Gaussian rewards. For the first time, we present a modification of the GI rule that can be used in experiments with exponentially-distributed rewards. We report its performance in simulated 2-armed and 3-armed experiments. Compared to traditional non-adaptive designs, our novel GI modified design shows operating characteristics comparable in learning (e.g. statistical power) but substantially better in earning (e.g. direct benefits). This illustrates the potential that designs using a GI approach to allocate participants have to improve participant benefits, increase efficiencies, and reduce experimental costs in adaptive multi-armed experiments with exponential rewards.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 219.00; Price excludes VAT (USA)

Softcover Book: USD 279.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Hypothesis testing in adaptively sampled data: ART to maximize power beyond iid sampling

Article 02 May 2023

A Curtailed Procedure for Selecting Among Treatments With Two Bernoulli Endpoints

Article 13 July 2021

Experiments, Longitudinal Studies, and Sequential Experimentation: How Using “Intermediate” Results Can Help Design Experiments

Data Availability Statement

The results presented in this work were based solely on simulated data. The code used for generating the data and the subsequent analysis is available through the GitHub repository: https://github.com/james-helium/gittins_adaptive_trials.

References

Sverdlov, O., Rosenberger, W.F.: On recent advances in optimal allocation designs in clinical trials. J. Stat. Theory. Pract. 7(4), 753–773 (2013)
Article MathSciNet MATH Google Scholar
Kalish, L.A., Begg, C.B.: Treatment allocation methods in clinical trials: a review. Stat. Med. 4(2), 129–144 (1985)
Article Google Scholar
Thompson, W.R.: On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika 25(3–4), 285–294 (1933)
Article MATH Google Scholar
Hu, F., Rosenberger, W.F.: The theory of response-adaptive randomization in clinical trials. John Wiley & Sons (2006)
Google Scholar
Williamson, S.F., Jacko, P., Villar, S.S., Jaki, T.: A Bayesian adaptive design for clinical trials in rare diseases. Comp. Stat. Data. Anal. 113, 136–153 (2017)
Article MathSciNet MATH Google Scholar
Villar, S.S., Bowden, J., Wason, J.: Multi-armed bandit models for the optimal design of clinical trials: benefits and challenges. Stat. Sci. 30(2), 199–215 (2015)
Article MathSciNet MATH Google Scholar
Williamson, S.F., Villar, S.S.: A response-adaptive randomization procedure for multi-armed clinical trials with normally distributed outcomes. Biometrics 76(1), 197–209 (2020)
Article MathSciNet MATH Google Scholar
Si, J., Yang, L., Lu, C., Sun, J., Mei, S.: Approximate dynamic programming for continuous state and control problems. In: IEEE, 17th Mediterranean Conference on Control and Automation 1415–1420 (2009)
Google Scholar
Mavrogonatou, L., Sun, Y., Robertson, D.S., Villar, S.S.: A comparison of allocation strategies for optimising clinical trial designs under variance heterogeneity. Comp. Stat. Data. Anal. 176, 107559 (2022)
Article MathSciNet MATH Google Scholar
Kendall, M.G.: The advanced theory of statistics (1946)
Google Scholar
Atkinson, A.C., Biswas, A.: Randomised response-adaptive designs in clinical trials. Monographs Stat. Appl. Probability 130, 130 (2013)
MATH Google Scholar
Zhu, H., Hu, F.: Implementing optimal allocation for sequential continuous responses with multiple treatments. J. Stat. Plan. Infer. 139(7), 2420–2430 (2009)
Article MathSciNet MATH Google Scholar
Robbins, H.: Some aspects of the sequential design of experiments. Bull. Amer. Math. Soc., (N.S.) (1952)
Google Scholar
Bellman, R.: On the theory of dynamic programming. Proc. Natl. Acad. Sci. U.S.A. 38, 716–719 (1952)
Article MathSciNet MATH Google Scholar
Bellman, R.: A problem in the sequential design of experiments. Sankhyā 16, 221–229 (1956)
MathSciNet MATH Google Scholar
Gittins, J.C., Jones, D.M.: A dynamic allocation index for the sequential design of experiments. Colloq. Math. Soc. János Bolyai 9, 241–266 (1974)
MathSciNet MATH Google Scholar
Gittins, J., Glazebrook, K., Weber, R.: Multi-armed bandit allocation indices. John Wiley & Sons (2011)
Google Scholar
Whittle, P.: Restless bandits: activity allocation in a changing world. J. Appl. Prob., 287–298 (1988)
Google Scholar
Miller Jr., R.G.: What price kaplan-meier?. Biometrics, 1077–1081 (1983)
Google Scholar

Download references

Acknowledgment

JKH thanks the University of Cambridge MRC Biostatistics Unit Summer Internship Program for the support and training that made this research possible. LM and SSV acknowledge funding and support from the UK Medical Research Council (MC_UU_00002/15).

Author information

Authors and Affiliations

University of Cambridge, Cambridge, CB2 1TN, UK
James K. He, Sofía S. Villar & Lida Mavrogonatou
Yonder Technology Limited, London, EC1V 9HX, UK
James K. He

Authors

James K. He
View author publications
You can also search for this author in PubMed Google Scholar
Sofía S. Villar
View author publications
You can also search for this author in PubMed Google Scholar
Lida Mavrogonatou
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

LM and SV defined the internship project proposal that led to this work; LM supervised the internship work. JKH and LM designed the simulation studies; JKH performed the simulations and analysed results; JKH drafted the manuscript; JKH, LM and SV contributed to the writing and editing of the manuscript.

Corresponding author

Correspondence to James K. He .

Editor information

Editors and Affiliations

Faculty of Science and Engineering, Saga University, Saga, Japan
Kohei Arai

Ethics declarations

Rights Retention Statement

For the purpose of open access, the author has applied a Creative Commons Attribution (CC BY) licence to any Author Accepted Manuscript version arising.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

He, J.K., Villar, S.S., Mavrogonatou, L. (2023). Computing the Performance of a New Adaptive Sampling Algorithm Based on The Gittins Index in Experiments with Exponential Rewards. In: Arai, K. (eds) Intelligent Computing. SAI 2023. Lecture Notes in Networks and Systems, vol 711. Springer, Cham. https://doi.org/10.1007/978-3-031-37717-4_10

Download citation

DOI: https://doi.org/10.1007/978-3-031-37717-4_10
Published: 01 September 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-37716-7
Online ISBN: 978-3-031-37717-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Computing the Performance of a New Adaptive Sampling Algorithm Based on The Gittins Index in Experiments with Exponential Rewards

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Hypothesis testing in adaptively sampled data: ART to maximize power beyond iid sampling

A Curtailed Procedure for Selecting Among Treatments With Two Bernoulli Endpoints

Experiments, Longitudinal Studies, and Sequential Experimentation: How Using “Intermediate” Results Can Help Design Experiments

Data Availability Statement

References

Acknowledgment

Author information

Authors and Affiliations

Contributions

Corresponding author

Editor information

Editors and Affiliations

Ethics declarations

Rights Retention Statement

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Computing the Performance of a New Adaptive Sampling Algorithm Based on The Gittins Index in Experiments with Exponential Rewards

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Hypothesis testing in adaptively sampled data: ART to maximize power beyond iid sampling

A Curtailed Procedure for Selecting Among Treatments With Two Bernoulli Endpoints

Experiments, Longitudinal Studies, and Sequential Experimentation: How Using “Intermediate” Results Can Help Design Experiments

Data Availability Statement

References

Acknowledgment

Author information

Authors and Affiliations

Contributions

Corresponding author

Editor information

Editors and Affiliations

Ethics declarations

Rights Retention Statement

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation