Skip to main content

Computing the Performance of a New Adaptive Sampling Algorithm Based on The Gittins Index in Experiments with Exponential Rewards

  • Conference paper
  • First Online:
Intelligent Computing (SAI 2023)

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 711))

Included in the following conference series:

Abstract

Designing experiments often requires balancing between learning about the true treatment effects and earning from allocating more samples to the superior treatment. While optimal algorithms for the Multi-Armed Bandit Problem (MABP) provide allocation policies that optimally balance learning and earning, they tend to be computationally expensive. The Gittins Index (GI) is a solution to the MABP that can simultaneously attain optimality and computationally efficiency goals, and it has been recently used in experiments with Bernoulli and Gaussian rewards. For the first time, we present a modification of the GI rule that can be used in experiments with exponentially-distributed rewards. We report its performance in simulated 2-armed and 3-armed experiments. Compared to traditional non-adaptive designs, our novel GI modified design shows operating characteristics comparable in learning (e.g. statistical power) but substantially better in earning (e.g. direct benefits). This illustrates the potential that designs using a GI approach to allocate participants have to improve participant benefits, increase efficiencies, and reduce experimental costs in adaptive multi-armed experiments with exponential rewards.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 219.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 279.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Data Availability Statement

The results presented in this work were based solely on simulated data. The code used for generating the data and the subsequent analysis is available through the GitHub repository: https://github.com/james-helium/gittins_adaptive_trials.

References

  1. Sverdlov, O., Rosenberger, W.F.: On recent advances in optimal allocation designs in clinical trials. J. Stat. Theory. Pract. 7(4), 753–773 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  2. Kalish, L.A., Begg, C.B.: Treatment allocation methods in clinical trials: a review. Stat. Med. 4(2), 129–144 (1985)

    Article  Google Scholar 

  3. Thompson, W.R.: On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika 25(3–4), 285–294 (1933)

    Article  MATH  Google Scholar 

  4. Hu, F., Rosenberger, W.F.: The theory of response-adaptive randomization in clinical trials. John Wiley & Sons (2006)

    Google Scholar 

  5. Williamson, S.F., Jacko, P., Villar, S.S., Jaki, T.: A Bayesian adaptive design for clinical trials in rare diseases. Comp. Stat. Data. Anal. 113, 136–153 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  6. Villar, S.S., Bowden, J., Wason, J.: Multi-armed bandit models for the optimal design of clinical trials: benefits and challenges. Stat. Sci. 30(2), 199–215 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  7. Williamson, S.F., Villar, S.S.: A response-adaptive randomization procedure for multi-armed clinical trials with normally distributed outcomes. Biometrics 76(1), 197–209 (2020)

    Article  MathSciNet  MATH  Google Scholar 

  8. Si, J., Yang, L., Lu, C., Sun, J., Mei, S.: Approximate dynamic programming for continuous state and control problems. In: IEEE, 17th Mediterranean Conference on Control and Automation 1415–1420 (2009)

    Google Scholar 

  9. Mavrogonatou, L., Sun, Y., Robertson, D.S., Villar, S.S.: A comparison of allocation strategies for optimising clinical trial designs under variance heterogeneity. Comp. Stat. Data. Anal. 176, 107559 (2022)

    Article  MathSciNet  MATH  Google Scholar 

  10. Kendall, M.G.: The advanced theory of statistics (1946)

    Google Scholar 

  11. Atkinson, A.C., Biswas, A.: Randomised response-adaptive designs in clinical trials. Monographs Stat. Appl. Probability 130, 130 (2013)

    MATH  Google Scholar 

  12. Zhu, H., Hu, F.: Implementing optimal allocation for sequential continuous responses with multiple treatments. J. Stat. Plan. Infer. 139(7), 2420–2430 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  13. Robbins, H.: Some aspects of the sequential design of experiments. Bull. Amer. Math. Soc., (N.S.) (1952)

    Google Scholar 

  14. Bellman, R.: On the theory of dynamic programming. Proc. Natl. Acad. Sci. U.S.A. 38, 716–719 (1952)

    Article  MathSciNet  MATH  Google Scholar 

  15. Bellman, R.: A problem in the sequential design of experiments. Sankhyā 16, 221–229 (1956)

    MathSciNet  MATH  Google Scholar 

  16. Gittins, J.C., Jones, D.M.: A dynamic allocation index for the sequential design of experiments. Colloq. Math. Soc. János Bolyai 9, 241–266 (1974)

    MathSciNet  MATH  Google Scholar 

  17. Gittins, J., Glazebrook, K., Weber, R.: Multi-armed bandit allocation indices. John Wiley & Sons (2011)

    Google Scholar 

  18. Whittle, P.: Restless bandits: activity allocation in a changing world. J. Appl. Prob., 287–298 (1988)

    Google Scholar 

  19. Miller Jr., R.G.: What price kaplan-meier?. Biometrics, 1077–1081 (1983)

    Google Scholar 

Download references

Acknowledgment

JKH thanks the University of Cambridge MRC Biostatistics Unit Summer Internship Program for the support and training that made this research possible. LM and SSV acknowledge funding and support from the UK Medical Research Council (MC_UU_00002/15).

Author information

Authors and Affiliations

Authors

Contributions

LM and SV defined the internship project proposal that led to this work; LM supervised the internship work. JKH and LM designed the simulation studies; JKH performed the simulations and analysed results; JKH drafted the manuscript; JKH, LM and SV contributed to the writing and editing of the manuscript.

Corresponding author

Correspondence to James K. He .

Editor information

Editors and Affiliations

Ethics declarations

Rights Retention Statement

For the purpose of open access, the author has applied a Creative Commons Attribution (CC BY) licence to any Author Accepted Manuscript version arising.

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

He, J.K., Villar, S.S., Mavrogonatou, L. (2023). Computing the Performance of a New Adaptive Sampling Algorithm Based on The Gittins Index in Experiments with Exponential Rewards. In: Arai, K. (eds) Intelligent Computing. SAI 2023. Lecture Notes in Networks and Systems, vol 711. Springer, Cham. https://doi.org/10.1007/978-3-031-37717-4_10

Download citation

Publish with us

Policies and ethics