Between two stools: preclinical research, reproducibility, and statistical design of experiments

Reynolds, Penny S.

doi:10.1186/s13104-022-05965-w

Between two stools: preclinical research, reproducibility, and statistical design of experiments

Commentary
Open access
Published: 21 February 2022

Volume 15, article number 73, (2022)
Cite this article

Download PDF

You have full access to this open access article

BMC Research Notes Aims and scope Submit manuscript

Between two stools: preclinical research, reproducibility, and statistical design of experiments

Download PDF

Penny S. Reynolds ORCID: orcid.org/0000-0001-7480-6275¹

3336 Accesses
6 Citations
19 Altmetric
2 Mentions
Explore all metrics

Abstract

Translation of animal-based preclinical research is hampered by poor validity and reproducibility issues. Unfortunately, preclinical research has ‘fallen between the stools’ of competing study design traditions. Preclinical studies are often characterised by small sample sizes, large variability, and ‘problem’ data. Although Fisher-type designs with randomisation and blocking are appropriate and have been vigorously promoted, structured statistically-based designs are almost unknown. Traditional analysis methods are commonly misapplied, and basic terminology and principles of inference testing misinterpreted. Problems are compounded by the lack of adequate statistical training for researchers, and failure of statistical educators to account for the unique demands of preclinical research. The solution is a return to the basics: statistical education tailored to non-statistician investigators, with clear communication of statistical concepts, and curricula that address design and data issues specific to preclinical research. Statistics curricula should focus on statistics as process: data sampling and study design before analysis and inference. Properly-designed and analysed experiments are a matter of ethics as much as procedure. Shifting the focus of statistical education from rote hypothesis testing to sound methodology will reduce the numbers of animals wasted in noninformative experiments and increase overall scientific quality and value of published research.

Introduction

“…I think we’re falling between two stools at the moment.… I think we have to take a step backward and address the basics of our game.”

––Donal Lenihan 25 Nov 2020, RTÉ Rugby Podcast, on Ireland’s need to revise training strategy following a string of defeats to England.

Criticism of much animal-based preclinical research has centred on reproducibility issues and poor translation [1, 2]. Causes are systemic and multifactorial, and include poor model fidelity, clinical irrelevance of target biomarkers or molecular pathways, and between-lab disparities in models and procedures [3, 4]. Difficulties in verifying and replicating methodology [5] and methodological issues related to poor statistical design and analysis are also major contributors [6,7,8,9,10]. Translational failure has massive economic repercussions. Advances in therapeutic agents or diagnostics development are more than offset by multimillion-dollar losses in investment, and ultimately unsustainable research and development costs [6, 11, 12]. There is also a significant ethical component to these failures. If questionable methodology produces biased or invalid results, evidence derived from animal-based research cannot be a reliable bridge to human clinical trials [13]. It is difficult to justify the continued use of millions of animals each year if the majority are wasted in non-informative experiments that fail to produce tangible benefit.

In this commentary, I suggest that preclinical research has ‘fallen between two stools’, by not conforming to either clinical trial or agricultural research traditions or skillset camps, and with little of the rigour of either. The solution is a return to the basics for statistical educators and consultants: statistical training explicitly tailored to non-statistician investigators, and coverage of statistical issues and topics relevant to preclinical research. In particular, I urge a change in focus from statistics as ‘just maths’ to statistics as process. I argue that reform of introductory statistics curricula along these lines could go far to reverse statistical pathologies common to much of the preclinical research literature.

Main text

Two stools of competing traditions

The clinical trial and agricultural/industrial research traditions show considerable divergence in focus and methodology. Clinical trials are performed when there is uncertainty regarding relative efficacy of a specific clinical intervention. They are constrained by the necessity to minimize subject risk of mortality and severe adverse events. In general, clinical trials tend to be relatively large and simple, with only two or a few comparator interventions randomly assigned to many subjects, ideally representative of the target population. Although clinical trials have a history going back several hundred years (e.g. [14]), the randomized controlled trial (RCT) as the gold standard was a relatively recent development, with the first modern RCT performed in 1946 [15, 16], and formalisation only in the late 1970s. Lagging implementation was due in part to resistance to the so-called “numerical approach” by supporters of the non-randomised “let's-try-it-and-see” attitude to clinical research problems [17, 18]. Meanwhile, methodology for observational studies was being developed in parallel. Cohort studies in particular have had a key role in epidemiological investigations of carcinogenic and environmental hazards when RCTs are not feasible [19]. Because factors are not randomly assigned to subjects, inferring causality requires stringent methodological safeguards for minimising confounding and bias [15, 20, 21].

In contrast, agricultural/industrial designs are characterised by small sample sizes and multiple factors studied simultaneously. In addition to randomisation, key design features include replication and blocking (‘local control’), coupled with formal statistically-structured arrangements of input variables, such as randomized complete block and factorial designs [22]. Agricultural designs were developed primarily by Sir Ronald Fisher in the early half of the twentieth century. These principles were subsequently extended to industrial experimentation by George Box and collaborators [23]. Industrial experiments are further distinguished by sequential implementation (data from a small or restricted group of runs in the original experiment can be used to inform the next experiment), with prompt feedback (immediacy), allowing iteration and relatively rapid convergence to target solutions [24]. For these applications, variable screening and model building are both of interest, and ‘design’ is essentially the imposition of a statistical model as a useful approximation to the response of interest [23, 25].

Preclinical studies: between the stools

Animal-based research studies are unique for the explicit ethical obligation to minimise the numbers of animals used. Application of Three Rs (Replacement, Reduction, Refinement) principles are based on the premise that maximum scientific value should be obtained with minimal harms [26]. However, over-emphasis on numbers reduction has contributed to underpowered experiments generating unreliable, and ultimately noninformative, results [27, 28].

Small sample sizes, large variability, multi-group comparisons, and the exploratory nature of much preclinical research suggest that study designs should be more aligned with the agricultural/industrial tradition. Fisher-type designs (such as randomised complete blocks and factorials) are suitable for purpose and have been vigorously promoted [12, 29,30,31,32,33], as have procedural methods for controlling variation without increasing sample size [34], and design features that increase validity [1, 35]. However, these methods seem to be virtually unknown in the preclinical literature [7, 8, 36,37,38]. Two-group comparisons more typical of clinical trials are common, although unsuited to assessing multiple factors with interactions. Informal examination of introductory textbooks and statistics course syllabi suggest that knowledge gaps are due in part to sparse formal training in experimental design, and neglect of analytical methods more suited to preclinical data. Compounding these problems is lack of general statistical oversight. Unlike human-based studies [39], few animal research oversight committees in the USA have access to properly qualified biostatisticians, statistical analysis plans and study preregistration are not required, and protocol review criteria vary considerably between institutions [40].

Statistical pathologies in the preclinical literature

Bad statistical practices are very deeply entrenched in the preclinical literature. Many of the major errors observed in the research literature involve statistical basics [41,42,43]. Statistics service courses tend to emphasise mathematical aspects of probability and null hypothesis significance testing at the expense of non-mathematical components of statistical process [44,45,46]. Consequently, it is now part of the belief system of many investigators that ‘statistical significance (P < 0.05)’ is the major criterion for assessing biological importance of results, and that P-values are an intrinsic property of the biological event or group of animals being studied [47]. As a result, there is over-reliance on rote hypothesis testing and P-values to interpret results. Related pathologies include reporting of orphan inexact P-values with no context, P-hacking, N-hacking, selective reporting, and spin [41, 48].

A second problem area is poor understanding by investigators of basic statistical concepts and operational definitions. Statistical terms are frequently conflated with lay meanings, confused with other technical definitions, or ignored. Concepts that seem especially misunderstood include ‘study design’, ‘randomisation’, ‘cohort’, ‘unit of analysis’, and ‘replication’. To investigators, ‘study design’ refers primarily to descriptions of technical methodology and materials, e.g. [49]. To applied statisticians, ‘study design’ is the formal arrangement and structuring of independent or predictor variables hypothesized to affect the response or outcome of interest. A good study design maximizes the experimental signal by accounting for diverse sources of variability [31, 50, 51]), and incorporates specific design features to ensure results are reliable and valid, such as correct specification of the unit of analysis, relevant outcome measures, inclusion and exclusion criteria, and bias minimization methods [8, 35, 52]. ‘Randomisation’ to statisticians is a formal probabilistic process that minimizes selection bias and effect of latent confounders, and is the cornerstone for statistical inference. In contrast, randomisation in preclinical studies seems to be frequently misinterpreted in the lay sense of ‘unplanned’ or ‘haphazard’ [53], or is likely not performed at all [8, 38, 54, 55]. The common habit of referring to a group of animals subjected to a given treatment or intervention as a ‘cohort’ likely reflects non-random allocation of subjects to a defined intervention group, an invalid and confounded assignment strategy [56]. The term ‘cohort’ actually refers to groups of subjects in observational studies, where group membership is defined by some common characteristic [19]. It does not refer to experimental treatment groups with group allocation determined by randomisation. The meaning of ‘unit of analysis’ is virtually unknown, or confused with biological and observational units [56,57,58]. ‘Replication’ is frequently interpreted solely as duplication of the total sample size for ‘reproducibility’ [59], rather than as an independent repeat run of each combination of treatment factors [25].

A third area of concern is that the conventional statistical arsenal of t-tests, ANOVA, and χ² tests [60, 61] is unsuited for analysing ‘problem’ data typical of many preclinical studies. ‘Problem’ data include non-gaussian, correlated (clustered, nested, time dependencies), or non-linear data; data that are missing at random or due to dropout or attrition; data characterised by over-representation of true zeros; and high-dimensional data. A major deficiency that must be addressed is the focus of introductory courses on methods virtually unchanged since the 1950s, with little coverage of modern methods more appropriate for such data [8, 35, 44].

Finally, there is little attention paid to methods for identifying diverse sources of variation during experiment planning. Research papers rarely report auxiliary variables and conditions related to animal signalment, environment, and procedures only indirectly related to the main experiments, e.g. [62]. Such variables contribute to unpredictable effects on animals and experimental results, resulting in uncontrolled variation that obscures true treatment effects. For example, systematic investigations of factors contributing to survival time in mouse models of amyotrophic lateral sclerosis suggested that claims for therapeutic efficacy were most likely due to the effects of uncontrolled variation rather than actual drug effects [12, 29, 33].

Outlook

Lack of knowledge on the part of investigators is related to training deficiencies on the part of statistics educators. The solution is a return to the basics: statistical education that meets the needs of non-statistician investigators, and curricula addressing design and data issues specific to preclinical research. This is hardly new: in 1954, John Tukey identified as essential that “statistical methods should be tailored to the real needs of the user” [63], and this has been repeated in the decades since [9, 44, 46, 64, 65]. Investigators still identify better training in statistics and statistical methods as a high priority [9, 64]. The June 2021 report by the Advisory Committee to the Director of the National Institutes of Health (NIH-ACD) made five major recommendations to improve rigor and reproducibility of animal-based research, among which was recognition of the need for “modern and innovative statistics curricula relevant to animal researchers” [9].

What do researchers need? The poor internal validity characterising much preclinical research [66] reflects poor understanding of the upstream basics of statistically-based study design and data sampling strategies. Unreliable downstream results cannot be rescued by fancy analyses after the fact, as Fisher himself warned [67]. Therefore, the concept that good statistical principles must be built in during planning and before data are collected must be introduced and reinforced. This can be accomplished first, by more appropriate training of entry-level researchers with courses and topic coverage more attuned to specific need, and second by removal of longstanding barriers (such as cost and academic credit) to early consultation with appropriately-training statisticians. Early formal involvement of applied statisticians in the planning process must be encouraged and rewarded [9, 68].

Statistical educators and consultants must be re-educated to better address actual research needs. ‘Statistics’ is neither just maths nor an analytical frill tacked on to a study after data have been collected. Instead, statisticians must structure instructional materials to reflect the basic tenets of statistical process: design before inference, and data quality before analysis [69]. Data curation skills are also part of good statistical practice [46], identified as such for nearly a century [70]. These practices are not strongly mathematical, and unfortunately statisticians tend to be uninterested in non-mathematical procedures [46, 71]. Second, service courses must shift away from pedagogical approaches common to applied maths or algebra, where uncritical analysis of a data set leads to a fixed ‘correct’ solution [46, 71, 72]. Procedural change could be accelerated by statisticians becoming more aware of best-practice expectations though evidence-based planning [73] and reporting [74] guidelines. These tools can direct early-stage study planning to ensure that procedures strengthening study validity can be incorporated [4, 35, 74, 75].

Properly designed and analysed experiments are an ethical issue [28, 66, 69]. Shifting the focus of statistical education from rote hypothesis testing to sound methodology should ultimately reduce the numbers of animals wasted in noninformative experiments and increase overall scientific quality and value of published research.

Availability of data and materials

Not applicable.

Abbreviations

3Rs:: Replacement, Reduction, Refinement
NIH-ACD:: Advisory Committee to the Director of the National Institutes of Health
RCT:: Randomised controlled trial

References

Bailoo JD, Reichlin TS, Würbel H. Refinement of experimental design and conduct in laboratory animal research. ILAR J. 2014;55(3):383–91.
Article CAS PubMed Google Scholar
Lowenstein PR, Castro MG. Uncertainty in the translation of preclinical experiments to clinical trials. Why do most phase III clinical trials fail? Curr Gene Ther. 2009;9(5):368–74.
Article CAS PubMed PubMed Central Google Scholar
McGonigle P, Ruggeri B. Animal models of human disease: challenges in enabling translation. Biochem Pharmacol. 2014;87:162–71.
Article CAS PubMed Google Scholar
van der Worp HB, Sandercock PAG. Improving the process of translational research. BMJ. 2012;245: e7837.
Article Google Scholar
Errington TM, Denis A, Allison AB, Araiza R, Aza-Blanc P, Bower LR, Campos J, Chu H, Denson S, Dionham C, et al. Experiments from unfinished registered reports in the reproducibility project: cancer biology. Elife. 2021;10: e73430.
Article PubMed PubMed Central Google Scholar
Freedman LP, Cockburn IM, Simcoe TS. The economics of reproducibility in preclinical research. PLoS Biol. 2015;13: e1002165.
Article PubMed PubMed Central Google Scholar
Macleod MR. Why animal research needs to improve. Nature. 2011;477:511.
Article PubMed Google Scholar
Macleod MR, Lawson McLean A, Kyriakopoulou A, Serghiou S, de Wilde A, Sherratt N, Hirst T, Hemblade R, Bahor Z, Nunes-Fonseca C, et al. Risk of bias in reports of in vivo research: a focus for improvement. PLOS Biol. 2015;13(11): e1002301.
Article PubMed PubMed Central Google Scholar
Wold B, Tabak LA, Advisory Committee to the Director. ACD working group on enhancing rigor, transparency, and translatability in animal. Washington, DC: Department of Health and Human Services; 2021.
Google Scholar
Van Calster B, Wynants L, Riley RD, van Smeden M, Collins GS. Methodology over metrics: current scientific standards are a disservice to patients and society. J Clinical Epidemiol. 2021;138:219–26.
Article Google Scholar
Ledford H. 4 ways to fix the clinical trial. Nature. 2011;477:526–8.
Article CAS PubMed Google Scholar
Perrin S. Make mouse studies work. Nature. 2014;507:423–5.
Article PubMed Google Scholar
Macleod M. Learning lessons from MVA85A, a failed booster vaccine for BCG. BMJ. 2018;360: k66.
Article PubMed Google Scholar
Collier R. Legumes, lemons and streptomycin: a short history of the clinical trial. CMAJ. 2009;180:23–4.
Article PubMed PubMed Central Google Scholar
Doll R. Sir Austin Bradford Hill and the progress of medical science. BMJ. 1992;305:1521–6.
Article CAS PubMed PubMed Central Google Scholar
Hart PD. A change in scientific approach: from alternation to randomised allocation in clinical trials in the 1940s. BMJ. 1999;319:572–3.
Article PubMed Central Google Scholar
Peto R. Reflections on the design and analysis of clinical trials and meta-analyses in the 1970s and 1980s. J R Soc Med. 2019;112(2):78–80.
Article PubMed PubMed Central Google Scholar
Silverman WA. Personal reflections on lessons learned from randomized trials involving newborn infants from 1951 to 1967. Clin Trials. 2004;1:179–84.
Article PubMed Google Scholar
Breslow NE, Day NE. The role of cohort studies in cancer epidemiology. In: Breslow NE, Day NE, editors. Statistical methods in cancer research. Volume II—the design and analysis of cohort studies. Lyon: IARC Scientific Publications; 1987.
Google Scholar
Armitage P. Before and after Bradford Hill: some trends in medical statistics. J R Stat Soc A Stat Soc. 1995;158(1):143–53.
Article Google Scholar
Hill AB. The environment and disease: association or causation? Proc R Soc Med. 1965;58:295–300.
CAS PubMed PubMed Central Google Scholar
Street DJ. Fisher’s contributions to agricultural statistics. Biometrics. 1990;46(4):937–45.
Article Google Scholar
Box GEP, Draper NR. Empirical model-building and response surfaces. New York: Wiley; 1987.
Google Scholar
Box GEP. Statistics as a catalyst to learning by scientific method part II—a discussion. J Qual Technol. 1999;31(1):16–29.
Article Google Scholar
Montgomery DC. Design and analysis of experiments. 8th ed. London: Wiley; 2013.
Google Scholar
Russell WMS, Burch RL. The principles of humane experimental technique. London: Methuen; 1959.
Google Scholar
Button KS, Ioannidis JPA, Mokrysz C, Nosek BA, Flint J, Robinson ESJ, Munafò MR. Power failure: why small sample size undermines the reliability of neuroscience. Nat Rev Neurosci. 2013;14:365–76.
Article CAS PubMed Google Scholar
Parker RMA, Browne WJ. The place of experimental design and statistics in the 3Rs. ILAR J. 2014;55(3):477–85.
Article CAS PubMed Google Scholar
Editorial. The ‘3Is’ of animal experimentation. Nat Genetics. 2012;44(6):611.
Article Google Scholar
Festing MFW. Randomized block experimental designs can increase the power and reproducibility of laboratory animal experiments. ILAR J. 2014;55:472–6.
Article CAS PubMed Google Scholar
Festing MFW, Altman DG. Guidelines for the design and statistical analysis of experiments using laboratory animals. ILAR J. 2002;432:244–58.
Article Google Scholar
Karp NA, Fry D. What is the optimum design for my animal experiment? BMJ Open Sci. 2021;5: e100126.
Article PubMed PubMed Central Google Scholar
Scott S, Kranz JE, Cole J, Lincecum JM, Thompson K, Kelly N, Bostrom A, Theodoss J, Al-Nakhala BM, Viera FG, et al. Design, power, and interpretation of studies in the standard murine model of ALS. Amyotroph Later Scler. 2008;9:4–15.
Article CAS Google Scholar
Lazic SE. Four simple ways to increase power without increasing the sample size. Lab Anim. 2018;52:621–9.
Article CAS PubMed Google Scholar
Muhlhauser BS, Bloomfield FH, Gillman MW. Whole animal experiments should be more like human randomized controlled trials. PLoS Biol. 2013;11(2): e1001481.
Article Google Scholar
Errington TM, Denis A, Perfito N, Iorns E, Nosek BA. Challenges for assessing replicability in preclinical cancer biology. Elife. 2021;10: e67995.
Article PubMed PubMed Central Google Scholar
Macleod MR, Mohan S. Reproducibility and rigor in animal-based research. ILAR J. 2020;60:17–23.
Article Google Scholar
Kilkenny C, Parsons N, Kadyszewski E, Festing MF, Cuthill IC, Fry D, Hutton J, Altman DG. Survey of the quality of experimental design, statistical analysis and reporting of research using animals. PLoS ONE. 2009;4(11): e0007824.
Article Google Scholar
Gaur A, Merz-Nideroest B, Zobel A. Clinical trials, good clinical practice, regulations, and compliance. Regul Focus Quart. 2021;1(1):15–31.
Google Scholar
Silverman J, Macy J, Preisig P. The role of the IACUC in ensuring research reproducibility. Lab Anim (NY). 2017;46(4):129–35.
Article Google Scholar
Diong J, Butler AA, Gandevia SC, Héroux ME. Poor statistical reporting, inadequate data presentation and spin persist despite editorial advice. PLoS ONE. 2018;13(8): e0202121.
Article PubMed PubMed Central Google Scholar
Lang TA, Altman DG. Basic statistical reporting for articles published in clinical medical journals the SAMPL Guidelines. In: Smart P, Masisonneuve H, Polderman AKS, editors. Science editors’ handbook. Paris: European Association of Science; 2013.
Google Scholar
Makin TR, De Orban Xivry J-J. Ten common statistical mistakes to watch out for when writing or reviewing a manuscript. Elife. 2019;8: e48175.
Article CAS PubMed PubMed Central Google Scholar
Preece DA. The design and analysis of experiments: what has gone wrong? Util Mathematica. 1982;21:201–44.
Google Scholar
Preece DA. Illustrative examples: illustrative of what? J Roy Stat Soc Ser D. 1986;35(1):33–44.
Google Scholar
Preece DA. Good statistical practice. J Roy Stat Soc Ser D. 1987;36(4):397–408.
Google Scholar
Greenland S, Senn SJ, Rothman KJ, Carlin JB, Poole C, Goodman SN, Altman DG. Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations. Eur J Epidemiol. 2016;31:337–50.
Article PubMed PubMed Central Google Scholar
Nuzzo R. Statistical errors. Nature. 2014;506:150–2.
Article CAS PubMed Google Scholar
Marcus E. A STAR is born. Cell. 2016;166:1059–60.
Article CAS PubMed Google Scholar
Altman DG. Practical statistics for medical research. London: Chapman & Hall; 1991.
Google Scholar
Karp NA. Reproducible preclinical research—is embracing variability the answer? PLoS Biol. 2018;16(3): e2005413.
Article PubMed PubMed Central Google Scholar
Kilkenny C, Browne WJ, Cuthill IC, Emerson M, Altman DG. Improving bioscience research reporting: the ARRIVE guidelines for reporting animal research. PLoS Biol. 2010;8(6): e1000412.
Article PubMed PubMed Central Google Scholar
Altman DG, Bland JM. Treatment allocation in controlled trials: why randomise? BMJ. 1999;318:1209.
Article CAS PubMed PubMed Central Google Scholar
Hirst JA, Howick J, Aronson JK, Roberts N, Perera R, Koshiaris C, Heneghan C. The need for randomization in animal trials: an overview of systematic reviews. PLoS ONE. 2014;9: e98856.
Article PubMed PubMed Central Google Scholar
Reynolds PS, Garvan CW. Gap analysis of animal-based hemorrhage control research. “Houses of brick or mansions of straw?” Miltary Med. 2020;185:85–95.
Google Scholar
Festing MFW. The “completely randomised” and the “randomised block” are the only experimental designs suitable for widespread use in pre-clinical research. Sci Rep. 2020;10:17577.
Article CAS PubMed PubMed Central Google Scholar
Lazic SE, Clarke-Williams CJ, Munafò MR. What exactly is “N” in cell culture and animal experiments? PLoS Biol. 2018;16: e2005282.
Article PubMed PubMed Central Google Scholar
Parsons NR, Teare MD, Sitch AJ. Unit of analysis issues in laboratory-based research. eLife. 2018;7: e32486.
Article PubMed PubMed Central Google Scholar
Frommlet F, Heinze G. Experimental replications in animal trials. Lab Anim. 2021;55(1):65–75.
Article CAS PubMed Google Scholar
Bolt T, Nomi JS, Bzdok D, Uddin L. Educating the future generation of researchers: A cross-disciplinary survey of trends in analysis methods. PLoS Biol. 2021;19(7): e3001313.
Article CAS PubMed PubMed Central Google Scholar
Gosselin RD. Insufficient transparency of statistical reporting in preclinical research: a scoping review. Sci Rep. 2021;11:3335.
Article CAS PubMed PubMed Central Google Scholar
Nevalainen T. Animal husbandry and experimental design. ILAR J. 2014;55(3):392–8.
Article CAS PubMed Google Scholar
Tukey JW. Unsolved problems of experimental statistics. J Am Stat Assoc. 1954;49:706–31.
Google Scholar
Baker M. Is there a reproducibility crisis? Nature. 2016;533:452–4.
Article CAS PubMed Google Scholar
Brown AW, Kaisera K, Allison DB. Issues with data and analyses: errors, underlying themes, and potential solutions. Proc Natl Acad Sci. 2018;115(11):2563–70.
Article CAS PubMed PubMed Central Google Scholar
Sena ES, Currie GL. How our approaches to assessing benefits and harms can be improved. Anim Welf. 2019;28:107–15.
Article Google Scholar
Fisher RA. Presidential address to the first indian statistical congress. Sankhya. 1938;4:14–7.
Google Scholar
Sprent P. Some problems of statistical consultancy. J Roy Stat Soc Ser A. 1970;133(2):139–65.
Article Google Scholar
Altman DG. Statistics and ethics in medical research: misuse of statistics is unethical. BMJ. 1980;281:1182–4.
Article CAS PubMed PubMed Central Google Scholar
Dunn HL. Application of statistical methods in physiology. Physiol Rev. 1929;9(2):275–398.
Article Google Scholar
Preece DA. Discussion on the papers on `statistics and mathematics’. J Roy Stat Soc Ser D. 1998;47(2):274.
Google Scholar
Preece DA. Biometry in the third world: science not ritual. Biometrics. 1984;40(2):519–23.
Article Google Scholar
Smith AJ, Clutton RE, Lilley E, Hansen KEA, Brattelid T. PREPARE: guidelines for planning animal research and testing. Lab Anim. 2017;52(2):135–41.
Article PubMed PubMed Central Google Scholar
Percie du Sert N, Hurst V, Ahluwalia A, Alam S, Avey MT, Baker M, Browne W, Clark A, Cuthill IC, Dirnagl U, et al. The ARRIVE guidelines 2.0: updated guidelines for reporting animal research. PLoS Biol. 2020;18(7): e3000410.
Article CAS PubMed PubMed Central Google Scholar
Altman DG, Simera I. Using reporting guidelines effectively to ensure good reporting of health research. In: Moher D, Altman DG, Schulz KF, Simera I, Wager E, editors. Guidelines for reporting health research: a user’s manual, edn. Chichester: Wiley; 2014. p. 32–40.
Google Scholar

Download references

Acknowledgements

Many thanks to Dr Tamara Hughes and three anonymous reviewers for useful suggestions that greatly improved the manuscript.

Funding

None to declare.

Author information

Authors and Affiliations

Department of Anesthesiology, College of Medicine, Department of Small Animal Clinical Sciences, College of Veterinary Medicine, University of Florida, Gainesville, FL, 32610, USA
Penny S. Reynolds

Authors

Penny S. Reynolds
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

PSR wrote the paper, designed, compiled, and composed the original draft, reviewed and revised the final draft. The author read and approved the final manuscript.

Corresponding author

Correspondence to Penny S. Reynolds.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

PSR was a member of the ARRIVE 2.0 International Working Group.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article

Reynolds, P.S. Between two stools: preclinical research, reproducibility, and statistical design of experiments. BMC Res Notes 15, 73 (2022). https://doi.org/10.1186/s13104-022-05965-w

Download citation

Received: 22 December 2021
Accepted: 08 February 2022
Published: 21 February 2022
DOI: https://doi.org/10.1186/s13104-022-05965-w

Between two stools: preclinical research, reproducibility, and statistical design of experiments

Abstract

Introduction

Main text

Two stools of competing traditions

Preclinical studies: between the stools

Statistical pathologies in the preclinical literature

Outlook

Availability of data and materials

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Between two stools: preclinical research, reproducibility, and statistical design of experiments

Abstract

Introduction

Main text

Two stools of competing traditions

Preclinical studies: between the stools

Statistical pathologies in the preclinical literature

Outlook

Availability of data and materials

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation