Improving reporting standards for polygenic scores in risk prediction studies

Wand, Hannah; Lambert, Samuel A.; Tamburro, Cecelia; Iacocca, Michael A.; O’Sullivan, Jack W.; Sillari, Catherine; Kullo, Iftikhar J.; Rowley, Robb; Dron, Jacqueline S.; Brockman, Deanna; Venner, Eric; McCarthy, Mark I.; Antoniou, Antonis C.; Easton, Douglas F.; Hegele, Robert A.; Khera, Amit V.; Chatterjee, Nilanjan; Kooperberg, Charles; Edwards, Karen; Vlessis, Katherine; Kinnear, Kim; Danesh, John N.; Parkinson, Helen; Ramos, Erin M.; Roberts, Megan C.; Ormond, Kelly E.; Khoury, Muin J.; Janssens, A. Cecile J. W.; Goddard, Katrina A. B.; Kraft, Peter; MacArthur, Jaqueline A. L.; Inouye, Michael; Wojcik, Genevieve L.

doi:10.1038/s41586-021-03243-6

Improving reporting standards for polygenic scores in risk prediction studies

Perspective
Published: 10 March 2021

Volume 591, pages 211–219, (2021)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

From

View current issue Submit your manuscript

Improving reporting standards for polygenic scores in risk prediction studies

Download PDF

Hannah Wand^1,2^na1,
Samuel A. Lambert ORCID: orcid.org/0000-0001-8222-008X^3,4,5,6,7^na1,
Cecelia Tamburro⁸,
Michael A. Iacocca¹,
Jack W. O’Sullivan^1,2,
Catherine Sillari⁸,
Iftikhar J. Kullo ORCID: orcid.org/0000-0002-6524-3471⁹,
Robb Rowley⁸,
Jacqueline S. Dron ORCID: orcid.org/0000-0002-3045-6530^10,11,
Deanna Brockman¹⁰,
Eric Venner¹²,
Mark I. McCarthy ORCID: orcid.org/0000-0002-4393-0510^13,14,
Antonis C. Antoniou¹⁵,
Douglas F. Easton ORCID: orcid.org/0000-0003-2444-3247¹⁵,
Robert A. Hegele¹¹,
Amit V. Khera ORCID: orcid.org/0000-0001-6535-5839¹⁰,
Nilanjan Chatterjee^16,17,
Charles Kooperberg¹⁸,
Karen Edwards¹⁹,
Katherine Vlessis²⁰,
Kim Kinnear²⁰,
John N. Danesh^5,6,21,
Helen Parkinson^6,7,
Erin M. Ramos⁸,
Megan C. Roberts²²,
Kelly E. Ormond ORCID: orcid.org/0000-0002-1033-0818^20,23,
Muin J. Khoury²⁴,
A. Cecile J. W. Janssens²⁵,
Katrina A. B. Goddard^26,27,
Peter Kraft ORCID: orcid.org/0000-0002-4472-8103²⁸,
Jaqueline A. L. MacArthur ORCID: orcid.org/0000-0002-3550-9769⁷,
Michael Inouye ORCID: orcid.org/0000-0001-9413-6520^{3,4,5,6,21,29}^na2 &
…
Genevieve L. Wojcik ORCID: orcid.org/0000-0001-7206-8088³⁰^na2

27k Accesses
235 Citations
316 Altmetric
19 Mentions
Explore all metrics

Abstract

Polygenic risk scores (PRSs), which often aggregate results from genome-wide association studies, can bridge the gap between initial discovery efforts and clinical applications for the estimation of disease risk using genetics. However, there is notable heterogeneity in the application and reporting of these risk scores, which hinders the translation of PRSs into clinical care. Here, in a collaboration between the Clinical Genome Resource (ClinGen) Complex Disease Working Group and the Polygenic Score (PGS) Catalog, we present the Polygenic Risk Score Reporting Standards (PRS-RS), in which we update the Genetic Risk Prediction Studies (GRIPS) Statement to reflect the present state of the field. Drawing on the input of experts in epidemiology, statistics, disease-specific applications, implementation and policy, this comprehensive reporting framework defines the minimal information that is needed to interpret and evaluate PRSs, especially with respect to downstream clinical applications. Items span detailed descriptions of study populations, statistical methods for the development and validation of PRSs and considerations for the potential limitations of these scores. In addition, we emphasize the need for data availability and transparency, and we encourage researchers to deposit and share PRSs through the PGS Catalog to facilitate reproducibility and comparative benchmarking. By providing these criteria in a structured format that builds on existing standards and ontologies, the use of this framework in publishing PRSs will facilitate translation into clinical care and progress towards defining best practice.

Tutorial: a guide to performing polygenic risk score analyses

Article 24 July 2020

Implementation and implications for polygenic risk scores in healthcare

Article Open access 20 July 2021

Polygenic risk scores: from research tools to clinical instruments

Article Open access 18 May 2020

Main

The predisposition to common diseases and traits arises from a complex interaction between genetic and non-genetic factors. In the past decade there has been enormous success at discovering disease-associated genetic variants, facilitated by many collaborative consortia and large cohorts of well-phenotyped individuals with matched genetic information^1,2,3,4,5. In particular, genome-wide association studies (GWASs) have yielded summary statistics that describe the magnitude (effect size) and statistical significance of association between an allele and the outcome of interest^4,6. GWASs have been applied to many complex human traits and diseases, including height, blood pressure, cardiovascular disease, cancer, obesity and Alzheimer’s disease.

The associations identified through GWASs can be combined to quantify genetic predisposition to a heritable trait, and this information can be used to conduct disease risk stratification or to predict prognostic outcomes and response to therapy^7,8. Typically, information across many variants is combined by means of a weighted sum of allele counts, in which the weights reflect the relative magnitude of association between variant alleles and the trait or disease. These weighted sums can include millions of variants, and are frequently referred to as polygenic risk scores (PRSs), or genetic or genomic risk scores (GRSs), if they refer to risk estimates of disease outcomes; or, more generally, polygenic scores (PGSs) when referring to any outcome (Box 1). Although algorithms are actively being developed to decide how many and which variants to include and how to weigh them so as to maximize the proportion of variance explained or disease discrimination, there is an emerging consensus that inclusion of variants beyond those that meet stringent GWAS significance levels can boost predictive performance^9,10. Methodological research has also established theoretical limits of the performance of PGSs and PRSs on the basis of the genetic architecture and heritability of the trait^{11,12,13,14,15}.

In the last decade, the landscape of genetic prediction studies has transformed. Over 900 publications mention PGS or PRS, and there have been significant developments in how PRSs are constructed and evaluated, as well as many new proposed uses. The data available in the current era of biomedical research are larger and more consolidated than ever before. Biobanks and large-scale consortia have become dominant, yet frequently researchers have limited access to individual-level data. As individual data are often unavailable, most PRS models are developed from summary-level data (for example, GWAS summary statistics) in secondary datasets, each of which come with their own specific methodological considerations^16,17,18. At the same time, there has been a push towards open data sharing as outlined in the FAIR (Findable, Accessible, Interoperable and Reusable) Data Principles^3,19, with an emphasis on ensuring that research is reproducible by all.

The capacity of PRSs to quantify genetic predisposition for many clinically relevant traits and diseases has begun to be established, with many potential clinical uses in settings related to disease risk stratification as well as proposed prognostic uses (for example, predicting responses to intervention or treatment). Readiness for implementation varies by outcome, and mature PRSs with potential clinical utility are available for only a few diseases—for example, coronary heart disease (CHD) and breast cancer (Boxes 2 and 3, respectively). There has also been a rapid rise of direct-to-consumer assays and for-profit companies (23andMe, Color, MyHeritage, and so on) that provide PGS and PRS results to customers outside of the traditional patient–provider framework. These concomitant developments have resulted in healthcare systems developing new infrastructures to deliver genetic risk information²⁰. Individually and combined, these advances have raised substantial challenges for PRS reporting standards—from the very basic (for example, reporting performance metrics on an external validation dataset) to the complicated (for example, making raw variant and weight information for a PRS available)—and necessitate the updating of existing standards for reporting genetic risk prediction studies to convey the increased scope of PRSs and the complexity of their clinical applications.

Poorly designed and/or described studies call into question the validity of some PRSs to predict their target outcome^21,22, and relatively few studies have externally benchmarked the performance of multiple scores. At present, there are no best practices for developing PRSs that are uniformly agreed upon, nor are there widely adopted standards or regulations that are sufficiently tailored to assess the eventual clinical readiness of a PRS. There are emerging applications of PRSs that further compound the heterogeneity in reporting; for example, using PRSs as tools for testing gene–environment interactions or shared aetiology between diseases^23,24,25,26. The rapid evolution in both methodological development and applications of PRSs make it challenging to compare or reproduce claims about the predictive performance of a PRS for a specific outcome when studies are not properly documented. These deficiencies are barriers to PRSs being interpreted, compared, and reproduced, and must be addressed to enable the application of PRSs to improve clinical practice and public health.

Frameworks have been developed to establish standards around the transparent, standardized, accurate, complete and meaningful reporting of scientific studies. In 2011, an international working group published the Genetic Risk Prediction Studies (GRIPS) Statement—a set of reporting guidelines for risk prediction models that include genetic variants, from genetic mutations to gene scores²⁷. These guidelines are analogous to those developed for observational epidemiological studies (STROBE²⁸) and genome-wide association studies (STREGA²⁹), and are in line with the reporting guidelines for multivariate prediction models (TRIPOD³⁰). Adherence to reporting statements has been low, and the same holds for GRIPS. One reason might be that researchers feel that the GRIPS Statement inadequately addresses PRSs. Researchers are frequently uncertain as to what precisely should be reported for a PRS study to be assessed as rigorous, reproducible and ultimately translatable, especially with the increased push for data availability and transparency. Most PRS studies follow a prototypical process (Fig. 1) that can be used as a template for standardizing reporting and benchmarking in the field.

**Fig. 1: Prototype of PRS development and validation process.**

Here, the Clinical Genome Resource (ClinGen) Complex Disease Working Group and the Polygenic Score (PGS) Catalog (Supplementary Note 1) jointly present the Polygenic Risk Score Reporting Standards (PRS-RS), an expanded and updated set of reporting standards for PRSs that addresses current research environments with advanced methodological developments to inform clinically meaningful reporting on the development and validation of PRSs in the literature, with an emphasis on reproducibility and transparency throughout the development process. Additional methods are detailed in Supplementary Note 2.

Box 1  Definitions of relevant genetic risk prediction terms

Polygenic score (PGS). a single value that quantifies an individual’s genetic predisposition to a trait. Typically calculated by summing the number of trait-associated alleles in an individual weighted by per-allele effect sizes from a discovery GWAS, and normalized using a relevant population distribution. Sometimes referred to as a genetic score.

Polygenic risk score (PRS). a subset of PGS that is used to estimate the risk of disease or other clinically relevant outcomes (binary or discrete). Sometimes referred to as a genetic or genomic risk score (GRS). See categories below.

Integrated risk model. a risk model for the outcome of interest which combines PRS with other risk factors, such as demographics (often age and sex), anthropometrics, biomarkers, and clinical measurements.

Categories of use for PRS and/or integrated risk models

The addition of PRSs to existing risk models has several potential applications, summarized below. Each aims to improve individual or subgroup classification such that there is clinical benefit.

Disease risk prediction — estimate an individual’s risk of developing a disease, on the basis of certain genetic and/or clinical variables.
Disease diagnosis — classify whether an individual has a disease, or a disease subtype, on the basis of certain genetic and/or clinical variables^9,37.
Disease prognosis — estimate the risk of further adverse outcome(s) subsequent to the diagnosis of disease³⁸.

Therapeutic — predict the response of a patient or subgroup to a particular treatment³⁹.

Box 2  Current CHD PRSs and their potential uses

Many PRSs have been developed for CHD, which vary in the computational methods used, number of variants included (50–6,000,000) and cohorts used for PRS training. For example, many of the latest CHD PRSs use GWAS summary statistics from the CardiogramPlusC4D study⁴⁰, and differ by the method of selecting and weighting individual variants (including LDpred^41,42, lassosum⁴³ and meta-scoring approaches⁴⁴) and how they are used in an integrated risk model. These PRSs may provide useful information for predicting the risk of CHD that is largely orthogonal to conventional risk factors (age, sex, hypertension, cholesterol, BMI, diabetes and smoking) as well as family history. Clinical applications may include:

Improved risk prediction for future adverse cardiovascular events when added to conventional risk models (such as the Framingham risk score⁴⁵, pooled cohort equations^43,44 and QRISK⁴³).
Reclassification of risk categories often leading to recommendations for risk-reducing treatments like statins^45,46,47.

Although the data for these clinical applications strongly suggest CHD PRSs may improve patient outcomes, clinical use through randomized clinical trials has yet to be established; however, a number of clinical trials are underway (https://clinicaltrials.gov).

Box 3  Current breast cancer PRSs and their potential uses

Many of the most recent and most predictive PRSs for breast cancer include a smaller number of variants (usually hundreds to thousands), possibly owing to a less polygenic architecture and more low-frequency variants having greater effect; however, scores composed of millions of variants also exist. PRS construction typically includes GWAS summary statistics and data from the Breast Cancer Association Consortium (BCAC), then variants passing genome-wide significance (lead SNPs), stepwise regression, penalized regression on individual-level genotypes⁴⁸, clumping and thresholding⁴¹ or Bayesian methods^42,49. In contrast to CHD, genetics is commonly used to measure breast cancer risk vis-à-vis testing for BRCA1 and BRCA2 mutations; however, routine screening for breast cancer is often performed in older women using non-genetic risk prediction tools, such as mammography. Research into PRSs for breast cancer includes multiple potential clinical uses and considerations:

 Multiple PRSs exist to predict risk for subtypes of breast cancer (for example, ER-positive or -negative, luminal, and triple-negative^48,50), which could be used to stratify patients according to prognosis or for more beneficial treatments.
 Integrated risk models which combine PRSs with non-genetic risk factors (such as age, family history, mammographic density, hormone replacement therapy)^{51,52,53,54,55,56,57,58}.
 PRSs can provide important stratification of risk among carriers of pathogenic variants in genes that are already screened in clinical practice (for example, BRCA1, BRCA2, PALB2, CHEK2 and ATM)⁵⁹ and thus could improve clinical decision-making^{42,49,56,60,61,62,63}.

Indeed, the BOADICEA breast cancer risk prediction model includes the effects of common variants (PRS₃₁₃; ref. ⁴⁸) as well as other rare pathogenic genetic variants⁵⁶ and has been implemented in the CanRisk Tool (www.canrisk.org), which has been approved for use by healthcare professionals in the European Economic Area. The utility of PRSs has been studied in simulations⁶⁴ and is being evaluated in risk-based breast cancer screening trials in the US⁶⁵ and Europe (MyPeBS; https://mypebs.eu).

The PRS-RS

The PRS-RS is a set of standard items specifying the minimal criteria that need to be described in a manuscript to accurately interpret a PRS and reproduce results throughout the PRS development process³¹ (see Fig. 1 for a brief summary). It applies to PRS development and validation studies that aim to predict disease onset, diagnosis and prognosis, as well as response to therapies; however, other research uses of PRSs have overlapping steps that should be reported in a similar manner. Table 1 presents the full PRS-RS, with reporting items organized into key components along the developmental pipeline of PRSs for clear interpretation and to encourage their documentation from the inception of the study, well before publication.

Table 1 Polygenic Risk Score Reporting Standards (PRS-RS)

Full size table

Reporting on the background of risk scores

The development and validation of a PRS tests a specific hypothesis with a defined outcome and study population. Therefore, authors should define a priori (note that in the next few sections, inverted commas are used to refer to each of the Reporting Standards referred to in Table 1) the ‘study type’ (for example, development and/or validation), ‘risk model purpose’ (for example, risk prediction versus prognosis) and ‘predicted outcome’ (for example, CHD) in enough detail to understand why the study population and risk model selected are relevant (for example, the value for CHD risk stratification and primary prevention is highest in younger individuals compared to those over 80 with lifetime accumulated risk). As the PRS-RS is focused on clinical validity and implementation, authors must outline the study and appropriate outcomes to understand what risk is measured, what the purpose of measuring risk would be and why this purpose may be of clinical relevance. To establish the internal validity of a study, authors should use the appropriate data for the intended purpose (for example, prediction of incident disease versus prognosis), with adequate documentation of dataset characteristics to understand nuances in measured risk.

Reporting on study populations

The applicability of any risk prediction to an external target population (the who, where, and when) depends on its similarity to the original study populations that were used to derive the risk model. Therefore, authors need to define and characterize the details of their study population (‘study design and recruitment’), and describe study ‘participant demographics’ for key variables (most often age and sex) and ancestry. Notably, there are often inconsistent definitions and levels of detail associated with ancestry, and the transferability of genetic findings between different racial and ethnic groups can be limited^1,9,32. It is therefore essential for authors to provide a detailed description of the genetic ‘ancestry’ of participants—including how ancestry was determined—using a common controlled vocabulary where possible (for example, the standardized framework developed by the NHGRI-EBI GWAS Catalog¹). Authors should provide a sufficient level of detailed criteria for defining all of the factors relevant to the ‘outcome of interest’, including but not limited to those used in the risk model (‘non-genetic variables’). These details should accompany information about how the population was genotyped (‘genetic data’), including assays and all quality control measures.

Reporting on the development of risk models

At present there are several commonly used methods to select variants that constitute the PRS and fine-tune their weights^{7,16,17,18,31}. Methods using GWAS summary statistics should clearly cite the relevant GWAS, preferably using unique and persistent study identifiers from the GWAS Catalog (GCSTs)³³. As the performance and limitations of the combined risk model are dependent on methodological considerations, authors must provide complete details including the method used and how variants are combined into a single PRS (‘PRS construction and estimation’). Apart from genetic data, authors should also describe the defining criteria for other demographic and non-genetic predictors (‘non-genetic variables’) included in the model. Often authors will iterate through numerous models to find the optimal fit. In addition to the estimation methods, it is important to detail the ‘integrated risk model(s) fitting’ procedure, including the measures that were used for the selection of the final model. Translating the continuous PRS distribution to a risk estimate, whether absolute or relative, is highly dependent on assumptions and limitations that are inherent to the specific dataset used. When describing the ‘risk model type’, authors should detail the timescale used for prediction, or the study period and follow-up time for a relative hazard model. Furthermore, if relative risk is estimated, the reference group should be well described. These details should be described for the training set, as well as for validation and sub-group analyses.

Reporting on the evaluation of risk models

Authors should report estimates for all evaluated models (including the methods used to derive them) to equip readers with the information necessary to evaluate the relative value of an increase in performance against other trade-offs. We recommend that authors provide summary information of the ‘PRS distribution’ to aid in model interpretation. The ‘predictive ability’, ‘calibration’, and ‘discrimination’ of the risk model should also be assessed and detailed with common descriptions including the risk score effect size, variance explained (R²), reclassification indices and metrics like sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). The risk model ‘calibration’ and ‘discrimination’ should be described for all analyses, although their estimation and interpretation are most relevant for the PRS validation sample. It is imperative for the PRS and integrated risk models to be evaluated on a population that is external (independent, non-overlapping) to the individuals in the study population. The ability of the risk model to classify individuals of interest (‘risk model discrimination’) is commonly described and presented in terms of the area under the receiver operating characteristic (AUROC) or precision-recall curve (AUPRC), or the concordance statistic (C-index). Any differences in variable definitions or performance discrepancies between the training and validation sets should be described.

Reporting on interpretation

By explicitly describing the ‘risk model interpretation’ and outlining potential ‘limitations’ to the ‘generalizability’ of their model, authors will empower readers and the wider community to better understand the risk score and its relative merits. Authors should justify the clinical relevance and the ‘intended uses’ of the risk model, such as how the performance of their PRS compares to other commonly used risk models, or previously published PRSs. This may also include comparisons to other genetic predictors of disease (for example, mutations in high or moderate risk genes associated with Mendelian forms of the disease), family history, simple demographic models or conventional risk calculators (see Boxes 2 and 3 for disease-specific examples). What indicates a ‘good’ prediction can differ between outcomes and intended uses, but should be reported with similar metrics to those described in the evaluation section.

Reporting on model parameters

The underlying PRS (variant alleles and derived weights) should be made publicly available, preferably through direct submission to an indexed repository such as the PGS Catalog, to enable others to reuse existing models (with known validity) and to facilitate direct benchmarking between different PRSs for the same trait (thus promoting ‘data transparency and availability’). The current mathematical form of most PRSs—a linear combination of allele counts—facilitates clear model description and reproducibility. Future genomic risk models may have more complex forms; for example, allowing for explicit non-linear epistatic and gene–environment interactions, or deep neural networks of lesser clarity. It will be important to describe these models in sufficient detail to allow their implementation and evaluation by other researchers and clinical groups.

Supplementary Note 3 provides explanations of reporting considerations in addition to the minimal reporting framework in Table 1. Authors intending downstream clinical implementation should aim for the level of transparent and comprehensive reporting covered in both Table 1 and the Supplementary Notes, especially concerning points related to discussing the interpretation, limitations and generalizability of results. The proper reporting of the development and performance of PRSs can also have implications for seeking regulatory approval of the PRS as a clinical test. Although not a comprehensive list of regulatory requirements, we highlight aspects of the PRS-RS that would be considered evidence of analytical and clinical validity from the perspective of the College of American Pathologists (CAP) and the Clinical Laboratory Improvement Amendments (CLIA) (Supplementary Table 1). CAP and CLIA approvals are additional incentives for the reporting adherence of researchers who wish to translate their work to the clinic, as well as a caution for researchers who want to avoid their findings being put to unintended use. Finally, we reiterate the need for both methodological and data transparency, and we encourage the deposition of PRSs (variant-level information necessary to recalculate the genetic portion of the score) in the PGS Catalog (www.pgscatalog.org³⁴), which provides an invaluable resource for the widespread adoption and distribution of a published PRS. The PGS Catalog provides access to PGSs and related metadata to support the FAIR principles of data stewardship¹⁹, enabling subsequent applications and assessments of PGS performance and best practices (see Supplementary Table 2 for a description of the metadata captured in the PGS Catalog and its overlap with the PRS-RS).

Improving PRS research and translation

We surveyed 30 publications (selecting for a range of disease domains, risk score categories and populations) to understand how the information in the PRS-RS is presented and displayed as part of the larger iterative process to clarify and improve minimal reporting item descriptions. For 10 of these publications, we provide detailed annotations using the final minimal reporting requirements (Supplementary Table 3) and use these annotations to illustrate the detail necessary for each PRS-RS item (further described in Supplementary Note 3). The heterogeneity in the PRS reporting we observed in this pilot highlights a series of challenges. Critical aspects of PRS studies—including ancestry, predictive ability, and transparency or availability of information needed to reproduce PRSs—were frequently absent or reported in insufficient detail. This underscores the need for the PRS-RS to clearly and specifically define meaningful aspects of PRS development, testing and intended clinical use. However, these deficits in reporting are not unique to PRSs; previous reports of underreporting have found that 77% of GWAS publications in 2017 did not share summary statistics³⁵ and 4% of GWASs do not report any relevant ancestry information¹. In line with the push towards a culture of reproducibility and open data in genomics, we as the ClinGen Complex Disease Working Group and PGS Catalog joined to create this set of reporting standards (Table 1), which is specifically tailored to PRS research and adapts the previous standards on the basis of the opinions of multidisciplinary and international experts.

Researchers using the PRS-RS may identify fringe cases that are inadequately captured by these reporting items, as we have modelled our guidelines on prototypical steps for PRS development (Fig. 1). Although we anticipate that the field may change further as novel methods and technologies are generated, the PRS-RS items can be expanded and adapted to encompass new considerations. By updating previous standards, drawing on the knowledge of leaders in the field and tailoring the framework to common barriers observed in recent literature, we aim to provide a comprehensive and pragmatic perspective on the topic. In line with previous standards, the PRS-RS includes elements related to understanding the clinical validity of PRSs and consequent risk models. Items such as ‘predicted outcome’ and ‘intended use’ bookend our guidelines with the intended clinical framing of PRS reporting. In addition, we have modelled the guidelines by steps in experimental design—from hypothesis to interpretation—to more clearly emphasize the importance of considering the risk model’s intended purpose in defining what needs to be reported and to inform documentation throughout the process. As a reference, we have included a guide to where PRS-RS items should be reported in a manuscript in Supplementary Table 4. These expansions will further facilitate the curation and expert annotation of published PRSs as we move towards widespread clinical use.

Although the scope of our work encompasses clinical validity, it does not address the additional requirements that are needed to establish the clinical or public health utility of a PRS, such as randomized trials with clinically meaningful outcomes, health economic evaluations or feasibility studies³⁶. In addition, the translation of structured data elements into useful clinical parameters may not be direct. One example is that the case definitions used in training or validation in any particular PRS study may deviate (sometimes substantially) from those used in any specific health system. CHD symptoms commonly include angina (chest pain), whereas PRSs are frequently trained on stricter definitions excluding angina. Another example is that the definitions used for race or ancestry as outlined in the PGS Catalog and the GWAS Catalog¹ may differ from structured terms used to document ancestry information in the clinic. Consistent mappings and potentially parallel analyses may be necessary to translate from genetically determined ancestries to those that are routinely used in clinical care. Such translation issues potentially limit generalizability to target populations and warrant further discussion, and we reiterate the need for authors to be mindful of their intended purpose and target audience when discussing their findings. Authors’ understanding of potential translational barriers can be aided by considering the current CAP and CLIA analytical and clinical validity evidence requirements of peer-reviewed literature to ensure that the PRS-RS has value in informing later steps of the clinical translation spectrum, including clinical utility (Supplementary Table 1). Finally, although the principles of this work are clear, its scope does not include the complex commercial restrictions—such as intellectual property—that may be placed on published studies with regard to the reporting or distribution of PGSs or the data that underlie them. We hope that our work will inform downstream regulation and transparency standards for PRS as a commercial clinical tool.

The coordinated efforts of the ClinGen Complex Disease Working Group and PGS Catalog provide a set of compatible resources for researchers to deposit PGS- and PRS-related information. The PGS Catalog (www.PGSCatalog.org) provides an informatics platform, with data integration and harmonization to other PGSs as well as the source GWAS study through its sister platform, the GWAS Catalog^1,34. In addition, it provides a structured database of scores (variants and effect weights) that can be reused, along with metadata requested in the PRS-RS. With these tools, the PRS-RS can be mandated by leading peer-reviewed journals and, consequently, the quality and rigour of PRS research will be increased to a level that facilitates clinical implementation. We encourage readers to visit the ClinGen website (https://clinicalgenome.org/working-groups/complex-disease/) for any future changes or amendments to our reporting standards.

Although we have provided explicit recommendations on how to acknowledge study design limitations and their effects on the interpretation and generalizability of a PRS, future research should attempt to establish best practices to guide the field. Moving forward, supplementary frameworks should be developed for the reporting of new methods, such as deep learning, as well as requirements for clinical utility and readiness. Together, the PRS-RS enables the rapid development of PRSs as potentially powerful tools for the translation of genomic discoveries into clinical and public health benefits, and provides a framework for PRSs to transform multiple areas of research in human genetics.

References

Morales, J. et al. A standardized framework for representation of ancestry data in genomics studies, with application to the NHGRI-EBI GWAS Catalog. Genome Biol. 19, 21 (2018).
Article PubMed PubMed Central Google Scholar
MacArthur, J. et al. The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog). Nucleic Acids Res. 45, D896–D901 (2017).
Article CAS PubMed Google Scholar
Claussnitzer, M. et al. A brief history of human disease genetics. Nature 577, 179–189 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Visscher, P. M., Brown, M. A., McCarthy, M. I. & Yang, J. Five years of GWAS discovery. Am. J. Hum. Genet. 90, 7–24 (2012).
Article CAS PubMed PubMed Central Google Scholar
Visscher, P. M. et al. 10 years of GWAS discovery: biology, function, and translation. Am. J. Hum. Genet. 101, 5–22 (2017).
Article CAS PubMed PubMed Central Google Scholar
Pasaniuc, B. & Price, A. L. Dissecting the genetics of complex traits using summary association statistics. Nat. Rev. Genet. 18, 117–127 (2017).
Article CAS PubMed Google Scholar
Lambert, S. A., Abraham, G. & Inouye, M. Towards clinical utility of polygenic risk scores. Hum. Mol. Genet. 28 (R2), R133–R142 (2019).
Article CAS PubMed Google Scholar
Torkamani, A., Wineinger, N. E. & Topol, E. J. The personal and clinical utility of polygenic risk scores. Nat. Rev. Genet. 19, 581–590 (2018). References 7 and 8 are useful reviews outlining the potential benefits and clinical uses of polygenic risk scores.
Article CAS PubMed Google Scholar
Martin, A. R. et al. Clinical use of current polygenic risk scores may exacerbate health disparities. Nat. Genet. 51, 584–591 (2019).
Article CAS PubMed PubMed Central Google Scholar
Chatterjee, N., Shi, J. & García-Closas, M. Developing and evaluating polygenic risk prediction models for stratified disease prevention. Nat. Rev. Genet. 17, 392–406 (2016).
Article CAS PubMed PubMed Central Google Scholar
Wray, N. R., Yang, J., Goddard, M. E. & Visscher, P. M. The genetic interpretation of area under the ROC curve in genomic profiling. PLoS Genet. 6, e1000864 (2010).
Article PubMed PubMed Central Google Scholar
Dudbridge, F. Power and predictive accuracy of polygenic risk scores. PLoS Genet. 9, e1003348 (2013).
Article CAS PubMed PubMed Central Google Scholar
Pharoah, P. D. P. et al. Polygenic susceptibility to breast cancer and implications for prevention. Nat. Genet. 31, 33–36 (2002).
Article CAS PubMed Google Scholar
Zhang, Y., Qi, G., Park, J.-H. & Chatterjee, N. Estimation of complex effect-size distributions using summary-level statistics from genome-wide association studies across 32 complex traits. Nat. Genet. 50, 1318–1326 (2018).
Article CAS PubMed Google Scholar
Chatterjee, N. et al. Projecting the performance of risk prediction based on polygenic analyses of genome-wide association studies. Nat. Genet. 45, 400–405 (2013).
Article CAS PubMed PubMed Central Google Scholar
Mak, T. S. H., Porsch, R. M., Choi, S. W., Zhou, X. & Sham, P. C. Polygenic scores via penalized regression on summary statistics. Genet. Epidemiol. 41, 469–480 (2017).
Article PubMed Google Scholar
Choi, S. W. & O’Reilly, P. F. PRSice-2: polygenic risk score software for biobank-scale data. GigaScience 8, giz082 (2019).
Article PubMed PubMed Central Google Scholar
Vilhjálmsson, B. J. et al. Modeling linkage disequilibrium increases accuracy of polygenic risk scores. Am. J. Hum. Genet. 97, 576–592 (2015).
Article PubMed PubMed Central Google Scholar
Wilkinson, M. D. et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci. Data 3, 160018 (2016).
Article PubMed PubMed Central Google Scholar
Ganguly, P. NIH funds centers to improve the role of genomics in assessing and managing disease risk. National Human Genome Research Institute https://www.genome.gov/news/news-release/NIH-funds-centers-to-improve-role-of-genomics-in-assessing-and-managing-disease-risk (2020).
Martens, F. K., Tonk, E. C. M. & Janssens, A. C. J. W. Evaluation of polygenic risk models using multiple performance measures: a critical assessment of discordant results. Genet. Med. 21, 391–397 (2019).
Article PubMed Google Scholar
Janssens, A. C. J. W. Validity of polygenic risk scores: are we measuring what we think we are? Hum. Mol. Genet. 28, R143–R150 (2019).
Article CAS PubMed PubMed Central Google Scholar
Warrier, V. et al. Polygenic scores for intelligence, educational attainment and schizophrenia are differentially associated with core autism features, IQ, and adaptive behaviour in autistic individuals. Preprint at https://doi.org/10.1101/2020.07.21.20159228 (2020).
Matoba, N., Love, M. I. & Stein, J. L. Evaluating brain structure traits as endophenotypes using polygenicity and discoverability. Hum. Brain Mapp. https://doi.org/10.1002/hbm.25257 (2020).
Coleman, J. R. I. et al. Genome-wide gene-environment analyses of major depressive disorder and reported lifetime traumatic experiences in UK Biobank. Mol. Psychiatry 25, 1430–1446 (2020).
Article PubMed PubMed Central Google Scholar
Kraft, P. & Aschard, H. Finding the missing gene–environment interactions. Eur. J. Epidemiol. 30, 353–355 (2015).
Article PubMed PubMed Central Google Scholar
Janssens, A. C. J. W. et al. Strengthening the reporting of genetic risk prediction studies (GRIPS): explanation and elaboration. Eur. J. Hum. Genet. 19, 615 (2011). GRIPS was the first to propose standards for the reporting of risk prediction studies using genetics, and forms the basis of our updated PRS-RS.
Article Google Scholar
von Elm, E. et al. Strengthening the reporting of observational studies in epidemiology (STROBE) statement: guidelines for reporting observational studies. Br. Med. J. 335, 806–808 (2007).
Article Google Scholar
Little, J. et al. STrengthening the REporting of Genetic Association Studies (STREGA): an extension of the STROBE statement. PLoS Med. 6, e1000022 (2009).
Article PubMed Central Google Scholar
Moons, K. G. M. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): explanation and elaboration. Ann. Intern. Med. 162, W1–W73 (2015).
Article PubMed Google Scholar
Choi, S. W., Mak, T. S.-H. & O’Reilly, P. F. Tutorial: a guide to performing polygenic risk score analyses. Nat. Protocols 15, 2759–2772 (2020).
Article CAS PubMed Google Scholar
Wojcik, G. L. et al. Genetic analyses of diverse populations improves discovery for complex traits. Nature 570, 514–518 (2019).
Article CAS PubMed PubMed Central Google Scholar
Buniello, A. et al. The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res. 47, D1005–D1012 (2019).
Article CAS PubMed Google Scholar
Lambert, S. A. et al. The Polygenic Score Catalog as an open database for reproducibility and systematic evaluation. Nat. Genet. https://doi.org/10.1038/s41588-021-00783-5 (2021). This paper describes the development and contents of the PGS Catalog (www.PGSCatalog.org), an open database of polygenic scores.
Thelwall, M. et al. Is useful research data usually shared? An investigation of genome-wide association study summary statistics. PLoS One 15, e0229578 (2020).
Article CAS PubMed PubMed Central Google Scholar
Burke, W. & Zimmern, R. Moving Beyond ACCE: An Expanded Framework for Genetic Test Evaluation (PHG Foundation, 2007).
Huo, D. et al. Comparison of breast cancer molecular features and survival by African and European ancestry in The Cancer Genome Atlas. JAMA Oncol. 3, 1654–1662 (2017).
Article PubMed PubMed Central Google Scholar
Choi, J. et al. The associations between immunity-related genes and breast cancer prognosis in Korean women. PLoS One 9, e103593 (2014).
Article ADS PubMed PubMed Central Google Scholar
Marston, N. A. et al. Predicting benefit from evolocumab therapy in patients with atherosclerotic disease using a genetic risk score: results from the FOURIER trial. Circulation 141, 616–623 (2020).
Article PubMed Google Scholar
Nikpay, M. et al. A comprehensive 1,000 Genomes-based genome-wide association meta-analysis of coronary artery disease. Nat. Genet. 47, 1121–1130 (2015).
Article CAS PubMed PubMed Central Google Scholar
Khera, A. V. et al. Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nat. Genet. 50, 1219–1224 (2018).
Article CAS PubMed PubMed Central Google Scholar
Mars, N. et al. Polygenic and clinical risk scores and their impact on age at onset and prediction of cardiometabolic diseases and common cancers. Nat. Med. 26, 549–557 (2020).
Article CAS PubMed Google Scholar
Elliott, J. et al. Predictive accuracy of a polygenic risk score-enhanced prediction model vs a clinical risk score for coronary artery disease. J. Am. Med. Assoc. 323, 636–645 (2020).
Article Google Scholar
Inouye, M. et al. Genomic risk prediction of coronary artery disease in 480,000 adults: implications for primary prevention. J. Am. Coll. Cardiol. 72, 1883–1893 (2018).
Article PubMed PubMed Central Google Scholar
Abraham, G. et al. Genomic prediction of coronary heart disease. Eur. Heart J. 37, 3267–3278 (2016).
Article CAS PubMed PubMed Central Google Scholar
Ganna, A. et al. Multilocus genetic risk scores for coronary heart disease prediction. Arterioscler. Thromb. Vasc. Biol. 33, 2267–2272 (2013).
Article CAS PubMed Google Scholar
Kullo, I. J. et al. Incorporating a genetic risk score into coronary heart disease risk estimates: effect on low-density lipoprotein cholesterol levels (the MI-GENES clinical trial). Circulation 133, 1181–1188 (2016).
Article PubMed PubMed Central Google Scholar
Mavaddat, N. et al. Polygenic risk scores for prediction of breast cancer and breast cancer subtypes. Am. J. Hum. Genet. 104, 21–34 (2019).
Article CAS PubMed Google Scholar
Mars, N. et al. The role of polygenic risk and susceptibility genes in breast cancer over the course of life. Nat. Commun. 11, 6383 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Zhang, H. et al. Genome-wide association study identifies 32 novel breast cancer susceptibility loci from overall and subtype-specific analyses. Nat. Genet. 52, 572–581 (2020).
Article CAS PubMed PubMed Central Google Scholar
Zhang, X. et al. Addition of a polygenic risk score, mammographic density, and endogenous hormones to existing breast cancer risk prediction models: a nested case-control study. PLoS Med. 15, e1002644 (2018).
Article PubMed PubMed Central Google Scholar
Lakeman, I. M. M. et al. Addition of a 161-SNP polygenic risk score to family history-based risk prediction: impact on clinical management in non-BRCA1/2 breast cancer families. J. Med. Genet. 56, 581–589 (2019).
Article CAS PubMed Google Scholar
Wilcox, A. N. et al. Prospective evaluation of a breast cancer risk model integrating classical risk factors and polygenic risk in 15 cohorts from six countries. Preprint at https://doi.org/10.1101/19011171 (2019).
Pal Choudhury, P. et al. Comparative validation of the BOADICEA and Tyrer-Cuzick breast cancer risk models incorporating classical risk factors and polygenic risk in a population-based prospective cohort. Preprint at https://doi.org/10.1101/2020.04.27.20081265 (2020).
Maas, P. et al. Breast cancer risk from modifiable and nonmodifiable risk factors among white women in the United States. JAMA Oncol. 2, 1295–1302 (2016).
Article PubMed PubMed Central Google Scholar
Lee, A. et al. BOADICEA: a comprehensive breast cancer risk prediction model incorporating genetic and nongenetic risk factors. Genet. Med. 21, 1708–1718 (2019).
Article PubMed PubMed Central Google Scholar
Garcia-Closas, M., Gunsoy, N. B. & Chatterjee, N. Combined associations of genetic and environmental risk factors: implications for prevention of breast cancer. J. Natl. Cancer Inst. 106, dju305 (2014).
Article PubMed PubMed Central Google Scholar
Kapoor, P. M. et al. Combined associations of a polygenic risk score and classical risk factors with breast cancer risk. J. Natl. Cancer Inst. 113, djaa056 (2020).
Google Scholar
Easton, D. F. et al. Gene-panel sequencing and the prediction of breast-cancer risk. N. Engl. J. Med. 372, 2243–2257 (2015).
Article CAS PubMed PubMed Central Google Scholar
Muranen, T. A. et al. Genetic modifiers of CHEK2*1100delC-associated breast cancer risk. Genet. Med. 19, 599–603 (2017).
Article CAS PubMed Google Scholar
Kuchenbaecker, K. B. et al. Evaluation of polygenic risk scores for breast and ovarian cancer risk prediction in BRCA1 and BRCA2 mutation carriers. J. Natl. Cancer Inst. 109, djw302 (2017).
Article PubMed Central Google Scholar
Fahed, A. C. et al. Polygenic background modifies penetrance of monogenic variants for tier 1 genomic conditions. Nat. Commun. 11, 3635 (2020.
Article ADS CAS PubMed PubMed Central Google Scholar
Barnes, D. R. et al. Polygenic risk scores and breast and epithelial ovarian cancer risks for carriers of BRCA1 and BRCA2 pathogenic variants. Genet. Med. 22, 1653–1666 (2020).
Article CAS PubMed PubMed Central Google Scholar
van den Broek, J. J. et al. Personalizing breast cancer screening based on polygenic risk and family history. J. Natl. Cancer Inst. 00, jdaa127 (2020).
Google Scholar
Esserman, L. J., the WISDOM Study and Athena Investigators. The WISDOM Study: breaking the deadlock in the breast cancer screening debate. NPJ Breast Cancer 3, 34 (2017).
Article PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We acknowledge the input of the ClinGen Complex Disease Working Group members, including C. D. Bustamante, M. Meyer, F. Harrell, D. Kent, P. Visscher, T. Assimes, S. Plon and J. Berg. We also thank D. Durham for her editorial support and A. Paolucci for her administrative support in the preparation and submission of this manuscript. ClinGen is primarily funded by the National Human Genome Research Institute (NHGRI), through the following three grants: U41HG006834, U41HG009649 and U41HG009650. ClinGen also receives support for content curation from the Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD), through the following three grants: U24HD093483, U24HD093486 and U24HD093487. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health (NIH). In addition, the views expressed in this article are those of the author(s) and not necessarily those of the National Health Service (NHS), the National Institute for Health Research (NIHR) or the Department of Health. Research reported in this publication was supported by the NHGRI under award number U41HG007823 (EBI-NHGRI GWAS Catalog, PGS Catalog). We also acknowledge funding from the European Molecular Biology Laboratory (EMBL). Individuals were funded from the following sources: M.I.M. was a Wellcome Investigator and an NIHR Senior Investigator with funding from the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK; U01-DK105535) and Wellcome (090532, 098381, 106130, 203141 and 212259); M.I. is supported by the Munz Chair of Cardiovascular Prediction and Prevention at the Baker Heart and Diabetes Institute; M.I., S.A.L. and J.N.D. were supported by core funding from: the UK Medical Research Council (MR/L003120/1), the British Heart Foundation (RG/13/13/30194 and RG/18/13/33946) and the NIHR (Cambridge Biomedical Research Centre at the Cambridge University Hospitals NHS Foundation Trust); S.A.L. is supported by a Canadian Institutes of Health Research postdoctoral fellowship (MFE-171279); and J.N.D. holds a British Heart Foundation Personal Chair and a NIH Research Senior Investigator Award. This work was also supported by Health Data Research UK, which is funded by the UK Medical Research Council, Engineering and Physical Sciences Research Council, Economic and Social Research Council, Department of Health and Social Care (England), Chief Scientist Office of the Scottish Government Health and Social Care Directorates, Health and Social Care Research and Development Division (Welsh Government), Public Health Agency (Northern Ireland), British Heart Foundation and Wellcome.

Author information

These authors contributed equally: Hannah Wand, Samuel A. Lambert
These authors jointly supervised this work: Michael Inouye, Genevieve L. Wojcik

Authors and Affiliations

Stanford University School of Medicine, Stanford, CA, USA
Hannah Wand, Michael A. Iacocca & Jack W. O’Sullivan
Stanford Center for Inherited Cardiovascular Disease, Stanford, CA, USA
Hannah Wand & Jack W. O’Sullivan
Cambridge Baker Systems Genomic Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
Samuel A. Lambert & Michael Inouye
Cambridge Baker Systems Genomic Initiative, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
Samuel A. Lambert & Michael Inouye
BHF Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
Samuel A. Lambert, John N. Danesh & Michael Inouye
Health Data Research UK Cambridge, Wellcome Genome Campus and University of Cambridge, Cambridge, UK
Samuel A. Lambert, John N. Danesh, Helen Parkinson & Michael Inouye
European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
Samuel A. Lambert, Helen Parkinson & Jaqueline A. L. MacArthur
National Human Genome Research Institute, Bethesda, MD, USA
Cecelia Tamburro, Catherine Sillari, Robb Rowley & Erin M. Ramos
Department of Cardiovascular Medicine, Mayo Clinic, Rochester, MN, USA
Iftikhar J. Kullo
Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
Jacqueline S. Dron, Deanna Brockman & Amit V. Khera
Western University, London, Ontario, Canada
Jacqueline S. Dron & Robert A. Hegele
Baylor College of Medicine, Houston, TX, USA
Eric Venner
Department of Human Genetics, Genentech, South San Francisco, CA, USA
Mark I. McCarthy
Wellcome Centre for Human Genetics, Oxford, UK
Mark I. McCarthy
Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
Antonis C. Antoniou & Douglas F. Easton
Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
Nilanjan Chatterjee
Department of Oncology, Johns Hopkins School of Medicine, Baltimore, MD, USA
Nilanjan Chatterjee
Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
Charles Kooperberg
Department of Epidemiology, University of California, Irvine, CA, USA
Karen Edwards
Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
Katherine Vlessis, Kim Kinnear & Kelly E. Ormond
National Institute for Health Research Cambridge Biomedical Research Centre, University of Cambridge and Cambridge University Hospitals, Cambridge, UK
John N. Danesh & Michael Inouye
Division of Pharmaceutical Outcomes and Policy, UNC Eshelman School of Pharmacy, Chapel Hill, NC, USA
Megan C. Roberts
Stanford Center for Biomedical Ethics, Stanford University School of Medicine, Stanford, CA, USA
Kelly E. Ormond
Centers for Disease Control and Prevention, Atlanta, GA, USA
Muin J. Khoury
Department of Epidemiology, Rollins School of Public Health, Emory University, Atlanta, GA, USA
A. Cecile J. W. Janssens
Department of Translational and Applied Genomics, Kaiser Permanente Northwest, Portland, OR, USA
Katrina A. B. Goddard
Center for Health Research, Kaiser Permanente Northwest, Portland, OR, USA
Katrina A. B. Goddard
Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
Peter Kraft
The Alan Turing Institute, London, UK
Michael Inouye
Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
Genevieve L. Wojcik

Authors

Hannah Wand
View author publications
You can also search for this author in PubMed Google Scholar
Samuel A. Lambert
View author publications
You can also search for this author in PubMed Google Scholar
Cecelia Tamburro
View author publications
You can also search for this author in PubMed Google Scholar
Michael A. Iacocca
View author publications
You can also search for this author in PubMed Google Scholar
Jack W. O’Sullivan
View author publications
You can also search for this author in PubMed Google Scholar
Catherine Sillari
View author publications
You can also search for this author in PubMed Google Scholar
Iftikhar J. Kullo
View author publications
You can also search for this author in PubMed Google Scholar
Robb Rowley
View author publications
You can also search for this author in PubMed Google Scholar
Jacqueline S. Dron
View author publications
You can also search for this author in PubMed Google Scholar
Deanna Brockman
View author publications
You can also search for this author in PubMed Google Scholar
Eric Venner
View author publications
You can also search for this author in PubMed Google Scholar
Mark I. McCarthy
View author publications
You can also search for this author in PubMed Google Scholar
Antonis C. Antoniou
View author publications
You can also search for this author in PubMed Google Scholar
Douglas F. Easton
View author publications
You can also search for this author in PubMed Google Scholar
Robert A. Hegele
View author publications
You can also search for this author in PubMed Google Scholar
Amit V. Khera
View author publications
You can also search for this author in PubMed Google Scholar
Nilanjan Chatterjee
View author publications
You can also search for this author in PubMed Google Scholar
Charles Kooperberg
View author publications
You can also search for this author in PubMed Google Scholar
Karen Edwards
View author publications
You can also search for this author in PubMed Google Scholar
Katherine Vlessis
View author publications
You can also search for this author in PubMed Google Scholar
Kim Kinnear
View author publications
You can also search for this author in PubMed Google Scholar
John N. Danesh
View author publications
You can also search for this author in PubMed Google Scholar
Helen Parkinson
View author publications
You can also search for this author in PubMed Google Scholar
Erin M. Ramos
View author publications
You can also search for this author in PubMed Google Scholar
Megan C. Roberts
View author publications
You can also search for this author in PubMed Google Scholar
Kelly E. Ormond
View author publications
You can also search for this author in PubMed Google Scholar
Muin J. Khoury
View author publications
You can also search for this author in PubMed Google Scholar
A. Cecile J. W. Janssens
View author publications
You can also search for this author in PubMed Google Scholar
Katrina A. B. Goddard
View author publications
You can also search for this author in PubMed Google Scholar
Peter Kraft
View author publications
You can also search for this author in PubMed Google Scholar
Jaqueline A. L. MacArthur
View author publications
You can also search for this author in PubMed Google Scholar
Michael Inouye
View author publications
You can also search for this author in PubMed Google Scholar
Genevieve L. Wojcik
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

H.W., C.T., M.A.I., J.S.D., I.J.K., C.S., R.R., K.V., K.K., K.A.B.G., P.K. and G.L.W. conducted the literature review of reporting practices. H.W., S.A.L., C.T., M.A.I., J.W.O., I.J.K., R.R., D.B., E.V., M.I.M., A.C.A., D.F.E., R.A.H., A.V.K., N.C., C.K., K.E., E.M.R., M.C.R., K.E.O., M.J.K., A.C.J.W.J., K.A.B.G., P.K., J.A.L.M., M.I. and G.L.W. provided feedback on the reporting framework. H.W., S.A.L., C.T., M.A.I., J.W.O., J.S.D., I.J.K., A.V.K., J.N.D., H.P., E.M.R., M.C.R., A.C.J.W.J., K.A.B.G., P.K., J.A.L.M., M.I. and G.L.W. wrote and edited the manuscript text and items.

Corresponding author

Correspondence to Genevieve L. Wojcik.

Ethics declarations

Competing interests

M.I.M. is on the advisory panels for Pfizer, Novo Nordisk and Zoe Global; honoraria: Merck, Pfizer, Novo Nordisk and Eli Lilly; research funding: Abbvie, Astra Zeneca, Boehringer Ingelheim, Eli Lilly, Janssen, Merck, Novo Nordisk, Pfizer, Roche, Sanofi Aventis, Servier and Takeda. As of June 2019, he is an employee of Genentech with stock and stock options in Roche. No other authors have competing interests to declare.

Additional information

Peer review information Nature thanks Guillaume Lettre, Paul O’Reilly and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

This file contains Supplementary Notes, including additional Supplementary Methods detailing the development and revisions of the Polygenic Risk Score Reporting Statement (PRS-RS), as well as additional considerations for each of the reporting items.

Supplementary Table 1

Considerations for clinical translation. This table outlines considerations to inform downstream stakeholder perspectives regarding the translational spectrum of lab test development.

Supplementary Table 2

Mappings between the PGS Catalog and PRS-RS fields. As the PGS Catalog and PRS-RS do not completely align in their fields, we have provided a mapping between the two frameworks to assist in reporting of published PRS.

Supplementary Table 3

Example curations of 10 manuscripts using PRSRS. We have detailed information from ten PRS manuscripts using the fields outlined in PRS-RS as an example for the research community.

Supplementary Table 4

Introduction, Methods, Results and Discussion (IMRAD) Mappings for PRS-RS. Additionally, we have provided mappings between PRS-RS reporting items and a traditional IMRAD structure to aid authors in the reporting of items in manuscript preparation.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wand, H., Lambert, S.A., Tamburro, C. et al. Improving reporting standards for polygenic scores in risk prediction studies. Nature 591, 211–219 (2021). https://doi.org/10.1038/s41586-021-03243-6

Download citation

Received: 20 April 2020
Accepted: 15 January 2021
Published: 10 March 2021
Issue Date: 11 March 2021
DOI: https://doi.org/10.1038/s41586-021-03243-6
Springer Nature Limited

This article is cited by

Inclusion of the severe and enduring anorexia nervosa phenotype in genetics research: a scoping review
- Sarah Ramsay
- Kendra Allison
- Sara M. Sarasua
Journal of Eating Disorders (2024)
Genetic distance and ancestry proportion modify the association between maternal genetic risk score of type 2 diabetes and fetal growth
- Tesfa Dejenie Habtewold
- Prabhavi Wijesiriwardhana
- Fasil Tekola-Ayele
Human Genomics (2024)
Generalizability of polygenic prediction models: how is the R2 defined on test data?
- Christian Staerk
- Hannah Klinkhammer
- Andreas Mayr
BMC Medical Genomics (2024)
Recent advances in polygenic scores: translation, equitability, methods and FAIR tools
- Ruidong Xiang
- Martin Kelemen
- Samuel A. Lambert
Genome Medicine (2024)
Clinical data mining: challenges, opportunities, and recommendations for translational applications
- Huimin Qiao
- Yijing Chen
- You Guo
Journal of Translational Medicine (2024)

Improving reporting standards for polygenic scores in risk prediction studies

Abstract

Similar content being viewed by others

Main

The PRS-RS

Reporting on the background of risk scores

Reporting on study populations

Reporting on the development of risk models

Reporting on the evaluation of risk models

Reporting on interpretation

Reporting on model parameters

Improving PRS research and translation

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Navigation