Introduction

It is now well established that multiple sclerosis (MS) risk has a significant genetic component, and that this genetic contribution to risk is polygenic. More than 230 human leucocyte antigen (HLA), non-HLA, and X-chromosome common genetic variants are now documented as being associated with MS risk [1••]. However, these discoveries still only explain at best 48% of the heritability of MS, with a portion of the remainder explained by rare variants, as yet undiscovered common variants of small effect, epigenetic variation, gene-gene interactions, and gene-environment interactions [2, 3]. Although incomplete, our understanding of the genetic architecture of MS risk is becoming clearer and will likely make significant advances in the near future. However, our understanding of the potential genetic associations with MS phenotype and severity are much less well understood or even studied adequately.

There is significant circumstantial evidence for a component of genetic control over MS severity. This is supported by the observation that outcomes in all forms of MS are highly heterogeneous [4,5,6,7]. Significantly, recent work from the UCL Clinically Isolated Syndrome (CIS) cohort has demonstrated that 30 years post-CIS, 23% of CIS cases remain so, 24% have mild disability (Expanded Disability Status Scale (EDSS) [8] scores ≤ 3.5), and 20% have progressed to SPMS, with approximately 12% deceased due to MS [7]. With only a few clinical and MRI baseline variables that predict subsequent outcomes [9,10,11,12], there is likely to be a complex interplay between genetic, environmental, personal, stochastic, and treatment factors in the risk of MS progression.

There are several reasons why MS severity and phenotype are more difficult to study than MS risk. Firstly, measuring disability progression and defining MS phenotype are more difficult than defining the disease state. We have currently no widely applicable validated bio-markers of MS progression from MRI [13], and although neurofilament light chains in CSF and/or serum [14] are attracting significant attention as biomarkers, neither has provided a robust marker of disease progression, and both are heavily modified by treatment. Therefore, we rely on clinical markers of disability progression such as change in EDSS, MS Functional Composite (MSFC), or the MS Severity Scale (MSSS). Whilst changes in EDSS and MSFC require a longitudinal study with measures at least two time points, the MSSS [15] can be used cross-sectionally as it generates a ranking for each case compared to the mean disability reached for a given disease duration based on the EDSS. However, the EDSS, on which the MSSS is based, is, in itself, an unreliable measure of disability progression as it is nonlinear, has significant inter- and intra-rater variability, and is highly focused on ambulation [16]. Regarding phenotype, the classical clinical definition of MS separating it into relapsing and progressive onset forms is useful clinically, although whether this distinction defines two separate disease processes is uncertain. There is no biomarker nor genetic marker from GWAS that defines either broad phenotype.

Secondly, many variables almost certainly can and will affect the rate of disability progression and these particularly include treatment factors with mounting evidence that use of DMTs has significantly altered the natural history of MS [17]. Similarly, smoking [18] and other personal factors such as vitamin D levels [19], sunlight exposure [20], and comorbidity burden [21] may all influence disease outcomes and therefore need to be factored into models of progression to isolate the potential role of genetic factors. Consequently, to accurately model progression requires a longitudinal cohort study with multiple measures of outcome including biomarkers (particularly MRI), complete accrual of potential confounding factors for progression including personal, environmental, and treatment factors, and must be of a sufficient time frame to measure confirmed disability accrual and create a clear distinction between those who have progressed and those who have not, that is delineation of clearly separate disability trajectories. This usually will require a comprehensive minimum 5-year cohort study. These studies by their nature are difficult to do as the participants must commit for a long period of time, they are expensive due to the close monitoring needed, and they are difficult to analyse due to the multiple variables that may directly confound the outcome associations. Consequently, these types of studies are uncommon. Additionally, because of these constraints, these studies are generally small, usually less than 500 cases in size, and thus, generally unsuitable to utilise genome-wide association study (GWAS) methods with hundreds of thousands of single nucleotide polymorphisms (SNPs) being studied and the multiple testing burden that this imposes statistically.

Examining the evidence for genetic regulation of disease course prediction

Disease risk and progression

One potential clue for a significant genetic basis for MS severity and phenotype could come from studies of homogeneity of phenotype and severity markers of MS cases within families. However, there is weak evidence of familial association with severity [22•]. The largest study of familial association identified 2310 individuals from 1083 families and concluded that there was modest concordance for clinical course amongst siblings with MS, but not parent-sib pairs, with no concordance found for MSSS [23].

The individual risk allele with the strongest effect on MS risk is HLA-DRB1*15:01 that is associated with a 2–3-fold increase in MS risk and therefore an obvious candidate for an association with phenotype and severity. However, large studies have revealed no association with severity, no association with phenotype, and only a modest association with age at onset (AAO), with a lower AAO, of 1 year if heterozygous and 3 years if homozygous for HLA-DRB1*15:01 [24, 25]. In MS cohort studies, HLA-DRB1*15:01 status has not been associated with relapse risk or disability progression [26•], except as a potential modifier of the effects of vitamin D level on relapse rate in children with MS [27]. To date, very large cross-sectional GWAS studies have failed to identify associations between cross-sectional markers of MS severity (usually MSSS) and MS phenotypes and genetic variation [28,29,30]. There is some evidence, however, of a potential role for functional dichotomy between variants and pathways associated with risk as compared to progression [31].

Prior to 2015, there was little evidence for any true associations between MS severity and genetic variants, with many studies including candidate gene approaches, vitamin D pathway gene analysis, and reanalysis of GWAS data sets not clearly establishing a causal or even statistically significant link between the assessed genetic risk variants and candidate genes and MS severity or progression. From the GWAS, including those enriched for progressive MS phenotypes, there has been no convincing evidence for genetic variation in determining MS phenotype, with no study finding a different genetic architecture as defined by SNP typing and the risk of either a progressive or relapsing MS phenotype [22•]. This robust finding along with lack of pathological differences between these two phenotypes suggests that MS is a single entity with multiple disease trajectories. However, the possibility that separate genetic variants, not associated with MS risk per se, could influence progression rate is not excluded.

After 2015, there has been some progress in our discovery of putative genetic associations with progression of MS. This progress has resulted from the assessment of well-characterised longitudinal studies, although lack of replication, small numbers, and variability in outcome measures have hampered cross-validation. The published data since 2015 is summarised in Table 1.

Table 1 Summary of studies 2015 onwards correlating genetic variation with MS severity and phenotype

In 2017, we described the association between relapse risk and the gene lipoprotein receptor protein 2 (LRP2) in the largest GWAS for relapse reported to date [32••]. This study was undertaken in three longitudinal cohorts: Tasmanian MS longitudinal study n = 141; Ausimmune longitudinal study n = 127; and then the top hits were explored in a paediatric longitudinal cohort n = 181 (combined n = 449). Here, the LRP2 SNP rs12988804 reached genome-wide significance in predicting relapse risk, with a hazard ratio (HR) 2.18, p = 3.3 × 10−8. LRP2 is expressed by neurons and oligodendrocytes, and has been shown in animal studies to regulate axonal guidance and brain development. This SNP is intronic and non-functional; therefore, it likely tags a rare functional variant or may regulate expression. Most importantly, this variant has now been validated in an independent Belgian longitudinal cohort [33], indicating that LRP2 variation is the first genetic locus that has been proven to be associated with a marker of MS severity (relapse rate). Further, in 2018, another association between an MS risk allele, AHI1 (rs11154801), and increased relapse rate was reported [34], further lending support to the notion that some MS risk variants may additionally contribute to disease course. The above three studies highlight the need for detailed, longitudinal assessments to reveal associations between genetic variants and clinical course.

MS risk SNPs and genetic risk score approaches

Utilising a similarly well-phenotyped and longitudinally followed cohort from UCSF, Isobe and colleagues [35••] studied the role of the HLA region in MS progression utilising a genetic burden analysis method. They studied 652 patients with phenotypic information (586 with genetic data) and 455 controls. Outcome measures included MSSS, brain, and spinal cord imaging. They constructed a HLA genetic burden (HLAGB) score, but found no correlation between HLAGB and MSSS. They also found no association between HLAGB and earlier age at onset, or time to conversion from CIS to clinically definite MS, after multiple hypothesis correction. They did find an association between reduced cerebral white matter fraction in women with RRMS and HLAGB that was robust to multiple hypothesis testing corrections. Interestingly, in their study, Isobe and colleagues reported that HLA-A*2:01 (the most protective allele against MS risk) was not associated with brain MRI outcomes, although HLA-A*24:02-HLA-B*07:02-HLA-DRB1*15:01 haplotypes were associated with shrinkage of the sub-cortical grey matter fraction. Further, they reported that the HLA-A*X-HLA-B*X-HLA-DRB1*15:01 haplotypes were not correlated with brain MRI outcomes. Likewise, several other small cohort studies of MRI outcomes [36] and relapse risk or disability progression [26•] did not find associations with HLA variants. However, in 2016, in another small cohort study, Balnyte and colleagues reported a weak association with longer time to conversion to SPMS and HLA DRB1*08 [37]. These findings taken together would suggest the HLA region shows little evidence for an association with MS clinical course or phenotype, although pooling of studies, standardising of outcome measure and application of genetic risk score (GRS) methods that examine the joint risk genetic effects may clarify this question.

To minimise the multiple testing burden, many investigators have studied the known MS risk alleles and their associations with MS progression and phenotype, based on the reasonable hypothesis that genes that influence MS risk may also influence other disease metrics. These studies have focussed on the non-HLA risk variants, whilst still adjusting for the presence of the HLA alleles particularly, HLA DRB1*15:01. Three of these studies, including a pooled analysis of > 7000 cases [38] and 2 other smaller studies [39, 40], constructed weighted genetic risk scores (wGRS) or cumulative genetic risk scores (cGRS) and tested for association with MSSS. The authors of these studies found no, or at best weak associations with phenotype or other metrics of MS progression. Although, Esposito and colleagues did find an association with their genetic risk score and AAO which strengthened with the exclusion of HLA DRB1*15:01 [39].

Using a cGRS in a longitudinal study, Pan and colleagues [26•] found that the seven top ranked and nominally associated known MS risk genes did significantly predict relapse and disease progression in a prospectively assessed cohort. Interestingly, the two gene sets were entirely different suggesting that disability progression and relapse may have different genetic drivers, as previously suggested by Barinzini [31]. This effect was highly significant for both relapse and progression with those carrying ≥ six disability risk SNPs having an annual change of EDSS of 0.48 points greater than those carrying ≤ two disability risk SNPs. With an r2 of 0.32, it suggests that this cGRS explained 32% of the observed variability of disability progression. Similarly, strong findings were reported for relapse risk. It is important to note that this was a relatively small study of 127 cases followed for 5 years and that none of the SNPs reached significance on their own. Others have used similar wGRS methods [41] and found little convincing evidence for associations with relapse.

Overall genetic risk score and genetic burden studies have not provide significant evidence of a role for novel genetic loci in MS phenotype and progression again likely reflecting the difficulty of measuring progression reliably outside of longitudinal cohort studies.

Exome sequencing approaches

Recent technological advances allowing for more affordable exome sequencing have also seen the first studies published attempting to elicit likely biologically significant variants (that is genetic variants that are more likely to influence protein structure) in the context of MS severity. To date, this method has not yielded evidence for significant associations with phenotype or severity. Wang and colleagues [42•] used an exome sequencing approach in two unrelated extended MS families with a total seven affected members with primary progressive MS (PPMS). They reported that rs61731956, an exomic SNP that produces an Arg to Gln mutation in the nuclear receptor NR1H3, is associated with familial MS, with the mutation associating with a rapidly progressive phenotype with loss of function associated with gene dysregulation. They then looked for common variants in 2053 MS and 799 healthy controls (1687 with detailed phenotype) and they found that tagging SNPs were not associated with RRMS but that there was a significant association between tagging SNPs and PPMS. This work has, however, been controversial, as these variants could not be validated in other data sets [50]. In another recent publication, Sadovnik and colleagues [43] utilised a two-stage methodology with their discovery cohort utilising an extremes of phenotype approach to maximise power. They recruited and exome sequenced severe (PPMS – EDSS ≥ six at 10 years, n = 50) and mild (RRMS – EDSS ≤ three at ≥ 14 years, n = 50) cohorts. Thirty-eight variants met their threshold levels for replication, with 33 of these sent for replication in 2016 patients. However, no variant was found to be statistically significantly associated with severity (MSSS), phenotype, or AOO. One other exome sequencing study in a Kuwaiti population [44] found an association with the PLXNA3 SNP rs5945430 G allele hemizygosity and higher EDSS scores in males, but not females. As males in this study were selected for their higher EDSS scores, this finding requires further validation.

Candidate gene approaches

Candidate gene approaches have been used by a number of authors to assess the association between MS severity measures and biologically plausible candidate MS genes. Whilst findings of these studies are described below, it is worth noting that none of the identified associations described have been validated in independent cohorts to date.

Using a candidate gene approach in a Greek population of 389 cases and 336 controls, Dardiotis and colleagues targeted 147 SNPs located within nine genes associated with leucocyte trafficking. Here they found significant associations with severity as measured by MSSS: ITGA4 (rs6721763; p = 3.00x10 − 06) and SPP1 (rs6532040; p = 0.009884), and AAO: FN1 (rs1250249; p = 0.0002) which was dose-dependent [45]. These findings need validation but they arise from a genetically homogenous population adjusted for treatment duration and with a minimum of 5 years prospective follow-up.

Boiocchi et al. [46] utilised a targeted approach looking at HSP70 polymorphisms (rs2227956) in 191 MS and 365 HC. Severity was based on MSSS (< three v ≥ three). Here, they found a significant difference in MSSS between TT and CC homozygotes, with no effect of CT heterozygosity (CC carriers had a 1.22 (95% CI 0.22, 2.21) higher MSSS than TT homozygotes, p < 0.0001). Additionally, in a subset of 47 patients, they found that protein expression levels of HSP70-hom were directly correlated with MSSS scores, with lower expression associated with lower MSSS. Again, this study found associations with variants in a biologically plausible MS severity gene and added expression studies as additional support. In a separate study (n = 127), Zhou and colleagues [47] identified that common variants in the MBP gene (a biologically plausible MS severity gene) were associated with relapse risk and disability accumulation in a longitudinal cohort.

Based on previous work on the role of the PD-1 pathway that is associated with peripheral tolerance as well as MS severity and phenotype and other autoimmune diseases [51], Pawlak-Adamska et al. [48] assessed PD-1 gene polymorphisms in relation to first symptom and severity in RRMS. Three SNPs PD-1.3 (intron 4), PD1.5 and PD1.9 in exon 5 were assessed in 479 Polish people (203 cases and 276 controls). Variants in PD-1.5T allele were associated with pyramidal signs (protected against diplopia), and the PD-1.3G/PD-1.5T/PD-1.9T haplotype was associated with a greater time interval from first to second relapse.

Using post-mortem tissue, Melief and colleagues [49] studied the effect of glucocorticoid receptor (GR) haplotypes that confer increased sensitivity to glucocorticoids and their association with the rate of progression of MS. They assessed GR haplotypes together with the production of cortisol and soluble CD163 (sCD163—a receptor associated with inflammatory activity) by macrophages/microglia in 137 MS brain donors (77 SPMS, 34 PPMS, 26 unknown phenotypes). Brain tissue (cerebellum), and cortisol and sCD163 in CSF from the lateral ventricles were collected post-mortem. Median age was 66 (IQR 55–78) with time from disease onset to death taken as an indicator of severity. Exclusions in this study were based on death not due to MS, and measurement of cortisol was excluded if death was due to sepsis. Here, the N363S (asparagine to serine mutation) haplotype and Bcl1 haplotypes with high sensitivity to GC were associated with more rapidly progressive disease as defined by time from symptom onset to death or time from symptom onset to EDSS 6, suggesting that GC receptor haplotypes which conferred increased sensitivity to glucocorticoids were associated with faster disease progression. This is an interesting study as it utilises brain tissue combined with a CSF cortisol as a biomarker, although replication is clearly needed.

Gene expression studies

Several recent studies have assessed gene expression and severity or phenotype. Srinivasan et al. [52] undertook a transcriptome analysis in peripheral blood mononuclear cells (PBMCs) from 297 people with different stages and/or phenotypes of MS including CIS, RRMS, secondary-progressive MS (SPMS), and PPMS together with 96 healthy controls (HC). They used microarray, qPCR, and pathway analysis, with validation by literature mining. They found that genes near known MS risk SNPs were dysregulated compared to the healthy population. Interestingly, there was greater dysregulation in CIS, SPMS, and PPMS than expected but curiously not in RRMS, which does not support a role in disease severity. The use of PBMCs rather than selected cell populations may reduce the sensitivity of the analysis, as does the presence of small numbers of phenotypic subsets in each cohort. Similarly, Hellberg and colleagues [53] assessed stimulated CD4+ T cell responses in vitro from individuals with MS compared to HC. They identified a number of dysregulated genes. To narrow the study focus, they chose to examine dynamic response genes that were also linked to MS risk GWAS hits (n = 19). Four genes were selected for further study (CXCL10, CXCL1–3, CCL2, and osteopontin) and were linked in a gene network. The degree of dysregulation of this network predicted disease activity and therapeutic response to natalizumab therapy (n = 15; 6 responders, 9 non-responders). The authors postulated that this integration of GWAS and dynamic gene expression profiling could lead to personalised biomarkers. These findings are of interest, but further work on purified cell populations in larger longitudinal cohorts that have significant on- and off-treatment epochs is required to disentangle this very difficult area. Similarly, defining a significant change in gene expression (without assessing concomitant protein expression changes) makes this work very difficult to conduct. The ideal sample size for unbiased comparisons of gene expression features is no smaller than that required for GWAS, but mutual validation studies could markedly reduce the required sample size.

Epigenetics

The study of association between epigenetic change and severity and/or phenotype of MS is an emerging field. Whilst there are several different epigenetic modifications of the genome, methylation of CPG sites is the most widely studied due to availability of high-throughput chip-based analyses. Kulakova et al. [54] provided preliminary evidence in a small cohort of never treated RRMS [14] and PPMS [8] patients and controls [8] that differential methylation of CpG sites in PBMCs occurs between RRRMS and PPMS cases, although there was no adjustment for age, itself an important determinant of methylation state.

Epigenetic changes particularly in methylation can be influenced significantly by the environment and by ageing and potentially by treatments for MS. For example, smoking is associated with shortened time to SPMS [18, 55], and a faster rate of progression [56], and in addition, is also associated with significant change in the epigenetic profile as measured by methylation [57] potentially providing a link between an environmental/personal factor and genetic variation that affects severity.

In a longitudinal cohort, Zhou and colleagues [58] have shown that genetic variations within the gene that encodes for miR-146a (an miRNA associated with MS risk) can influence relapse rate and time to second relapse. This variation may have an effect on the structure of the mature miR-146a, thereby providing an interesting mechanism of epigenetic control of MS severity markers. Again, validation in a larger cohort is needed.

Conclusions

It is surprising that in 2017, apart from an association of HLA-DRB1*15:01 and AAO, there is only one currently validated genetic variant associated with an MS severity marker (relapse) [32••]. At present, we know nothing about this variant within LRP2 as it is intronic and does not appear to influence gene expression or splicing. Understanding how this variant can influence MS relapse rate is therefore of great interest. Potentially, it may tag a rarer functional SNP or variants within the gene or in nearby genes, and therefore, initial experiments should focus on sequencing the haplotype block involved. The validation of this finding in an independent cohort is critically important and represents the first occasion that this has occurred for an MS severity gene.

Most of the studies reported here are of modest size and report associations with a high risk of a type one error. Similarly, the multiple hypothesis-testing burden can be overwhelming and requires careful attention to establish true associations rather than false positives. Given the very low rate of replication, we can in fact assume that most of the reported results in the literature are or may well be false positives. Sample size and longitudinal measurement of outcome measures may overcome these problems to some extent, but the development of a validated biomarker of phenotype or progression in MS may also provide a significantly improved metric to assess genetic predictors. We would strongly advocate that, in all studies with severity as an outcome measure, validation cohort be sought as the risk of type one error is high. The use of cross-sectional studies has proven unsuccessful to this point in finding severity genes, and careful consideration of methodology and outcome measures is needed. Extremes of phenotype studies may help as comparison can be made between those with very divergent and therefore distinct outcomes. Controlling for all potential confounders, particularly treatment, smoking, comorbidities, and age, is critical, as progression is clearly multifactorial. Gene-environment interactions, the effect of the environment on epigenetic changes, and epistatic interactions may all be important and need to be considered when assessing the role of genetic variations in MS severity.

There is a great untapped opportunity in validation studies across differing types of genomic assays. If target regions identified by GWAS recur in gene expression or methylation studies, this should be considered evidence for locus or pathway replication.

The lack of evidence for a genetic signal that can distinguish primary progressive MS from relapsing onset MS even in very large GWAS would suggest that the two clinical phenotypes are not genetically different in their onset risk. However, whether severity genetic variations have differential effects between the two phenotypes is not clear. This is an area that requires significantly more research; this research will need to be collaborative as no single study can answer these questions with certainty. Pooling of resources is the best mechanism to achieve validated outcomes that may provide clues as to therapeutic interventions to improve the lives of people with MS through precision medicine.