Introduction

Multiple sclerosis (MS) is a chronic autoimmune disorder of the central nervous system, in which inflammation, demyelination and neuroaxonal degeneration constitute the principal pathological hallmarks [1]. Neurodegenerative processes, in particular, are advocated as fundamental drivers of the gradual accumulation of irreversible disability (e.g. ambulation problems) [2, 3]. Due to extensive interindividual variation in the disease course and often clinically silent neuroaxonal damage throughout the initial phase [3], progressive deterioration is difficult to predict. Magnetic resonance imaging (MRI)-based indicators such as brain volume loss (BVL) have been established to reflect neurodegeneration in MS. However, longitudinal BVL assessment is subject to multiple confounders (e.g. “pseudoatrophy”) and is currently unreliable at the level of an individual patient [4, 5]. Together with an ever-growing spectrum of different therapeutic options, this means that there is a need for additional biomarkers to monitor neurodegenerative changes and anticipate (further) disease progression.

An up-and-coming approach is “the eye as a window to the brain”. Since the retina is an unmyelinated structure, changes in the retinal architecture can serve as a model for MS-associated neuroaxonal degeneration. Indeed, in a post-mortem clinicopathological study, retinal neurodegenerative signs were discernible across most MS patients and correlated with several clinical parameters [6]. To appraise neuroaxonal loss, optical coherence tomography (OCT) captures high-resolution cross-sectional images of the retina and permits measurement of the individual retinal components. In MS, the peripapillary retinal nerve fiber layer (pRNFL)—unmyelinated axons from retinal ganglion cells—and the combined macular ganglion cell-inner plexiform layer (mGCIPL)—bodies and dendrites from retinal ganglion cells—show a reduced thickness over time (Fig. 1) [7]. Here, a conventional pathophysiological mechanism is transsynaptic retrograde degeneration inflicted by (subclinical) damage along the entire visual pathway [8,9,10]. Over the past years, both pRNFL and mGCIPL atrophy have been acknowledged as valuable biomarkers of neurodegeneration in MS [7] and numerous studies have correlated retinal layer thinning with disability progression and MRI-derived BVL [11,12,13,14]. Nevertheless, various confounders (e.g. age and gender) and unmet demand for well-defined OCT threshold values to predict disease progression have hampered wide-spread implementation of OCT in MS clinical practice [15].

Fig. 1
figure 1

Representative macular OCT image (Spectralis, Heidelberg Engineering) of a healthy individual with annotated retinal layer segmentation. RNFL retinal nerve fiber layer, GCL ganglion cell layer, IPL inner plexiform layer, INL inner nuclear layer, OPL outer plexiform layer, ONL outer nuclear layer, ELM external limiting membrane, PR photoreceptor layer (inner and outer segment), RPE retinal pigment epithelium. Due to low contrast, GCL and IPL are often combined into GCIPL = ganglion cell-inner plexiform layer

In this systematic review, we summarize the evidence from recent longitudinal studies that investigated whether OCT-measured pRNFL and/or mGCIPL atrophy can predict the risk of future disability worsening in MS patients after adjustment for confounders.

Methods

Information sources and search strategy

We conducted a systematic literature search in four electronic databases (MEDLINE via PubMed, Embase, Web of Science Core Collection and Cochrane Library) from initiation until 21 April 2022 (last date searched). Principal concepts, such as “probability”, “disease progression”, “multiple sclerosis” and “optical coherence tomography”, were used in combination to define our search strategy. We adopted the Peer Review of Electronic Search Strategies (PRESS) [16] and Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) [17] checklists to instruct the management of this systematic review. A detailed search strategy is provided in Supplemental information.

Selection process and eligibility criteria

Articles retrieved from above-mentioned databases were selected by one reviewer (SS) in accordance with the subsequent protocol. First, duplicate records were removed through a previously described algorithm in EndNote [18]. Then, titles and/or abstracts were screened based on the population (MS patients), intervention (OCT) and outcome (disability progression) [19]. Full-text studies were assessed for eligibility and included, provided that these articles met the following criteria: (1) longitudinal study design; (2) OCT-based measurement of retinal layer thickness and/or thinning; (3) spectral-domain OCT (SD-OCT); (4) pRNFL and/or mGCIPL; (5) risk of disability progression as outcome (i.e. odds ratio, hazard ratio and relative risk); (6) adjustment of risk for confounders; (7) Expanded Disability Status Scale (EDSS), Multiple Sclerosis Functional Composite (MSFC) and/or Multiple Sclerosis Severity Score (MSSS) as component in the definition of “disability progression” and (8) adult patients with confirmed MS using the respective McDonald criteria [20,21,22]. Only articles written in English, French or Dutch were considered. There was no date or geographic restriction. Reviews, case reports, expert opinions, abstracts from conferences and preprints were omitted. All the studies that passed this selection process were incorporated in our qualitative synthesis. A comprehensive overview of inclusion and exclusion criteria is available in Supplementary Information.

Data collection process and synthesis

We applied a predetermined data extraction form to collect relevant information, which consisted of the experimental design, baseline participant characteristics and primary outcome measures (reviewer SS). Our aspiration was to describe the risk (odds ratio; hazard ratio; relative risk) of disability worsening relative to pRNFL and/or mGCIPL threshold values (µm; µm/year) in MS patients after adjustment for confounders. We did not perform a meta-analysis of the data.

Risk-of-bias assessment

The Quality In Prognosis Studies (QUIPS) tool was used to evaluate the risk of bias within each individual study (reviewer SS). This validated tool for prognostic factor studies estimates bias based on 6 domains: (1) study participation; (2) study attrition; (3) prognostic factor measurement; (4) outcome measurement; (5) study confounding and (6) statistical analysis and reporting [23]. We rated the overall risk of bias as “low” when all six domains were assigned as “low” risk of bias [24].

Results

Figure 2 illustrates the selection process of the included articles. Our search strategy identified 422 records across the different electronic databases. After removal of duplicate references and omission in regard to title and/or abstract, 101 full-text articles were assessed for eligibility. Of those, 8 studies fulfilled the inclusion and exclusion criteria and were integrated in this systematic review.

Fig. 2
figure 2

Flow diagram of included studies in the systematic review. One full-text article was not retrieved (after request via email and ResearchGate). OCT-A optical coherence tomography-angiography, CIS clinically isolated syndrome, RIS radiologically isolated syndrome, INL inner nuclear layer, TD-OCT time-domain OCT

All 8 included studies had a longitudinal design, which enabled the investigators to trace the evolution of clinical impairment in MS cohorts across several time points (i.e. visits) and relative to previous OCT-scans. The follow-up duration ranged from 2 years up to 10 years (Table 1). Five studies focused on MS patients with a relapsing–remitting disease course (RRMS), whereas the other three articles also considered primary and/or secondary progressive subtypes of MS (PPMS and/or SPMS). Similar definitions for “disability progression” were used. However, Lambe et al. required a greater increase in EDSS [25]; three studies delineated a combined endpoint with Symbol Digit Modalities Test (SDMT) as an additional factor. With regard to SD-OCT, Spectralis (Heidelberg Engineering) and/or Cirrus (Carl Zeiss Meditec) devices were used; intervals between scans varied from 1 year to ≥ 7 years (for longitudinal assessment). To calculate the risk of disability progression with reference to prior retinal layer measurement(s), the authors either applied proportional hazards (hazard ratio) or logistic regression (odds ratio) statistical models. Both hazard ratio (HR) and odds ratio (OR) convey a sense of risk, although these measures are not interchangeable. In short, the HR represents instantaneous risk over a time period, whereas the OR is the cumulative risk until a prespecified event (here, disability worsening) occurs. Detailed study and cohort characteristics are provided in Supplemental Table 1 and 2.

Table 1 Characteristics and results included studies

In 2016, Martinez-Lapiscina et al. introduced a threshold value for the cross-sectional—i.e. one-time or baseline—measurement of pRNFL thickness based on the tertile distribution of this parameter within their large MS cohort (879 patients) (Table 1). A baseline pRNFL thickness of ≤ 88 µm (Spectralis) was associated with a twofold increased risk (adjusted HR or aHR 1.96) for disability progression in subsequent years [26]. Later, Bsteh et al. corroborated these results in RRMS patients (aHR 2.96) [27]. Within similar and shared RRMS populations, this research group determined threshold values for the cross-sectional measurement of mGCIPL thickness via receiver operating characteristic (ROC) analysis; as well as for the longitudinal—i.e. OCT-scans at several time points—measurement of pRNFL and mGCIPL thinning at a predefined specificity of 90%. A baseline mGCIPL thickness of < 77 µm (Spectralis) correlated with a threefold increased risk (adjusted OR or aOR 2.9) for clinical deterioration. Moreover, a pRNFL thinning rate of > 1.5 µm/year (Spectralis) and a mGCIPL thinning rate of ≥ 1.0 µm/year (Spectralis) resulted in a 15- and 18-fold increased risk (aOR 15.1 and 18.3, respectively) for disability worsening [28, 29]. In a 2021 article, Schurz et al. confirmed above-mentioned findings within a different and more broad MS cohort (RRMS, PPMS and SPMS). Their proportional hazards model demonstrated a sixfold and sevenfold increased risk (aHR 5.7 and 6.8, respectively) for EDSS escalation, when the pRNFL thinning rate was > 1.5 µm/year (Spectralis) and the mGCIPL thinning rate was ≥ 1.0 µm/year (Spectralis) [32].

Three studies defined somewhat different thresholds, each applied to a specific clinical context or research question. In a fourth article, Bsteh et al. investigated the risk of treatment failure (i.e. disability progression after the initiation of treatment). Here, a pRNFL thinning rate of ≥ 2.0 µm/year (Spectralis) and a mGCIPL thinning rate of > 0.5 µm/year (Spectralis) corresponded to a three- and fivefold increased risk (aHR 2.7 and 4.5, respectively) for treatment failure over time [30]. Cilingir et al., on the other hand, concentrated on young (between 16 and 45 years old) RRMS patients with a short disease duration (within 5 years) and a rather low EDSS (less than 4) (Supplemental Table 1). In this population, they outlined a twofold increased risk (aHR 2.43) for disability worsening over the next few years, if the baseline pRNFL thickness was ≤ 97 µm (Spectralis) [31]. With a median follow-up duration of approximately 10 years, Lambe et al. were able to explore long-term disability progression in an MS cohort, and found that a baseline mGCIPL thickness of < 70 µm (Cirrus) translated into a fourfold increased risk for clinical deterioration [25].

All risk estimates were adjusted for confounders (aHR and aOR), factors that influence both the independent (retinal layer measurement) and the dependent (disability progression) variable. Most studies corrected at least for age, disease duration and baseline EDSS (Table 1). Five out of 8 studies accounted for treatment status. To reduce noise, patients with ophthalmological (e.g. severe myopia), neurological or drug-related causes of vision loss or retinal damage not attributable to MS were excluded across all studies. These criteria engendered an important non-MS-related reason for exclusion (5.1–14.5% of patients; not specified in every study) [29, 30, 32]. Another, nonetheless, MS-related factor is optic neuritis (ON). Eyes with a history of ON (ON-eyes) display a reduced cross-sectional pRNFL and mGCIPL thickness in comparison to eyes without a history of ON (non-ON-eyes). Most neuroaxonal damage transpires within the first few months following acute ON. After 6 months, longitudinal retinal layer thinning of ON-eyes runs again in parallel with non-ON-eyes [33,34,35]. Table 2 summarizes the most recurrent inclusion and exclusion criteria based on ON and these dynamics. A detailed overview per article is available in Supplemental Table 3. OCT quality assessment in accordance with the OSCAR-IB consensus recommendations was performed in all studies [36, 37], except Martinez-Lapiscina et al. [26] (Supplemental Table 2). In this systematic review, we graded the overall risk of bias as “low”.

Table 2 Most recurrent patient and/or eye inclusion and exclusion criteria based on optic neuritis (ON)

Discussion

Since MS is characterized by an unpredictable disease course, accurate prognosis and personalized treatment constitute an important challenge in clinical practice. Particularly, in treating the individual MS patient, few tools are available to guide clinicians in their choice of drug that provides the required efficacy with acceptable side-effects. In this systematic review, we assessed OCT as a candidate prognostic tool and found that (1) cross-sectional assessment of pRNFL thickness ≤ 88 µm; (2) cross-sectional assessment of mGCIPL thickness < 77 µm; (3) longitudinal assessment of pRNFL thinning > 1.5 µm/year; and (4) longitudinal assessment of mGCIPL thinning ≥ 1.0 µm/year were all associated with an increased risk for disability progression in subsequent years. These results were validated in different MS cohorts (RRMS alone or RRMS, PPMS and SPMS combined), replicated across two statistical models (proportional hazards and logistic regression models) and adjusted for confounders such as age, disease duration, baseline EDSS and treatment status. Inclusion and exclusion criteria accounted—to a considerable degree—for the effect of retinal degeneration after ON. Some (more specific) thresholds were introduced to monitor the risk of treatment failure, clinical deterioration in young MS patients and long-term (10 years) disease progression. However, these findings have not been corroborated at this moment.

Relative to pRNFL measurement, cross-sectional and longitudinal mGCIPL thresholds yielded consistently higher risk estimates. In fact, mGCIPL atrophy is recognized to be a better clinicopathological measure, to be more sensitive to neuroaxonal degeneration after acute ON and to be less susceptible to ON-related swelling than pRNFL atrophy [33, 35, 38, 39]. Concerning OCT-scan frequency, longitudinal pRNFL and mGCIPL assessment resulted in larger effect sizes than the cross-sectional approach in our analysis. In addition to this, previous studies reported that the rate of longitudinal retinal layer thinning after prior ON (≥ 6 months) is equivalent for ON-eyes and non-ON-eyes, whereas an absolute difference in cross-sectional retinal layer thickness persists [33,34,35]. Altogether, we confirm that longitudinal mGCIPL measurement is the most robust and preferred biomarker for disability progression in MS.

To restrain interference from non-MS-attributable retinal damage, all studies in this systematic review used “optimized” MS cohorts. In agreement with the OSCAR-IB criteria, patients with ophthalmological, neurological or drug-related causes of vision loss or retinal damage were excluded [36]. However, comorbidities are prevalent in MS [40], which translated into a substantial fraction (5–14%) of patients being omitted based on the above-mentioned criteria in our included studies [29, 30, 32]. This somewhat limits the application of retinal layer measurement to MS patients with few comorbidities, but also emphasizes the importance of an in-depth collaboration between neurologists and ophthalmologists (e.g. patient selection and interpretation results).

Another imaging biomarker for neurodegeneration is the MRI-based measurement of brain atrophy, quantified as BVL. Indeed, MS patients with BVL > 0.4%/year had a significantly higher annualized rate of EDSS change than patients below this threshold [41]. Nevertheless, interpretation at an individual MS patient level remains difficult. This is in part due to methodological reasons, physiological brain volume variation subject to parameters such as hydration status, alcohol use and genetic variation; and “pseudoatrophy”—i.e. reduced inflammation and consequently decreased brain volume after initiation of treatment [5, 42]. With regard to OCT, the estimated technical or physiological variation is minor (< 1%) [43]; and in contrast to brain volume, mGCIPL (as well as pRNFL in the absence of acute ON) thickness is not directly affected by inflammation [7]. Moreover, recent studies have underscored the excellent reproducibility of standardized retinal layer measurements between different raters, over time and in a multicenter context [44,45,46,47]. These arguments endorse the use of OCT-measured pRNFL and/or mGCIPL atrophy as longitudinal and intra-individual biomarkers for disability worsening in MS, complementary to MRI-based BVL.

This systematic review has some limitations. First and foremost, our search strategy and selection process yielded a modest number of 8 articles; 3 of these studies had a (for the most part) shared RRMS cohort. Second, we did not perform a meta-analysis, but only a qualitative appraisal of the available literature. And lastly, the majority of included studies (5/8) focused on an RRMS cohort, which restricts the recommendations in this systematic review to RRMS patients. However, we believe that these RRMS patients (especially at disease onset) represent the principal population of interest—i.e. disease progression has not been established yet and, therefore, the prognostic value of retinal layer measurement is more relevant.

In conclusion, this small and qualitative systematic review builds on the available body of evidence that OCT-measured pRNFL and/or mGCIPL atrophy can predict disability progression in RRMS patients. We therefore recommend close clinical follow-up or initiation/change of treatment in RRMS patients with increased risk for clinical deterioration based on retinal layer thresholds, in particular when other poor prognostic signs co-occur.