Introduction

Motoneuron disease (MND) is a heterogeneous group of fatal neurodegenerative diseases affecting adults. The clinical [1] and pathophysiological [2] heterogeneity of this group complicates early diagnosis and personalized prognosis for a particular patient. Misdiagnosis is relatively common in the early stages of the disease. The spinal form of MND is often mistaken with compressive radiculopathies or myelopathy, which have the highest incidence in a similar age group as MND [3]. As a result, about 13% of patients undergo unnecessary surgical treatment [4], which, in addition, may lead to an acceleration of MND progression [5]. The differential diagnosis is relatively complicated if MND involvement is limited to lower motoneurons (LMN) [6], and in such cases, the diagnostic uncertainty can lead to costly and potentially risky therapies (immunosuppression) without benefit to the patient [7]. The search for new biomarkers of MND, enabling a timely and reliable diagnosis and personalized prognosis, therefore became one of the main priorities of this research area.

Over the last 20 years, several studies have consistently shown increased levels of neurofilament light (NfL) and phosphorylated heavy chain (p-NfH) in cerebrospinal fluid (CSF) and serum of MND patients. Accordingly, the role of these proteins as a potential diagnostic biomarker for MND has been proposed [8] (among other possible roles, as their utility as prognostic and potentially pharmacodynamic biomarkers has been parallelly explored as well). These neuron-specific cytoskeletal proteins from the family of intermediate filaments are predominantly expressed in large, myelinated axons and are classified according to the molecular weight of their subunits as light (NfL), medium (NfM), and heavy (NfH) chains, forming a triplet (within a tetramer structure) [9]. The approximate intracellular ratio of NfL/NfM/NfH is 5:3:1 [10]. In CSF and serum, increased levels of NfL [11, 12], as well as p-NfH [13,14,15], have been reported in several other neurological disorders (multiple sclerosis, neuroinflammation, infarction, traumatic lesion, and other neurodegenerative diseases), suggesting that neurofilaments are non-specific markers of neuroaxonal damage. When comparing the levels of Nfs in MND and these disorders, the discriminative ability of both Nfs towards MND seems to be significantly lower [15]. Furthermore, some studies have demonstrated significantly increased levels of neurofilaments in compressive myelopathies and radiculopathies as well [16, 17]. So far, other studies have not particularly focused on the discriminative ability of neurofilaments between compressive root lesions and MND, and the group of myelo-/radiculopathies has been underrepresented [15, 18,19,20]. Furthermore, although several studies have shown significantly lower levels of NfL and p-NfH in CSF and serum in variants of MND with LMN-restricted involvement as compared to other MND phenotypes [18, 21, 22], data on the discriminative ability of Nfs in these variants, when compared to MND mimics, are lacking. Such data would be very meaningful from a clinical diagnostic perspective.

Therefore, in this retrospective, monocentric study, we evaluated the discriminative ability of CSF NfL, p-NfH, and p-NfH in serum in MND patients and patients with other neurological disorders relevant to the differential diagnosis of MND. In detail, we compared the levels of these proteins (1) between MND and compressive myelo-/radiculopathies and (2) between MND patients with isolated LMN involvement and patients with other clinically relevant conditions. Our secondary goal was to determine the clinical parameters in MND influencing levels of Nfs in serum and CSF.

Methods

Patients

Between 2012 and 2019, a total of 164 patients undergoing CSF examination as a part of the routine diagnostic work-up in our neurological department were enrolled in a retrospective manner. The study has been approved by our institutional Ethics Committee.

A total of 71 patients with a diagnosis of MND established at our neuromuscular reference centre according to El Escorial criteria revision from 2015 (patients with only LMN-isolated lesion in ≥ 2 regions were also included) [23] were enrolled (Table 1). First symptoms (bulbar, weakness, fasciculations) as referred by the patients, and ALSFRS-R score closest to the time of lumbar puncture (LP) have been ascertained from medical records. The time from onset (of MND) to LP has been determined. The disease progression rate (∆FS) was calculated, as described elsewhere [24].

Table 1 Comparison of characteristics between MND (n = 71) and myelo-/radiculopathies (n = 30) and between MND and disease controls (n = 63)

Further, the MND population was divided into different categories according to three criteria. The first category was composed of MND patients for whom the differential diagnosis of compressive myelopathy or radiculopathy had been considered based on clinical-radiological correlation or who had been misdiagnosed before referral to our centre (MND-C, n = 18, Table 2). The criteria of a suspicious radiological finding were: (a) for the cervical region, a finding of myelopathy or cervical compression index (CCI) grade 3 [25] and/or ≥ 2 levels with foraminal stenosis, corresponding to the documented clinical deficit; (b) for the lumbar region, a dural sac cross-sectional area < 100 mm2 [26] and/or ≥ 2 levels with foraminal stenosis (Table 3). The diagnosis of MND was confirmed by EMG (evidence of LMN dysfunction according to Awaji criteria found in regions without compressive lesion) and/or based on further clinical progression of the disease. The rest of the MND patients did not fulfil these criteria (n = 53, Table 2). The second category was composed of patients with LMN-isolated involvement (in ≥ 2 regions) at the time of LP (MND-LMN, n = 15, Table 4). The other two subgroups comprised patients fulfilling the Awaji criteria of at least a possible ALS, either with spinal (MND-ALS, n = 36), or bulbar (MND-B, n = 20) involvement at the onset. Third category was composed of MND patients divided according to their progression rate into groups of slow (∆FS < 0.3; 25. percentile, n = 20), fast (∆FS ≥ 1; 75. percentile, n = 21) and intermediate (∆FS 0.3–1, n = 30) progressors (Supplementary Table 1).

Table 2 Comparison of characteristics between MND-C and other MNDs, between MND-C and myelo-/radiculopathies and between other MNDs and myelo-/radiculopathies
Table 3 Association between radiological and clinical findings and biomarker levels among MND-C patients (n = 18)
Table 4 Comparison of characteristics between MND-ALS, MND-LMN, and MND-B

There were 93 patients without MND. (Table 1). The first cohort consisted of 30 patients with cervical spondylotic myelopathy or pluriradiculopathy due to lumbar spinal stenosis, multilevel foraminal stenosis, or discogenic compression (for details, see Supplementary Table 2). Medical records of each patient could be tracked for at least one year after LP without signs of developing MND. The second cohort (63 patients, disease controls—DC) comprised subjects with several peripheral nervous system disorders or functional neurological disorders (Table 5).

Table 5 Demographics, clinical data, and biomarker concentration of the disease controls (DC)

Neurofilament assays

Samples from all 71 MND patients and 93 other patients were tested for CSF p-NfH and NfL levels, and samples from 62 MND patients and 62 other patients (26 myelo-/radiculopathies and 36 DC) were tested for p-NfH in serum with ELISA kits. CSF was obtained by lumbar puncture, centrifuged, aliquoted, and stored within 2 h at − 80 °C until analysis. A paired sample of serum, obtained at the time of lumbar puncture, was processed likewise. The cell count and concentration of albumin in CSF and sera, with a calculation of CSF/serum albumin ratio (Qalb), were determined in a standard manner. Commercially available ELISA CE marked kits for in vitro diagnostics (IVD) were used both for NfL (UmanDiagnostics AB, Umea, Sweden, kit No. 10-7001 CE), as well as for p-NfH (Euroimmun AG, Luebeck, Germany), according to manufacturers’ instructions. For p-NfH in CSF, kit No. EQ 6561-9601 with a declared limit of detection of 27 pg/ml, and for sera, kit No. EQ 6562-9601 with a declared limit of detection of 1.7 pg/ml were used. Samples were measured in duplicate and the mean intra-assay coefficient of variation was: for NfL at 1%, for p-NfH in CSF at 8%, and for p-NfH in serum at 4.4%.

Statistical analyses

The normality of data was determined with the Shapiro–Wilk test. In non-normally distributed data, a Kruskal–Wallis test was performed and for statistically significant difference, a pairwise comparison through the 2-tailed Mann–Whitney test at a significance level of 5% with Bonferroni correction was performed. Demographic (age, sex) and clinical data (phenotype, progression rate, and duration of disease, particularly for MND subcohorts) were assessed as covariates with a multiple linear regression model. Categorical parameters were tested with Pearson Chi-squared test (multiple comparisons) or Fischer’s exact test (pairwise comparison). Correlation analysis of non-normally distributed data was performed with the Kendall correlation test (due to multiple tied ranks and relatively small sample numbers). The diagnostic performances of p-NfH (CSF, serum) and NfL were tested by receiver-operating characteristic (ROC) curves through area under the curve (AUC) calculation. The optimal cut-off was calculated using three different methods: (1) Youden-J index (2) for target sensitivity of 90% to model a clinical screening situation minimizing false negatives and (3) for target specificity of 90% to model a confirmatory test, minimizing false positives. For each optimal cut-off value, the sensitivity, specificity, and predictive values with corresponding 95% confidence intervals (CI) were calculated. Analysis (generation of tables) was performed in IBM SPSS Statistics 28, MSOffice (generation of boxplot figures), and R 4.2 (pROC and ROCit packages).

Results

Clinical characteristics

We found no significant differences in age or sex between MND and the other two groups (Table 1). Among different MND subgroups, MND-C patients were significantly older than the other MND patients (Table 2). As for disease duration, fast progressors displayed the shortest and slow progressors the longest time-to-LP interval (Supplementary Table 1), whereas, among other subgroups, there were no statistically significant differences in this regard. As for the progression rate (∆ FS), the MND-LMN subgroup displayed significantly lower values than MND-B (Table 4). Among MND-LMN patients, slow progressors were significantly more and fast progressors significantly less frequently represented, when compared to the MND-B subgroup (Table 4). Within the entire MND-LMN group (n = 15), eight patients (53.3%) manifested involvement of upper motor neurons (UMN) during the course of the disease and fulfilled Awaji criteria for ALS, while in seven patients, the lesion was only limited to LMN signs throughout the entire course of the disease—these patients were diagnosed with progressive muscular atrophy (PMA). Patients with LMN-isolated lesion were significantly more frequently represented in the MND-C group (7/18; 38.9%) than in the rest of the MND population (8/53; 15.1%) (p = 0.046). Regarding the radiological findings among MND-C patients (Table 3), CCI grade 3, but not myelopathy, was significantly associated with the presence of a clinical UMN lesion. All but one patient with LMN-isolated lesion (within MND-C subgroup) had radiological evidence of (at least two levels of) radicular compression, neuroanatomically corresponding to the clinical deficit, but this association was not statistically significant.

Concentrations of biochemical markers

We observed statistically significant differences between MND and myelo-/radiculopathies and between MND and disease controls in the levels of p-NfH (CSF), p-NfH (serum), NfL, total protein, and value of Qalb. Patients from the MND group had significantly higher levels of p-NfH (both CSF and serum) and NfL and significantly lower Qalb values and total protein levels compared to the other two groups (Table 1; Fig. 1a, b; Supplementary Fig. 1a). Furthermore, group of myelo-/radiculopathies had significantly higher levels of NfL (p < 0.001) and p-NfH in CSF (p = 0.003), but not p-NfH in serum (p = 0.89) than DC group. In a multiple linear regression model accounting for age and sex, the significant effect of MND on higher levels of NfL and p-NfH in CSF and serum was confirmed (for both Nfs in CSF p < 0.001, for p-NfH in serum p = 0.04). In this model, a significant effect of age (increasing level with increasing age) and sex (higher levels in males) on levels of NfL and p-NfH in CSF among the entire study population (p < 0.001 for both Nfs and both parameters) was observed. For p-NfH in serum, the effect of age (p = 0.044), but not of sex (p = 0.195), was significant in this model. When only MND patient was tested in this model, no significant effect of age or sex on levels of CSF NfL (for age: p = 0.654, for sex: p = 0.223) or CSF p-NfH (for age: p = 0.138, for sex: p = 0.552) was found.

Fig. 1
figure 1

a Comparison of transformed values of p-NfH (CSF, pg/ml) with natural logarithm between MND (n = 71) and compressive myelo-/radiculopathies (n = 30) and between MND and disease controls (DC; n = 63). b Comparison of transformed values of NfL (CSF, pg/ml) with natural logarithm between MND (n = 71) and compressive myelo-/radiculopathies (n = 30) and between MND and disease controls (DC; n = 63)

Correlation between parameters

A correlation analysis of all studied parameters (Supplementary Table 3) revealed a significant positive correlation between age and levels of both CSF Nfs among patients with compressive myelo-/radiculopathies (NfL τb = 0.471; p-NfH τb = 0.372) and disease controls (NfL τb = 0.369; p-NfH τb = 0.313). We observed no significant correlation between these parameters among MND patients. A significant positive correlation was found between age and p-NfH in serum among myelo-/radiculopathies (τb = 0.440), but not among disease controls or MND patients. Among all groups, we found a significant positive correlation between levels of p-NfH and NfL in CSF (MND τb = 0.575, myelo-/radiculopathies τb = 0.526, DC τb = 0.657) and a significant, but only moderate correlation between levels of p-NfH in CSF and in serum (MND τb = 0.280, myelo-/radiculopathies τb = 0.284, DC τb = 0.283). Furthermore, between levels of CSF NfL and ∆ FS, a significant strong positive correlation was found (τb = 0.390), while for CSF p-NfH, a significant moderate positive correlation (τb = 0.265), and for serum, a significant weak positive correlation (τb = 0.190) was found in this respect.

Concentration of neurofilaments in different MND subgroups

Patients from the MND-C group (n = 18) displayed significantly lower levels of CSF NfL (p = 0.05), but not of p-NfH (both in CSF and serum) as compared to the rest of the MND patients (n = 53). Further, MND-C group had significantly higher levels of NfL (p = 0.048) and p-NfH in CSF (p = 0.007) (Fig. 2a, b) but not in serum (p = 0.164), than patients from the group of compressive myelo-/radiculopathies. The rest of the MND patients had significantly higher levels of each Nfs than patients from the myelo-/radiculopathy group (Table 2, Fig. 2a, b).

Fig. 2
figure 2

a Comparison of transformed values (natural logarithm) of p-NfH (CSF, pg/ml) between MND-C (n = 18), MND non-C (n = 53), and compressive myelo-/radiculopathies (n = 30). b Comparison of transformed values (natural logarithm) of NfL (CSF, pg/ml) between MND-C (n = 18), MND non-C (n = 53), and compressive myelo-/radiculopathies (n = 30). c Comparison of transformed values (natural logarithm) of NfL (CSF, pg/ml) between MND-ALS (n = 36), MND-LMN (n = 15), and MND-B (n = 20). d Comparison of transformed values (natural logarithm) of p-NfH (CSF, pg/ml) between MND-ALS (n = 36), MND-LMN (n = 15), and MND-B (n = 20)

Patients from the MND-LMN subgroup had significantly lower levels of NfL than patients from the MND-ALS (p = 0.003) and the MND-B (p = 0.003) subgroups (Table 4, Fig. 2c). MND-LMN group had significantly lower levels of p-NfH in CSF than MND-ALS (p = 0.008), but not MND-B (p = 0.027, NS after Bonferroni correction) groups (Fig. 2d). We found no significant difference in serum p-NfH levels among the three groups (p = 0.347). The effect of the MND-LMN subtype on NfL levels was also confirmed in a multiple linear regression model including age, sex, ∆FS, and time-to-LP (between MND-LMN and MND-B: p = 0.01; between MND-ALS and MND-LMN: p = 0.006). The same model confirmed a significant effect of MND-LMN on CSF p-NfH levels in comparison to MND-ALS (p = 0.042), but not to MND-B (p = 0.075). Within the MND-LMN group, slow progressors (n = 8) had significantly lower levels of NfL (p = 0.002) and p-NfH in CSF (p = 0.008) than intermediate and fast progressors. No significant differences in levels of p-NfH (CSF) (p = 0.95) or NfL (p = 0.68) were found between patients with PMA (n = 7) and those, who later fulfilled ALS criteria (n = 8). Furthermore, the MND-LMN group had significantly higher levels of CSF NfL and p-NfH than disease controls, (ps < 0.001), but not of p-NfH in serum (p = 0.642) (Table 6). No significant difference in any Nfs levels between MND-LMN and myelo-/radiculopathies was observed (Table 6, Supplementary Fig. 1b, c).

Table 6 Comparison of characteristics between MND-LMN, myelo-/radiculopathies, and DC

Among the cohorts divided according to progression rate, fast progressors had the highest and slow progressors the lowest NfL levels in the CSF (Supplementary Table 1). The independent effect of a group on the levels of NfL was confirmed in a multiple linear regression model including age, sex, phenotype, and time-to-LP (between slow and fast progressors: p = 0.005, between intermediate and slow progressors: p = 0.049). The difference in levels of CSF p-NfH was significant only between the fast and slow progressor group, confirmed by the multiple linear regression model (p = 0.038). No significant difference in p-NfH levels in serum among the three groups was found.

Diagnostic performance of Nfs in discriminating MND from other disorders

In the entire study population (n = 164), the ability of both CSF Nfs to discriminate patients with MND from the other two groups combined was good (CSF p-NfH: AUC 0.885, CI 0.832–0.938; CSF NfL: AUC 0.892, CI 0.839–0.946), while for p-NfH in serum, it was poor (AUC 0.650, CI 0.553–0.746) (Table 7a; Fig. 3a, b; Supplementary Fig. 1d). The discriminative ability of CSF NfL and CSF p-NfH was comparable (AUC comparison between ROC curves p = 0.723) and significantly higher than that of p-NfH in serum (for both p < 0.001). All the 3 cut-off values (Youden-J index, target sensitivity, and specificity of 90%, both respectively) were calculated (Table 7a). For CSF p-NfH, a Youden-J cut-off value of 1205 pg/ml reached a sensitivity of 81.7 (CI 73.2–90.1%), specificity of 86.0 (CI 78.5–92.5%), with PPV of 81.7 (CI 78.5–92.5%) and NPV of 86.0 (CI 76.9–92.3%). For target sensitivity of 90%, the cut-off value was 625 pg/ml, with a specificity of 71.0 (CI 61.3–80.7%), while for target specificity of 90%, a cut-off value of 2101.5 pg/ml reached a sensitivity of 67.6 (CI 55.5–78.2%). For CSF NfL, the Youden-J cut-off value of 2180 pg/ml was identical to that of the target specificity of 90%, with a sensitivity of 83.1 (CI 74.7–90.1%), PPV of 86.8 (CI 77.0–93.4%), and NPV 87.5 (CI 79.9–94.5%). For the target sensitivity of 90%, a cut-off value of 1525 pg/ml reached a specificity of 80.6 (CI 73.1–88.2%). When only MND and DC groups were included in the analysis (Supplementary Table 4a), diagnostic performance of CSF p-NfH, as well as of CSF NfL, (but not of p-NfH in serum) further significantly increased (for CSF p-NfH: AUC 0.954, CI 0.904–0.981, AUC comparison between ROC curves p = 0.044; for NfL: AUC 0.962, CI: 0.928–0.996, p = 0.014; for p-NfH in serum: AUC 0.656, CI 0.545–0.767, p = 0.46) becoming excellent for both CSF Nfs. Further, all the respective calculated cut-off values were lower, with higher sensitivities, specificities, PPV, and NPV as compared to the analysis with myelo-/radiculopathies included (Supplementary Table 4a).

Table 7 Results of receiver-operating characteristic (ROC) analysis discriminating patients
Fig. 3
figure 3

a ROC curve for p-NfH in CSF (pg/ml) discriminating MND from myelo-/radiculopathies and disease control group; red point shows p-NfH cut-off value obtained as maximum value of Youden’s index; AUC area under the curve, CI confidence interval. b ROC curve for NfL (pg/ml) discriminating MND from myelo-/radiculopathies and disease control group; c ROC curve for p-NfH in CSF(pg/ml) discriminating MND-C from myelo-/radiculopathies group. d ROC curve for NfL (pg/ml) discriminating MND-C from myelo-/radiculopathies group. e ROC curve for p-NfH in CSF(pg/ml) discriminating MND-LMN from disease controls. f ROC curve for NfL (pg/ml) discriminating MND-LMN from disease controls

The ability of CSF p-NfH to discriminate the MND-C group from myelo-/radiculopathies was fair (Table 7b; AUC 0.733, CI 0.592–0.875; Fig. 3c) and not significantly lower than in the respective comparison between the rest of the MND group (n = 53) and myelo-/radiculopathies (Supplementary Table 4b; AUC 0.774, CI 0.627–0.869; AUC comparison between ROC curves: p = 0.341). For p-NfH in serum (AUC 0.627, CI 0.444–0.809) and NfL (AUC 0.672, CI 0.514–0.830; Fig. 3d), the discriminative ability was poor (Table 7b), but not significantly different as compared to other MND patients in this respect (Supplementary Table 4b; CSF NfL: AUC 0.768, CI 0.621–0.862; AUC comparison p = 0.341). For CSF p-NfH, a Youden-J cut-off value of 607.5 pg/ml reached a sensitivity of 94.4 (CI 83.3–100%) and specificity of 50 (CI 33.3–66.7%), with PPV of 53.1 (CI 35.0–70.5%) and NPV of 93.7 (CI 67.7–99.7%). For target sensitivity of 90%, a cut-off value was 702 pg/ml, with a specificity of 50 (CI 33.3–66.7%) and for target 90% specificity, a cut-off value of 5014 pg/ml reached a sensitivity of 22.2 (CI 5.6–44.4%). For NfL, the Youden-J cut-off value was 2180 pg/ml, with a sensitivity of 77.8 (CI 55.6–94.4%), specificity of 70 (CI 53.3–86.7%), PPV of 58.3 (CI 36.9–77.2%) and NPV of 83.3 (CI 61.8–94.5%). For the target sensitivity of 90%, a cut-off value was 895 pg/ml, with a specificity of 33.3 (CI 16.7–50.0%), and for the target 90% specificity, a cut-off value of 7682 pg/ml reached a sensitivity of 11.1 (CI 1.9–36.0%). All data combined, we could not reach an optimal cut-off value without substantial loss of either sensitivity or specificity, especially for p-NfH in CSF. Therefore, cut-off values derived from the entire study population have been applied (for p-NfH in CSF: 1205 pg/ml, for NfL: 2180 pg/ml). When these values were applied, 11 (36.7%) and 9 (30%) patients from the myeloradiculopathy group had levels above the cut-offs for CSF p-NfH and NfL, respectively. In seven out of ten patients (70%) from the myelo-/radiculopathy group, who underwent surgical decompression, the measured levels of both CSF Nfs were simultaneously above these cut-offs. On the other hand, five patients from the MND-C group (27.7%) underwent surgical treatment due to suspected compressive aetiology before a referral to our centre. In two out of these five patients (40%), the levels of both Nfs in CSF were below the cut-off values.

When comparing MND-LMN patients with the disease control group, the discriminative ability of CSF p-NfH (AUC 0.848, CI 0.671–0.934) and NfL (AUC 0.838, CI 0.694–0.981) was good (Table 7c). For CSF NfL, this result was significantly lower than in the comparison between other MND patients (n = 56) and DC (Supplementary Table 4 c; AUC 0.995, CI 0.981–0.999; AUC comparison between ROC curves: p = 0.032), while for CSF p-NfH, the difference was not significant in this respect (AUC 0.968, CI 0.920–0.987, AUC comparison between ROC curves: p = 0.064). Serum p-NfH failed to discriminate between MND-LMN and DC (AUC 0.457, CI 0.254–0.661). For p-NfH (CSF), the Youden-J cut-off value of 605 pg/ml reached a sensitivity of 80 (CI 60.0–95.7)% and specificity of 81 (CI 71.4–90.5%), with PPV of 50 (CI 34.5–84.7%) and NPV of 94.4 (CI 82.1–97.2%). For target sensitivity of 90% a cut-off 535 pg/ml had specificity of 66.7 (CI 71.4–90.5%), and for target specificity of 90%, a cut-off value of 727 pg/ml reached sensitivity 66.7 (CI 38.4–88.2%), with PPV 62.5 (CI 41.9–86.1%) and NPV 92.1 (CI 78.0–97.0%). For NfL, the Youden-J cut-off value of 2170 pg/ml reached a sensitivity of 60 (CI 33.3–86.7%), with a specificity of 100 (CI 94.3–100%), PPV of 100 (CI 70.3–100%) and NPV of 91.4 (CI 76.9–100%). A cut-off value for target sensitivity of 90% was 800 pg/ml, with a specificity of 68.3 (CI 55.3–79.3%). For target specificity of 90%, a cut-off value of 1216.5 pg/ml reached a sensitivity of 66.7 (CI 46.5–86.7%). Regarding the choice of an optimal cut-off value, in a clinical situation where diagnostic confirmation is requested, the high specificity of a test has to be preferred. For discrimination between MND-LMN and disease controls, we, therefore, selected the optimal cut-off for p-NfH (CSF) as 727 pg/ml and NfL as 2170 pg/ml based on the highest achievable test specificity and positive predictive value, while maintaining acceptable sensitivity and negative predictive values When analysing the discriminative ability between all MND (= 71) and DC groups, for CSF p-NfH, identical cut-off value has been achieved, while for NfL, a lower value of 1425 pg/ml has yielded the best diagnostic performance (Supplementary Table 4a). As there were seven patients with LMN-restricted phenotype with a confounding radiological finding of multiple nerve roots compression (46.6% of the MND-LMN subgroup), we have compared the MND-LMN group to all controls combined (n = 93), but could not reach an optimal cut-off value without a substantial loss of either sensitivity or specificity, and most importantly, with very poor positive predictive values (Supplementary Table 4d). Therefore, we have applied the cut-off values established from the entire study population (all MND, n = 71 compared to all controls, n = 93), i.e. for CSF p-NfH 1205 pg/ml, and for CSF NfL 2180 pg/ml, on the MND-LMN group, to account for a clinical scenario of differentiation between MND-LMN and compressive myelo-/radiculopathies as well. These cut-off values yielded 8 (53.3%) and 6 (40%) false negative MND-LMN patients, respectively.

Discussion

To the best of our knowledge, this is the first study analysing the discriminative ability of Nfs towards specific MND subgroups, where the clinician is effectively in doubt due to a restricted phenotype (limited clinical signs of the disease) or confounding radiological findings of a nerve root or spinal cord compression.

Our study demonstrated that both p-NfH and NfL in CSF have an overall good discriminative ability towards the group of all MND patients in the entire study population. When patients with myelo-/radiculopathy—a group with NfL and p-NfH levels in CSF significantly higher than in the group of disease controls—were excluded from the analysis, the discriminative ability of both Nfs further increased (Supplementary Table 4a). In addition, after this exclusion, our diagnostic performance and respective calculated cut-off values towards MND strongly resembled those of previous studies, where the structure of disease control (or mimic) groups was comparable to our DC group [18,19,20]. Given the unspecific nature of Nfs elevation, reflecting rather the extent and pace of axonal degeneration, we assume, the main reason for the observed difference (myelo-/radiculopathies compared to disease controls) is based on a higher burden of neuroaxonal damage associated with acute (multiple) nerve root or myelon compression. Our own data support this assumption, as we have observed significantly higher levels of both CSF Nfs among acute myelopathies, as well as pluriradiculopathies, compared to chronic myelopathies or monoradiculopathies (Supplementary Table 2). The effect of pace and extent of neuroaxonal damage on p-NfH levels among compressive myelopathies or radiculopathies has been documented by other authors as well [16, 17]. On the other hand, the group of disease controls comprised aetiologies with chronic, slow, and predominantly peripheral nerve involvement or chronic, predominantly demyelinating polyradiculoneuropathies, where the degree of neuroaxonal damage seems to be less pronounced. After the inclusion of myelo-/radiculopathies, the discriminative ability of both Nfs remained good, although significantly lower than that of the former analysis, but the respective cut-off values increased (i.e. for p-NfH in CSF from 840 to 1205 pg/ml), and positive predictive values were reduced (i.e. for p-NfH in CSF from 92.5%, CI 82.7–97.2% to 81.7%, CI 78.5–92.5%). Our data confirm the difficulty in homogenizing cut-off values among different studies (even when identical ELISA kits are used) and highlight the importance of the control group structure. When a wide range of disorders with a higher burden of neuroaxonal damage are included as controls, the cut-off values may vary substantially, given that elevation of Nfs levels is independent of causal pathways.

On the other hand, the diagnostic performance of p-NfH in serum was significantly lower than that of its CSF counterpart, as well as than that of NfL in CSF, and was overall poor. The main rationale for using blood-based Nf measurement was its better accessibility, e.g. in the setting of an outpatient clinic. On the other hand, CSF could more directly reflect CNS pathophysiology. Therefore, we examined how well the value of serum measurements could serve as a proxy for CSF measurements. We used a commercially available CE marked ELISA Kit for p-NfH assessment in sera, as this would represent a widely accessible solution for routine laboratory testing outside of an academic setting. Furthermore, promising results from other research group [19] using this type of ELISA kit prompted us to replicate their results. Regarding NfL measurement in sera, the commercially available ELISA kit has been deemed unsuitable for analysis of blood samples due to its limited sensitivity combined with significantly lower NfL blood levels, compared to CSF. Meanwhile, ultrasensitive single-molecule array (SiMoA) digital immunoassay is about 100-fold more sensitive than ELISA for quantification of NfL. Although proposed as a potential gold standard for blood NfL quantitation, its high cost and designation as for research use only limit the accesibility of SIMoA. Due to limited funding, we therefore choose only p-NfH quantitation in our study. Our data show significantly lower diagnostic performances of p-NfH in serum as compared to similarly designed studies [19, 27]. Nevertheless, in a previous, differently designed study, combining the cross-sectional and longitudinal design of p-NfH measurement in plasma, a broad overlap of values between MND and controls was observed [28]. The authors attributed this finding to a documented high variability of p-NfH levels over time among MND patients, depending on the actual stage of disease and progression rate. The authors argued that an inter-individual difference in antibody response to p-NfH, as well as antibody-independent aggregation (matrix effect) of p-NfH, could be responsible for the p-NfH blood levels variation among MND patients. Consequently, they deemed the cross-sectional measurement of p-NfH levels in plasma unreliable. Furthermore, our data are in line with those of another study that compared the performance of three analytical approaches in serum (namely, ELISA and two different types of the single molecular array—SIMoA kits) [29]. This comparative study found that SIMoA was superior to ELISA regarding analytical sensitivity in serum. Given these and our data, it can be speculated that p-NfH measured with, as for now available, second-generation ELISA kits has a clearly limited value as a diagnostic blood marker.

Our secondary goal was to determine the clinical parameters in MND influencing levels of Nfs. While among MND, we found no significant effect of age or sex on NfL or p-NfH levels in CSF or serum, the opposite was observed among other neurological diagnoses. Both findings are in accordance with a meta-analysis of studies on NfL in CSF [12], as well as with other studies on p-NfH [15, 20]. We found two significant factors affecting the levels of NfL and p-NfH in CSF in MND patients: (1) clinical phenotype (lower levels in LMN-isolated disease, a model of the limited extent of involvement) and (2) disease progression rate. Previous studies also observed a significant effect of clinical phenotype, with lower CSF levels of NfL in isolated LMN phenotypes [21] and of both Nfs in patients with a limited extent of simultaneous UMN and LMN involvement [18]. Although we could speculate a correlation between CSF NfL levels and the burden of corticospinal tract involvement in light of our data, a different design including MRI markers (fractional anisotropy, radial diffusivity) would be needed to add to existing evidence [30, 31], and such an assessment was beyond the scope of our study. The primary purpose to include the MND-LMN subgroup in our design was to represent a real-life clinical setting, where missing signs of UMN involvement in the early diagnostic phase broaden the spectrum of differential diagnoses. In these cases, a reliable diagnostic marker would have a great added value. Furthermore, this group provided evidence for the limited reliability of phenotyping patients in the early MND phases, given that half of its patients manifested UMN signs later on and fulfilled the criteria of classic ALS, while the other half fulfilled the PMA criteria (as a true restricted disease phenotype). In such conditions, a difference in levels of Nfs between these patients would have been of great significance. Unfortunately, we did not observe any significant difference in this respect. A plausible explanation would be a very similar progression rate, which we observed between these two subcohorts.

The ability of both CSF Nfs to differentiate MND patients with isolated LMN involvement from disease controls was reduced as compared to their respective ability towards other MND patients. This effect was more prominent in the case of CSF NfL, reaching a statistically significant difference. For CSF p-NfH, although statistically not significant, the difference was also substantial. The significant association of the MND-LMN group with lower levels of both CSF Nfs (compared to the rest of the MND patients) resulted in a higher overlap of values between the MND-LMN and DC groups. Although the diagnostic performance of both CSF Nfs remained good from the statistical perspective, the further in-depth analysis revealed the problematic point from the clinical perspective. Our cut-off values for the discrimination of MND-LMN from DC (as well as for discrimination of all MND from DC only) were similar to other published data [18,19,20] Looking in detail at false negative MND-LMN patients after our cut-offs were applied, all of them (5/15, 33.3% with p-NfH, 6/15, 40% with NfL cut-off) were simultaneously slow progressors, a sub-cohort with significantly lower levels of both CSF Nfs even within the MND-LMN group. In these patients, the correct diagnosis had to be established by means of a rather extensive diagnostic work-up, involving electrophysiology (exclusion of a motor conduction block, confirmation of a denervation–reinnervation syndrome in multiple myotomes), laboratory diagnostics (e.g. antiganglioside antibodies), molecular genetic testing (e.g. CAG expansion in AR gene, SMN1 deletion, mutation in HEXA gene, etc.), and clinical monitoring for a confirmatory disease progression pattern. Furthermore, as a substantial proportion of MND-LMN group (46.6%) simultaneously displayed a confounding radiological finding, we analysed the discriminative ability of CSF Nfs between MND-LMN and both control groups combined (Supplementary Table 4d). In this analysis, most prominently, the positive predictive values of any of the respective cut-off value for both CSF Nfs have not even reached 50%, a major limitation for the clinical utility of these biomarkers. After the cut-off values established from the entire study population have been applied, 6 (40%) and 8 (53.3%) MND-LMN patients, respectively, had levels below the cut-off value for CSF NfL and p-NfH. Again, all these false negative patients were simultaneously slow progressors and four of them had confounding radiological finding. Thus, for the discrimination of LMN-predominant MND patients with slow disease progression rate from disease mimics (and, among them, particularly from compressive radiculopathies), quantification of Nfs appears to be rather unsuitable. For the discrimination of all MND patients with suspicious radiological findings (MND-C) from patients with only compressive aetiology, the diagnostic performance of CSF p-NfH was fair, but in the case of p-NfH in serum and CSF NfL, it was poor. Significantly higher levels of both Nfs in myelo-/radiculopathies (compared to the DC group) have, again, led to a broader overlap of values with MND patients. Consequently, the applied cut-off values for both CSF Nfs resulted in high false positive rate among myelo-/radiculopathy patients. Looking at the clinical implications of these data in further detail, 70% of the surgically treated patients from the myelo-/radiculopathy group (all of whom subsequently improved after the treatment) tested above the cut-off levels for both Nfs. Labelled as false positives by the test, these patients would be at risk of being withheld a therapeutic benefit. On the other hand, there were still four (22.2%) MND patients with levels of both CSF Nfs below their respective cut-off. Two of these false negative MND patients underwent surgical treatment (out of five surgically treated MND patients altogether) before a referral to our centre. The correct diagnosis in all the patients from MND-C group has been established by means of a careful clinico-radiological correlation (absence of radicular pain and sensitive deficit, signs of a lower or upper motor neuron lesion in regions, unaffected by the documented compression) and/or electrophysiological examination (proof of denervation–reinnervation syndrome in regions, unaffected by the documented compression). Given the substantially higher prevalence of compressive myelo-/radiculopathies than MND in the general population, and the risk of withholding or delaying a potential therapeutical benefit by means of misdiagnosis, these data highlight a limited utility of Nfs as diagnostic markers in clinically difficult to distinguish cases.

The main limitation of our study was its retrospective design, leading to a bias towards MND patients who underwent CSF examination at our department. This represented only about 40% of all patients examined at our centre during the time frame. Other MND patients were only examined at the outpatient clinic, and CSF examination was already performed elsewhere (and therefore not repeated). Thus, this limitation has led to a reduced number of recruited patients when compared to similar studies. The effect of insufficient sample size was most pronounced in the comparison between subgroups of interest (MND-C, MND-LMN). A future study, preferably with multi-centric design to allow for adequate enrichment in MND patients with concomitant radiological findings or restricted phenotype to either confirm or contradict our preliminary findings, is therefore warranted.

In conclusion, from the perspective of a general MND population, both Nfs in CSF performed well as diagnostic markers. Despite this, caution must be applied, as higher levels of Nfs are not specific towards MND, reflecting rather a burden of an ongoing unspecific neuroaxonal damage. Nevertheless, when the clinical picture is highly suggestive of (and respecting exclusion criteria for) MND and is further supported by electrophysiology and not contradicted by radiological findings, Nfs levels above their respective cut-off values could be a strong support to the definitive diagnosis. Both Nfs in CSF displayed comparable diagnostic performance. On the other hand, the significant covariates that render the clinical picture less specific towards MND (LMN-restricted phenotype, slow disease progression) are simultaneously associated with lower levels of Nfs, which limits the utility of Nfs as a reliable diagnostic marker for the LMN-restricted disease. Furthermore, due to the high overlap of values between MND patients and patients with compressive radiculopathies and myelopathies (mainly those with acute progression of deficits, who may benefit from surgical treatment), the diagnostic performance of Nfs is insufficient for a reliable discrimination in this situation. Finally, although measurement of p-NfH from sera using ELISA would be a feasible alternative for an outpatient clinic, a broad overlap of values between MND and other neurological disorders, and its overall low sensitivity towards MND, render this test unsuitable for a screening task. In the era of fourth-generation assays (SIMoA) with superior analytical sensitivity, the mere better accesibility of the blood-based ELISA method does not seem to outweigh its low diagnostic performance.