Introduction

Chronic low back pain (CLBP) is a multidimensional, qualitative phenomenon with three primary dimensions: sensory-discriminative, affective-motivational, and cogitative-evaluative [1]. Nociceptive and nociplastic pain mechanisms (e.g., peripheral and central sensitization) contribute to the sensory-discriminative component and are typically associated with one of five peripheral sources: discogenic, neuropathic, vertebrogenic, facetogenic, and paraspinal/myofascial [2]. While clinical imaging is often used to localize anatomic features as nociceptive/neuropathic sources, degenerative changes are common and usually non-specific for CLBP, including specifically for discogenic low back pain (DLBP) [3]. Alternatively, provocative injections to the disc or facets can be more reliable compliments to physical examination, but these are invasive and subjective, and don’t capture mechanistic information that may inform optimal treatment selection—the “why” in addition to the “where.”

To address this need, we have previously developed and validated single-voxel magnetic resonance spectroscopy (MRS) as a non-invasive method to quantify chemical signatures of pain within individual lumbar discs. This approach quantifies chemical features of degenerating extracellular matrix [4] as well as metabolites of disc cell function related to pain, such as lactate [5] and propionate [6]. The clinical utility of MRS has been established by validation against a reference diagnostic standard, low-pressure provocative discography (PD), and correlation with short-term clinical outcomes [7].

We previously reported clinical results using custom MRS data acquisition and post-processing protocols (NOCISCAN-LS®, Aclarion Inc.) that showed when MRS positive discs were treated, surgical success (at 6 and 12 months) was 97% versus 57% when the treated level was MRS negative, or versus 54% when an MRS positive disc was left untreated. We now report a similar comparison of outcomes of lumbar surgeries in that same study in more patients and at longer postoperative intervals (up to 24-months).

Methods

Lumbar disc MRS study design

A multi-center, observational, single-voxel MRS clinical development and accuracy study was conducted under IRB approval and patient informed consent. The enrolled subjects received MRI imaging and PD as part of their standard care for DLBP. Custom scanner settings and acquisition protocols were conducted using the Siemens 3 T Verio acquisitions at a single center from November 2011 to May, 2019. DLBP surgical treatment outcomes at 6-, 12-, and 24-months were assessed in relation to pre-surgical MRS-based diagnoses for P versus NP discs based on the MRS criteria [7].

DLBP patient and lumbar disc population

Patient inclusion/exclusion criteria (Table 1) focused enrollment on patients receiving PD for suspected intra-discal pain versus other potential extra-discal pain sources not quantified by the MRS approach. Ninety-three patients were enrolled, examined by MRS, and then received lumbar surgery. Of these, n = 78 were evaluated for surgical outcomes evaluation, with 15 noted exclusions that comprised 4 technical failures related to the MRS acquisition exam, 14 deviations from the inclusion/exclusion criteria (8 for spondylolisthesis and/or spondylolysis, 1 for scoliosis, 5 for leg pain > back pain) and 3 for both.

Table 1 Pain patient inclusion/exclusion criteria for study

Disc levels were selected for MRS by physician preference, typically including discs receiving PD plus other levels (≥ 3 MRS levels recommended per patient). PD was performed using physician preferred techniques, with positive discogram (PD +) results requiring low-pressure provocation (< 50 psi), ≥ Grade III annular tear, and a negative control disc (PD-) [8]. PD was performed after the MRS exam, except for two cases where PD was performed at least six weeks prior to MRS. The NP group included the discogram negative (PD-) discs. The P group included the PD + discs, plus other non-PD discs that were physician-diagnosed as painful based on other clinical criteria (e.g., patient-reported symptoms, physical exam and neurologic deficits, X-ray, MRI, myelogram, diagnostic injections) in patients receiving PD at other levels.

Lumbar disc MRS exam protocol

The custom MRS protocol included a Point RESolved Spectroscopy (PRESS) sequence, and Chemical Selective Suppression (CHESS) for water suppression as previously described [7]. Shortened T1-weighted and T2-weigted imaging of sagittal, coronal, and axial planes enabled the scanner operator to prescribe the single voxel (SVS) to encompass the disc nucleus and exclude the vertebral body (Fig. 1).

Fig. 1
figure 1

Three-plane voxel prescription in center slices of 3 plane T2-weighted images for L4L5 disc (left: mid-sagittal, center: coronal; Right: axial). Optimum voxel prescription includes ≥ 1 cc volume and ≤ 4 mm height

The MRS data were collected as part of a secondary MRI session after routine clinical MRI was already performed as indicated by standard-of-care. MRS was not utilized in the surgical decision-making process.

Custom MRS data post-processing—spectral quality, quantitation, classification

Disc MRS data post-processing was performed using an investigational NOCISCAN-LS® software post-processor, as previously described and which is now commercially available (Aclarion, Inc.; Broomfield, CO).

Spectral feature quantification including peak value, SNR, and area-under-the-curve (AUC) was performed for:

  • Carbohydrate/collagen (CA) and proteoglycan (PG) regions as structural integrity markers expected to decrease with disc degeneration, and

  • Alanine (AL), lactic Acid (LA), and propionic acid (PA) regions as acidic pain markers (e.g., from hypoxia, inflammation, and/or Propionibacterium acnes infection) expected to increase with discogenic pain (Fig. 2).

Fig. 2
figure 2

Post-processed absorption spectrum (y-axis) versus chemical shift (x-axis) for L4L5 disc MRS acquisition, indicating peak regions associated with degenerative pain biomarkers quantified by further post-processing

As previously described [7], the quantified spectral features were used to calculate a NOCISCORE Total for each disc (0–10 scale), which were then normalized relative to the highest Total MRS-based NOCISCORE in the patient to generate NOCISCORE Normalized (0–1 scale; Fig. 3). The ability to classify a disc as positive or negative was assessed using thresholds of both the NOCISCORE Total and NOCISCORE Normalized values for that disc. If either of the scores was below defined low thresholds, the disc was classified negative (NOCI-), while if both were above defined high thresholds, the disc was classified positive (NOCI +) (Table 2 and Fig. 4). These thresholds were previously defined via statistical training against PD data [7].

Fig. 3
figure 3

Patient example. Left: mid-sagittal T2-weighted MRI image of lumbar spine. Right: six calculated biomarker ratios after normalizing to maximum thresholds and applying differential weighting, and combined Total and Normalized NOCISCOREs, for each disc tested

Table 2 Thresholds and factors used in processing MRS data for classification and analysis, including: Chemical ratio adjustment factors (saturation normalization thresholds & weighting factors) for calculating disc-specific NOCISCORES; and NOCISCORE range thresholds for classifying discs as NOCI + or NOCI-
Fig. 4
figure 4

Patient Example. Left: mid-sagittal T2-weighted MRI image of lumbar spine. Center: NOCISCORE Total values for disc levels tested. Right: associated post-processed spectra for disc levels tested

Averaged PG spectral measurements (peak, AUC, SNR) for each disc were also normalized to the highest value in that patient to generate a relative “SI-SCORE” as an indicator of structural integrity (scale = 0–1).

Patients/surgical outcomes

All study patients included in this analysis received lumbar spine surgery by a single surgeon (co-author, MG). All surgeries were performed at P disc levels (determined by PD plus other clinical criteria) independent of MRS scores. For the 78 patients included for surgical outcomes analysis, all were included in the 6 month outcomes, 71 at 12 months, and 61 at 24 months (Fig. 5).

Fig. 5
figure 5

Nociscanned surgical patient distribution with postsurgical follow-up data at a 6 months, b 12 months, and c 24 months

The majority of the surgeries (n = 42) were fusions, while n = 19 were total disc replacement (TDR), and n = 14 had a “hybrid” combination of TDR and fusion. Patients were divided into two primary groups for outcome analysis. Group C included those who had surgery on one or more disc levels that was/were concordant with MRS-based diagnosis, i.e., only at discs that were NOCI + or NOCImild (for this purpose, limited only to discs with the highest NOCISCORE that were classified as NOCImild in patients without a NOCI + disc). Group D patients were those who had surgery on one or more disc levels that was/were discordant with MRS-based diagnosis, i.e., surgeries at only a NOCI- disc (NOCI-) or that did not treat a NOCI + disc (NOCI + adj).

Table 3 shows the clinical characteristics of the overall group and the two outcomes comparison Groups C and D. While there was a statistically significant difference in age between Groups, with the mean age of Group D subjects about 5.4 years older, this was not considered sufficiently relevant to bias outcomes, and other noted characteristics were similar.

Table 3 Pre-operative baseline characteristics of surgery patients included for postsurgical outcomes evaluation

A third Group C + was also identified for 2-level surgeries that included both a NOCI + disc and a NOCI- disc (or “NOCI ± ”), for which surgical outcomes were evaluated separately and apart from the comparisons made between the other two Groups C and D where the concordance or discordance between surgery and MRS-based diagnosis was more distinct.

Our study protocol included two validated instruments for patient-reported outcomes: Oswestry Disability Index (ODI; 0–100 scale) and Visual Analog Scale (VAS; 1–10 scale) for back pain. We chose a 15-point ODI improvement as an indicator of surgical success since it is the FDA benchmark, it is more stringent than the calculated minimum clinically important difference (12.5-points; [9]), and it is accepted throughout the spine research literature [10,11,12]). As a secondary success metric, we also used a 2-point improvement in VAS for back pain.

Data analysis

Statistical analyses for the MRS-based NOCISCORE correlations to P and NP data were performed using the R programming language (R Foundation for Statistical Computing, Vienna Austria).

Association between MRS results and surgical success was evaluated using ODI improvement relative to correspondence between the treated level and the MRS-based classifications for the patient’s discs. We calculated the proportions of Group C and Group D patients reporting at least a 15-point ODI improvement (“ODI Success”), with success rate differences compared using a Pearson Chi-square test (JMP Pro, V16). We also compared the distribution of absolute numeric and % values for changes in ODI and VAS between Groups C & D, with p-values calculated using an independent groups, two-tailed Student t test. Because patient allocation into these groups was based on an objective MRS algorithm derived from patient scans (the vast majority of which were conducted prior to receiving discography), there was no opportunity for assignment bias in the surgical outcomes analysis since the patients were enrolled at diagnostic baseline, prior to any actual surgical planning. These comparisons were also made for different surgical treatment types—including fusion, total disc replacement (TDR), “hybrid” fusion + TDR, a limited few instances of “other” decompression surgeries, and for both 1-level and 2-level surgeries.

Results

Relationship of MRS-based NOCISCORE to surgical treatment outcomes

The pre-operative ODI and VAS scores were not statistically different for the Group C and Group D patients (Table 4 and Fig. 6).

Table 4 ODI and VAS distributions for Group C & D patients at 6-, 12-, 24-month surgical follow-up
Fig. 6
figure 6

Postoperative Oswestry Disability Index (ODI) and Visual Analog Scale (VAS) pain scores were significantly better when surgery was performed at all NOCI + discs (Group C) versus when one or more NOCI + discs were not treated (Group D): a Average Oswestry Disability Index (ODI) and b Visual Analog Scale (VAS) for patients at baseline, 6 months, 12 months, and 24 months after surgical treatment. *statistically different at p < 0.05; ** at p < 0.01

However, Group C had significantly lower ODI versus Group D patients at each of the three end-points, with mean scores that were lower by as much as 13 points at 12 months and remained 9 points lower at 24 months. There was also a significantly greater reduction in ODI from pre-operative baseline for Group C versus Group D at [absolute # change in score (p), % change in score (p)]: 6 months: − 35 versus − 23 (p < 0.05), − 61% versus − 39% (p < 0.05); 12 months: − 39 versus − 22 (p < 0.01), − 69% versus − 39% (p < 0.01); 24 months: − 38 versus − 26 (p < 0.05), − 66% versus − 48% (p < 0.05). Group C surgeries also had lower VAS than Group D at 6 months: 1.7 versus 2.9 (p < 0.056); and at 12 months: 1.8 versus 3.7 (p < 0.01).

Post-Sx success rates were also significantly improved and differences ranged from 22 to 35%, higher for Group C versus Group D surgeries at the follow-up time points: 6 months (88 vs. 62%; p = 0.01), 12 months (91 vs. 56%; p < 0.001), and 24 months (85 vs. 63%; p = 0.07). (Table 5 and Fig. 7). Differences between Group C surgeries that treated all NOCI + discs (NOCI + sub-Group) versus Group D surgeries that left a NOCI + disc untreated (“NOCI + NoSx” sub-Group) were: 6 months (89% vs. 64%, p < 0.05); 12 months (94% vs. 54%; p < 0.001); 24 months (87% vs. 70%; p = 0.2). Those NOCI + sub-Group success rates were also higher than the Group D surgeries that only treated NOCI- discs in patients without an adjacent NOCI + disc (NOCI-sub-Category) at 6 months (57%, p = 0.03); 12 months (60%; p = 0.02); 24 months (50%; p = 0.03).

Table 5 6-, 12-, 24-month % surgical success rates (≥ 15pt ODI reduction) for: (a) ALL patients, and Groups C, D, and C + ; and (b) respective sub-groups for Groups C and D
Fig. 7
figure 7

Percent (%) surgical success rates, per >  = 15 point ODI reduction, for Groups C and D at 6-, 12-, 24-month postsurgical follow-up

Longitudinal durability of outcomes between follow-up time points

Among the 71 patients reaching the 12-month follow-up, only 2 (2.8%) improved from failure to success, and only 3 (4.2%) reverted from success to failure, relative to their 6-month data, with 66 (93%) of the patients remaining relatively stable in their longitudinal outcomes from 6- to 12-month follow-up. Among the 67 patients reaching the 24-month follow-up time point, only 4 (6%) improved from failure to success, and only 3 (4.5%) reverted from success to failure, relative to their 12-month outcomes, while 5 improved and 6 reverted versus their earlier 6-month outcomes, with 56/67 (84%) of the patients remaining relatively stable in their outcomes from 6- to 24-months.

1-level versus 2-level surgeries

One-level versus 2-level surgeries had similar outcomes between patient sub-groups (Table 5). Success rates for 1-level surgeries were significantly higher for Group C versus Group D [6, 12, 24 months]: 90% versus 61% (p < 0.01), 92% versus 56% (p = 0.002), and 84% versus 62% (p = 0.1). These success rates were even higher, and the difference between groups greater, for Group C surgeries treating only NOCI + discs: 93% versus 61% (p = 0.01), 96% versus 56% (p = 0.001), 88% versus 62% (p = 0.07). Success rates for 2-level surgeries were also higher for Group C versus Group D, despite a relatively smaller sample size [6, 12, 24 months]: 82% versus 67% (p = 0.3); 89% versus 50% (p = 0.2); 86% versus 67% (p = 0.5).

Discussion

The current data continue to reinforce that MRS can guide disc level selection when interventional procedures are indicated for DLBP patients. Rates of clinical success were significantly improved, with differences between groups of 26% and 35%, respectively, at 6 and 12 months when the surgical levels were matched with pre-operative MRS classification as painful (NOCI +). Those Group C outcomes also remained higher versus Group D at 24 months, though missing statistical significance likely due to the relatively lower number of patients reporting at that longer term follow-up. Longitudinal data also demonstrate that Group C patients with surgery at all NOCI + discs had a significantly higher success rate at 6 and 12 months (89% and 94%) versus surgeries in Group D patients that left a NOCI + disc untreated (64%, 54%). While all Group C sub-Group success rates were high, the highest success rates were for NOCI + surgeries in the following settings: (a) 100% success at all time points for 2-level Fusion + TDR (33–50% higher vs. Group D); and (b) 96% success at 12 months for Fusion (22/23, 42% higher vs. Group D) and for 1-level surgeries (25/26, 40% higher than Group D), with only 1 patient failing success criteria for both sub-groups.

These results demonstrate the added value of the MRS-based NOCISCOREs for diagnosis and treatment planning. These data compare very favorably to meta-analyses summarizing randomized clinical trials of surgical care for DLBP, where typical surgical success rates range from 41 to 57% (which is commensurate with the results from the Group D discordant surgeries that did not follow the MRS-based NOCISCORE guidance in our data) [13].

While the treatment in this current study was surgery with either fusion or disc arthroplasty, MRS diagnostic information may similarly improve success of other interventions meant to treat discogenic pain. In particular, biologic therapies to regenerate the disc or suppress inflammatory processes also require specificity for level selection during pre-operative planning. Indeed, MRS may be even more critical in this case because biologically-based therapies target discs early in the degenerative cascade where painful levels may only subtly differ from adjacent asymptomatic discs via routine clinical imaging. Further, because biologically treated discs aren’t surgically removed or reinforced with implants, MRS can be used to track the activity of the therapy over time. Of note, because MRS has the potential to discriminate features of anaerobic bacterial activity [6], and because of the growing awareness of subclinical disc infection as an important discogenic pain mechanism [14], MRS may be uniquely valuable to distinguish discs that should be treated using antibiotic versus regenerative therapies.

Clinical outcomes are influenced both by the accuracy of the diagnosis and potency of the intervention. While we have previously validated MRS using PD as a reference standard, a potential limitation of this study is the suitability, execution, and durability of the surgical treatment. Because the surgeries were performed by a single surgeon, this reduces the variation in treatment execution as a confounding factor. Additionally, while our Group D patients were slightly older (44.2 versus 38.8; p = 0.02), there were no statistically significant differences in other potential confounding factors such as race (p = 0.60), smoking (p = 0.29), BMI (p = 0.59), and worker’s compensation status (p = 0.59) (Table 3) [15,16,17,18]. Further, the 6- and 12-month time periods are sufficient for evaluating the accuracy of the diagnosis, whereas outcomes at longer times are more likely related to the durability of the chosen surgical treatment (and potential evolution of new pathologies, such as adjacent segment disease). As such, our data continue to support the notion that proper level selection in fusion surgery is important and that limitations in this regard may explain discordance between historically improving rates of technical success (e.g., radiographically-confirmed fusion) and lack of proportionate gains in patient outcomes [19].

Another limitation of this study is that successful MRS execution may not be feasible on some spinal levels. As discs degenerate, they dehydrate and lose height, potentially degrading the quality of the MRS signal. Currently, MRS voxel heights are limited to a minimum of 3 mm, which may prevent successful data acquisition from severely degenerated discs. This aspect, however, does not meaningfully limit clinical utility in most situations since MRS information is combined with other clinical data when making treatment choices. For example, traditional clinical examination and other radiographic features may indicate severely degenerated discs are indicated for fusion surgery. Alternatively, severely degenerated discs are not suited for biologic therapies, and may be excluded using routine clinical imaging. The unique value that MRS provides is to reduce ambiguity when making treatment decisions for discs that are not severely degenerated and where routine clinical imaging provides insufficient actionable information.

Despite these limitations, we show that MRS-derived NOCISCAN-LS data to distinguish PD + from PD- discs, especially in non-herniated axial DLBP patients, have the potential to significantly improve surgical outcomes. Six- to 24-month surgical success rates were very high for patients treated at all NOCI + discs (and even NOCImild discs in patients without a NOCI + disc), versus low for patients treated at only an NOCI- discs or with a NOCI + disc left untreated. These data motivate the use of MRS as a valuable new approach to help doctors, in combination with other available clinical information, better diagnose and evaluate treatment options toward more successful outcomes.

Future studies are being planned to investigate MRS applications in the cervical and thoracic spines and develop algorithms that optimally combine MRS with other clinical data that capture the multidimensional aspects of pain within individual DLBP patients. We also anticipate future use of this MRS tool for other purposes, such as screening for infection, tracking success of biologic therapies meant for disc repair, and studying biochemical mechanisms of action for such new therapies.