Introduction

Recently, attention has been placed on the safety, diagnostic value, and efficacy of minimally invasive pain procedures for the evaluation and treatment of spinal pain. Although discography has been utilized for over 50 years, it continues to be scrutinized and is surrounded by controversy. Originally, Lindblom [1, 2] introduced discography in the 1940s as a modality to assist in the visualization of a herniated nucleus pulposus (HNP). During that period, diagnostic tools for evaluation of the lumbar spine were limited. Although rapid advances in radiology have occurred, the ability to diagnose discogenic low back pain caused by internal disc disruption (IDD) is still inadequate. Therefore, provocation discography continues to be utilized to identify symptomatic discs through the interpretation of the morphological characteristics of the disc and the addition of patient-reported symptoms. Current guidelines and systematic reviews have continued the debate by offering opposing conclusions [3, 4, 5••]. This article will provide an updated review with specific emphasis on both longstanding and new areas of debate. Four major areas will be covered: 1) false-positive rates, 2) technical parameters, 3) clinical utility, and 4) risk of procedural-related disc damage.

The Role of Interventional Diagnostic Procedures

Chronic low back pain, with an annual prevalence ranging from 15% to 45%, is associated with significant health and socioeconomic costs to the individuals suffering from the condition and to the health care system. From 1997 to 2006, the national expenditures for spine-related problems increased 82% (average of 7% per year) [6]. Although a large percentage of the increase was related to inpatient, prescription, and emergency services, outpatient services including interventional therapies contributed to the drastic increase in health care costs. Even though expenditures for spine treatments have increased, measures of self-reported mental and physical health and functional improvements have declined [6].

In about 90% of chronic low back pain cases, a specific pain generator cannot be identified with any certainty [7]. Diagnostic spinal procedures (eg, discography, medial branch blocks, and selective nerve root blocks) have been proposed as a way to guide decision-making for both interventional pain procedures and surgical operations, with the goal of improving outcomes through enhanced patient selection [8, 9].

Diagnostic spinal injections are often utilized because of the shortcomings of radiographic imaging. Radiographic imaging often can identify morphological changes in low back structures; however, it cannot conclusively identify them as pain generators. Boden et al. [10] performed and interpreted lumbar magnetic resonance imaging (MRI) scans on 67 asymptomatic individuals. In those individuals younger than 60 years, 20% had an HNP. In the study population older than 60 years, abnormal findings existed in about 57% of the individuals. Jensen et al. [11] demonstrated that only 36% of asymptomatic patients had normal lumbar discs at all levels. Patients who had a bulge at least at one level were 52%, with 19% of the patients having annular tears. In addition, 38% had an abnormality of more than one intervertebral disc. The high-intensity zone (HIZ) detected on T2 MRI images has been advocated as a surrogate marker for painful internal disc disruption (IDD). The HIZ sign’s clinical significance may be limited by poor sensitivity and a documented high prevalence rate of 25% in asymptomatic individuals with known risk factors for disc degeneration [12, 13]. In symptomatic patients, the sensitivity of the HIZ in identifying painful IDD has been estimated to be 81% [14]. Based on these studies, it can be concluded that MRI imaging should not be used in isolation for decision making.

Before discussing the controversy surrounding discography, it is important to examine the difficulties with establishing the etiology of low back pain in a given individual and the utility of diagnostic spinal procedures for low back pain. The essential features of a diagnostic test are accuracy in making the correct diagnosis, safety, and reproducibility. Spinal diagnostic tests are plagued by multiple issues including the placebo response, nocebo effect, patient participation, and centralization of pain, which often lead to low levels of sensitivity and specificity [15]. North et al. [16] demonstrated that false-positive responses are common with diagnostic nerve blocks and suggested a limited role for uncontrolled local anesthetic blocks. Blocks distal to the anatomic source of pain and anesthetizing an uninjured structure may relieve pain for a temporary period of time [16, 17]. Marks [18] found that in 385 observations of 138 patients, no consistent segmental or sclerotomal pattern was found for lumbar facet joint–mediated pain. A prospective study involving mechanical stimulation of cervical nerve roots in patients with cervical radicular symptoms undergoing diagnostic selective nerve root blocks demonstrated a distinct difference between dynatomal (referred symptoms) and dermatomal maps. In 12% of C6 nerve root stimulations, the patient reported symptoms in all five fingers, and none of the cervical levels (C4–C8) were associated with the classic dermatomal distribution greater than 50% of the time [19]. Many diagnostic tests to evaluate nonspecific neck and low back pain lack a reference standard with external validation to assist in the determination of the sensitivity and specificity of the tests.

False-Positive Rates

Lumbar provocative discography provides information on both the morphological characteristics of the disc and the provoked pain response. The reliability associated with discography for evaluating morphological characteristics has been examined for both test-retest reliability and intra- and interrater reliability [20, 21]. When using the Adams Grading System (Fig. 1) for discograms, both inter- and intraobserver reliability levels are high (κ = 0.77–0.85) [20, 22]. Milette et al. [21] also reported high rates of interrater reliability for the detection of annular degeneration (κ = 0.67) and annular disruption (κ = 0.66) with discography.

Fig. 1
figure 1

A type 5 Adams Classification L4–L5 disc demonstrating an annular tear (black arrow) with contrast escaping to the anterior epidural space (white arrow). See [20, 22] for Adams classification

Although the reliability of discography in detecting morphological changes appears to be accurate and consistent, morphological changes alone do not identify a tested disc as the pain generator; thus, a great emphasis is placed on pain provocation. One of the major concerns surrounding discography is the possibility of an unacceptably high false-positive rate in asymptomatic individuals. Holt [23] was the first to suggest an unacceptable high false-positive rate of 37% when he examined the results of discograms performed in 30 asymptomatic inmates. The validity of this study has been extensively questioned, with major criticisms including radiographic contrast (diatrizoate) utilized, study population, and lack of consideration of patient response, pressure criteria, and control discs in the determination of a positive discogram [24, 25••, 26].

Carragee et al. [2729] published additional studies on discography and reported high false-positive rates in specific patient populations. Through a series of clinical experiments, the major diagnostic criteria for discography were examined in individuals without discogenic pain having clinical and demographic features commonly seen in individuals with intractable back pain. In the first study, the accuracy of the concordant pain response and a patient’s ability to differentiate anatomical pain generators was questioned when 50% of the individuals with normal psychometric testing and no history of low back pain who had prior iliac crest bone graft harvesting experienced a concordant painful sensation with lumbar discography [27]. In the second study, three subgroups were examined, including pain-free, chronic cervical pain, and somatization disorder groups [28]. The pain-free group had a false-positive rate of about 10%, which was higher than the 0% false-positive rate reported by Walsh et al. [26]. The chronic cervical pain and somatization disorder groups had higher false-positive rates of 40% and 83%, respectively [28].

Discography has often been performed on individuals who have persistent pain after low back surgery. In a third study, Carragee et al. [29] also suggested that this population is at risk for high false-positive rates. In this study, 240 individuals who previously had limited lumbar discectomies underwent discography. In 40% of asymptomatic individuals with normal psychometric testing, significant pain occurred during injection. Based on these studies, caution is warranted in the interpretation of discography results in individuals with abnormal psychometric testing, a history of previous back surgery, and other nonrelated chronic pain conditions.

Additional studies have confirmed the importance of considering psychological factors when interpreting discography. Personality factors, as indicated in the Minnesota Multiphasic Personality Inventory (MMPI) scores, were found to significantly influence pain response. Individuals that had higher mean scores for the hypochondriasis, depression, and hysteria scales were more likely to report pain in nondisrupted discs [30]. Furthermore, individuals that have elevated MMPI scores typically indicated pain in nonanatomic patterns on pain drawings. Ohnmeiss et al. [31] demonstrated that patients with abnormal drawings had a significantly higher false-positive discography rate of 50% in comparison to 12.3% false-positive response in individuals with normal pain drawings. Because multiple studies have demonstrated that false-positive rates are significantly higher in individuals with abnormal psychological function, it is important for clinicians to consider psychological screening before the performance of the discography. Unfortunately, our ability to detect psychopathology and psychological distress may be inadequate. Referral to a specialist or the utilization of standardized questionnaires should be considered before the performance and interpretation of discography results. A prospective blinded study examining 400 patients who presented to a university spine center demonstrated that surgeons and nonoperative specialists were only able to detect psychological distress correctly in 28.7% and 41.7% of cases, respectively [32•].

Two of the main criticisms of the Carragee et al. [2729] studies were how an asymptomatic individual can define concordant pain and why manometric pressure readings were not required as a diagnostic criterion for the designation of a positive disc [25, 33]. To address this second criticism, a retrospective analysis was performed on the three prior publications using a low-pressure guideline of less than 22 psi above opening pressure. The low-pressure requirement resulted in a false-positive rate of about 25% in asymptomatic individuals [34]. Patients without psychological distress, chronic pain, or previous surgery were associated with significantly lower false-positive rates.

A recent systematic review of lumbar provocation discography in asymptomatic patients with a meta-analysis of false-positive rates further suggest that false-positive rates could be lowered with appropriately set diagnostic low pressure and patient selection criteria [25••]. The meta-analysis of five studies, using the International Spine Intervention Society (ISIS) standard, resulted in a specificity of 0.94 and a false-positive rate of 0.06. In the systematic review, utilizing strict diagnostic criteria, the false-positive rates for a specific population were 5.6% for chronic pain, 50% for somatization disorder, and 15% for postdiscectomy subgroups.

Technical Parameters

To further limit false-positive rates, multiple technical refinements and practice guidelines have been proposed for discography. Research has provided additional insight into possible reasons for false-positive responses besides psychological factors. In 2004, ISIS published practice guidelines with the goal of improving the diagnostic validity of the test [33]. One of the recommendations was the measurement of intradiscal pressures through manometry as an operational criterion to improve test validity [33, 35]. A national multispecialty survey demonstrated poor compliance with the ISIS recommendations, with only 65% and 72% of respondents measuring opening and pain response pressures, respectively [36].

Because a great amount of emphasis has been placed on the clinical significance of manometry, one must comprehend the foundation and limitations behind this recommendation. In 1999, Derby et al. [37] suggested that pressure-controlled discography improved the diagnostic specificity of the test and helped select the appropriate surgical technique. Diagnostic categories were developed based on intradiscal pressures at pain provocation, with a chemically sensitive disc having concordant pain provocation occurring at less than 15 psi, and a mechanically sensitive disc having pain provocation occurring between 15 and 50 psi above opening pressure. An indeterminate disc was classified with pain provocation occurring between 51 to 90 psi above opening pressure.

Although limited research exists for defining cutoff values for positive and negative discs, incorporation of pressure measurements does seem to limit false-positive results. A pressure-controlled lumbar discography study in volunteers without low back symptoms indicated that false-positive rates could be limited to less than 10% if the operational criteria are set at a pressure not greater than 50 psi above opening and an intensity of concordant pain greater than 4 on a 0 to 10 numerical rating scale [38]. The false-positive rate could be lowered to zero if the threshold pressure is lowered to 30 psi above the opening and the required pain score is held for greater than 4 out of 10. An additional study, assessing the diagnostic relevance of pressure-controlled discography, demonstrated that concordant pain responses occurred at significantly lower intradiscal pressures [39].

Multiple factors influence pressure measurements during discography, including injection speed, contrast viscosity, location of sensors, and needle profile [40, 41••]. Pain reproduction during discography closely correlates with peak dynamic pressure, not static postinjection pressure. At high injection speeds, the pressure differences between dynamic and static intradiscal pressures have significantly increased. Therefore, high speeds of injections may lead to substantially increased dynamic pressures. High injection speed, high viscosity, and needle characteristics (ie, small diameter and long length) all increase the dynamic pressure [40]. Automated pressure-controlled discography devices have been advocated as a way to control the speed of injection and reduce operator error [42]. Also, intrasyringeal pressure sensors are not as accurate as extrasyringeal sensors [41••]. Therefore, as the science of pressure manometry is advanced, it will be essential to standardize the speed of injection, viscosity of the injected material, diameter and length of the needle, and sensor placement.

Other technical factors, including the transfer of disc pressure to adjacent discs and endplate deflection, may be responsible for the generation of false-positive pain responses. An in vivo porcine discography study demonstrated pressure transmission to adjacent discs. The median value of intradiscal pressure rise in the adjacent disc was 16% over the baseline pressure [43•]. Therefore, a concordant pain response from a disc during discography could originate from a pressure increase in an adjacent abnormal disc. Derincek et al. [44] demonstrated that a pressurized pain response in a morphologically normal disc may be due to referred pain from an adjacent abnormal disc. When an adjacent morphologically abnormal disc was anesthetized and discography was repeated on a normal disc that had previously provoked a pain response, none of the patients experienced reproduction of concordant pain. Additionally, discography results in vertebral endplate deflection and deformation. In thoracolumbar cadaver spine segments, the average endplate deflections were 0.3 mm [45]. The stimulated disc may not be the sole source of pain during discography, and other structures including bony elements and adjacent discs need to be considered.

Clinical Utility

An additional controversy that has plagued discography is whether the interpreted results positively influence treatment outcome. Before determining whether preprocedural discography influences treatment outcome, it is important to understand the efficacy of present treatments for IDD. Treatment options for IDD are limited and are often associated with low levels of efficacy. The results of intradiscal electrothermal therapy are mixed [4649]. The current evidence for efficacy is weak for pain relief and inconclusive for associated improvements in function [50]. Other intradiscal therapies have been developed such as intradiscal biacuplasty, though with only limited evidence generated from pilot studies and case series [51].

The surgical data for IDD and axial low back pain is heterogeneous with success rates varying from less than 50% to more than 86% [5255, 56•]. One systematic review of randomized trials comparing lumbar fusion surgery to nonoperative care for low back pain demonstrated that fusion may not be more effective than structured rehabilitation programs that include cognitive behavioral therapy [57]. Furthermore, lumbar fusion for degenerative disc disease is associated with complications and negative consequences, including the acceleration of degenerative changes in the adjacent level discs, nerve injury, hardware failure, infection, and higher reoperation rates [5861].

It is difficult to accurately evaluate the utility and predictive value of diagnostic discography. This is impacted by the lack of a highly effective minimally invasive disc procedure. Additionally, a surgical procedure that consistently results in pain reduction, functional improvement, and an acceptable risk profile does not exist. Multiple studies have examined the correlation between discography results and treatment outcomes. Because of the low quality and heterogeneous results of these multiple studies, it is difficult to interpret the evidence and to determine if presurgical discography predicts and enhances the degree of pain reduction or improves functional and quality-of-life measures [62]. Colhoun et al. [63] reported superior outcomes in individuals who had morphologically abnormal discs and provoked pain during discography (clinical success = 89%) than in individuals who only had morphological abnormalities (clinical success = 52%). Other studies have not found discography to favorably influence outcomes [6466]. Poor results from single-level anterior lumbar interbody fusion for discogenic back pain were found in 47% percent of individuals who had discogram-concordant pain [64]. A study comparing patients who underwent surgical intervention for discogenic back pain, with and without presurgical screening discography, demonstrated no significant improvement in outcome. A total of 75.6% of patients in the nondiscography group and 81.2% in the discography group reported improved Oswestry Disability Index scores [66].

An additional question that arises is how to treat an individual that has a normal MRI scan but a positive discography result. A study examining functional results after L5–S1 lumbosacral fusion in patients with a positive discogram with either a normal or an abnormal MRI scan would suggest caution in the normal MRI patient category [67]. The success rate for surgical intervention in individuals with normal MRI findings was 50% versus the 75% success rate for those individuals with abnormal MRI findings. Surgical guidelines have further advanced this recommendation for avoiding intervention in individuals with normal MRIs [9, 68, 69].

In a small retrospective study analyzing the outcomes of patients with documented single-level discogenic pain determined by discography that were considered candidates for surgery but did not elect to progress forward, improvements in pain and disability scores occurred in 68% of patients at a mean follow-up of 4.9 years [70]. Although the study has limitations, it does suggest that the natural history of discogenic pain managed nonoperatively may be superior to surgical intervention in specific cases.

Risk of Procedural-Related Disc Damage and Progression of Disc Degeneration

The controversy surrounding the diagnostic validity of discography and its ability to improve clinical outcomes should be placed into further context with the risks associated with the procedure. Besides the traditional complications mentioned, including infection (epidural abscess, discitis, and osteomyelitis), neurological injury, and drug reactions, recently published in vivo, in vitro, and human studies have implied that the annular puncture from discography may have significant clinical and biological consequences [71, 72••, 73, 74•].

As early as the 1950s, Goldie [75, 76] raised concerns about changes observed in the IVD after discography including hyaline droplets, but denied areas of necrosis as others had described. A recent, prospective, 10-year, controlled, matched cohort study of disc degeneration examined individuals who underwent discography with either 22-gauge or 25-gauge spinal needles and patients who did not undergo discography [72••]. The researchers observed the effects of modern discography techniques with limited pressurization on rates of lumbar disc degeneration. An MRI was performed at initial enrollment and repeated 7 to 10 years after baseline assessment. When examining the subset of discs that underwent needle puncture, the percent of individuals with progression of disc degeneration (35%) were significantly greater than control (14%). The discography group had a significantly greater incidence of new herniations, with the herniations disproportionately occurring on the side of the annular puncture (foraminal and far lateral). No differences in degeneration patterns between the groups of the nonpunctured discs (L1–L3) were observed. Several limitations were noted in this study, including patient selection from a group of individuals that had a history of a greater than average risk of disc degeneration.

Although the above study raises concerns, other studies suggest that discography does not result in disc damage. Johnson [77] found no evidence that discography resulted in accelerated rates of disc degeneration or higher rates of subsequent disc herniation. In individuals with normal psychometric test results, there were no reports of the development of significant long-term back pain at 1 year after discography [78].

In vitro and in vivo animal models for disc degeneration providing further insight into the effect of needle puncture on disc biomechanics, degeneration, and cell viability would indicate that there should be concern about this possible complication. Annular puncture has been shown to have negative consequences, including changes in biochemical and structural properties, cell viability, and biosynthesis (Table 1) [74•, 7983]. Needle puncture with a 25-gauge needle demonstrated harmful changes in dynamic modulus and creep (ie, tendency of the material to the form). Cell viability was decreased in the area of insertion [79].

Table 1 Needle punctures effects on disc biosynthetic and structural properties based on animal data

Concerns have been raised about the effects of the injectate on nucleus pulposus viability. Analgesic discography, which involves the injection of local anesthetic, has been proposed as a way to increase the sensitivity of diagnostic discography [84]. Cytotoxic effects on cultured nucleus pulposus cells demonstrating 51% cell death were exerted by 0.5% bupivacaine [85••]. A time-dependent response was seen with 0.25% bupivacaine. High-dose intradiscal antibiotics, which are often used to help prevent discitis, also have been shown to have detrimental effects on disc cell viability, proliferation, and metabolism [86].

At this point, there are at least preliminary clinical and animal data that suggest there should be concern about the possibility of disc degeneration associated with discography. It also brings into question the risk–benefit profile of validating discography results with a normal control disc. Future clinical investigation is needed to further assess this risk (Table 2).

Table 2 Areas for future investigation on needle puncture and injectate effects on the degenerative cascade in human disc tissue

Conclusions

The controversy surrounding discography continues to exist with widespread discussion over its validity and clinical utility. Most of the research suggesting high false-positive rates comes from one institution and provides insight into the importance of appropriate patient selection and other factors, in addition to anatomical structures, that can confound the results [2729, 34]. Currently, false-positive rates seem to be effectively lowered to acceptable levels with appropriate use of operational criteria and patient selection. At present, other diagnostic techniques (ie, radiographic imaging) are not associated with better diagnostic capabilities for symptomatic IDD. Additional research and refinements are needed in equipment design and technical parameters to improve the validity of the test. Furthermore, the procedure is an invasive procedure that is associated with complications, including the possibility of accelerated disc degeneration. The decision to utilize this procedure should take these risks into account with the understanding that the current treatment of discogenic back pain is limited and still elusive. The diagnostic utility of discography cannot be fully evaluated until a consistently efficacious treatment with an acceptable risk profile is developed for discogenic back pain. Extensive research is needed to further the diagnosis and treatment of discogenic back pain with emphasis on ways to limit the negative consequences of treatment.