Introduction

Early-onset scoliosis (EOS) continues to present challenges to pediatric spine surgeons. Underlying etiologies are grouped into idiopathic, either infantile or juvenile, or nonidiopathic, which includes syndromic, neuromuscular, congenital, or secondary for other reasons [1,2,3]. Often, these patients are resistant to bracing, so surgical management is frequently indicated [1, 4,5,6]. Posterior spinal fusion (PSF) is recognized as a means to definitively halt curve progression [6], but, given the early-onset nature, many patients are too young or small to be primarily fused. Bridging operations such as distraction-based growing rod (GR) instrumentation are performed in an effort to improve spinal length, lung growth, and pulmonary function [1,2,3, 5, 7].

Despite the noble goals, GR procedures can be fraught with issues, with complications rates upwards of 58% [8]. The complications include hardware failures, infections, and wound breakdown [1, 5,6,7,8,9]. One complication that has been recognized but sparsely investigated or understood is spine spontaneous fusion or autofusion (AF).

AF has been found in growing constructs over the years in various case series in 1984 [10], 1991 [11], and 1995 [12]. It was not until 2010 that AF was retrospectively reported to be a continued issue, even with the current instrumentation and techniques [13]. The following year, a multicenter study published the 2011 Law of Diminishing Returns [14] regarding growing constructs, in which it was postulated that “a possible explanation for the diminishing returns may be progressive stiffness of the immature spine that develops from prolonged instrumentation or even autofusion”. AF continues to be reported as a GR complication [2, 8, 9, 15, 16], but, despite the acknowledgment, AF impacts and understanding have minimally progressed over the past four decades.

This study took on the robust effort to answer many of the unknowns surrounding AF, with a goal to investigate, evaluate, quantify, and determine the effects of AF in a prospective design. In order to do so, this paper has three aims. The first aim was to quantify AF, including its graded severity, to determine the incidence and extent of AF. The second aim was to identify risk factors for AF both pre-operatively and during the GR treatment period to be able to provide answers as to how and why AF occurs. And lastly, this project aims to understand if AF affects scoliosis management by quantifying the impacts of AF, by severity, on curve correction and spinal lengthening.

Methods

This study was designed as a prospective cohort of EOS patients, was managed at a single academic institution, and was approved by the institutional review board (IRB). The patient care timeline began from preoperative evaluation, continued through all surgeries including index GR placement, all lengthening(s), and any additional procedures, and ended after the definitive posterior spinal fusion (PSF). Patients were enrolled from 2016 through 2021, prior to PSF. Some patients enrolled preoperatively, and others enrolled following index GR placement. For the latter, demographics, preoperative measurements, and GR treatments up to that point were able to be collected through electronic medical record (EMR) chart review.

Inclusion criteria included a diagnosis of EOS. GR constructs could be of any type of distraction-based growing instrumentation. All patients in this study received either the traditional (TGR) or magnetically controlled (MCGR) GR constructs. TGR necessitated open lengthenings to divergently translate rods along a cross connector, while MCGR involved implantation of magnetic expansion control rods (Nuvasive; San Diego, CA, USA), which are lengthened transdermally, without surgical intervention. GR were placed by one of the institution’s two pediatric orthopaedic spine surgeons. The surgical technique for both GR type involved exposing the vertebrae at the cephalad and caudal ends of the designated construct levels for placement of the GR vertebral fixation and creation of the fusion masses. For TGR, the midportion of the spine was incised down to the thoracolumbar fascia for the rods to be placed and connected, while for MCGR, the rods were tunneled subcutaneously to bridge the ends. Qualifying patients needed to maintain the entirety of their care within the institution, including index GR placement, all subsequent surgical and clinical care, and definitive PSF.

Exclusion criteria included patients who had received surgical or clinical management at an external institution at any point during the collection timeline, due to inability to control and track external indications and surgical technique. Patients were excluded if they did not complete the entirety of the described timeline. Patients were also excluded if they had undergone any spine surgery prior to placement of the index GR, such as neurosurgical procedures for spina bifida repair or tethered cord.

Data collection began with patient demographics, EOS classification, along with preoperative scoliosis measurements, which included both an apical curve size along with spinal length. Apical curve size identified each patient’s maximal Cobb angle on the preoperative posterior-anterior radiograph. The cephalad and caudal vertebra were recorded, and the same end vertebrae were for all subsequent Cobb measurements. Spinal length was always measured T1–S1. Both measurements were performed on calibrated X-rays within MergePACS (IBM; Armonk, NY, USA). The date and age of the patients were noted at the time of the index GR placement, along with the type and length of construct utilized, in addition to the scoliosis measures immediately after index GR placement prior to any lengthenings. During the GR management period, the number of lengthenings both open and closed were collected, along with all other surgical interventions. Other surgical interventions included repeat spine surgery for any indication during the treatment period, such as for hardware revision or infection management. Finally, patients underwent removal of GR instrumentation and conversion to PSF. All PSF instrumentation used the same GR fusion masses for the cephalad and caudal extents, with no PSF extending beyond these levels in any patients. Age at this final surgery relative to index GR placement determined the length of the overall GR management timeline. During PSF, all vertebral levels were exposed allowing for AF to be directly assessed, as described below. Lastly, after completion of the PSF, final radiographic scoliosis measurements were recorded.

For AF assessment, there is currently no “gold standard” evaluation modality. Both computed tomography (CT) and magnetic resonance imaging (MRI) have known accuracy limitations with nearby metal [17], along with additional challenges, costs, risks, and evaluation biases in attaining such imaging in pediatric patients. Therefore, a standardization of a previously described direct assessment method [13] was utilized. One of two senior surgeons, with 42 years of combined pediatric spine experience, would distinctly assess each vertebral level between the cephalad and caudal fusion masses for mobility and the presence of bridging bone between vertebral levels at any intervertebral location. If two vertebrae moved as a unit without differential mobility and/or were connected by solid bridged bone, that level was documented as AF. Conversely, if two vertebrae moved independently with frank mobility between them, that level was documented as not AF. By dividing the AF levels by the total levels between fusion masses, an AF percentage could be determined. These percentages were converted to an IRB approved severity grade, detailed in Table 1. Grades I–V correlate with AF percentages, ranging from 0 to 100% AF. Those graded I or II (< 50% AF) are additionally described in this study as low-grade, and III–V (≥ 50% AF) as high grade.

Table 1 Autofusion grading

Data analysis progressed in three stages. The first stage was statistical evaluation of patient demographics, construct types, and interval lengthenings and surgeries. Changes in scoliosis curve along with spinal length were next calculated. In stage two, AF grades were applied to each patient. Patients were evaluated in 22 different exposure factors (Table 2). Given this novel research topic, standard values do not exist for these factors, so the factors were classified as follows. Risk factors were characterized as binary categorical (e.g. gender), presence or absence (e.g. fixation to pelvis), or continuous values above or below the sample mean (e.g. age at index GR placement), as labeled in Table 2. If the continuous values were discrete (e.g. vertebral levels), these were rounded to the nearest whole value to define the cutoff, while those that were nondiscrete were kept at the appropriate statistical mean (e.g. cm of length added).

Table 2 Risk factors

Each exposure factor was identified for each patient in association with that patient’s AF grade, generating a Pearson regression correlate R and p value, relative correlation, and odds ratio. Pearson regression correlates can be described as either directly (+) or inversely (−) correlated. If R > − 0.1 and < 0.1 these are uncorrelated, ± 0.1 to  ± 0.25 are weakly correlated, ± 0.25 to  ± 0.5 are mildly correlated, ± 0.50 to  ± 0.75 are moderately correlated, and ± 0.75 to  ± 1.00 are strongly correlated. R values were significantly correlated if p value was ≤ 0.05 and significantly uncorrelated if p ≥ 0.95. Odds ratios were significant if their 95% confidence interval did not cross 1.00. Those that did not cross 1.00 and remained > 1.00 were therefore statistically significant risk factors, while those that did not cross 1.00 and remained < 1.00 were therefore statistically significant protective factors.

The third and final analysis stage involved directly comparing the low- or high-grade categorized patients to determining whether the severity of AF impacted curve correction by way of Cobb angle measures or length correction by way of spinal length measured T1-S1. Student T tests were performed, and statistical difference was defined as p ≤ 0.05.

Results

Thirty-two patients were enrolled. Twenty-eight patients qualified for this study after four were excluded. Two excluded patients never underwent definitive PSF. Another patient was excluded after receiving their index GR placement and initial management at another facility prior to transferring their care to our hospital. And the fourth exclusion was a patient who had undergone multiple neurosurgical spinal procedures as an infant for tethered cord complicated by subsequent thoracolumbar infection.

Preoperative EOS diagnoses included 11 idiopathic, 7 syndromic, 7 neuromuscular, 2 congenital, and 1 secondary to an infantile cardiothoracic surgery which did not involve any surgery on the spine (Fig. 1). The average age at the time of index GR surgery was 8.54-years old (2.74–11.13). Patients were lengthened over 4.81 years (2.09–8.27) to a mean age of 13.31 years old (10.54–15.98), the time at which they underwent GR removal and conversion to PSF.

Fig. 1
figure 1

EOS diagnoses. This graph shows the numerical and percentage breakdown of the underlying EOS diagnosis for each patient enrolled in the study

At the index operation, 53.6% of patients received MCGR and the remaining received TGR. The vertebral level of cephalad fixation was most commonly T2 or T3, though one patient’s construct ended at T1. The caudal fixation was more variable, with L3 being the most frequent level. Nearly 18% of patients’ GR constructs included the pelvis. Overall, the GR constructs bridged 13.4 levels on average. Further details on these index GR constructs can be found in Table 3.

Table 3 Operative technique

Over the lengthening period, those with TGR underwent 3.9 (1–9) open, operative lengthenings, and those with MCGR had 11.4 (2–23) closed, magnetic lengthenings. The net average lengthenings for the cohort was 10.8 (1–23). It is notable that 52.9% (7/13) of TGR were converted to MCGR, and one of these seven was back converted again to TGR. None of those primarily implanted with MCGR (0/15) were converted to TGR. Throughout the lengthening period, patients averaged 0.8 (0–5) additional open procedures. These included revision of screw (s) or hook (s) (11, in 7 patients), upsizing of MCGR (2, in 2 patients), rod revision for breakage (4, in 4 patients), and incision with drainage for infection (5, in 2 patients). There were no occurrences of crankshaft or junction kyphosis necessitating intervention. Combining open lengthenings, rod conversions, and other open procedures results in an average of 2.9 (0–16) total open procedures per patient over the treatment timeline between index GR placement and PSF.

When looking at changes in the spine measures before and after GR placement (Table 4), spinal curves on average were corrected from 68.5º (48.50–109.40º) preoperatively to 35.3º (14.7–65.0º) after GR placement. After converting GR to definitive PSF, the curve severity remained similar to the index GR correction, averaging 35.6º (14.50–70.10º) (p = 0.93). Regarding spinal length, on average length increased from 30.0 cm preoperatively to 33.9 cm after index GR placement. After an average 4.8 years of lengthening, an additional 5.6 cm were gained for a final length of 39.5 cm after PSF.

Table 4 Scoliosis correction and lengthening

AF grading (Fig. 2) was as follows: 28.6% Grade I, 25.0% Grade II, 17.9% Grade III, 25.0% Grade IV, and 3.6% Grade V 3.6%. To categorize this grading, 53.6% were low-grade (Grade I or II, < 50%) and 46.4% were high-grade (Grades III–V, ≥ 50%). The grades were compared back with the underlying EOS diagnosis by Chi square analysis. There were no significant correlations between grade and etiology, though idiopathic patients had relatively lower AF grades on average (2.0) compared to nonidiopathic (2.8) (p = 0.37).

Fig. 2
figure 2

AF grading. This graph depicts the number and percentage of each AF grade as calculated at the time of PSF

The results of the 22 exposure factors are presented in Table 5. Significant risk factors for AF include GR placement before age 8 (10.4×, p = 0.01), any interval open procedures (6.3×, p = 0.05), and residual curve > 30º after index GR (13.7×, p = 0.02). Two protective factors include preoperative spinal length of > 30.0 cm (0.11×, p = 0.01) and index MCGR rather than TGR (0.16×, p = 0.03). Cephalad level, number of total lengthenings, > 50% initial correction, and > 12.5% initial length added all were shown have no statistical correlation with AF (all p value > 0.95).

Table 5 Risk factor correlation with autofusion

Given the known change in the cohort’s spine characteristics presented in Table 4, and the described AF grade findings, the cohort was then subcategorized into low- and high-grade AF to allow for a direct comparison to be made to determine if the severity of AF impacted the curve correction or spinal lengthening. When comparing the change in Cobb angles (Table 6A) between index GR placement to after PSF, there was slight additional correction achieved in those with low-grade (− 3.5º) compared with a minor curve progression in those with high-grade (+ 4.7º) (p = 0.08). There was a 20.5% higher frequency of correction loss from index GR through PSF in the high-grade subgroup (53.9%, vs 33.3% for low-grade), but this difference did not reach significance (p = 0.29).

Table 6 A and B Spinal parameters between low- and high-grade AF

Looking at subgroups by length gained over the lengthening period (Table 6B), as previously discussed in the risk factor results, those with longer spines preoperatively was a protective factor against AF, and this length difference held consistent throughout the treatment period as seen by the longer T1-S1 measurements at all three timepoints (p = 0.02, 0.008, 0.03, respectively). Despite the absolute length difference, there was no difference in the absolute or relative length added between low- and high-grade AF (p = 0.50, 0.18, respectively). Low-grade patients lengthened an average of 5.3 cm, and high-grade patients lengthening an average of 6.1 cm.

Discussion

As mentioned, AF has been sparsely discussed in the literature spanning the past four decades. Moe et al. back in 1984 [10], 5 years after describing the subcutaneous technique for Harrington instrumentation, reported within their case series the observation of “spontaneous fusion” as a finding. A number of years later in 1991, Fister et al. [11] published a 9-patient series on Luque Trolley instrumentation undergoing revision 4 years following index instrumentation. The report observed “spontaneous fusion” in all patients “most often at or distal to the thoracolumbar junction” and, in two patients, the AF was “solid” throughout. These authors surmised that this fusion limited further correction.

It was not until 1995 that the first article came to the literature dedicated to observing spontaneous fusion. In a three-patient case report by Fisk et al. [12], AF was diagnosed by x-ray and CT in one patient and by direct visualization during revision surgery in the other two. The authors concluded broadly that “spontaneous spine fusion can occur before reaching maturity”.

Over a decade later, Cahill et al. [13] revived the AF discussion in their 2010 publication. Unfortunately, only nine GR patients from 1985 to 2004 were identified through medical record review. Spontaneous fusion was documented in eight (89%). The authors did not believe that the presence of AF inhibited lengthening, although their lengths were compared to a previous study that did not assess for or report the presence of AF. It was proposed that “distractive force leads to [fusion] growth.” Patient age and treatment duration were also discussed as possibly being involved in the process. Risk factors for the AF were not investigated, as it was termed “multifactorial” with proposed reasons including immobilization, muscular disturbance, and immature bone healing. The 2011 Law of Diminishing Returns [14] demonstrated diminishing length gains after subsequent GR lengthenings, and it was “the experience of many of the authors” that spines were “stiffer than would be expected” as “several of these segments have undergone autofusion”.

Therefore, for decades, AF had been proposedly described and frequently supported as a GR complication without a greater understanding. This prospective cohort study was designed to identify and address numerous questions surrounding AF in EOS growing constructs.

First, the question of frequency and severity of AF was evaluated. While the Cahill retrospective series [13] found an incidence of 89%, this was by chart review of documentation for the presence of AF in only nine patients. Comparatively, this study’s 28 patient cohort demonstrated AF to be essentially ubiquitous, but it occurs as a severity continuum, ranging from minimal (0%) to complete (100%). This adds a new layer of understanding in comparison to previous reports that utilized the binary definition of “present” or “absent” to describe AF. The proposed Grading system (Table 1) was designed to better delineate this continuum and found a fairly even spread between grades I–IV. This cohort showed that, while 71.4% of patients had ≥ 25% AF, only 28.6% had ≥ 75% AF. Therefore, noting merely the presence of AF at a single or even a few levels does not necessarily mean a patient is completely fused, as most patients exist in the spectrum in between.

The second question that this paper addressed was to determine risk factors for AF. The regression analysis and odds ratios demonstrated factors that put patients at risk for AF, ones that protect patients from AF, and, also importantly, factors that did not correlate with AF rates and severity. At the index GR procedure, residual Cobb angle of > 30º had a 13.7× odds ratio and performing the procedure before the age of 8 years increased odds for AF by 10.4×, while utilizing MCGR rather than TGR and performing the surgery after reaching 30 cm of perioperative T1–S1 length were protective from AF formation by 0.16× and 0.11×, respectively. Also, for surgical planning of the index procedure, it is valuable knowing that the level of cephalad fixation, > 50% index Cobb correction with ≥ 12.5% increased length, and index spine lengthening were uncorrelated with AF. And during the GR management period, the need to perform any open procedures was a 6.3× risk, while increasing the number of magnetic or combined lengthenings did not correlate with AF development or severity.

Some of these factors can be controlled by the surgeon, such that by avoiding factors that increase AF odds and including in those that add protection, AF may be able to be minimized. But it must also be recognized that these individual factors cannot be viewed in a vacuum. The authors understand that many of these factors are associated with others, such as age and spine length. With that said, the factors are still discrete and were not equally influential on AF. The value of the odds probabilities aids in surgical decision making and discussions with families, as patients are likely to have a mixture of risk, protective, and noncorrelative factors. In addition, the authors recognize that these surgical factors will only play a part in the decision, as there are separate medical, social, and global considerations that go into the decision-making process, such as family situation, medical complexity, other surgical needs, tolerance of nonoperative measures, etc. But, as an overarching takeaway, modifiable risk factors such as surgical timing, type of implant selected, amount of initial correction, and construct levels can be considered in surgical planning to mitigate AF risk.

The final question this paper addressed was if and how AF matters. Two of the main goals of growing construct in EOS are to improve scoliosis curvature and also allow the spine to continue to grow and lengthen. This study showed that the correction achieved after index GR placement remains similar with final correction after PSF, though patients with high-grade AF were relatively more likely to have a slight loss of reduction. The absolute difference between curvatures in low- and high-grade AF patients was 8%, and this did not achieve statistical significance (p = 0.076). It was not within the scope of this study to determine if that difference was clinically significant. Additionally, while it previously had been postulated [13,14,15,16] that AF may interfere with spinal lengthening, this study showed that patients lengthened similarly between those with low- and high-grade AF (p = 0.50). The 5.7 cm average length gained in this study (1.2 cm/year) is similar to previously published expected length gains [14]. Therefore, in summation, these findings provide reassurance for providers, patients, and families that even those with more severe AF should not have a significant impact on curve correction or length gains. Other potential impacts of AF have yet to be elucidated.

Critiques, limitations

Despite this being the largest cohort AF series to date, EOS remains a rare entity and the findings in this study are limited by the relatively small sample size. One common way to address this is by performing studies as multicenter efforts or through study groups to increase numbers, but adding additional institutions generates additional challenges with controlling for surgical technique, patient care, treatment protocols, etc. A power analysis was not applicable, as baseline values for the questions being researched have not previously been published. This study enrolled all eligible candidates over the approved study period.

There is no current gold standard for assessing for AF with any type of clinical, intraoperative, or imaging techniques. Advanced imaging such as CT or MRI are not validated, and these imaging modalities do not eliminate subjectivity or error, especially with associated metal artifact, particularly if the implants contain magnets. Furthermore, advanced imaging also comes with risks (radiation, sedation), costs, and technical challenges for the EOS population. This study proposed a new evaluation and grading system to describe AF severity. Previous literature reported only on retrospective observances of AF [10,11,12,13], which has many limitations. In this study’s design, prospective collection limits recall and omitted-variable biases, and level quantification limits measurement bias. But, even when performed by senior surgeons, human evaluation is affected by observer bias (subjectivity) and confirmation bias, which limit the impact of the conclusions. A future, large-scale validation study could ascertain the reliability of the described methodology, and a direct comparison with alternative modalities such as CT or MRI would help elucidate the value of each technique.

Lastly, the authors recognize there may be other surgical, medical, or patient factors that may influence AF but were not identified in this study. Future studies may find additional protective or risk factors.

Conclusion

Prior to this project, autofusion was a recognized but largely unknown growing rod complication in early onset scoliosis management. This study showed that autofusion is nearly ubiquitous in these patients but presents with variable severity. Risk factors, along with noncorrelative and protective factors, do exist and are associated with autofusion development. It is possible modifying these factors may minimize autofusion development. Cobb angles after PSF tend to be similar to the index GR correction, regardless of the amount of autofusion. And all patients, even those with more severe autofusion, still achieved increased spinal length. Finally, autofusion was not an apparent impediment to definitive posterior fusion.