Introduction

Osteoid osteoma (OO) is a benign bone tumor of the young that typically presents with chronic pain at the extremities, which spikes at night. Surgical excision has been the “gold standard” of treatment until 1992, when Rosenthal et al. [1] reported the first percutaneous thermal ablation (PTA), more precisely, computed tomography-guided radiofrequency ablation (CT-RFA) (Figs. 1, 2). Since then, a plethora of clinical trials have reported success rates close to 100 %, thus making PTA the treatment of choice. Advantages over surgery include low invasiveness of percutaneous versus open access, minimal postinterventional observation, ability to treat high-risk localizations, such as intra-articular and spinal localizations, and lower cost.

Fig. 1
figure 1

Large cortical OO of the tibia (white arrow) in a 7-years old boy before (A) and after (B) RF needle insertion

Fig. 2
figure 2

Small OO of the medial cortex of the femur (white arrow) in a 23-years old man before (A) and after (B) RF needle insertion

A minority of patients, however, have only a partial response. In some case, the investigators report an unsuccessful technique; nevertheless, it is accepted that a small percentage of patients will have a symptomatic recurrence even after several pain-free months no matter how satisfactory the procedure initially was.

The present study aimed to assess the incomplete and recurrent cases of OO treated by PTA the literature and to enlist the causes and the suggestions reported by the investigators. Second, we analyzed the overall safety and efficacy of PTA during long-term follow-up and classified the incidence of OO in relation to anatomical size and patient age in a large cohort of patients. Third, we detailed a sizable number of complications and the reported methods to prevent them. To our knowledge, this is the first systematic literature review of PTA of OO.

Materials and Methods

Literature Searches

This study followed the Cochrane Collaboration for Systematic Reviews of Interventions. [2]. We searched MEDLINE, MEDLINE In-Process, EMBASE, and the Cochrane databases from 1992 to 2013, targeting our search based on condition and intervention using the keywords “osteoid,” “osteoma,” “ablation,” “coagulation,” and “thermocoagulation” combined in appropriate algorithms. In addition, the biographies of the resulting articles were screened for further inclusion.

Selection Criteria

All studies matching the following criteria were eligible for inclusion: (1) prospective or retrospective cohort study for PTA of OO under CT guidance; (2) patient OO diagnosis by way of at least one CT or MR examination; (3) RFA or interstitial laser ablation (ILA) as the object of the study; (4) English language; (5) population <10 patients; (6) follow-up ≥12 months; and (6) results are not already published by the same investigator in a previous article. We also included those studies focused on comparison with another technique or those in which part of the cohort had undergone previous treatment for OO. Two reviewers [E. L. (radiology resident with 6 year of experience in literature research) and Y. T. (radiologist with 6 years of OO ablation practice)] assessed the studies for inclusion and resolved conflicts in consensus.

Data Extraction

Eligible articles were systematically assessed for data extraction against a precompiled electronic sheet (Excel 2011; Microsoft, Seattle, WA, USA), which takes into account the following features: study design, demographic characteristics, type of anesthesia, technical specifications, tumor localization, technical success, clinical success at multiple time points, biopsy success, postprocedural management, complications, incomplete treatments and recurrences, follow-up protocol, retreatments, patients lost to surgery, and investigators.

When an investigator reported a median age of population instead of a mean value, this was determined with a dedicated algorithm [3]. The same calculations were used when mean follow-up time was not explicitly reported.

Regarding clinical success, we considered the number of patients reporting a satisfactory treatment at different time intervals: I = 0–1 months; II = 1 month and 1 day to 6 months; III = 6 months and 1 day to 12 months; IV = 12 months and 1 day to 24 months. Particularly, the IV time was only considered when at least half the population was assessed at this follow-up point. If a patient had recurrence during one time interval and had a satisfactory retreatment in a following interval, it was counted as clinically unsuccessful during the former and as successful in the latter. If a part of a study population did not have 12-month follow-up, this subgroup was separated and excluded from the analysis [4].

Regarding biopsy, we considered it as successful when the pathologist’s response was diagnostic for OO or osteoblastoma (OB). Notably, not all investigators have performed this procedure, and among those who did, just a subgroup of patients was considered suitable. Hence, our calculations for biopsy success ratio apply to these subgroups and not to the overall study cohort.

Concerning incomplete treatments, this was considered the case when the investigator specifically reported an unsatisfactory procedure, habitually realizing so before or immediately after the first postinterventional evaluation. In contrast, a recurrence was defined as a new episode of the same pain for which the patient was originally treated that occurred after a successful procedure and a reported or deductible pain-free interval of at least 1 month. If this patient opted or was referred for surgery instead of a retreatment, this was counted as “patient referred to surgery.”

Complications were defined as all perioperative events that required unpreventable medical care, such as additional clinical evaluation, prolonged observation, further imaging studies, or interventions. When a complication led to an incomplete treatment, these two were counted as distinct events. If a complication led to a surgical intervention, the patient was again counted as “patient referred to surgery.” Retreatment was defined as a patient who needed more than one ablation in the same study to attain pain relief regardless of the outcome.

For each of these categories, we extrapolated and categorized the pertaining investigators’ opinions of both likely causes and actions to take to avoid undesired results. Because not all articles contributed equally to the analysis, for every field we expressed the relative number of articles (n p) that provide pertinent data (Table 1).

Table 1 Articles included in the analysis, sorted by year

Methodological Quality Assessment

The assessment of quality was performed with a modified version of the Newcastle–Ottawa Scale (NOS) [5]. This scale evaluates the quality of nonrandomized studies to be included in a systematic review and uses a “star system” to judge three aspects of the study groups: (1) selection, (2) comparability, and (3) ascertainment of either the exposure or outcome of interest for case–control or cohort studies, respectively. Considering the implicit absence of a control group in the cohorts included in our review, we adapted the NOS to judge the following aspects: (I) representativeness of cohort, (II) ascertainment of exposure, (III) outcome of interest, (IV) assessment of outcome, (V) adequate duration of follow-up, and (VI) adequate follow-up of cohort. As to the answer to specific question (Table 2), an article was awarded a grade A*, B*, C, or D for each aspect, but only grades A* and B* are worth a star. Consequently, an article of the best possible quality was awarded with six stars.

Table 2 Newcastle–Ottawa Quality Assessment scale modified by Lanza E

Results

Two hundreds fourteen studies were initially found. After applying the criteria, 28 clinical trials were included. During the first assessment, 2 articles were excluded because the same investigator had already presented the results in a different journal. Two studies were excluded because ablation was performed with alcohol injection and cementoplasty, respectively. After reference evaluation, 3 frequently cited articles, initially not considered, were also included.

In total, this review comprises 27 articles (n p) describing thermal ablation for OO concerning 1,772 patients (1,205 males, 554 females, 13 not specified; mean age = 20.8 years; σ = 2.5) from 1998 to 2013: 23 involved RFA, 3 involved ILA, and I involved combined RF-ILA.

Twelve investigators (44 %) performed the ablations with the patient under general (GA) or locoregional (LR) anesthesia; 9 (33 %) under GA; 2 (7 %) did not report specifically; 2 used GA, LR, or local anesthesia (7 %); 1 (7 %) used both GA and conscious sedation (CS); and 1 investigator intervened with the patient under GA, LR or CS depending on the case.

Mean technical success was 100 % (σ = 0.02, n p = 27). The OOs were located as follows: femur 41.6 %, tibia 20.1 %, not specified 10.9 %, foot 6.1 %, humerus 4.9 %, pelvis 4.9 %, spine 3.5 %, fibula 1.7 %, ulna 1.5 %, radius 1.1 %, hand 1.1 %, scapula 0.5 %, clavicle 0.1 %, and ribs 0.1 % (Fig. 3).

Fig. 3
figure 3

Clinical success at 1, 6, 12, and 24 months after thermal ablation

Biopsy mean rate of detection was 59 % (σ = 0.24, n p = 16). The available data were insufficient to estimate the frequency of biopsy procedures in the considered population.

Electrodes or laser fibers used were as follows: 5 unspecified needles attached to a Radionics RF generator; 5 unspecified needles and generator; 4 Cool-Tip (Valleylab); 3 Diomed laser fibers (Cambridge); 2 Soloist (Boston Scientific); 2 CelonProSurge micro (Celon); 1 UniBlate (AngioDynamics); 1 RFG-3 C Radionics (Tyco Healthcare); 1 Cool-Tip (Century Medical); 1 combined Starburst (RITA)/Cool-Tip (Radionics/LeVeen (Boston Scientific)/SDE (RITA)/Celon ProSurge micro (Celon) systems; 1 water-cooled 9F-Power Laser Set (Somatex) plus Starburst (RITA); and 1 (4 %) used a thermo-coupling device for temperature measurement. No substantial differences in the outcome were documented between the different needle types.

In 21 (78 %) of the articles, a coaxial approach was used. This means that the electrode/laser diode reached the nidus coaxially inserted into another device (15 bone needle, 2 9F sheath, 1 7F sheath, and 3 spinal needle). A noncoaxial approach was preferred by 4 investigators (15 %). Two articles (7 %) were not exhaustive in this regard.

When reported, target ablation temperature in RFA papers (n p = 19) was approximately 90 °C (mean 88.8 °C, σ = 10.3) with a mean ablation time of 6.8 min (σ = 3.2, n p = 22), whereas ILA investigators had a mean ablation time of 9 min (σ = 1.4, n p = 4). No sufficient information was available to express a quantitative reference target for end of ablation during ILA.

Sixteen investigators (59 %) did not specifically prescribe any pain medication immediately after the procedure; 5 (19 %) performed direct or subperiosteal injection of local anesthetic at the end of ablation; and 6 (22 %) recommended generic pain medication for the immediate postinterventional time window.

On average, at least 1-year follow-up data were available for 96 % of treated patients (σ = 0.12, n p = 27). Sixteen investigators (59 %) required imaging follow-up in symptomatic patients only; 4 (15 %) did not follow-up with imaging or failed to specify; 1 (4 %) recommended follow-up at 60 months; 1 (4 %) performed magnetic resonance imaging (MRI) the day after treatment; 1 (4 %) performed MRI within 1 week; 1 (4 %) performed X-ray and final CT at 24 months; 1 (4 %) performed MRI and scintigraphy at 6 and 12 months; 1 (4 %) performed CT/X-ray at 6 months; and 1 (4 %) performed X-ray during the follow-up period.

Pooled mean clinical success was 96, 95, 94, and 98 % at I, II, III and IV time intervals, respectively (σ I = 0.06, n pI = 23; σ II = 0.05, n pII = 24; σ III = 0.06, n pIII = 26; and σ IV = 0.06, n pIV = 17) (Figs. 4, 5). Reinterventions contemplated numbered 92 (5.2 %, mean 3.5, σ = 4.2, n p = 27). Incomplete treatments numbered 32 (1.8 %, mean 1.2, σ = 1.7, n p = 27). Forty-four complications were reported (2.1 %, mean 1.5, σ = 1.5, n p = 27) as follows: 12 cases of skin burn, 5 cases of muscle burn, 4 cases of infection, 3 cases of nerve lesion, 3 cases of tool breakage, 1 case of fracture, 2 cases of delayed skin healing, 2 cases of hematoma, 2 cases of unreachable target temperature, 1 case of pulmonary aspiration, 1 case of cardiac arrest, and 1 case of thrombophlebitis.

Fig. 4
figure 4

Trends of clinical success during 24-month follow-up

Fig. 5
figure 5

Anatomical localizations

Recurrences were reported in 86 patients (4.9 %, mean 3.2, σ = 4.5, n p = 27). Because not all investigators reported when the recurrence happened, it was not possible to correlate recurrences over time. Twenty patients underwent surgery during the follow-up period (1 %, mean 0.7, σ = 1, n p  = 27), due to pain recurrence or a procedure-related complication.

After qualitative assessment of the investigators’ considerations in the discussion sections, we quantified them as follows: 6 (22 %) investigators suggested multiple ablations with needle repositioning for large OOs; 4 (15 %) identified prolonged heating at sustained temperature as a relevant factor to prevent recurrences; 2 (7 %) suggested that no weight-bearing should be allowed after the procedure, whereas other 2 (7 %) affirmed the opposite; 1 (4 %) highlighted the use of contrast MRI as a tool to predict recurrences and undertake early retreatment; 1 (4 %) suspected that a misjudgment of the nidus on CT can lead to incorrect needle positioning; and 1 (4 %) noted that recurrences are more frequent in young patients. Nine investigators (33 %) did not provide relevant considerations regarding the causes of pain recurrence.

Risk of bias assessment resulted in 6 of 6-star scores for 25 of 27 articles (Table 3) according to the NOS, whereas 2 articles were given 5 of 6 stars (overall mean = 5.93 stars). No papers were excluded after this assessment.

Table 3 Number of stars awarded to each articles per each question after assessment with modified NOS

Discussion

PTA is the treatment of choice for OO. RFA is the most adopted technique; ILA is a newer method that has been proven as safe and as effective in comparison. Recent papers [6, 7] also show promising results of cryoablation as an alternative approach.

The high rate of clinical success is a distinctive feature of this procedure along with the high incidence of OO in the young. Our results confirm—on the largest scale and regardless of OO location, PTA technique, or tools used—that PTA is a curative treatment. In fact, among 1,772 patients, we registered a 90–100 % success rate, which was defined as pain disappearance immediately after the procedure and sustained for over the long term. The complication rate was very low (2 %) and in most cases were minor. Moreover, most investigators (78 %) freed their patients from daily, long-term medication immediately after ablation, whereas the rest recommended it only for the first days after intervention.

However successful, this technique does not adequately treat a small percentage of patients (5 %), who responded differently or not at all. This was confirmed by a transversal presence of recurrences in all articles considered. Even if small, this subgroup of patients is obliged to undergo retreatment, which ultimately multiplies the risk of complications. Furthermore, some may opt for surgery and thus not profit from the advantages of the percutaneous approach. This was true for 20 patients in our review, who needed surgery first due to pain recurrence and secondly due to a procedure-related complication.

Are Recurrences in Fact Incomplete Treatments?

Of 5 % of nonresponding patients, only 1.8 % underwent an unsatisfactory procedure. Some investigators may have not investigated adequately the cause of a recurrence; thus, it is reasonable to deem this gap as “underestimated.” In this regard, at least one follow-up at 12 months with CT or MRI for all patients seems a judicious, if not mandatory, precaution. Instead, 74 % of the investigators did not plan any routine follow-up. Nevertheless, this analysis points to a negative answer to the question. Unfortunately, the differences in study design did not allow for statistical comparison. The present investigators believe that there is still an interest to discern a clearer answer to this question for better understanding the behavior of this benign tumor, such as long-term regrowth after therapy, or to highlight errors in differential diagnosis.

Should the Bar of “Technical Satisfaction” to be Raised?

We searched for adjustments to the ablation technique due to the investigator’s own observations. The heterogeneity of the studies designs did not lead to conclusive results in this matter: Not all of the investigators confronted the issue of recurrences, and it was not possible to extract pertinent assertions from those papers. Still, many considerations can be made. There is agreement that large osteomas tend to recur if not ablated completely [812]. To do so, multiple investigators converged on the need of repositioning of the needle and performing multiple coagulations [810]. In addition, there seems to be a consensus regarding the importance of confirmation of the needle tip position inside the nidus [11]. In this regard, the majority of investigators (93 %) used GA in some, if not all, of their procedures. There seems to be agreement that GA is needed to avoid undesired movements during ablation, although LR, CS, and, less frequently, even local anesthesia, were employed in peripheral locations or collaborative patients. Furthermore, investigators agree that during ablation, a temperature as high approximately 90 °C maintained for more than 6 min is a key to low recurrence rates. In addition, Mahnken and Bruners [13] supported the manner in which contrast-enhanced MRI could be used as a tool to predict early recurrence and thus indicate the need for an early retreatment hypothetically before recurrence.

Finally, complications also contributed slightly to unsatisfactory treatments. Again, neither consensus data nor statistical evidence may be drawn from the investigator’s conclusions on which measures, if any, should be undertaken or avoided to lower the risk of complications. For example, we mention two articles that strongly advised against weight-bearing after lower-limb ablation [14, 15], whereas other two articles expressly labeled the same precaution as not needed [16, 17].

Conclusion

The investigators convincingly proved the superiority of thermal ablation for OO compared with other techniques. Thanks to the enormous efficiency of this technique, with success rates solidly close to 100 %, this task was achieved with studies of relatively simple design, mainly focused on technical aspects but in some cases lacking meticulous clinical follow-up. Regrettably, when the technique failed, the interventionists were not able to provide “the best external evidence for a specific clinical question” [18] or adequately investigated the causes of failure.

Considering the 5 % of nonresponding patients, it could be stated that in 95 treated patients, the interventionist will have proven the great benefits of thermal ablation; to the nonresponding 5, he or she will not be able to provide a good explanation for failure.

Implications for Practice

No evidence levels 1 to 2 may be derived from the examined papers regarding the nature of recurrences in OO ablations or the technical adjustments needed to prevent them. The reported indications are to be considered as deriving directly from investigators’ experience because the dissimilarity in the design of available clinical trials did not allow for consistent statistical analysis. However, a table of level 3 evidence suggestions for the prevention of recurrences and avoidance of complications after thermal ablation of OO was feasible, is listed in Table 4, and should be used for practice development.

Table 4 Level III evidence suggestions to avoid complications

Implications for Research

Future clinical trials for OO ablation should consider reporting essential procedure details and follow-up findings to allow for meta-analysis. The main goal should be clear differentiation between recurrences and incomplete treatments, which may help define the long-term behavior of OO. Recommended standards for future reporting, derived from the present study, are listed in Table 5.

Table 5 Guidelines for future reporting