Introduction

Desmoid fibromatosis is a locally aggressive, nonmetastasizing mesenchymal tumor that is often difficult to resect and frequently recurs. The World Health Organization defines desmoid fibromatosis as a benign soft-tissue neoplasm composed of a clonal proliferation of fibroblasts and myofibroblasts that grows with infiltrative margins, and is composed of a proliferation of uniform spindle cells with collagen production [1]. Although these intermediate-grade tumors lack the potential to metastasize, they may cause significant morbidity because of their infiltrative growth and tendency to locally recur [1, 2].

Traditional management of desmoid fibromatosis has been surgical resection. High rates of local recurrence after excision, reported to be 20–60 % at 5 years [3, 4], can be mitigated with adjuvant radiation [5, 6]. Recent studies have shown that systemic medical treatment is associated with improved outcomes in patients. Hormonal therapy, anti-inflammatory drugs, tyrosine kinase inhibitors, or cytotoxic chemotherapy represent the various systemic medical treatments available [7]. Overall response rates to combination cytotoxic therapy in single-arm studies range between 17 and 100 %, with a median response rate of 50 %. The most commonly used cytotoxic regimens include doxorubicin-based chemotherapy, often combined with dacarbazine, or a combination of methotrexate with a vinca alkaloid [8].

On MRI, active desmoid fibromatosis is often heterogeneously isointense on T1-weighted images, heterogeneously hyperintense on T2-weighted images, and avidly enhances after intravenous gadolinium administration. Characteristic bands of low signal on all sequences are due to the distinctive histology of hypocellularity and abundant collagen, with high T2 signal and enhancement thought to reflect increased cellularity and active disease, whereas low T2 signal denotes collagenization and maturation [9, 10].

Changes in imaging features are often used as a marker for treatment response; however, there is a lack of consensus for what constitutes a good response [11, 12]. Historically, size-based criteria—such as Response Evaluation Criteria in Solid Tumors (RECIST)—have been used for treatment response, but such an assessment is insensitive to the biological response observed in some soft-tissue sarcomas [13, 14]. Desmoid fibromatosis has a unique histology in that as it matures or becomes biologically quiescent, it undergoes collagenization, which is reflected by increased T2 hypointensity and decreased enhancement [10]. We hypothesized that these changes might often occur independent of changes in the single largest dimension (Dmax). The purpose of this study is to evaluate the imaging characteristics of extra-abdominal desmoid fibromatoses after systemic therapy, and to identify if any size and signal-based MRI features are predictive of positive clinical outcome.

Materials and methods

Subjects

Institutional Review Board (IRB) approval was obtained for this study and in accordance with the requirements of a retrospective review; requirement for informed consent was waived. We retrospectively identified 70 consecutive patients with tissue-confirmed desmoid fibromatosis treated with systemic agents at our institution. Exclusion criteria included a lack of both pre- and post-treatment MRI, the absence of clinical follow-up at completion of therapy, or intra-abdominal disease. This resulted in exclusion of 47, and inclusion of 23 patients with 29 discrete extra-abdominal lesions. All lesions were sporadic; none of the patients carried a diagnosis of familial adenomatous polyposis syndrome.

Systemic therapy regimens

Cytotoxic chemotherapy regimens included methotrexate plus a vinca alkaloid drug (either vinorelbine or vinblastine): 10 patients, with a mean of 317 days of treatment; and a doxorubicin-based regimen (± dacarbazine): 9 patients, mean 135 days. Targeted chemotherapy included 1 patient each on maintenance imatinib (1,644 days), tamoxifen (343 days), and sorafenib (280 days). One patient was treated with nonsteroidal anti-inflammatory drugs (NSAIDs) for 253 days. Mean MRI intervals were 408 and 196 days for the methotrexate/vinca alkaloid and doxorubicin-based regimens respectively. The treating medical oncologist categorized the disease status as progressive, stable, or partial response based on a composite of clinical (symptoms, physical examination) and imaging findings obtained at the time of treatment. Aggressive fibromatosis rarely, if ever, completely responds.

Clinical response

Clinical assessment of response was based on subjective patient assessment of pain levels, or interference with activities of daily living, changes in physical examination findings, including range of motion and tumor firmness on palpation. These evaluations were carried out prospectively during routine clinical care visits, and then retrospectively identified. It should be noted that MRIs that were ordered for response assessment influenced a clinical judgment of response.

MRI evaluation

The latest pre-treatment and earliest post-treatment MRIs were reviewed to avoid confounding by interval tumor growth during periods off therapy or variations attributable to follow-up duration. Technical parameters of MRI varied considerably because this review also included patients referred to our tertiary care center with outside imaging studies (n = 11, DICOM files uploaded directly into the institutional PACS). Fat-suppressed proton-density- or T2-weighted, and fat-suppressed post-contrast T1-weighted sequences were used for the assessment of tumor size and signal characteristics.

Imaging parameters were recorded in consensus by a senior radiology resident and a fellowship-trained musculoskeletal radiologist (with 5 years’ musculoskeletal experience) blinded to clinical response. Maximum tumor diameter was recorded and response characterization was adapted from RECIST 1.1 (Fig. 1) [15], as follows:

Fig. 1
figure 1

A 47-year-old woman with desmoid fibromatosis of the right chest wall. Axial T1 post-contrast images of the chest a before and b after treatment with methotrexate/vinblastine shows a 35 % decrease in maximum diameter (11.3 to 7.3 cm; arrows), which meets RECIST 1.1 criteria for a partial response to therapy

  • Progressive disease (PD): at least a 5-mm absolute and at least a 20 % increase in diameter of the target lesion compared with baseline, or the appearance of new lesions

  • Stable disease (SD): neither sufficient shrinkage to qualify for partial response nor a sufficient increase to qualify for progressive disease

  • Partial response (PR): at least a 30 % decrease in the longest diameter of the target lesion compared with the baseline

  • Complete response (CR): disappearance of all target lesions

We deviated from strict methodology in which maximum lesion diameters are summed, and treated each lesion separately, given our small sample size and our interest in tracking individual tumor responses within the same patient. The subset of patients with multiple lesions was included in a subanalysis where the sum of maximum tumor diameters was used, as outlined in RECIST 1.1. Maximum orthogonal diameters were measured to calculate an approximate tumor volume for an ellipsoid given by the equation below:

$$ \sim Volume=\frac{1}{2}{D}_1{D}_2{D}_3 $$

This approximation is analogous to the 3D Children’s Oncology Group (3D COG) measurement recently shown to closely correlate with true volumetric calculations based on labor-intensive tumor segmentation [16]. Previous work by Mozley et al. showed 12.4 % standard deviation for volumetric tumor assessments, and we adopted their suggestion that ± 2 SDs be used to define partial response (−25 %) or progressive disease (+25 %) [17]

Finally, quantitative T2 hyperintensity and contrast enhancement values were calculated by drawing the largest possible circular region of interest (ROI) fully circumscribed by the tumor, using the image where maximum tumor diameter was observed. The mean signal intensity within this ROI was then normalized to the mean signal intensity in adjacent muscle, so that a tumor:muscle T2 hyperintensity ratio was obtained, similar to the modified Choi technique described by Stacchiotti et al. [13].

Statistical analysis

Statistical analysis was performed using Stata 13.1 (StataCorp, College Station, TX, USA). Tumor measurements and signal ratios from the pre-treatment and post-treatment examinations were compared using the paired t test for same-subject longitudinal comparisons, the two-sample t test, and ANOVA for comparing means among response groups. Changes in tumor size (unidimensional and volumetric) and signal (T2 hyperintensity and contrast enhancement ratios) correlated using Pearson’s correlation coefficient (r); according to Evans’ convention, Pearson′s correlation r < 0.20 is very weak, 0.20–0.39 is weak, 0.40–0.59 is moderate, 0.60–0.79 is strong, and > 0.80 is very strong [18]. Because the distribution of ratios is often skewed (decreases are restricted between 0 and 1, whereas increases are unbounded), a logarithmic transformation using the natural logarithm was performed for tumor:muscle signal ratios of T2 hyperintensity and contrast enhancement to normalize the variable distribution and facilitate parametric statistical analysis. The change in signal parameters after treatment was expressed as a ratio of post-:pre-treatment values. Subanalysis of treatment-specific changes in imaging features between cytotoxic chemotherapy regimens was carried out. For all analyses, results were considered statistically significant at p < 0.05.

Results

Subjects

Patients’ mean age was 40.7, range 19–83 years; there were 8 men and 15 women. Ten patients had undergone surgical treatment of lesions with subsequent recurrence before the start of chemotherapy. The locations of the 29 tumors included 7 in the thigh or gluteal region, 8 in the lower leg, 5 in the upper extremity, and 9 in the extra-abdominal trunk. Three lesions (in 2 patients) had clinical progression; 5 lesions (in 4 patients) were judged to be clinically stable disease, whereas 21 lesions (in 17 patients) had a clinical response.

Size-based response

Table 1 summarizes the results of maximum diameter and volumetric changes according to lesion response. For a unidimensional response, only 2 lesions showed a decrease of >30 % in the maximum diameter necessary to be classified as a partial response in RECIST 1.1; the other 27 out of 29 lesions would be classified as stable (mean −6.8 % decrease). In a subanalysis of the 5 patients with multiple lesions (n = 11), the sum of the longest diameters of individual lesions demonstrated stability in all patients by RECIST 1.1 criteria (mean −9.0 %, range: −18.3 % to +3.0 %). Maximum tumor differences among the three clinical response categories did not reach statistical significance (p = 0.11, ANOVA) (Fig. 2a). When clinical response was dichotomized to PR/SD vs PD, PD demonstrated a mean decrease of −0.33 % in maximum diameter, vs −9.6 % for PR/SD (p = 0.25, t test). Importantly, whether lesion maximum diameters were treated individually, or summed when in the same patient, the application of RECIST criteria failed to detect clinically apparent deteriorating disease.

Table 1 Size changes during treatment according to clinical response
Fig. 2
figure 2

Box plots show a relative change in the maximum diameter (Dmax) and b a relative volume decrease for lesions stratified by clinical response. Although the change in Dmax was significant for partial responders, this assessment failed to discriminate those with progressive disease, and the overall differences among response groups was insignificant for Dmax (p = 0.11, ANOVA). For volumetric measurements, those classified as having a positive clinical response (n = 21, mean −29 %) were significantly different from those classified as either stable (−19 %) or progressive (+33 %), p = 0.002 (ANOVA). Each box defines the 75th and 25th percentiles, with a median represented by the horizontal line within the box; upper whisker marks represent the largest value ≤ upper quartile + 1.5*interquartile range; the lower whisker marks represent the lowest value ≥ lower quartile – 1.5*interquartile range. Small points represent outliers beyond these ranges

Volumetric assessment more closely aligned with clinical judgments of response; positive responders averaged 132 ml (SD = 188 ml) before, and 90 ml (SD = 121) after treatment. More modest decreases were observed in the group of clinically stable lesions, whereas those lesions that progressed showed an average increase in volume of almost 33 %. Figure 2b shows that the volumetric decrease in lesions that responded positively (−29.4 %) was significantly greater compared with those that were stable (−19.2 %) or progressive (32.5 %), p = 0.002 (ANOVA). Change in maximum diameter strongly correlated with change in tumor volume (Pearson’s r = 0.66); Fig. 3 highlights a counter-example to this trend.

Fig. 3
figure 3

A 76-year-old woman with posterior calf desmoid fibromatosis; axial fat-suppressed proton-density-weighted images a before and b after treatment with methotrexate/navelbine. Maximum tumor diameter decreased by only 2 % (black and white arrows), but the approximate volume decreased by 43 % because most tumor shrinkage occurred on a minor anterior–posterior axis

When clinical response was dichotomized, PR/SD demonstrated a statistically significant volumetric decrease (−27.4 ± 23.6 %) vs PD (+32.5 ± 32.4 %, p < 0.001, t test). With the threshold set at a 25 % volumetric change, 14 showed PR, and of these, 12 were deemed to be responsive clinically; 13 showed stable disease and of these, 3 were actually stable, but a fourth was deemed progressive. Finally, the 2 tumors that reached the +25 % 3D threshold for PD characterization matched clinical impressions for PD in both cases. Numbers of subjects categorized as having PR, SD, or PD according to RECIST 1.1, volumetric analysis, and clinical judgment are summarized in Table 2.

Table 2 Number of subjects with response characteristics based on RECIST 1.1 criteria, 3D volumetric approximation, and global clinical assessment

Signal-based response

Figures 4 and 5 show examples of change in T2 hyperintensity and contrast enhancement respectively, with relatively little change in tumor size. Changes in T2 hyperintensity and contrast enhancement among response groups are depicted in Fig. 6. Among responders, the mean difference in the T2 hyperintensity ratio was −0.688, or a 50 % decrease after treatment (range −83 % to +59 %). Patients with stable disease showed a similar mean reduction in T2 hyperintensity (54 %); however, the three patients with progressive disease demonstrated a mean difference of −0.103, or only an approximately 10 % decrease (range −44 % to 28 % increase). When dichotomized to positive response/stable disease vs progressive disease, these differences were statistically significant (p = 0.049, t test).

Fig. 4
figure 4

A 31-year-old woman with desmoid fibromatosis of the left gluteal region shows a decrease in T2 hyperintensity after treatment. Axial proton-density-weighted, fat-saturated images of the left gluteus region demonstrating a pre- and b post-treatment changes to the tumor after chemotherapy with methotrexate/navelbine. Note that the maximum dimension is unchanged (white arrows); however, the substantial decrease in T2 hyperintensity laterally (asterisk) indicates increased collagenization and response to therapy

Fig. 5
figure 5

A 51-year-old woman with desmoid fibromatosis of the right forearm. Axial T1-weighted fat-saturated post-contrast images a pre- and b post-treatment with methotrexate/vinblastine demonstrate decreased overall contrast enhancement (arrows), suggestive of partial response. Again, note the relative lack of change in the size of this lesion

Fig. 6
figure 6

Box plot shows changes in T2 hyperintensity and contrast enhancement after treatment. Signal intensity ratios were logarithmically transformed to facilitate statistical analysis; negative values indicate decreased lesional T2 hyperintensity or enhancement (relative to muscle). There was a weak trend toward decreased T2 hyperintensity in lesions that responded, but this did not reach statistical significance (p = 0.14, ANOVA); there was no difference in contrast enhancement among response groups (p = 0.57)

Results for contrast enhancement ratios were broadly similar, with responders showing the greatest decrease (mean ± SD difference of −0.256 ± 0.45, corresponding to a mean decrease in signal ratio of 23 % ± 57 %) compared with those with progressive disease where contrast enhancement was essentially unchanged (mean ± SD difference of 0.0096 ± 0.21, corresponding to a mean decrease in signal ratio of 0 % ± 23 % between pre- and post-treatment values). However, these results were not statistically significant after the response was dichotomized as above (p = 0.37, t test).

Size and signal response comparisons

T2 hyperintensity and contrast enhancement changes correlated strongly with each other (r = 0.75). T2 hyperintensity change correlated moderately with volumetric change (r = 0.40), but weakly with maximum diameter (r = 0.25). Contrast enhancement correlated weakly/very weakly with changes in tumor size (r = 0.26 and −0.07 for volumetric and unidimensional changes respectively). Neither T2 signal intensity nor enhancement ratios were significantly different among RECIST categories: the mean T2 signal ratio decreased 47 % in SD (n = 27) vs 49 % in PR (n = 2) (p = 0.92), and the mean enhancement ratio decreased 18 % in SD (n = 23, as not all scans showed contrast enhancement) vs 17 % in PR (n = 2; p = 0.95). However, the value of examining signal changes becomes more apparent when changes in T2 hyperintensity are plotted against changes in tumor volume (Fig. 7): patients with progressive disease exhibited both little or increased T2 hyperintensity and increased volume, whereas 18 out of 21 PR lesions showed regression not only in volume, but also in signal intensity.

Fig. 7
figure 7

Plot of change in T2 hyperintensity (as in pre-treatment/post-treatment T2 hyperintensity ratios) vs percentage change in tumor volume; “0” indicates stability on both the x-axis and the y-axis. The plot demonstrates that those lesions deemed progressive lie in the upper outer quadrant (shaded region), reflecting both increased volume and stable/slightly decreased T2 ratios. Moreover, the plot highlights that 18 out of 21 lesions with PR and 3 out of 5 lesions with stable disease cluster in the lower inner quadrant, reflecting both decreased size and signal, suggesting that a multiparametric approach to desmoid characterization could outperform conventional unidimensional criteria in stratifying therapeutic response

Regimen-specific response

Imaging parameters between chemotherapy regimens only differed with regard to change in T2 hyperintensity, with the methotrexate/navelbine group showing a mean relative decrease in T2 signal of 58 %, compared with doxorubicin ± dacarbazine, which showed a mean decrease of 29 % (p = 0.01, t test). Otherwise, differences were insignificant for a change in maximum tumor diameter (−12.3 % vs −7.3 %, p = 0.40, t test), tumor volume (−35 % vs −19 %, p = 0.17), and contrast enhancement (−28 % vs −10 %, p = 0.27, t test). However, owing to the limited number of subjects in each group, no attempt was made to adjust for confounders, such as previous treatment failures in this analysis.

Discussion

Despite having no metastatic potential, desmoid fibromatosis has long posed treatment dilemmas because of the morbidity that arises from local aggressive behavior. As yet, there is no consensus on the optimal management of these lesions, partly because of their unpredictable clinical behavior and location-dependent potential for morbidity [1921]. Surgical excision as the first-line treatment is falling out of favor because of the high rates of local recurrence [22], and evidence that in many cases eventual tumor maturation and quiescence [23], or even spontaneous regression [24], may justify a cautious observational approach. However, because it remains difficult to predict which tumors follow an indolent course or recur after surgery [25], nonsurgical treatments that promote quiescence or regression are critical to avoiding clinical scenarios where watchful-waiting permits unchecked tumor growth and rising morbidity.

Our results show that decreases in tumor volume and T2 hyperintensity reflect the positive response of desmoid fibromatosis to systemic therapy. Importantly, the magnitude of these radiological changes suggests that they could potentially outperform the unidimensional-based standards for determining treatment response, as laid out in RECIST 1.1. Moreover, changes in T2 signal and enhancement patterns occur relatively independently of change in tumor size, suggesting that both morphological and functional features might be used to fully characterize imaging response.

The RECIST criteria were introduced in 2000 to standardize and simplify tumor response criteria, which was revised in 2009 as RECIST version 1.1. To qualify as partial response, there must be a ≥30 % decrease in the sum of the longest diameters of target lesions compared with baseline [15]. However, for some tumors these criteria are neither sensitive nor accurate for detecting treatment response. For example, Canter et al. reported that 0 out of 25 soft-tissue sarcomas treated with neoadjuvant radiotherapy showed any response according to RECIST [26]. Desmoid fibromatosis behaves similarly, exhibiting diminished tumor volume with treatment, despite relative constancy in maximum diameter. If maximum diameter is to be used as a response criterion, our data show that revising one-dimensional thresholds downward is probably justified for partial response, although Dmax still shows limited capacity to effectively discriminate tumor burden increases indicative of progressive disease (Fig. 2a).

A more comprehensive imaging assessment should also include features of internal signal change, particularly with desmoid fibromatosis. Choi et al. developed the prototypical example of this kind of analysis, and showed that a decrease in size of more than 10 % or a decrease in Hounsfield units of greater than 15 % had a sensitivity of 97 % and specificity of 100 % in detecting a good response in gastrointestinal stromal tumors, whereas RECIST substantially underestimated the response [14]. Stacchiotti et al. used modified Choi criteria to achieve much higher sensitivity than RECIST in predicting a good (88 % vs 32 %) and very good (82 % vs 41 %) histological response in high-grade soft-tissue sarcomas [13]. Their methodology was based on manually constructing ROIs around entire tumor volumes, and measuring the average tumor signal intensity relative to muscle signal intensity as a reference; it is unclear whether they accounted for differences in ROI area when computing average signal values. To avoid that problem, we adapted their methodology and selected the single slice where maximum tumor diameter was observed to construct the ROI. However, we acknowledge that complete tumor segmentation would likely allow more reproducible assessment of size and signal changes, better reflect desmoid morphology, which often deviates from that of a true ellipsoid, and would be particularly well-suited to addressing the degree of stromal heterogeneity through calculations of first- or even second-order texture analysis.

Our results echo these data in confirming the poor sensitivity of RECIST in detecting early, favorable biological changes induced or at least accelerated in desmoid fibromatosis by our systemic treatment protocols. Although RECIST thresholds were chosen in part to minimize type 1 errors and increase the reproducibility of response categorization, in desmoid tumors, RECIST implementation risks over-commitment of type 2 errors, i.e., failing to identify a treatment effect when one actually exists. We found that only two tumors would have been classified as exhibiting a PR, despite the fact that most tumors showed substantial decreases in volume and, in aggregate, showed significant decreases in T2 signal and contrast enhancement. These patterns of MRI signal change reflect increasing collagenization in the tumor [27], and our results contrast with those of Castellazzi et al., who reported relative stability in size and MRI signal characteristics of desmoid fibromatosis over time, regardless of systemic treatment [12]. However, this may be partly due to a higher proportion of lesions subjected to cytotoxic chemotherapy—24 out of 29 in our study compared with 18 out of 29 in the Castellazzi study—and the fact that ordinal RECIST categories were used, rather than the continuous data we report here.

Among the oncologist’s armamentarium for desmoid fibromatosis are NSAIDS, tamoxifen, tyrosine kinase inhibitors, and chemotherapy. A full review of the results of previous trials with these agents is beyond the scope of this article, but Lev et al. recently described high rates of at least a PR to both tamoxifen and chemotherapy [28]. Treatment with imatinib has been shown in several phase II trials to result in a median time to progression of 9–25 months [29]. Recently identified molecular markers may help to direct individualized treatment regimens to maximize therapeutic response, although such information would require tissue sampling. Higher nuclear expression of beta-catenin, and specifically the S45F point mutation, inversely correlated with meloxicam (a selective COX-2 inhibitor) efficacy [30]. The advent of these therapies demands an objective, non-invasive means of assessing response, for which MRI is well-suited. For example, several trials of systemic therapy have relied on primary endpoints defined by RECIST [29, 30]. The results reported here, in highlighting inherent limitations in the RECIST1.1 evaluation of desmoid fibromatosis, could have an impact on both clinical management and trial protocols, as the underestimation of regimen efficacy could result in the premature and unjustified termination or alteration in treatment strategy.

We acknowledge several limitations with the current study, foremost among them being the difficulty in establishing a clinical response independent of imaging findings in this retrospective study. Unlike malignancies for which histological necrosis in the resected tumor specimen can provide a reference standard, desmoid tumors in this patient population were not resected after therapy. Additionally, size alone may not be a good predictor of symptomatology and morbidity owing to patient-specific factors, making it exceedingly challenging to evaluate radiological findings in isolation. We believe that the clinical assessment used here based on a composite of radiological and physical examination findings, and patient signs/symptoms, was the best endpoint available within the limitations of this retrospective study design. Ideally, a comprehensive clinical instrument should be developed for measuring subjective and objective aspects of tumor response and would include a patient-reported survey (e.g., the Short Form 36 Health Survey, SF-36 [31]), physical examination findings (e.g., Musculoskeletal Tumor Society [MSTS] score), and objective imaging findings.

There is a strong selection bias in the subjects included in this study, who were referred to a tertiary care center, frequently because of multiple failures with previous treatments. It is possible that the responses observed here underestimate the magnitude of treatment changes that might be seen if more treatment-naïve patients were to be included. Moreover, significant reductions in tumor burden may continue months or years after systemic therapy has been completed, leading to underestimation of the efficacy in this study, where the response interval was defined more narrowly by the dates of therapy administration [32]. Only three lesions progressed; thus, our observations as they pertain to disease progression should be regarded as provisional. Future work should focus on serial evaluations and surveillance to determine the clinical (e.g., CTNNB1 or APC mutation status) and true volumetric imaging features most predictive of durable response. Moreover, incorporation of advanced MRI techniques, such as diffusion-weighted imaging and dynamic contrast-enhanced MRI [33] would provide additional quantitative parameters for tumor characterization.

Conclusion

Desmoid fibromatosis response to systemic therapy is better characterized by changes in volume and T2 hyperintensity than by maximum dimension alone, as only 2 out of 29 lesions qualified as exhibiting a partial response according to RECIST 1.1. Decreases observed in T2 hyperintensity and contrast enhancement occurred relatively independently of changes in size. These findings indicate the necessity of a multiparametric approach to response evaluation in the setting of systemic therapy for desmoid fibromatosis.