Introduction

Soft-tissue sarcomas are malignancies of mesenchymal origin and are exceedingly rare [1]. In 2010, it is estimated that 10,520 new cases of soft-tissue sarcoma were diagnosed and 3,920 patients died in the United States [2]. The rarity of these neoplasms has made it difficult to determine predictive factors of post-surgical survival time in high-grade soft tissue. However, known risk factors for post-surgical mortality include both the histology of the lesion and the presence of metastasis before or after the time of surgical resection [1]. While there have been great strides in improving outcomes for some sarcomas, i.e. osteosarcoma, there has unfortunately been less progress made in the medical treatment of soft-tissue sarcoma [3, 4]. There is a complex interplay of factors related to tumor biology, intervention measures, and individual patient characteristics that determines overall prognosis with regards to soft tissue sarcomas. Because high-grade soft tissue sarcomas currently have a 40–60% incidence of developing metastatic disease and a similarly high rate of mortality, research interests in this field have intensified [57]. Improved treatments and outcomes for these cancers likely can be achieved with more accurate identification of prognostic variables. Multiple prognostic variables have been extensively published in the literature and include clinical and intervention factors such as age, gender, symptoms at presentation, anatomic location, histology, treatment course, and surgery type [818]. Many factors predict endpoints such as metastasis, local recurrence, death, and disease free survival. Tumor grade [5, 6, 10], tumor depth [5, 6, 8, 9], tumor size [510], and local recurrence [5, 6, 19, 20] have been found to be predictors for the development of distant metastasis. Causal relationships and factors predicting local recurrence have been more difficult to establish, but depending on the study include microscopic resection margins [57, 10, 15, 19], grade [57, 10, 15, 21], recurrence at presentation [5, 10, 14], age [5, 10, 14], peri-operative blood transfusions [22], and location in the body [5, 1416, 23].

Although these data are generally accepted as risk factors, other studies have shown them to be insignificant [11, 20, 2426] Larger tumor size [5, 6, 810, 13, 14], tumor grade [8, 1012], advanced patient age [6, 1416], and inadequate margins [6, 12, 13, 16, 18, 27] have consistently been inversely correlated with overall survival. Although also somewhat controversial, biopsy at a facility other than the treating facility has been shown to correlate with poorer patient outcomes [28, 29].

While it is certainly true that many of these factors significantly contribute to a patient’s risk for post-surgical mortality, these and effects of other factors (such as the patient’s gender, race, age, pre-operative or post-operative radiation, pre-operative or post-operative chemotherapy, positive or negative surgical margins, location of the tumor, estimated blood loss, and the need for an operative blood transfusion) should be re-examined to better identify which, if any, are most predictive of post-resection survival. Data were collected from 129 patients diagnosed with high-grade soft-tissue sarcomas at our institution between February 2002 and June 2010 in order to investigate the effects of these variables on post-surgical survival. Each patient underwent surgical removal of a malignant high-grade soft-tissue sarcoma and was followed over time with the primary endpoint being death related to high-grade soft tissue sarcoma. The goal of this project is to use these data to examine variables and determine the best predictors of survival time and develop a Cox Survival Analysis model for predicting post-surgical survival time for high-grade sarcoma patients from the time of surgical excision.

Materials and methods

This retrospective evaluation was IRB approved (Protocol #2010C0011). The data used for this project were compiled from 129 patients admitted to our institution for surgery to remove a high-grade soft tissue sarcoma. At the time of surgery, specific patient information and histopathologic findings were recorded, and each patient was followed after their surgery with the primary endpoint being tumor related death. Summary demographics are shown in Table 1. A total of 118 patients was included in our final survival analysis. Eleven patients were excluded. Seven of the 11 patients were removed from analysis because information pertaining to the patient’s pre-operative or post-operative radiation and chemotherapy could not be ascertained from the medical records. Three of the eleven patients were removed from analysis because a local recurrence had occurred prior to the patient’s original surgery at our institution, meaning the patient’s primary sarcoma resection was not performed at our institution. The final patient was removed from analysis because an estimated blood loss value could not be found from the operative report or in the medical records.

Table 1 Patient demographics and clinical data

The outcome of interest in the statistical analysis of this data was time to sarcoma related death following surgery, survtime, which was coded as a continuous variable in number of days. Censoring is very important in time-to-event analysis, thus it was be specified by the variable death, which indicated whether the patient died from a sarcoma-related cause: 1 if yes, 0 otherwise. There were 13 variables considered for this study as prognostic factors of time to sarcoma-related death: pre-operative radiation therapy (preoprad), postoperative radiation therapy (postoprad), pre-operative chemotherapy (preopchemo), postoperative chemotherapy (postopchemo), patient age (age), patient gender (sex), patient race (race), pre-operative metastatic disease (pres_mets), tumor size (size), postoperative surgical margin (margin), tumor location (location), operative blood transfusion (op_units), and estimate blood loss (ebl). Because the outcome variable is a continuous variable involving survival time, time-to-event analysis (survival analysis) was used to determine whether these variables could be related to survival time or sarcoma-related death.

In order to determine which of the 13 variables provide the best prediction of survival time, we built a Cox Proportional Hazards model using the forward selection technique. This analysis was performed using STATA 9.2 statistical software. Before building the model, we considered the proportional hazards assumption for all of the variables using the Likelihood Ratio Test (LR Test).

The significance level used was α = 0.05. After testing the proportional hazards assumption for each variable, we first built a Cox Regression model by considering a univariate model for locrtime and survtime (or bivariate model if the covariate is time-dependent as discovered by testing the proportional hazards assumption). We ran a Cox model for each of the 13 predictor variables taken individually.

After testing and validating our final model, we generated survival curves and interpreted the relative risks for each risk factor using its hazard ratio. Coefficients for each risk factor were calculated by taking the natural logarithm of each variable’s hazard ratio. We then summed these coefficients and multiplied them by their respective increases in value from the final model. Taking the antilog of these values yielded the relative risk for each risk factor.

Results

The outcome of interest is time to sarcoma-related death following surgery. Forty-two patients (35.59%) died from a sarcoma-related death, while the remaining 76 (64.41%) patients were censored because they were alive at the end of follow up. Summary statistics for the continuous variables, dichotomous, and categorical variables are provided in Tables 2 and 3, respectively.

Table 2 Summary statistics for the continuous variables
Table 3 Summary statistics for the dichotomous and categorical variables

At the start of the forward selection technique in building the main effects model, a univariate Cox model considering each of the 13 risk factor variables was run individually. Of the 13 variables, it was discovered that size yielded the lowest AIC value of 341.2038 (LRχ2 = 13.21, P value = 0.0003). After incorporating size into the model and analyzing the remaining 12 variables in a bivariate model, the addition of the variable pres_mets gave the lowest AIC value of 335.134 (LRχ2 = 21.28, P value = 0.0000). Further Cox analysis of a trivariate model containing size, pres_mets, and the remaining eleven variables did not result in a lowering of the AIC value. The main effects model thus contained the following variables: size and pres_mets. The other eleven variables provided either collinear information to variables already included in the model or no statistically significant improvement of prediction for survival time.

After obtaining the main effects model, interaction between these variables was then investigated. There was only one plausible interaction variable that could be created using the two determined variables of the main effects model involving the interaction between size and pres_mets. The biological plausibility of this interaction lies in the fact that larger tumors are more likely to metastasize [58, 12, 26]. This interaction term was named size × presmets and was added to the main effects model. It was found that this interaction variable did not yield a lower AIC value than the main effects model. Thus, the interaction variable was not added to the main effects model.

The final model using Cox proportional hazards calculation for the prediction of survtime included the non-time-dependent variables size and pres_mets. The equation for our predictive model was:

$$ \ln \left[ {h\left( t \right)/h_{o} \left( t \right)} \right] \, = \beta_{1} x_{1} + \beta_{2} x_{2} +_{{}} \varepsilon_{0} $$

where β 1 and x 2 are the coefficient and indicator variable of tumor size (greater or less than 8 cm in any cross section), β 2 and x 2 are the coefficient and indicator variable of whether metastasis was present at the time of surgery, and ε0 is the random error contained within the model. This equation models the log of the hazard at time t given a set of variables x i . Variable parameter estimates and the corresponding hazards ratios are illustrated in Table 4. Substitution of these values into the equation gives:

Table 4 Parameter estimates and hazard ratios for all model variables
$$ \ln \left[ {h\left( t \right) \, / \, h_{0} \left( t \right)} \right] \, = \, 1.147 \times {\text{size }} + \, 1.244 \times {\text{pres\_mets}} $$

This model can be used to evaluate the hazard ratio between any two individuals with a high-grade soft tissue sarcoma based on differences in tumor size and presence of pre-surgical metastatic disease. The hazard ratio can be obtained by taking e(Σ(βj Δj)), where beta represents the coefficients of each variable and delta represents the unit difference between two individuals for each respective variable. These results are summarized in Table 4.

The hazard ratios for the variables size and pres_mets are 3.149 and 3.468, respectively. A high-grade soft tissue sarcoma larger than 8 cm in any cross section carries a risk for death that is 3.149 times higher after controlling for other variables. With the presence of pre-surgical metastatic disease, a patient’s risk for high-grade sarcoma-related death is 3.468 times higher than a patient who does not have pre-surgical metastasis, after controlling for the other variables in the model. The relative risk (RR) of post-surgical sarcoma related death can be estimated based on values for both predictor variables. The overall RR of an individual can be obtained by taking the antilog of the sum of the coefficients multiplied by the increase in the respective indicator variables. Therefore, the relative risk of a patient who has a tumor size larger than 8 cm and pre-surgical metastatic disease is:

$$ \begin{aligned} \ln \left[ {h\left( t \right) \, / \, h_{0} \left( t \right)} \right] = & 1.147 \times 1 + \, 1.244 \times 1 \\ = & 1.147 \, + \, 1.244 \, = \, 2.391 \\ \end{aligned} $$
$$ {\text{e}}^{2.391} = \, 10.924 $$

This means that a person with a tumor size larger than 8 cm and pre-surgical metastasis is 10.9 times more likely to die of a sarcoma-related death than an individual who had a tumor smaller than 8 cm without pre-surgical metastasis.

The overall Kaplan–Meier survival curve and Nelson-Aalen cumulative hazard estimate are demonstrated in Figs. 2 and 3 of the Appendix. As interpreted by the Kaplan–Meier curve, approximately 67% of patients will be free from a sarcoma-related death after 1,000 days post-surgically, and approximately 50% of patients will be free after 2,000 days post-surgically. Conversely, the Nelson–Aalen curve demonstrates that approximately 40% of patients will succumb to a sarcoma-related death 1,000 days post-surgically and approximately 60% by 2,000 days. Kaplan–Meier curves for the non-time-dependent model variables are demonstrated in Figs. 4 and 5 in the Appendix. From the Figures, it is clear that cumulative survival is higher if tumor size is less than 8 cm and the patient is free of pre-surgical metastasis.

Discussion

There were three essential objectives of this investigation. First, we evaluated potential factors that contribute to postoperative mortality in patients with a high-grade soft tissue sarcoma. Secondly, we constructed a Cox proportional hazards model in order to identify important predictors of time to high-grade sarcoma-related death around the time of surgical excision. Finally, we used these predictor variables to quantify relative risk for mortality.

Soft tissue sarcoma is a relatively rare neoplasm. With a mortality rate near 50%, patients with soft tissue sarcoma are in need of an accurate prognosis. With accurate prediction, patients at low risk for disease-specific death can be safely reassured, whereas patients at high risk can be considered for adjuvant systemic therapy. Several studies have identified prognostic factors in soft tissue sarcoma in the past. Although knowledge of these is useful for research and clinical trial design, patient counseling requires integration of the various prognostic factors to arrive at a single prognosis for the individual patient. The simple counting of a patient’s risk factors does not optimize the information available for prediction. Counting risk factors assumes each factor has equal weight and would require that a continuous variable, such as patient age at diagnosis, be categorized for counting, which loses information. A team at Memorial Sloan-Kettering Cancer Center created such a nomogram for patient counseling (Fig. 1) [30]. The nomogram predicts the probability that the patient will die of soft tissue sarcoma within 12 years of surgery, assuming he or she does not die of another cause first. The value in the nomogram lies in the fact that it seems to predict disease-specific death more accurately than would be achieved with straightforward subset analysis with the Kaplan–Meier method. The nomogram could be used to identify patients by computing their probability of sarcoma-specific death at 12 years, followed by offering adjuvant therapy to those whose prediction is higher than a predetermined amount, which is treatment-dependent.

Fig. 1
figure 1

A nomogram for patient counseling developed by specialists at Sloan-Kettering. From: Kattan et al. [30]

Our study supports many of the associations between each predictor variable and sarcoma specific death employed by the nomogram. The current study collaborates well with the nomogram in multiple variables including the association of worsening prognosis as the tumor size increases. Deep tumors seem to be somewhat less favorable than superficial tumors. Patients with tumors of the extremities seem to do better than those with tumors located in other sites, i.e. centrally occurring. Older patients have a higher sarcoma-specific death prediction than younger patients. Finally, it is easy to see the shift in prognosis associated with grade of the tumor. Patients with low-grade disease and intermediate to high death predictions would have substantially higher death predictions with high-grade disease. The morbidity and mortality of high-grade histology is what prompted limiting our current study to high-grade disease. Even though the nomogram suffers from several weaknesses, it currently provides the most accurate predictions presently available. The nomogram predicts better than chance (P < 0.05) and better than subset analysis.

Of the 13 variables we investigated, only two variables were found to provide a statistically significant prediction of survival time: tumor size and the presence of pre-surgical metastatic disease. Preoperative or postoperative radiation or chemotherapy did not provide a statistically significant prediction of survival time. The use of chemotherapy in the treatment of soft-tissue sarcomas continues to have mixed results [21, 3133]. Regarding radiotherapy, previous studies have shown improved local control with no significant improvement in patient survival [6, 34]. Patient age, gender, and race also did not provide a statistically significant prediction of survival time that correlates with previously studied cohorts [5]. However, other study cohorts show a positive correlation with these variables [6, 8, 35]. Moreover, tumor location, requiring an operative blood transfusion, and estimated blood loss were not found to be predictors of survival time which is consistent other large cohort studies [5, 6, 35, 36] Surprisingly, the presence of positive surgical margins did not provide a statistically significant prediction of survival time. However, the presence of positive margins and its effect on survival remains controversial [21, 23, 3543]. The results of the final model demonstrated that tumor size and presence of pre-surgical metastasis provided the best Cox proportional hazards model for the prediction of survival time. These variables did not violate the proportional hazards assumption, meaning their hazard functions were not significantly different over time.

Our final model demonstrates that tumor size larger than 8 cm in any given cross-sectional measurement carries a risk for a sarcoma related death that is 3.149 times higher, after controlling for the other variables. This vast increase of risk exemplifies the importance of discovering and treating soft tissue sarcoma early, before its size can increase. Moreover, our model reveals that the hazard ratio comparing individuals with and without metastatic disease prior to surgical excision is 3.468. This means that the risk of dying from a high-grade sarcoma for a patient with pre-surgical metastasis is 3.468 times the risk for a patient without pre-surgical metastasis, after controlling for tumor size. This is consistent with Fig. 5, which demonstrates that cumulative survival is higher for patients without pre-surgical metastatic disease. Previous research has consistently demonstrated that tumor size and presence of metastasis at presentation are negative prognostic factors for patient survival [6, 8, 17, 26]. Kaplan–Meier survival curves for analyzed dichotomous variables not included in the main effects model are shown in Figs. 6 through 14. While survival appears to be different in the analysis of these variables, none of these variables were statistically significant predictors of survival in our multivariate analysis and thus were not included in the model.

On the surface, the internal validity of this study is very good. All 13 variables considered were objectively obtained, eliminating the risk of recall bias. The sample size was relatively small (118 patients), but of the patients who dropped out of the study or were lost to follow up, 42 died from sarcoma-related death, and 76 were alive at the end of the study. The high retention rate of this study supports its internal validity. In addition, the content validity was appropriate in that 13 well-researched and established variables affecting survival in sarcoma were considered. The results of this model are not surprising and clearly corroborate established orthopaedic oncologic literature on high-grade soft tissue sarcoma. However, its calculations of hazard ratios could be used to provide advanced prognostic measures that may serve to guide a more personalized and effective treatment regimen.

While our model is certainly consistent with the literature regarding prognostic factors of survival time in malignant sarcoma, it is not without its limitations. While the internal validity seems fine on the surface, as with any study, this study may be subjected to selection bias and confounding. There was no randomization process involved in selection, which may subject the study to selection bias, and we did not control for cancer stage to prevent confounding. Moreover, the data for this study were collected from one musculoskeletal oncology surgeon at a single institution, which raises some questions regarding the external validity of the study. Particular variables involved in our study, such as race and age are likely to be very different in other parts of the world. Moreover, factors such as preoperative and postoperative radiation and chemotherapy, as well as surgical margins, estimated blood loss, and operative blood transfusions are somewhat hospital and surgeon dependent.

Summary

This study successfully created a Cox proportional hazards model using data taken from 118 high-grade sarcoma patients to predict the survival time to a sarcoma related death. It was discovered that at the time of surgical removal of a soft tissue sarcoma, tumor size and the presence tumor metastasis both affect the hazard rate. The model satisfied the underlying assumptions of Cox proportional hazard analysis and its results were significant. Hazard ratios agreed with what is already known in the orthopaedic oncology literature regarding high-grade soft tissue sarcoma: tumor size and the presence of metastasis at the time of surgery are important prognostic factors affecting survival. The hazard ratios calculated from this model could be used for prognosis and would help enhance treatment regimens for sarcoma patients. This study could also be used to guide future research regarding additional prognostic factors of high-grade sarcoma survival.