Introduction

Osteosarcoma (OS) is an invasive malignant tumor in the skeletal system, with rapid progression and poor prognosis, and has become a major fatal disease in children and adolescents [1]. OS often occurred in long epiphysis of extremities, such as the proximal tibia and distal femur. It grows rapidly, causes large bone defect and motor function limitation, and leads to pulmonary or skeletal metastasis, resulting in a significant reduction in the 5-year overall survival rate [2]. Despite the great advances have been made in surgery and chemotherapy in recent years [3], patients with metastatic and recurrent OS was still very depressed, the survival rate of 5-year was less than 25% due to the lack of prognosis assessment and therapeutic strategies efficacy [4]. Therefore, an urgent need exists to investigate the pathogenic mechanisms of OS and establish a reliable predictive model for the prognosis of OS patients.

Dysregulation of cellular energy or metabolic reprogramming was a striking feature of solid tumor. Normal cells proliferated at a limited rate, while tumor cells proliferated rapidly and without restriction, so tumor cells needed to produce more energy to meet their proliferation needs. Whether oxygen content was sufficient or not, tumor cells tended to go through anaerobic glycolysis rather than aerobic oxidation capacity, which was known as the Warburg effect [5]. Based on the clinical observations, the sugar uptake of tumor tissues was significantly increased relative to adjacent normal tissues [6], so glycolysis was considered as a potential prognostic indicator of tumors [7]. Recently, glycolysis has been proved to be closely related to the occurrence and development of OS [8]. Han et al. showed that the effect of TUG1 on the viability of OS cells was affected by glycolysis [9]. Additional evidence indicated that restriction of glycolysis could inhibit the metastasis of OS in vivo and in vitro [10]. Moreover, tumor glycolysis and the tumor microenvironment (TME) or immune evasion were interdependent [11, 12]. Obviously, we could conclude that if tumor cells consumed a large amount of glucose for glycolysis, the immune activity in the TME would be limited by the glucose competition between immune cells and tumor cells [13]. Currently, there are few studies on the relationship between glycolysis and immune response in OS. Therefore, a more complete understanding of the interaction between glycolysis and immune response in OS was necessary, which would open up new avenues for OS treatment and prognostic strategies.

Accumulating literatures have reported OS prognosis prediction methods. For example, Yang et al. established glycolysis-related risk signature based on the three genes P4HA1, ABCB6, and STC, which could predict the prognosis of OS patients and provided us with new ideas [14]. However, the prognostic model of OS based on glycolysis-immune comprehensive evaluation has not been reported. In present study, we focused on constructing a prognostic model based on GIG, which could be used to predict the prognosis of OS patients.

Materials and methods

Study objects

We downloaded mRNA data of 53 patients (GSE21257) and 88 patients from Gene Expression Omnibus (GEO, https://www.ncbi.nlm.nih.gov/geo/) and Therapeutically Applicable Research to Generate Effective Treatments (TARGET, https://ocg.cancer.gov/programs/target) databases respectively. The clinical information of 53 patients was shown in Table 1, and 84 of the 88 patients had complete survival information.

Table 1 Clinicopathological characteristics of Osteosarcoma patients from GSE21257 database

Differentially expressed genes (DEGs) analysis

The limma package of R software (version4.1.0) [15] was used to screen DEGs from patient samples with the |log2FC| > 0.5 and adj. p.val < 0.05 (the corrected p value wasrepresented by adj. p.val).

Functional enrichment analysis

GO (including Molecular Function, Cellular Component, and Biological Process) and KEGG enrichment analyses were conducted on the DEGs by clusterProfiler package in R software (p < 0.05 was considered significantly enriched).

Immune Score

The R software was used to cluster OS patients with complete clinical survival information based on their mRNA expression profile data via “K-means” method. And the “ESTIMATE” package was used to calculate the Immune Score of OS samples.

LASSO Cox regression analysis

The univariate Cox regression analysis was performed on gene expression values of OS samples, and p value < 0.05 was used as the threshold to screen genes significantly associated with OS prognosis. Then, LASSO Cox regression was conducted with GLMNET package of R software [16] to further optimize the genes related to OS prognosis. The screened genes were used to created the Risk Score of each sample by the following formula:

$$ \text{R}\text{i}\text{s}\text{k} \text{S}\text{c}\text{o}\text{r}\text{e}={\sum }_{\text{i}=1}^{\text{n}}\text{C}\text{o}\text{e}{\text{f}}_{\text{i}}\ast {\text{X}}_{\text{i}}$$

Coefi was the risk coefficient of each gene calculated by the LASSO-Cox model, and Xi was the expression value of each gene. Next, the most optimal threshold of Risk Score was determined by R package survival, survminer, and bilateral log rank test. The patients were divided into low and high risk group according to the median.

Survival analysis

The overall survival rate of patients was estimated by R software survival package and survminer package based on Kaplan Meier (KM) method, and the significance of survival rate difference between different groups was tested by log rank or breslow. Multivariate Cox regression model was used to analyze whether risk score can predict the survival of OS patients independently of other factors.

Evaluation of immune cell infiltration (ICI)

The relative proportion of 22 immune cells in each cancer sample was calculated by software CIBERSORT [17]. According to the gene expression matrix, CICERSORT executed deconvolution algorithm to characterize the composition of immune infiltrating cells with 547 preset barcode genes. The sum of the proportion of all estimated immune cell types in each sample was equal to one.

Establishment of Nomogram prognostic model

Nomogram has been widely used to predict cancer prognosis. In order to predict the 1-, 3- and 5-year survival probabilities of patients, all independent prognostic factors determined by multivariate Cox regression analysis were used to establish Nomogram with R software “RMS” package [18].

Results

Glycolysis-immune-related analysis could define different prognosis

In this study, a total of 296 glycolysis related genes (Table S1) were obtained from GSEA. Based on these genes, we used the “K-means” method in R software to cluster the OS samples in GSE21257. According to the sum of squares errors (Fig. 1A), we selected the number of clusters k = 2 and clustered the samples into two clusters (Fig. 1B). Furthermore, the KM carves showed that there was significant difference in overall survival between the two clusters of samples (Fig. 1C), and the prognosis of cluster 2 was better. Therefore, we define cluster 2 as the low glycolysis group, on the contrary, cluster 1 as the high glycolysis group.

Fig. 1
figure 1

glycolysis-immunoassay defines different prognosis. (A) The “Elbow diagram” of the optimal number of clustering. The horizontal and vertical axes represented the number of clusters K and sum of the squared errors respectively; the point where the decline tends to be gentle was the optimal number of clusters K = 2. (B) Sample clustering diagram, different colors represented different clusters. (C) KM survival curve defined by glycolysis. p value was obtained based on log-rank test. (D-E) The best cutoff values and KM survival curve defined by Immune Score. (F) Kaplan Meier survival curve defined by glycoly-immune score

For further survival analysis, the “Estimate” function package was conducted to evaluate the Immune Score of samples and these samples were grouped subsequently. According to the results, the optimal cutoff value divided the samples into two groups (Fig. 1D), and the prognosis of the samples in the high Immune Score group was significantly higher than that in the low Immune Score group (Fig. 1E).

According to the glycolysis status and Immune Score, we further divided the samples into low glycolysis-high immunity group (Low/High), high glycolysis-low immunity group (High/Low) and other mixed groups (Mix). We noted that the prognosis of Low/High group was the best among three groups, the prognosis of High/Low group was the worst, and mix group crossed between these two groups, there were significant differences among the groups (Fig. 1F).

Identification of candidate GIGs

We also analyzed the DEGs in each group. Compared with the high glycolysis group, the low glycolysis group had 1960 DEGs (Fig. 2A), the low immunity group had 659 DEGs compared with the high immunity group (Fig. 2B), 2517 DEGs between the Low/High group and the High/Low group (Fig. 2 C), and 305 overlap genes among the three groups (Table S2, Fig. 2D). Further enrichment analysis of GO and KEGG indicated that these 305 genes were significantly enriched in GO terms such as extracellular matrix organization and KEGG Pathway such as Nicotine addiction, respectively. The top 10 GO terms with the most significant enrichment were shown in Fig. 2E, as well as the top 10 KEGG terms in Fig. 2F. The detailed results of enrichment analysis as listed in Table S3.

Fig. 2
figure 2

DEGs and functional enrichment. (A-C) Volcano map of DEGs, blue dots were down-regulated genes, red dots were up-regulated genes. (D) Venn’s diagram of DEGs. (E-F) The top 10 items of GO and KEGG enrichment analysis

Construction and validation of prognostic model

The prognostic model was established and verified in this study. Firstly, we used the GSE21257 data set sample to conduct univariate Cox regression analysis with the expression values of 305 overlap genes as the continuous variables, and calculated the Hazard ratio (HR). The first 8 genes were screened with p < 0.01 as the threshold (Fig. 3A). Next, LASSO Cox regression analysis was performed on the 8 genes to screen the most closely related genes. According to the lambda values corresponding to the number of different genes, the optimal number of genes was determined to be 5 (Fig. 3B, lambda value was the minimum). And the five genes were RAI14, MAF, CLEC5A, TIAL1 and CENPJ. In order to obtain a uniform critical value, we divided patients into high risk group and low risk group, and weighted the expression values of 5 genes in the GSE21257 dataset with LASSO Cox regression coefficient to establish a Risk Score model for predicting patient survival. Risk Score=(-0.13626991*RAI14)+(-0.13438968*MAF)+(-0.30713757*CLEC5A)+ (0.09256004*TIAL1)+(0.05059066*CENPJ).

Fig. 3
figure 3

Construction of OS prognostic model. (A) Forest map of univariate analysis of 8 genes significantly associated with OS prognosis. HR: the Hazard ratio; 95% CI: the 95% confidence interval. (B) Optimization parameter Lambda by the LASSO regression model. The lambda value at the minimum was the best. (C-D) The predicted Risk Score for patients. The vertical dotted lines represented the median Risk Score, which distinguished between high-risk (red) and low-risk (green) populations in terms of population proportion, duration of survival (life/death), and gene expression of interest (genes involved in model construction). (E-F) KM survival curve in GEO dataset and TARGET dataset, respectively. p value was obtained based on log-rank test

After that, we calculated the Risk Score for each patient and divided the GSE21257 dataset and the TARGET validation sample into the high-risk and low-risk groups by median. Thus, we obtained the relationship between patients ranked by predicted value of risk and time to survival. As shown in Fig. 3C-D, the survival time of the low-risk group was slightly higher than that of the high-risk group. Clearly, the number of deaths was higher in the high-risk group than in the low-risk group. In addition, survival analysis showed that high-risk OS samples had poor overall survival compared with low-risk samples in the GSE21257 data set (Fig. 3E), as well as in the TARGET data set. Overall, these results suggested that the Risk Score can successfully distinguish the prognosis of OS patients.

Risk Score was an independent prognostic marker of OS

To determine whether Risk Score was an independent prognostic indicator, age, gender, tumor stage and Risk Score were included in this study for multivariate Cox regression analysis. The results showed that Risk Score was significantly correlated with overall survival, and the sample with high Risk Score had a greater risk of death and was a poor prognostic factor (HR = 5.35, 95% CI: 1.547–18.5, p < 0.008).

In addition, the prognostic value of Risk Score in OS samples with different clinicopathological factors (including age and sex) was further evaluated. We regrouped OS patients according to these factors and performed KM survival analysis. As the analysis results showed, in both female and male samples (Fig. 4B-C), the overall survival rate of male samples in the high-risk group was significantly lower than that in the low-risk group, while the p value of KM survival curve of female samples was greater than 0.05, which may be caused by too few samples. Meanwhile, overall survival was significantly lower in the high-risk group than in the low-risk group in both samples under 16 years old (inclusive) and over 16 years of age (Fig. 4D-E). These results indicated that Risk Score can be used as an independent indicator to predict the prognosis of OS patients.

Fig. 4
figure 4

Risk score was an independent prognostic marker of OS. (A) Forest map by multivariate Cox regression analysis. Compared with the reference sample, samples with a Hazard ratio greater than 1 had a higher risk of death, and those with a Hazard ratio less than 1 had a lower risk of death. (B-C) KM survival curves of OS samples of different genders. (D-E) KM survival curves of OS samples in different age groups

Nomogram model predicts the prognosis and survival of OS patients

A Nomogram was an effective model for pictorial representation of a complex mathematical formula and now was widely used in oncology to assist with prediction of prognosis [19]. In this study, we constructed a Nomogram model by Risk Score to further verify the accuracy and rationality of Risk Score (Fig. 5A). As shown in Fig. 5B-D, the corrected curve in the calibration diagram was closed to the ideal curve (the 45-degree line with slope 1 through the origin of the coordinate axis), indicating that the prediction was in good agreement with the actual results.

Fig. 5
figure 5

Nomogram predicted survival of OS patients. (A) Nomogram model was used to predict the probability of 1-, 3-, and 5-year overall survival of OS patients. (B-D) Calibration curves for Nomogram to predict the probabilities of 1-, 3-, and 5-year overall survival of OS patients. The X-axis and Y-axis represented the survival rate predicted by Nomogram and actual survival rate, respectively

Characterisation of the ICI in OS patients

We also used the CIBERSORT method combined with the LM22 feature matrix to evaluate the differences in immune infiltration of 22 immune cells in OS patients. In the analysis of ICI results of 53 OS patients (Fig. 6 A), obviously, there were differences in the infiltrating proportion of various immune cells in the high and low risk groups (Fig. 6B). the changes in the proportion of tumor infiltrating immune cells in different patients represented the internal characteristics of individual differences. Among them, plasma cells, T cells CD8, T cells CD4 memory resting, T cells follicular helper, NK cells activated, Monocytes and MacrophagesM0 had significant differences in the proportion of immune cell infiltration (Fig. 6C). Further PCA analysis showed that the 7 types of immune cells with significant differences could be clustered and divided into two groups (Fig. 6E).

Fig. 6
figure 6

Immune infiltration of OS patients in high and low risk groups. (A) Relative proportion of immune-infiltrating cells in patients. (B-C) Violin diagram of different immune cells in the high and low risk groups. p value was calculated by the Wilcoxn method, the significance level label ns: p > 0.05, *: p ≤ 0.05, **: p ≤ 0.01, ***: p ≤ 0.001, ****: p ≤ 0.0001. (D) Correlation matrix of the proportion of 22 immune cells. Orange represented positive correlation, and blue represented negative correlation. The darker the color, the greater the correlation. (E) PCA three-dimensional clustering map

Relationship between Risk Score and immune checkpoint genes

The expression of immune checkpoint has become a biomarker for OS patients to choose immunotherapy. In this study, we investigated whether there was an association between patient Risk Score and key immune checkpoints (CTLA4, PDL1, LAG3, TDO2). As expected, the Risk Score was associated with all of them (Fig. 7 A). Furthermore, there were significant differences in the expression of CTLA4 and LAG3 of OS patients in the high- and low-risk groups (Fig. 7B C).

Fig. 7
figure 7

Relationship between four immune checkpoints, tumor purity and Risk Score. (A) Chord diagrams expressed by Risk Score and four prominent immune checkpoints, the wider the line between them the stronger the correlation between them. (B) Violin plot of immune checkpoint with significant difference in expression levels in the high and low risk groups, different colors represented the high and low risk groups, the vertical axis represented expression levels, and p value was calculated by Wilcoxon method

Discussion

OS progresses rapidly and the prognosis of patients in the middle and late stages was often unsatisfactory [20]. Therefore, it is urgent to accurately evaluate the recurrence risk and provide appropriate monitoring and treatment for patients at high risk. Metabolic reprogramming in TME was involved in the survival, growth, and proliferation of tumor cells [21]. Recent studies have focused on the glycolysis process in the proliferation, invasion and drug resistance of OS cells [9, 22]. For example, Wen et al. [23] showed that lncRNA-SARCC-mediated cisplatin sensitivity may be via glycolysis in the miR-143-inhibited OS cells. Although prognostic signatures of patients with OS have been studied, the prognostic significance of glycoly-related genes was still unclear. In this study, we established a comprehensive evaluation model of glycolysis status combined with Immune Score, which could more clearly distinguish the difference in prognosis of patients. According to this unique model, we could easily group patients. Patients in different groups showed different glycolysis degrees and immune status, and the prognosis was also significantly different.

To numerically formulate this model, we analyzed 296 glycolsis-related genes and calculated Immune Score of OS patients to establish a GIG Risk Score. Next, we verified that the novel established model was an independent prognostic indicator of OS patients by using multivariate Cox analysis. Five prognostic genes related to glycolysis and immunity of OS were screened from 305 genes by univariate Cox regression analysis, including RAI14, MAF, CLEC5A, TIAL1 and CENPJ. Most of these genes have been studied in various tumors. RAI14 was highly expressed in gastric cancer, and its expression level was related to poor prognosis of patients [24]. MAF was a transcription factor and was considered as a tumor suppressor or oncogene, depending on the cell type [25]. For instance, MAF overexpression was a common tumorigenic event in multiple myeloma, triggering pathological bone marrow stromal cell interaction and promoting proliferation [26]. A study showed that the expression level of CLEC5A was significantly increased in ovarian cancer cells, and the higher the expression, the lower the survival rate of patients [27]. And CENPJ has been also verified to be tightly associated with prognosis of breast cancer [28]. Moreover, the stable silencing of TIAL1 expression in HeLa cells promoted cell proliferation, anchoring-dependent and -independent growth and invasion [29]. However, almost none of the 5 prognostic genes has been studied in OS. In our present work, the Risk Score constructed based on these five genes could satisfactorily predict the prognosis of OS patients.

Lactic acid produced by glycolysis of tumor cells subsequently accumulated in extracellular environment, which reduced pH of the TME and inhibited the innate immune response of the tumor, resulting in a barrier to treatment [30]. At the present, the immune Risk Score model based on tumor infiltrating immune cells has been reported to predict the prognosis of OS. For example, Chen et al. [31] constructed an immune Risk Score model based on tumor infiltrating immune cell types selected by forward stepwise approach to provide more valuable ideas for further research on the prognosis of OS. In this study, our results demonstrated that the prognosis of the low Immune Score group was significantly lower than the high Immune Score group, suggested that the low Immune Score group has a greater risk of death. Meanwhile, in the high and low risk group defined by Risk Score, there were differences in 22 types of immune cells, among which 7 immune cells, including plasma cells, T cells CD8, T cells CD4 memory resting, T cells follicular helper, NK cells activated, monocytes, and macrophages M0 were significantly different. These findings indicated that glycolysis affected the composition of the immune cell population of OS. Hence, it is more reasonable for our prognostic model to take into account both tumor glycolysis and immunity.

Compared with traditional methods, our GIG Risk Score model can predict the prognosis of OS patients more conveniently and accurately. In clinical observation, we can easily obtain the expression levels of these five genes to calculate the patient’s Risk Score and predict their survival rate. Our research also suffers from some limitations. The target sample size obtained from the GEO and TARGET datasets was small, which may affect the reproducibility of the results and lack verification of molecular mechanisms. These issues may be the focus of further research.

Conclusions

In conclusion, our study revealed an association between clinical outcomes of osteosarcoma and glycolysis and tumor immunity. The Risk Score model based on 5 glycolysis-immume-related genes was reliable in predicting the prognosis of osteosarcoma.