Introduction

Lung cancer is the global leading cause of cancer-related death and adenocarcinoma is its major pathological type, accounting for over 40% [1]. Despite the progress in the treatment, the 5-year overall survival (OS) rate of the lung adenocarcinoma (LUAD) is less than 20%. At present, prognosis-related mechanisms of LUAD remain ambiguous [2] and the heterogeneity of the tumor makes it difficult to evaluate the prognosis of each patient precisely [3], which is also insufficiency of the tumor-node-metastasis (TNM) stage system [4,5,6]. Therefore, accurate and individual evaluation and promotion of the survival rate of LUAD patients remain a great challenge.

Tumor microenvironment (TME) is the environment where the tumor is located and is composed of various immune cells, stromal cells (including mesenchymal cells, endothelial cells), extracellular matrix molecules, and a variety of cytokines [7, 8]. Emerging evidence has suggested that components in the TME can define the immune phenotypes of cancers and therefore influence patient’s prognosis [9,10,11]. For instance, low-level infiltration of cytotoxic immune cells can help tumor cells escape from immune attack, thereby reducing patient’s survival probability [9]. Except for immune cells, stromal cells also regulate tumor immune phenotypes, such as cancer-associated fibroblast, which exerts a direct immunosuppressive mechanism of action [12]. Furthermore, the abnormal change of TME does not only impact patients’ prognosis but also could be used as a biomarker of therapies, such as immunotherapy. Due to the crucial role of TME in the carcinogenic process, evaluation of TME status may be an effective way to predict the prognosis and therapeutic benefit for patients, individually.

Although evaluating the TME status is a powerful way for prognosis estimation and drug response prediction, clarifying the component of TME is difficult. In this case, an intuitive and effective tool may be needed. In 2013, Yoshihara et al. calculated the immune, stromal, and ESTIMATE scores using an ESTIMATE algorithm, which was based on the expression of related molecular biomarkers in immune and stromal cells, to predict the TME status [13]. This bioinformatics analysis provides a useful tool to evaluate the TME status from gene expression data. Therefore, based on large-scale RNA-seq data from The Cancer Genome Atlas (TCGA) cohort and Gene Expression Omnibus (GEO) databases, we conducted the current study using the ESTIMATE method to investigate potential TME-related prognostic genes and constructed TME-related risk score (TMErisk) for predicting OS in LUAD patients.

Results

Calculation of immune/stromal scores and their correlation with patients’ clinical features

A total of 510 LUAD patients in the TCGA database were included in our study to calculate immune and stromal scores. Stromal scores of these patients ranged from −1779.3 to 2106.9 and immune scores ranged from −932.6 to 3449.2. Immune and stromal scores had significant distinct distribution in the normal (adjacent)/tumor tissue (Fig. S2A, B), female/male patients (Fig. S2C), and smoker/non-smoker (Fig. S2D). Higher immune score was associated with lower T and American Joint Committee on Cancer (AJCC) TNM stages while higher stromal score was associated with the lower M stage (Fig. S2E). KM and Cox regression analyses showed that patients with higher immune and stromal scores had more favorable OS than those with lower immune and stromal scores (Fig. S2F, G). Furthermore, immune and stromal scores significantly correlated with tumor purity inferred by CPE and ESTIMATE algorithms, with strong negative dependencies (Fig. S2H).

Construction of the TMErisk score

Between the groups with distinct OS, the volcano plot showed 1366 DEGs related to immune scores and 1500 DEGs related to stromal scores (Fig. 1A, B). In WGCNA, 12 was selected as the optimal soft threshold (Fig. S3). According to WGCNA results, 22 co-expressed gene modules (except the gray module in which genes were not co-expressed) were recognized (Fig. 1C). The tan, light green, red, purple, and brown modules had a strong correlation with stromal score, whereas the royal blue, dark-green, brown, and green modules had a strong correlation with immune score (Fig. 1D). The Venn plot showed the number of DEGs and their intersection with strong TME-correlated gene modules; these genes were further inputted into LASSO-Cox regression analysis (Fig. 1E). CCR2, CX3CR1, SERPINE1, LOX, OR7E47P, and RUBCNL (also named C13orf18) were selected from stromal-related genes, whereas CCL3L1, CCR2, CD101 (also named IGSF2), CD200R1, CD74, GBP1, HLA-DQB1-AS1, IRF1, RAC2, RUBCNL, and STAP1 were selected from immune-related genes (Fig. 1E). The prognostic value of the 16 genes was evaluated, respectively, in TCGA and KM Plotter databases (Fig. S4). Based on our screening criteria, RUBCNL, LOX, OR7E47P, CX3CR1, SERPINE1, CCR2, GBP1, IRF1, STAP1, and CD200R1 were incorporated into stepwise regression. The model consisting of LOX, OR7E47P, SERPINE1, CX3CR1, GBP1, IRF1, STAP1, and CD200R1 had a minimal AIC value (AIC = 1858.92). Therefore, the final risk model was: TMErisk = 0.7869 × LOX − 0.4203 × OR7E47P + 0.3138 × SERPINE1 − 0.4006 × CX3CR1 + 0.6168 × GBP1 + 0.6100 × IRF1 − 0.8024 × STAP1 − 0.7196 × CD200R1.

Fig. 1: Screening for TME-related genes.
figure 1

A DEG analysis in the low and high immune scores. B DEG analysis in the low and high stromal scores. C Gene modules identified by WGCNA. D Correlation between gene modules and immune/stromal scores. Strongly correlated modules (|Cor | > 0.5, P < 0.05) are marked with black frames. *P < 0.05; **P < 0.01; ***P < 0.001. E LASSO analysis for selecting alternative genes. Venn plot shows the number of intersection genes between DEG analysis and WGCNA. These genes were further inputted into LASSO analysis. The Y-axis shows LASSO coefficients and the X-axis is −log (lambda). Dotted vertical lines represent minimum and 1 standard error values of lambda. The genes selected at 1 standard error values of lambda were finally used for further analysis. TME tumor micro-environment, DEG differentially expressed gene, WGCNA weighted gene co-expression network analysis, LASSO Least Absolute Shrinkage and Selection Operator.

Survival analysis of the TMErisk score

We firstly investigated the distribution of the TMErisk score in patients with different gender, smoking status, and TNM stage. We found that patients with smoking history and higher N or TNM stage had a higher TMErisk score (Fig. 2A). Univariate and multivariate Cox regression analyses were used to test the significance of the impact of TMErisk on prognosis. A higher TMErisk score indicated poorer OS in the six datasets; after adjusting age, gender, performance score, smoking history, radiation therapy, drug therapy, and TNM stage, the TMErisk score was identified as an independent risk factor for OS in both TCGA and the four GEO (GSE30219, GSE30210, GSE37745, and GSE81089, Fig. 2B) datasets. Meantime, time-dependent AUC suggested that the TMErisk score had a considerable value in predicting the OS of LUAD patients in the TCGA and GEO datasets (Fig. 2C). As illustrated in Fig. 2D, TMErisk had significant higher C statistics than TNM stage in TCGA and GSE30219 datasets, and equivalent C statistics in other 3 datasets. It is also noteworthy that the combination of TMErisk and TNM could significantly promote C statistics in predicting the OS in TCGA and the four GEO datasets (Fig. 2D).

Fig. 2: Evaluation and validation for the prognostic value of the TMErisk score.
figure 2

A Difference analysis of the distribution of TMErisk scores in different T, N, M, TNM stage, gender, and smoking history. Statistical difference of three or more groups was compared by the Kruskal–Wallis test and that of two groups was compared by the Wilcoxon test. *P < 0.05; **P < 0.01; ***P < 0.001; ns not significant. B Forest plot of Cox analysis in TCGA and GEO datasets. 1: For multivariate Cox regression analysis, HR value of TMErisk was adjusted by age, smoking history, TNM stage, performance score, radiation therapy (radiation therapy includes intensity modulated radiotherapy, radioactive vicryl mesh, gamma knife, etc.; these therapies were used as adjuvant, palliative therapies and in progression/recurrence) and drug therapy (drug therapy includes cisplatin, paclitaxel, carboplatin, gefitinib, erlotinib, etc.; these therapies were used as adjuvant, palliative therapies and in progression/recurrence). 2: The multivariate analysis was not performed in GSE3141 because of missing clinical information. 3: For multivariate Cox regression analysis, HR value of TMErisk was adjusted by age, T stage and gender. 4: For multivariate Cox regression analysis, HR value of TMErisk was adjusted by age, smoking history, gender, and TNM stage. 5 and 6: For multivariate Cox regression analysis, HR value of TMErisk was adjusted by age, gender, and TNM stage. C Time-dependent AUC value in TCGA, GSE3141, GSE30219, GSE31210, GSE37745, and GSE81089. D C statistics of TNM stage and TMErisk in TCGA, GSE30219, GSE31210, GSE37745, and GSE81089. C statistics was not calculated in GSE3141 because of missing clinical information. AUC (t) time-dependent AUC, T Tumor, N node, M metastasis, TNM tumor-node-metastasis, HR hazard ratio, 95% CI 95% confidence interval.

Patients were subsequently assigned to low- and high-TMErisk groups. KM curves showed that patients in high-TMErisk group had poorer OS (Fig. S5). The result of the stratified analysis suggested a significant difference in the OS between low- and high- TMErisk for subgroups with different age, gender, smoking status, performance status, anatomic subdivision, TNM stages, radiation, and drug therapies (Fig. S6). To adjust the potential bias in the high- and the low-TMErisk groups, we used the propensity score matching method to balance the difference in patients’ epidemiological and clinical characteristics (Table S2). Overall, the results of stratified analysis before and after propensity score matching were similar. However, in terms of race, the black and Asian/Indian patients had no significant different outcomes between the low- and the high-TMErisk groups after propensity score matching (Fig. S6).

TMErisk was associated with LUAD immune signature

To explore the potential mechanisms leading to the different outcomes between the low- and the high-TMErisk groups, we performed GSEA with annotations of GO (Table S3, S4) and KEGG gene sets (Table S5). Significantly enriched pathways whose |normalized enriched score (NES) | > 1 and FDR value < 0.05 were shown in Fig. 3A. The gene sets involved in cell cycle, DNA repair, Wnt, and NF-κB signals were enriched in the high-TMErisk group, while those related to migration and proliferation of immune cells, immune response, inflammatory response, chemokine, and JAK/STAT signals were enriched in the low-TMErisk group. In addition, immune and stromal scores were significantly lower in high-TMErisk group, while tumor purity was higher (Fig. 3B).

Fig. 3: Function enrichment analysis for TMErisk and correlation between TMErisk and expression of the HLA family genes/immune checkpoints.
figure 3

A KEGG and GO enrichment of TMErisk. In GSEA, TMErisk was categorized into low- and high-TMErisk groups and the results were shown as a volcano plot where the red dots represent signaling pathways enriched in the high-TMErisk group while blue dots represent signaling pathways enriched in the low-TMErisk group. Only significant enriched pathways (FDR q-value < 0.05 and |NES| > 1) are marked by red or blue. The size of the dot was consistent with the ratio of enriched genes in each pathway. Only signals of interest were shown as texts. B Association between immune score, stromal score, tumor purity, and TMErisk score and their distribution in the low- and high-TMErisk groups. C Analyses for the expression of immune checkpoints and HLA family genes in different TMErisk groups. D Correlation analysis for TMErisk score and expression of HLA family genes/immune checkpoints. *P < 0.05; **P < 0.01; ***P < 0.001; ns not significant, GO gene ontology, KEGG Kyoto Encyclopedia of Genes and Genomes, GSEA gene set enrichment analysis.

We also investigated gene expression of the 24 HLA family genes and 48 immune checkpoints between the high- and low-TMErisk groups. According to the Wilcoxon test, 20 HLA family genes and 33 immune checkpoints were significantly modulated in the high-TMErisk group (Fig. 3C). Our analysis also showed that the TMErisk score (as a continuous variable) was significantly associated with the expression of 16 HLA genes and 28 immune checkpoints, including HLA-DRA, CTLA-4, BTLA, B7-H3, and VISTA (Fig. 3C).

The distribution of infiltrating immune cells inferred by TIMER, CIBERSORT, and xCell between the low- and high-TMErisk groups were also investigated (Fig. 4). Our results demonstrated that most of the immune and stromal cells decreased in the high-TMErisk group. However, common lymphoid progenitor, T helper 1 (Th1) and T helper 2 (Th2) cells infiltrated more in this group (Fig. 4). It should be also noted that the ratio of M2 macrophage/(M2 macrophage + M1 macrophage) was higher in patients of low-TMErisk group (P < 0.05, Fig. S7), indicating that patients in the low-TMErisk group tended to present a M2 phenotype. Meanwhile, we did not observe significant difference of Th2/(Th1 + Th2) between two groups (P > 0.05, Fig. S7).

Fig. 4: Landscape of immune and stromal cell infiltrations in the low- and high-TMErisk groups.
figure 4

The heatmap shows the normalized scores of immune and stromal cell infiltrations. Blue represents cells with lower infiltration in the high-TMErisk group and red represents cells with higher infiltration in the high-TMErisk group. The statistical difference between the two groups was compared by the Wilcoxon test. *P < 0.05; **P < 0.01; ***P < 0.001; ns not significant. The tumor mutation burden, which was calculated by package “maftools”, was also displayed. The clinical features and gene mutation patterns of patients were also illustrated as an annotation.

Mutation status in LUAD patients in the high- and the low-TMErisk groups

To investigate TMErisk-related mechanisms in LUAD, somatic mutations from the TCGA database were also analyzed. When comparing the mutant frequency between samples of the low- and the high-TMErisk groups, more somatic mutations were observed in the high-TMErisk group, including non-synonymous and synonymous mutations (Fig. 5A). Meanwhile, maftools analysis results showed that 19 genes mutated more frequently in LUAD patients in the high-TMErisk group, including TP53, ASTN1, RYR2, XIRP2, LRP1B, CSMD1, RYR3, PCDH15, TTN, KEAP1, ADAMTS12, ERICH3, DMD, CDH10, SI, SYNE1, NRXN1, CACNA1E, and COL11A1 (Fig. 5B). Among these genes, eight were expressed differentially between the wild type and the mutant type (Fig. S8). The frequency of mutations in these genes was also shown in Fig. 4. Moreover, significant co-occurrences were observed among mutations of these genes (Fig. 5C). The co-occurrences of at least two in the 19 gene mutations accounted for 82.1% of patients in the high-TMErisk group and 58.3% in the low-TMErisk group (P < 0.001); the TP53-KRAS co-mutation rate in the high-TMErisk group was close to rate in the low-TMErisk groups (10.6% and 7.4%, respectively, P > 0.05). Interestingly, the co-mutation and single mutation of KEAP1, KRAS, STK11, PBRM1, and SMARCA4, which indicated worse response to immunotherapy, occurred significantly more in the high-TMErisk group (55.3%) than in the low-TMErisk group (42.3%) (P < 0.05, Fig. 5D). Meanwhile, we observed more truncating mutations of TP53 in the high-TMErisk group (23.5%) than in the low-TMErisk (15.9%) (P < 0.05, Fig. 5E).

Fig. 5: TMErisk was related to tumor mutation status.
figure 5

A Association between synonymous mutation counts, non-synonymous mutation counts, all mutation counts, and TMErisk score and their distribution in the low- and the high-TMErisk groups. B Forest plot of genes mutating differentially in patients of the low- and the high-TMErisk groups. C Interaction effect of genes mutating differentially in patients in the low- and the high-TMErisk groups. D Constitution of wild type, single mutation, and co-mutation among the 19 genes, TP53-KRAS and KEAP1-X. 1: X represented KRAS, STK11, SMARCA4, and PBRM1. E Lollipop chart displaying mutation sites of TP53 proteins.

TMErisk predicts therapeutic benefits

To find the potency of TMErisk as a biomarker for predicting the response of LUAD patients to drugs (including chemotherapy, targeted therapy, and immunotherapy), we inferred the IC50 value of the 138 drugs in TCGA-LUAD patients. We found that patients in the low-TMErisk group might be more sensitive to pyrimethamine, lapatinib, gemcitabine, etc., while patients in the high-TMErisk group might be more sensitive to sorafenib, paclitaxel, docetaxel, etc. (Fig. S9A). In terms of response to immunotherapy, patients in the low-TMErisk group had a lower TIDE score and a higher IPS (Fig. S9B, C), indicating that patients in the low-TMErisk group might respond better to immunotherapy.

Discussion

In the present study, we developed a risk scoring system (TMErisk) based on immune and stromal scores and performed external validation for its performance. Our results showed that TMErisk was an independent prognostic factor and had a considerable effect on predicting the OS of LUAD patients. Patients in the high-TMErisk group presented lower levels of immune and stromal cell infiltration, lower immunogenicity, higher tumor purity, and higher somatic mutation status than patients in the low-TMErisk group. Moreover, some gene mutations that negatively influence immunotherapy response were enriched in the high-TMErisk group. Finally, TMErisk could be a potential biomarker for predicting immunotherapeutic response and efficacy considering the distribution of the TIDE score and IPS in the high- and the low-TMErisk groups.

Various studies have applied the ESTIMATE algorithm to cancers [14,15,16,17] and demonstrated the effectiveness of this algorithm in exploring the prognostic value of TME. Interestingly, different from previous studies of gastric cancer [18], clear cell renal cell carcinoma [17] and glioblastoma [14], higher stromal and immune scores suggested better OS for LUAD patients, indicating that TME scores may exhibit a distinct prognostic value in different tumor types. Therefore, a pan-cancer analysis may be needed to clarify its effect in different tumor types. Kinoshita et al. demonstrated that smokers with LUAD had higher amount of CD8 + T cell than nonsmokers [19]. However, in our study, we found that smokers had lower immune score, suggesting that smoking may depress infiltrating levels of other immune cells. In addition, smoking history of patients was based on clinical interviews, and this bias may influence the results of this analysis, too.

It is costly to get bulk gene expression to calculate scores generated by ESTIMATE algorithm. Therefore, we constructed a prognostic model using only eight genes that were selected based on stromal and immune scores. Genes involved in the TMErisk model represented positive or negative regulations of TME, such as LOX, which coded a protein belonging to the lysyl oxidase family. Its major function was the covalent crosslinking of collagens and elastin in the extracellular matrix [20]. In addition, its expression exhibited a suppressive effect on T-cell infiltration. CX3CR1 was included in the final model and the protein coded by this gene was mainly involved in the CX3CL1-CX3CR1 pathway. This signal has been reported to mediate the polarization of macrophages, cell migration, and other TME-related pathways [21, 22]. SERPINE1 belongs to the serine proteinase inhibitor (serpin) superfamily and is also known as a plasminogen activator inhibitor. SERPINE1 plays a crucial role in extracellular matrix degradation and remodeling [23], as well as cell migration [23, 24]. Proteins coded by other genes in the TMErisk model were immune checkpoints (such as CD200R1, regulating the myeloid cell and T-cell activity through binding with CD200 [25]) or transcriptional factors of immune checkpoints (such as IRF1, promoting the expression of programmed death-ligand 1 (PD-L1)). In addition, OR7E47P requires more explorations because its expression affects the OS while the associated mechanism remains unclear.

The present study had the advantage of evaluating the robust performance of the TMErisk model as it had been validated in five independent datasets. Time-dependent AUC showed that TMErisk had a good accuracy in predicting the OS in TCGA and five GEO datasets. Besides, the C statistics suggested that the combination of TMErisk and TNM stage could significantly improve the accuracy of evaluating the OS of patients. The stratified analysis also confirmed that the TMErisk could predict survival outcomes in different TNM stages. Taken together, these results indicated that TMErisk could have clinical applications as a supplement for the TNM stage. Additionally, the analysis of drug sensitivity revealed that TMErisk might be helpful in therapeutic application. Some drugs, such as gemcitabine, cisplatin, docetaxel, paclitaxel, and erlotinib, responded differently between the low- and high-TMErisk groups. These results suggested that TME influenced the drug response of chemotherapy and targeted therapy. At present, studies have shown that different TME phenotypes are related to therapeutic efficacy of chemotherapy, targeted therapy, and immunotherapy [26,27,28]. Therefore, a strategy optimizing regimens of a combination of chemotherapy, targeted therapy, and immunotherapy based on TME may be effective.

Immunotherapy is an emerging novel treatment for multiple cancers, including lung cancer. The exploration of the kind of patients who may benefit from immunotherapy remains a great challenge for us. PD-L1 expression, microsatellite instability status, and mutation load have been proposed to be inefficient in representing the potential benefits from immunotherapy [29,30,31]. Emerging pieces of evidence showed that different TME phenotypes might represent different survival outcomes and different degrees of benefit from immunotherapy therapy [32, 33]. Our results showed that high TMErisk tended to predict an immune-suppressed status. Lower immune activities were revealed in the high-TMErisk group, including lower immune cell infiltration and downregulation of HLA-I and HLA-II expressions. The expression of checkpoints (except B7-H3) was also downregulated, which may be associated with overall downregulation of immune activity. Our results also showed that patients in low-TMErisk group were more likely presented as M2 phenotype. Tumor associated macrophages are thought to differentiate into two main phenotypes: proinflammatory M1 and protumorigenic M2. However, in some studies concerning about NSCLC, patients with some M2 phenotypes (such as CD204+) and the high islet density of M2 macrophages were enriched in group with better prognosis [34, 35], which was similar to our study. Therefore, the mechanism about how macrophage phenotype affects prognosis of patients with lung cancer may need more evidences and discussions.

Tumor mutational burden (TMB) has been identified as a biomarker of immunotherapy response [36, 37], where higher TMB predicts higher benefits from immunotherapy [36]. We found that patients in the high-TMErisk group had higher TMB. However, as discussed above, patients in the high-TMErisk group presented a lower immune activity, suggesting that high TMB did not necessarily predict high immunogenicity. Further analysis revealed that the mutation of the 19 genes was the major reason for the high TMB in the high-TMErisk group. Interestingly, these genes had a high frequency of co-mutations, indicating that co-occurrence mutation of these genes may lead to an unknown change in the regulation of TME. Whether these co-mutations influence patients’ response to immunotherapy warrants further investigation.

Among the 19 genes, TP53 and KEAP1 were found to be mutating more in the high-TMErisk group. The mutation of the two genes was recently reported to be related to the immunotherapy response. Assoun et al. found that TP53 mutations reflected TMB and were associated with immunotherapy benefit in advanced NSCLC [38]. Other findings also reported that TP53 and its related co-mutation (such as TP53+KRAS) increased TMB and the expression of immune checkpoints, therefore influencing patient’s response to immunotherapy. However, some researches revealed that different responses to immunotherapy between patients with wild-type and mutant TP53 might rely on specific mutation types [39, 40]. For example, nonsense [40] or truncating [39] TP53 mutations that were found to be more in the high-TMErisk group may predict a poor response to immunotherapy. A recent study by Marinelli et al. found that KEAP1, STK11, SMARCA4, and PBRM1 predicted unresponsive LUAD patients to immunotherapy [41]. Furthermore, according to their research, co-mutations in at least two genes among KEAP1, STK11, SMARCA4, and PBRM1 had a stronger power in predicting immunotherapy response [41]. Additionally, the co-mutation of TP53-KRAS and co-mutation of KEAP1-KRAS were reported to predict better and poorer outcomes in patients who received immunotherapy, respectively. In our study, TP53 was highly mutated in the high-TMErisk group in which patients had higher TMB. However, there was no higher expression of immune checkpoints in this group. Additionally, KEAP1, STK11, SMARCA4, PBRM1, KRAS mutations and their co-mutations were also more in the high-TMErisk group, while a minor difference of TP53-KRAS co-mutation was observed between the low- and the high-TMErisk groups. Taken together, these results and those of TIDE score and IPS suggest that patients in the low-TMErisk group may respond better to immunotherapy. Overall, our model may be a useful clinical tool to help clinicians evaluate prognosis and conduct therapies (including chemotherapy, targeted therapy, and immunotherapy).

Although our model was beneficial in evaluating prognosis and conducting therapies for patients, it should be prospectively validated and the clinicopathological factors should be considered when using the model. Additionally, although we have evaluated the response of LUAD patients to immunotherapy in TCGA, unfortunately, analysis of whether immunotherapy-treated LUAD patients with different TMErisk scores benefit differently was not performed in our study due to the lack of expression data of LUAD patients undergoing immunotherapy. Investigation should be performed in our further work to compare TMErisk scores with current biomarkers and explore the correlation between TMErisk and immunotherapy in LUAD patients.

In summary, we constructed a TMErisk model to predict the OS of LUAD patients, which was externally and extendedly validated. Functionally, the TMErisk score was related to the immune response of LUAD patients. Favorable performance in validation datasets suggests its robust and capacious perspective in utilization.

Materials and methods

Data source and analysis for immune and stromal scores

TCGA RNA-seq data (FPKM normalized) and follow-up data of LUAD patients were downloaded from Genomic Data Commons Data Portal (https://portal.gdc.cancer.gov/). Normalized microarray gene expression data of GSE3141, GSE30219, GSE31210, GSE37745, and GSE81089 were available from the GEO database (https://www.ncbi.nlm.nih.gov/geo/). TCGA dataset was used to investigate the potential TME- and prognosis-related genes, based on which TMErisk was subsequently established. GEO datasets were independently used for external validation. Immune and stromal scores were calculated using the ESTIMATE algorithm [13], which was provided in the R package “estimate”. The tumor purity of patients was inferred by ESTIMATE and consensus measurement of purity estimations (CPE) algorithms [42], and the correlation between tumor purity and immune/stromal scores was evaluated using the Pearson method. To determine the optimal cut-off point for all continuous variables in our study, maximally selected rank statistics from the “maxstat” R package and X-tile were used. The cut-off points we used in the study were listed in Table S1.

Differentially expressed gene (DEG) analysis and weighted gene co-expression network analysis (WGCNA) based on RNA-seq data

DEG analysis was performed using the Wilcoxon test. DEGs were defined as genes whose false discovery rate (FDR) value (the adjusted P-value calculated using the Benjamin–Hochberg method) was < 0.05 and |Log2 (Fold Change (FC))| > 1. WGCNA was used to recognize co-expressed gene modules closely related to the immune and stromal scores. Gene modules of |correlation coefficient|> 0.5 were considered as strong TME-correlated modules. The intersections of DEGs and strong TME-correlated gene modules were used as an input in the Least Absolute Shrinkage and Selection Operator (LASSO) regression analysis. The genes with 0 value of FPKM-normalized expression in more than 50% cases in TCGA were excluded from these two analyses.

LASSO regularization and development of TME-related risk score

LASSO is a regularization and descending dimension method which can be used in biomarker screening for survival analysis combined with the Cox model [43]. The immune- and stromal-related genes screened by both the DEG analysis and WGCNA were, respectively, inputted into the LASSO-Cox regression to identify the hub genes. These genes were further transformed into binary variables and univariate Cox regression analyses were performed to identify the prognostic value between high and low expression genes in TCGA and Kaplan–Meier (KM) Plotter [44] databases. The genes whose univariate Cox P-value was <0.1 in both TCGA and KM Plotter databases were inputted into the Akaike information criterion (AIC)-based stepwise Cox regression to develop the TMErisk. In this process, the model with minimal AIC value was determined as the final model. The risk score was calculated by the formula: TMErisk = ∑ βi*Expi, where βi was the coefficient of each gene in the final Cox model and Expi represented the gene expression value. For each gene, high expression was given 1 point and low expression was 0. A flow diagram was provided in supplement files (Fig. S1) to help understand the process of TMErisk development.

Analysis of tumor immune signatures and function enrichment for TMErisk

Tumor immune signature was evaluated in two aspects: (1) the expression of the human leukocyte antigen (HLA) gene family and immune checkpoints [45, 46]. (2) The levels of infiltrating immune and stromal cells that were calculated by TIMER [47], CIBERSORT [48], and xCell [49] algorithms. These results can be available at TIMER2.0 website (http://timer.comp-genomics.org/). The potential biological function related to TMErisk was enriched using the gene set enrichment analysis (GSEA) method and annotated by gene ontology (GO) and Kyoto encyclopedia of genes and genomes (KEGG) databases. In GSEA, the FDR value < 0.05 was considered as significantly enriched.

Analysis of the tumor mutation status in the low- and high-TMErisk group

The information of somatic mutations in TCGA samples was downloaded from Genomic Data Commons Data Portal (https://portal.gdc.cancer.gov/). Concerning different mutation types, Frame_Shift_Del, Frame_Shift_Ins, In_Frame_Del, In_Frame_Ins, Missense, Nonsense, Nonstop, Splice_Site and Translation_Start_Site were treated as non-synonymous mutation variants. Silent and other mutation types, including Intron, 3′UTR, 5′UTR, 3′Flank, 5′Flank, IGR, RNA, and Splice region, were regarded as synonymous mutations. In our analysis, the synonymous mutation was regarded as a wild type when comparing mutation frequency in different groups since it did not cause protein change. Frame_Shift_Del, Frame_Shift_Ins, and Nonsense were coded as truncating mutations, which was consistent with the study of Zhao et al. [39]. Significantly mutated genes (P < 0.05) between the low- and high-TMErisk groups and the interaction effect of gene mutations were analyzed by maftools. In the two analyses, only genes mutating more than 30 times in at least one group were considered. The statistical test for proportion of mutation was evaluated by one-side z-test and two-side Chi-square test, and P < 0.05 was considered as significant.

Prediction of therapeutic sensitivity in patients with different TMErisk scores

We investigated the predictive capacity of TMErisk in responding immunotherapy and 138 drugs of chemotherapies/targeted therapies. The 50% inhibiting concentration (IC50) value of the 138 drugs was inferred using the pRRophetic algorithm and the value was normally transformed. The potential response of patients to immunotherapy was inferred by the tumor immune dysfunction and exclusion (TIDE) score and immunophenoscore (IPS). Generally, a lower TIDE score and higher IPS predict a better response to immunotherapy.

Survival and other statistical analysis

For categorical variables, the KM and Cox regression analyses were used to calculate the significance of differences in the OS. The statistical difference of the OS in the KM curve analysis was compared using the log-rank test. For continuous variables, Cox regression was used to calculate the hazard ratio and significance of differences in the OS. The time-dependent area under the receiver operating characteristic curve (AUC) and C statistics (also named concordance index, C-index) were used to evaluate the predictive power of TMErisk and TNM to OS. A higher value of the two indicators represented a higher accuracy. Stratified analysis was performed to investigate the survival difference in subgroups, including age, gender, TNM stage, race, smoking status, performance score, and therapies. To balance the difference of the information between the low- and high-TMErisk groups, a 1:1 propensity score matching was performed, after which stratified analysis was performed again. The Spearman method was used to calculate the correlation coefficient and the P-value in the correlation analysis of TMErisk and other variables. Statistical difference of distribution in three or more groups was examined by the Kruskal–Wallis test and that of two groups was compared by the Wilcoxon test. In each study above, only patients with complete related information were included. For example, 9 patients were excluded from the survival-related analysis since their OS time was unavailable. Unless otherwise stated, the P-values were two-sided and P < 0.05 was considered as statistically significant.