Introduction

Lung cancer is the most common type of malignancy and is one of the leading causes of cancer-related deaths in the world [1, 2]. Non-small cell lung cancer (NSCLC) accounts for approximately 80 % of all lung cancers, with an annual increase in its incidence. Although advancements of diagnosis and treatment have improved the survival of lung cancer patients, the 5-year overall survival (OS) rate of NSCLC was 15 % [3]. Poor prognosis has been attributed to tumor invasion, metastasis, and recurrence. Therefore, it is urgent to identify biomarkers which provide early diagnosis, accurate prognosis prediction, and novel therapeutic target.

SOX4 is a member of the sex-determining region Y-related high mobility group box (SOX) transcription factor family, which consists of at least 20 highly conserved transcription factors in humans [4]. SOX4 gene is located on 6p22.3 and encodes a 47 kDa protein (474 amino acids) [5]. SOX4 protein is characterized by possession of the highly conserved high mobility group (HMG) DNA-binding domain, which mediates binding of SOX4 protein to a short target DNA sequence directly [5]. The 79-amino-acid HMG domains bind to the consensus target sequence (A/T)ACAA(T/A) in the minor grooves of DNA and modify the chromatin structure to generate a conformation that facilitates various DNA-dependent activities [5]. SOX4 involves in many developmental processes such as thymocyte development, nervous system development, embryonic cardiac development, and T cell differentiation pathway [6–8]. In addition, recent studies have suggested that SOX4 as one of the most frequently overexpressed protein in several types of human cancers mediated regulation of tumorigenesis and tumor progression such as hepatocellular carcinoma [9, 10], prostate cancer [11, 12], breast cancer [13, 14], colorectal cancer [15, 16], bladder cancer [17, 18], pineoblastoma [19], uterine leiomyoma [20], pancreatic cancer [21], pheochromocytoma [22], glioblastoma [23], endometrial cancer [24], medulloblastoma [25], and adenoid cystic carcinoma [26]. However, little is known about the significance of SOX4 in patients with NSCLC.

In lung cancer, SOX4 has been identified as a target of gene amplification at chromosome 6p [27]. Moreover, a microarray analysis indicated that SOX4 level significantly increased in small cell lung cancer (SCLC) compared to normal lung tissues [28]. So far, there is no study to evaluate the pathological role of SOX4 in NSCLC patients. The aim of this study is to estimate the association between SOX4 and clinical characteristics including prognosis in NSCLC patients and to explore the role of SOX4 in NSCLC development and progression.

Materials and methods

Analysis of microarray data

Microarray data sets (GEO accession number: GSE3268 and GSE19804) from five pairs of squamous cell carcinoma and adjacent normal lung tissue submitted by Wachi and sixty pairs of adenocarcinoma and adjacent normal lung tissue specimens submitted by Lu were retrieved from the GEO database. Those differentially expressed genes were screened and identified by Real-time PCR for the following study.

Sample collection

A total of 10 freshly frozen NSCLC samples and 10 normal lung samples were collected from Xianning Central Hospital. All fresh samples were immediately preserved in liquid nitrogen. One hundred and sixty-eight paraffin-embedded NSCLC specimens and forty-one normal lung specimens were retrieved from Xianning Central Hospital. No patients had received any form of tumor-specific therapy before diagnosis. Before the use of these clinical samples, prior consents from the patients and approval from the Institutional Ethics Committee of Xianning Central Hospital were obtained. The histopathological diagnosis of all samples was, respectively, diagnosed by two pathologists. The clinical staging was based on the 7th edition of the AJCC Cancer Staging Manual. In the 168 NSCLC cases, there were 98 males and 70 females with age ranging from 21 to 73 years (median 58.19 years). The clinical follow-up time of patients ranged from 6 to 60 months. OS was defined as the interval from the date of diagnosis to NSCLC-related death.

Real-time PCR

Total RNA was extracted using Trizol (Invitrogen) according to the manufacturer’s protocol. After purification, complementary DNA (cDNA) was synthesized from 10-μg total RNA using the Prime Script RT Master Mix (Takara). The primers (Invitrogen) were designed as follows: for human SOX4, the forward primer was 5′-CTTGACATGATTAGCTGGCATGATT-3′; and the reverse primer was 5′-CCTGTGCAATATGCCGTGTAGA-3′. For human GAPDH, the forward primer was 5′-CCCACTCCTCCACCTTTGAC-3′; and the reverse primer was 5′-ATGAGGTCCACCACCCTGTT-3′. The RT-PCR was conducted by SYBR Premix Ex TaqTM II (Takara) on LightCycler (Roche). Relative quantification of RNA expression was calculated using the 2−△△Ct method. The raw data were presented as the relative quantity of target mRNA, normalized with GAPDH, and relative to a calibrator sample. Each sample was examined in triplicate.

Western blot

Western blot was carried out as described [29] with anti-SOX4 (1:1,000; Abcam, ab80261). HRP-conjugated anti-rabbit IgG antibody was used as the secondary antibody (1:2,000; Cell signaling technology). Signals were detected using enhanced chemiluminescence reagents (Pierce).

Immunohistochemistry

Paraffin sections from NSCLC and normal lung specimens were deparaffinized in xylene and rehydrated in a descending ethanol series (100, 95, 90, 80, 70 % ethanol) and double-distilled water according to standard protocols. Heat-induced antigen retrieval was performed in citrate buffer and boiled for 10 min. After antigen retrieval, sections were treated with 3 % hydrogen peroxide and 1 % bovine serum albumin to block the endogenous peroxidase activity and non-specific binding. The sections were incubated with SOX4 antibody (Abcam, ab80261, dilution 1:100) overnight at 4 °C. After phosphate buffered saline washing, the tissue sections were incubated with the biotinylated secondary antibody and streptavidin–horseradish peroxidase complex, each for 20 min at room temperature. Diaminobenzidine was used as the chromogen, and tissue sections were counterstained with haematoxylin and then viewed under a bright-field microscope.

Evaluation of staining

The tissue sections stained immunohistochemically for SOX4 were reviewed, and scored separately by two pathologists blinded to the clinical parameters. Any disagreements were arbitrated by the third pathologists. Staining intensity was graded (0, negative; 1, weak; 2, moderate; 3, strong), and percentage of positive-staining cells was counted (0, 10; 1, 11–50; 2, 51–75; 3, >76 %). The final score was determined by the combined staining score and proportion score (intensity score × proportion score) [29]. The final staining score ranged from 0 to 9. For statistical analysis, final staining scores of 0–4 and 6–9 were, respectively, considered to be low and high expression.

Statistical analysis

All statistical analyses were performed using SPSS version 13.0 and GraphPad 5.0 software. The unpaired t test was applied to test the differential mRNA expression of SOX4 in NSCLC tissues compared to normal lung tissues. The Chi-square test was used to examine the differences of SOX4 protein expression between NSCLC tissues and normal lung tissues. The Chi-square test was applied to the examination of relationship between SOX4 expression levels and clinicopathologic characteristics. Survival curves were plotted using the Kaplan–Meier method and compared using the log-rank test. The significance of survival variables was analyzed using the Cox multivariate proportional hazards model. A P value of less than 0.05 was considered statistically significant.

Results

SOX4 mRNA and protein were overexpressed in NSCLC tissues

From our microarray data, SOX4 was highly expressed in lung squamous cell carcinoma and adenocarcinoma tissues compared with paired adjacent normal lung tissues (Fig. 1).

Fig. 1
figure 1

Increased SOX4 expression was shown in lung squamous cell carcinoma and adenocarcinoma by microarray data analysis of GSE3268 and GSE19804 data sets retrieved from the GEO database

Furthermore, we performed real-time PCR to verify the expression of SOX4 mRNA transcripts in ten fresh NSCLC tissues and ten fresh normal lung tissues. Compared with normal lung tissues, NSCLC tissues showed higher expression levels of SOX4 mRNA (P = 0.001, Fig. 2).

Fig. 2
figure 2

Expression of SOX4 mRNA is increased in NSCLC tissues compared with normal lung tissues by real-time PCR

We measured the expression levels of SOX4 protein in 168 archived paraffin-embedded NSCLC samples and 41 normal lung samples using immunohistochemical staining (Fig. 3a–f). Specific SOX4 protein staining was found in nucleus. Furthermore, we observed that SOX4 protein was overexpressed in 61.3 % (103/168) of NSCLC samples. In comparison, only 31.7 % of normal lung samples had highly expressed SOX4 protein, significantly lower than that in the NSCLC samples (P = 0.001, Table 1). Furthermore, we measured the expression levels of SOX4 protein in NSCLC tissues and adjacent normal lung tissues using Western blot. SOX4 protein was significantly overexpressed in NSCLC tissues compared with adjacent normal lung tissues (Fig. 4).

Fig. 3
figure 3

Immunohistochemical staining of SOX4 in NSCLC tissues (original magnification ×400). a Negative expression of SOX4 in normal lung tissues; b–f Expression of SOX4 in NSCLC tissues (b final score = 0; c final score = 2; d final score = 4; e final score = 6; f final score = 9)

Table 1 Expression of SOX4 protein between lung cancer and normal lung tissues
Fig. 4
figure 4

The protein expression of SOX4 was increased in NSCLC tissues than that in adjacent normal lung tissues (N Normal tissue, T Tumor tissue)

Association between clinicopathological characteristics and expression of SOX4 in NSCLC patients

The association between clinicopathological characteristics and SOX4 expression levels in patients with NSCLC was summarized in Table 2. We did not find any significant association of SOX4 expression levels with patient’s gender (P = 0.538), age (P = 0.618), smoking (P = 0.417), and pathology classification (P = 0.809). However, SOX4 was positively associated with differentiated degree (high vs. middle, P = 0.004; high vs. low, P < 0.001), clinical stage (I–II vs. III–IV, P < 0.001), T classification (T1–T2 vs. T3–T4, P = 0.004), N classification (N0–N1 vs. N2–N3, P = 0.002), and M classification (M0 vs. M1, P = 0.011) in NSCLC.

Table 2 Correlation between the clinicopathologic characteristics and expression of SOX4 protein in NSCLC

Survival analysis

To explore the prognostic value of SOX4 expression in NSCLC, we measured the association between the levels of SOX4 expression and patients’ survival using Kaplan–Meier analysis with the log-rank test. In 168 NSCLC patients with prognosis information, we found that the level of SOX4 protein expression was significantly associated with the OS of NSCLC patients, as patients with lower levels of SOX4 expression had better survival than those with higher levels of SOX4 expression (P < 0.001, Fig. 5). Furthermore, we also found that increased expression of SOX4 showed poor prognosis in NSCLC patients, regardless of clinical stage, T classification, N classification, and M classification. Multivariate analysis showed that decreased SOX4 expression was an independent poor prognostic factor for NSCLC patients (P = 0.002, Table 3).

Fig. 5
figure 5

Increased SOX4 protein expression predicts an unfavorable prognosis. The association between patient survival and SOX4 expression was estimated using the Kaplan–Meier method and the log-rank test

Table 3 Summary of univariate and multivariate Cox regression analyses of overall survival duration

Discussion

SOX4 is a member of the group C subfamily of the SOX transcription factors and plays a significant role during embryogenesis, where its expression is widespread and regulates the development of numerous tissues [30]. In many cancers, deregulated expression of SOX4 has been correlated with increased cancer cell growth, cell survival, suppression of apoptosis, and tumor progression through the induction of an epithelial-to-mesenchymal transition (EMT) and metastasis [30]. In adenoid cystic carcinoma, knockdown of SOX4 results in decreased viability and increased apoptosis, and contributes to malignant phenotype [31]. Moreover, Vervoort et al. reported that TGF-beta-mediated increased expression of SOX4 is required for the induction of a mesenchymal phenotype during EMT in human mammary epithelial cells [32]. Similar to Zhang et al’s study in breast cancer, they found that SOX4 expression induces EMT and promotes metastasis in vitro and in xenograft models in vivo [13].

SOX4 expression is increased in a wide variety of tumors, including leukemia, colorectal cancer, prostate cancer, and breast cancer, suggesting a fundamental role in the development of these malignancies [30]. However¸ the role of SOX4 in NSCLC is still unclear. In microarray analysis performed by Wachi et al. (GSE3268) and Lu et al. (GSE19804), we found SOX4 was higher level in lung squamous cell carcinoma and adenocarcinoma samples than in adjacent paired normal lung samples. Then, we performed real-time PCR to verify the expression of SOX4 mRNA and found SOX mRNA expression was also elevated in NSCLC samples. Furthermore, we present the evidence that protein expression of SOX4 was increased in NSCLC through immunohistochemistry, which was consistent with the microarray data and real-time PCR result.

In order to further explore the role of SOX4 in the development and progression of NSCLC, we first analyzed the expression of SOX4 in 168 NSCLC patients and found SOX4 overexpression was significantly associated with differentiated degree, clinical stage, tumor size (T classification), lymph node metastasis (N classification), and distant metastasis (M classification). SOX4 overexpressed in NSCLC may accelerate tumor growth and enhance local cell invasion and metastasis. Our results implicate that SOX4 may involve in NSCLC progression. Similarly, increased levels of SOX4 expression correlated with more aggressive tumors and poor prognosis in prostate cancer [11, 33]. In addition, Vervoort et al. demonstrated that SOX4 expression was elevated in breast cancer compared with normal mammary tissue and positively correlated with histologic grade and status of estrogen receptor, estrogen receptor and HER2 [13]. However, the correlation between SOX4 expression and the survival of NSCLC patients has been seldom reported.

In the past few years, SOX4 overexpression in tumor cells has been shown to be an independent prognostic factor in several types of tumors, which has a favorable or unfavorable prognostic significance according to tumor types. In primary gallbladder carcinoma, SOX4 overexpression was significantly related to better OS and disease-free survival, and SOX4 expression was an independent risk factor for both OS and disease-free survival through multivariate analyses [34]. Moreover, Jafarnejad et al. reported that SOX4 expression was remarkably reduced in metastatic melanoma compared with primary melanoma, and reduced SOX4 expression correlated with a poorer disease-specific survival of melanoma patients and was an independent prognostic factor [35]. On the contrary, there was more evidence indicating that overexpression of SOX4 was unfavorable prognosis factor in gastric cancer [36], acute lymphoblastic leukemia (ALL) [37], colorectal cancer [16], etc. In gastric cancer patients, overexpression of SOX4 was significantly correlated with metastasis, poor differentiation, and unfavorable prognosis [36]. Similarly, Ramezani-Rad et al.’s study showed high levels of SOX4 expression in ALL cells at the time of diagnosis predicted poor outcome, and identified SOX4 as a critical activator of PI3K/AKT and MAPK signaling in ALL cells [37]. The discrepancy of SOX4 expression in different types of human cancer may be attributed to tumor heterogeneity. In the present study, we first presented the evidence that SOX4 protein expression in NSCLC was inversely correlated with patient’s OS. The patients with higher expression of SOX4 protein had shorter survival time. According to multivariate analyses, increased expression of SOX4 protein was a significant predictor of unfavorable prognosis for NSCLC patients.

In summary, our study showed that mRNA and protein expression level of SOX4 were significantly increased in NSCLC tissues and associated with the malignant status of NSCLC. Moreover, our results indicated that SOX4 was a significant prognostic factor for NSCLC patients. Because of the limited patient sample size in our study, further studies are needed to strengthen these findings and verify the role of SOX4 as a reliable clinical predictor of outcome for NSCLC patients.