Introduction

Cancer is associated with fibroblasts at entire phases of disease progression, including metastasis, and they are a substantial component of the general host response to tissue damage induced by tumour cells [1]. Cancer-associated with fibroblasts become synthetic machines that produce diverse extracellular matrix (ECM) components, growth factors, cytokines, proteases, and hormones. Cancer-associated fibroblasts have generally been characterize as α-smooth muscle actin (α-SMA) [2]. Cancer-associated fibroblast expresses biomarkers such as such as vimentin, fibroblast specific protein, fibroblast activation protein (FAP), osteonectin, and desmin [3]. However, distinct biomarkers have been diagnosed in different tumour subtypes [4], and unique biomarkers that can perceive all of the cancer-associated fibroblasts in breast cancer remain inadequate to date.

Stromal cancer-associated fibroblasts of breast cancer have four specific cellular origins [5]. The first one is activated normal stromal fibroblasts, which is primary one and 80% of breast cancer-associated fibroblasts [6]. The second one is epithelial–mesenchymal transition (EMT) or endothelial cells, which undergo endothelial–mesenchymal transition [7]. The third one is bone marrow-derived mesenchymal stem cells (MSCs), which share to the regeneration of mesenchymal tissues, and are crucial in furnishing backing for the growth and differentiation of primitive hemopoietic cells within the bone marrow microenvironment [8]. The fourth one comprise trans differentiated cells in breast tissue, such as pericytes, adipocytes, or smooth muscle cells [9]. The significances of cancer-associated fibroblast in the clinical diagnosis and prognosis of breast cancer have been studied broadly, and the possible usage of some cancer-associated fibroblast biomarkers has been diagnosed. PDGF receptor β and FAP are the biomarkers significantly associated with breast cancer [10]. Alteration and/or loss of tumor suppressor genes are also critical features of breast cancer-associated fibroblast. Significantly alteration in tumor suppressor gene Cav-1 in breast cancer-associated fibroblast [11]. So triple negative breast cancer (TNBC) is comes under stromal cancer-associated fibroblasts.

Triple receptor-negative breast carcinoma (TNBC) is the most frequent and aggressive mammary gland malignancy in female adults, and is identified by absence of the estrogen receptor (ER) and progesterone receptor (PR) and lack of over expression of human epidermal growth factor receptor 2 (HER2) that are involved with advancement of the disease [12]. It is a rapidly fatal malignancy and the majority of patients with TNBC suffer from a poor quality of life [13]. Presently, the accepted clinical treatment is surgical resection of the malignant tissues, followed by radiotherapy and chemotherapy [14]. However, patients that receive these treatments may expeditiously advance resistance to chemotherapy [15]. Recent studies have focused on the recognition of candidate biomarkers of TNBC advancement, in order to produce a more efficient therapeutic strategy [16].

With the aim to explore the mechanisms of tumor initiation, progression and metastasis and develop new targeted therapies for TNBC, studies have focused on the signalling pathways deregulation and genes alternation related to TNBC in the past few years. Gene expression profiling is a beneficial tool to find differentially expressed genes (DEGs) in human TNBC so as to find possible critical genes or transcription factors that play crucial roles in the regulation of TNBC development and progression [13]. Numerous previous studies have identified some genes which may be used as diagnostic markers or therapy targets for TNBC. For example, FOXA1, KRT18, and XBP1 are found to be over-expressed in TNBC and involved in many important cell functions contribute to tumorigenesis [17]. FOXA1, a widely cytokeratin protein participated in many physiological and pathological processes such as lymphocyte homing and activation, cell survival and migration, and tumour growth and metastasis [18]. It is reported FOXA1, KRT18, and XBP1 are expressed at both high transcriptional and translational levels in TNBC and its expression is related to the degree of malignancy [19]. Furthermore, the up-regulation of FOXA1, KRT18, and XBP1 in malignant TNBC may be an indication of tumour cell growth and migration. ASNA1, NDUFS8, NDUFV1 and NDUFB7 [20] have all been studied and their possibilities to be used as targets for diagnosis and therapy of TNBC have also been evaluated. However these studies just noted a few DEGs and the interaction among these genes were still unknown.

In 2016, Marsh et al. [21] performed a microarray data analysis based on GSE75333 to identify gene expression changes responding to TNBC between normal fibroblasts, granulin-stimulated fibroblasts and cancer-associated fibroblasts. They obtained numerous differentially expressed genes (DEGs) between normal fibroblasts, granulin-stimulated fibroblasts and cancer-associated fibroblasts and found that the DEGs and their related function. This demonstrated there still existed distance between clinical application and the laboratory test, and more tests should be done to identify more candidates for therapy for TNBC.

In order to identify more genetic candidates for therapy for TNBC, expression microarray data GSE75333 deposited in Gene Expression Omnibus (GEO) by Marsh et al. [21]. DEGs in this dataset were analyze and their interaction relation were investigated by protein-protein interaction (PPI) network. Moreover, possible functions of DEGs involved in the PPI network were evaluated by gene and pathway enrichment analyses. The functions of the significant DEGs involved in significant function terms would be argued and some of them might be treated as the candidates for TNBC therapy.

Materials and methods

Affymetrix microarray data

The gene expression profile data of GSE75333 based on theplatform of GPL570 (Affymetrix Human Genome U133_Plus_2 Array) (Affymetrix Inc., Santa Clara, CA, USA) were downloaded from Gene Expression Omnibus (GEO) database in National Center for Biotechnology Information (NCBI, http://www.ncbi.nlm.nih.gov/geo/), which was deposited by Marsh et al. [21]. The datasets available in this analysis contained nine samples, including three normal fibroblasts cultures, three granulin-stimulated fibroblasts cultures and three cancer-associated fibroblasts cultures.

Identification of DEGs

The raw data of the mRNA expression profiles were downloaded and investigated by R language software [22]. Background correction, quartile data normalization, and probe summarization were tested for the authentic data. The limma [23] technique in Bioconductor (http://www.bioconductor.org/) was used to analyze genes which were differentially expressed between normal fibroblasts, granulin-stimulated fibroblasts and cancer-associated fibroblasts; the significance of DEGs was determined by t test and was characterized by p value. To cut down the risk of false positives, p values were adjusted for multiple testing using the Benjamini–Hochberg False Discovery Rate (FDR) technique. The revised p value was characterized by FDR [24]. FDR < 0.05 were treated as the cut off values for DEG screening.

Gene ontology (GO) analysis of DEGs

Gene ontology is an effective tool for compiling an enormous number of gene annotation terms [25]. The Lynx tool for a database and knowledge extraction engine for integrative medicine [26], is bioinformatics resources consisting of an integrated biological knowledgebase and analytic tools aimed at consistently elicit biological functional annotation from enormous gene/protein lists, such as being derived from high-throughput genomic experiments. To hike the extensive sympathetic of the biological functions of DEGs, Lynx tool was used to obtain the enriched GO terms of DEGs based on the p < 0.05 was set as the threshold value.

Bio-pathway analysis of DEGs

WIKIPATHWAYS, REACTOME, NCI and KEGG are a database resource for understanding functions of genes list from molecular level [27,28,29,30]. Lynx is a valuable tool to functionally interpret results from experimental techniques in genomics [26]. This web-based application consolidate different sources of information for discovering groups of genes with similar biological meaning. The enrichment analysis of Lynx is crucial in the analysis of high-throughput experiments. In the study, Lynx software was used to test the statistical enrichment of DEGs in pathways. p < 0.05 was set as the threshold value.

Construction of protein-protein interaction (PPI) network

The online database resource InnateDB keeps individually broad coverage, and access to predicted and experimental interaction information [31]. Interactions in the InnateDB are administered with a confidence score [31]. In the current study, application of the InnateDB (http://innatedb.com/) was used to predict PPIs based on a confidence score > 0.4 and other default parameters. The PPI network was visualized using network analyst (http://www.networkanalyst.ca/) [32]. Subsequently, a PPI network was build up based on the analyzed DEGs. A hypergeometric algorithm was used, and p < 0.05 was treated to express statistically significant differences.

Prediction of target miRNAs for DEGs

DIANA-TarBasev7.0 (http://diana.imis.athena-innovation.gr/DianaTools/index.php?r=tarbase/index) [33] is an integrated system for exploring large sets of gene. DIANA-TarBase was used to identify hsa-mir s corresponding to DEGs. Hsa-mirs with DEGs number > 2 and raw p < 0.05 were identified as target hsa-mirs. The gene hsa-mir network was visualized with network analyst (http://www.networkanalyst.ca/).

Results

Preprocessing of data

After preprocessing and data integration, we obtained 9 samples, including expression data matrix of 52,238 gene probes. Through comparing the quality of expression values of microarray before and after normalization, we found that the medians of expression value of nine samples were in a straight line (Fig. 1).

Fig. 1
figure 1

Box plots of the gene expression data before and after normalization. Horizontal axis represents the sample symbol and the vertical axis represents the gene expression values. The black line in the box plot represents the median value of gene expression

Identification of DEGs

Among the 52,238 genes identified from the gene expression profile microarray analysis, 190 differentially expressed genes were screened out (logFC| ≥ 1 and p < 0.05) (Table 1). Among these, 66 genes were up-regulated, and 124 genes were down-regulated. A “volcano plot” was constructed by plotting the differentially expressed genes between the p values and fold changes (Fig. 2). The corresponding Venn diagrams are shown in Fig. 3. The overlapping part represents the number of overlapping DEGs between groups, whereas the non-overlapping part represents the number of unique DEGs between groups.

Table 1 The statistical metrics for key differentially expressed genes (DEGs)
Fig. 2
figure 2

Volcano plot of differentially expressed genes. Genes with a significant change of more than twofold were selected

Fig. 3
figure 3

Venn diagram of differentially expressed genes (DEGs) in three groups. A vs. B represents DEGs between A and B; A vs. C represents DEGs between A and C; B vs. C represents DEGs between B and C. A normal fibroblasts group, B granulin-stimulated fibroblasts group, C cancer-associated fibroblasts group

Hierarchical clustering analysis of DEGs

After extracting the expression values of the DEGs, hierarchical clustering analysis was conducted for the DEGs. As shown in Fig. 4, the DEGs could clearly distinguish the normal fibroblasts samples from the granulin-stimulated fibroblasts and normal fibroblasts samples from the cancer-associated fibroblasts samples. In granulin-stimulated fibroblasts and cancer-associated fibroblasts samples, there were more down-regulated genes than up-regulated genes (Fig. 4).

Fig. 4
figure 4

Heat map of differentially expressed genes. The red colour represents down-regulated gene and the green represents up-regulated gene in TNBC samples. Legend on the top left indicate log fold change of genes (A1, A2, A3: normal fibroblasts group; B1, B2, B3: granulin-stimulated fibroblasts group; C1, C2, C3: cancer-associated fibroblasts group)

GO enrichment analysis

After the GO enrichment analysis, the DEGs in the PPI network of up-regulated genes were mainly associated with the GO BP categories of cell-cell signalling(including ADM, EFNB2, NOV, SSTR1, TFAP2C and TRHDE, p = 0.0000562) and negative regulation of cell proliferation (including ADM, BCHE, CDKN2A, CDKN2B, IGFBP3, SSTR1, and TWIST2, p = 0.0000733), GO MF categories of insulin-like growth factor II binding (including IGFBP3 and IGFBP5, p = 0.000201) and insulin-like growth factor I binding (including IGFBP3 and IGFBP5, p = 0.000258) and GO CC categories of insulin-like growth factor binding protein complex (including IGFBP3and IGFBP5, p = 0.0000217) and integral component of plasma membrane (including EFNB2, ENPP2, EPHA5, FLRT3, KCND2, KCNJ8, MME, NLGN1, PTGER3, SLC38A5, SSTR1 and TRHDE, p = 0.000258) (Table 2).

Table 2 Enriched functions of DEGs

Furthermore, the down-regulated genes were essentially involved in the GO BP categories of cell adhesion (including CCL2, CDH6, COL18A1, COL8A1, COMP, CXCL12, ENG, ITGA6, MCAM, NCAM1, NTM, PCDHB15, PODXL, SORBS2, SUSD5, and VCAM1, p = 9.03e−10) and extracellular matrix organization (including ADAMTS5, BMP4, COL10A1, COL11A1, COL18A1, COL8A1, COMP, EFEMP1, ITGA6, NCAM1, TGFB2 and VCAM1, p = 4.65e−8), GO MP categories of N-acetylgalactosamine 4-sulfate 6-O-sulfotransferase activity (including CHST11and CHST15, p = 0.0000243) and calcium ion binding (including ANXA3, CDH6, CELSR1, COMP, EFEMP1, FLG, GALNT3, JAG1, MGP, PCDHB14, PCDHB15, SCUBE3 and SYTL2, p = 0.0000335) and GO CC categories of extracellular space (including ACTA2, BMP4, CCL2, COL18A1, COMP, CXCL12, EFEMP1, ENG, GPX3, IL26, KRT81, LRRC17, MCAM, PLAU, PODXL, TGFB2, TIMP3 and VCAM1, p = 0.000258) and extracellular matrix (including COL18A1, COL8A1, COMP, EFEMP1, MGP, TGFB2 and TIMP3, p = 0.000258) (Table 2).

Pathway enrichment analysis

After the pathway enrichment analysis, up-regulated genes were mainly associated with the WIKIPATHWAYS categories of myometrial relaxation and contraction pathways (including ADM, ATF5, IGFBP3 and IGFBP5, p = 0.001) (Table 3).

Table 3 Enriched pathways of DEGs

Furthermore, the down-regulated genes were essentially involved in the WIKIPATHWAYS categories of endochondral ossification (ADAMTS5, CHST11, COL10A1, MGP, PLAU, RUNX3, SOX6, TGFB2, TIMP3, p = 4.27e−11) and TGF beta signalling pathway (including BMP4, ENG, LEF1 and RUNX3, p = 0.00016), REACTOME categories of extracellular matrix organization (including ADAMTS5, CHST11, COL10A1, MGP, PLAU, RUNX3, SOX6, TGFB2 and TIMP3, p = 9.35e−9) and integrin cell surface interactions (including COL10A1, COL18A1, COL8A1, COMP, ITGA6 and VCAM1, p = 0.00000408) and PID_NCI categories of beta1 integrin cell surface interactions (including COL11A1, COL18A1, ITGA6, PLAU, TGM2 and VCAM1, p = 8.35e−7), and KEGG categories of malaria (including CCL2, COMP, TGFB2 and VCAM1, p = 0.000102) and glycosaminoglycan biosynthesis—chondroitin sulfate/dermatansulfate (including CHST11, CHST15 and CHSY3, p = 0.000127) (Table 3).

PPI network analysis

The PPI networks of up- and down-regulated genes are shown in Figs. 5a and 6a, respectively. The up-regulated network was constructed with 472 nodes and 529 edges (A). The proteins cyclin dependent kinase inhibitor 2A (CDKN2A, degree = 148), membrane matalloendopeptidase (MME, degree = 78), PBX homebox 1 (PBX1, degree = 39), insulin like growth factor binding protein 3 (IGFBP3, degree = 25) and transcription factor AP-2 gamma (TFAP2C, degree = 23) were hub nodes in this network. The distribution of node degrees complied with exponential distribution. R squared and correlation coefficient are 0.618 and 0.898, respectively (Fig. 7a). The down-regulated PPI network was constructed with 1143 nodes and 1374 edges (B). The proteins vascular cell adhesion molecule (VCAM1, degree = 426), keratin 18 (KRT18, degree = 89), transglutaminase 2 (TGM2, degree = 69), actin, alpha 22, smooth muscle, aorta (ACTA2, degree = 55) and STAM binding protein (STAMBP, degree = 44) were hub proteins in this network. The distribution of node degrees complied with exponential distribution. R squared and correlation coefficient are 0.684 and 0.914, respectively (Fig. 7b). After the connectivity degree analysis, the top 20 nodes with high degrees for the up- and down-regulated PPI network were screened (Table 4). The connectivity degrees of the top 20 nodes in the up-regulated and down-regulated PPI were all higher than 5.

Fig. 5
figure 5

Protein-protein interaction network of the DEGs (a). Top four PPI subnetworks in co-expression module (be) drawn from the protein-protein interaction network of the differentially expressed genes. The green circles represent for the up-regulated genes. The PPI pairs were identified with the required confidence (combined score) > 0.9 as a threshold, and the PPI network of these connections were constructed using NetworkAnalyst software

Fig. 6
figure 6

Protein-protein interaction network of the DEGs (a). Top four PPI subnetworks in co-expression module (be) drawn from the protein-protein interaction network of the differentially expressed genes. The red circles represent for the up-regulated genes. The PPI pairs were identified with the required confidence (combined score) > 0.9 as a threshold, and the PPI network of these connections were constructed using NetworkAnalyst software

Fig. 7
figure 7

a Scatter plot of node degree distribution for up-regulated genes; b scatter plot of node degree distribution for down-regulated genes

Table 4 Top 20 nodes with higher connectivity degrees in the protein-protein interaction network of the up-regulated and down-regulated differentially expressed genes

Moreover, in the four subnetworks screened from up-regulated genes, several hub genes were also identified, such as MMP3 (Fig. 5b), CDH10 (Fig. 5c), LSAMP (Fig. 5d), CSGALNACT1 (Fig. 5e) and four subnetworks for down-regulated genes, such as CARD16 (Fig. 6b), COL10A1 (Fig. 6c), COL18A1 (Fig. 6d), and CELSR1 (Fig. 6e).

Prediction of target hsa-mir for DEGs

The target hsa-mir for up- and down-regulated genes are presented in Figs. 8 and 9, respectively. In the up-regulated gene hsa-mir network, hsa-mir-759 (n = 3) and hsa-mir-4446-5p (n = 4) regulated the most up-regulated genes. Insulin like growth factor binding protein 5 (IGFBP5) was regulated by 143 hsa-mirs, such as hsa-mir-219a-1-3p and hsa-mir-4446-5p. Besides, hsa-mir-26a-5p (n = 5) and hsa-mir- 301a-3p (n = 5) families regulated the most down-regulated gene hsa-mir network. Gene myosin regulatory light chain interacting proteins (MYLIP) was the hub node and was regulated by 113 hsa-mirs, including hsa-mir-26a-5p and hsa-mir- 301a-3p.

Fig. 8
figure 8

Target gene-miRNA networks of Down-regulated genes. Green node stands for up-regulated genes, blue diamond stands for miRNA. The network was constructed and visualized using NetworkAnalyst software

Fig. 9
figure 9

Target gene-miRNA networks of down-regulated genes. Red node stands for up-regulated genes, blue diamond stands for miRNA. The network was constructed and visualized using NetworkAnalyst software

Discussion

Firstly, total 66 up-regulated and 124 down-regulated genes were identified between three normal fibroblasts cultures, three granulin-stimulated fibroblasts cultures and three cancer-associated fibroblasts cultures. The result indicates that the common DEGs might play important roles during TNBC development and progression. In order to explore the potential roles of these DEGs, we performed functional analyses to them. The GO term analysis showed that up-regulated DEGs were mainly involved in cell-cell signalling and negative regulation of cell proliferation with high significant p value. The previous study shows that cell-cell signaling is mainly involved in TNBC development [34]. ADM, EFNB2, NOV, SSTR1, TFAP2C and TRHDE are biomarkers enriched in cell-cell signalling. Smirnov et al., 2006 [35] showed that ADM (adrenomedullin) was a potent vasodilator and a hypotensive agent and mainly associated with metastatic carcinomas. Higher expression of EFNB2 (ephrinB2) controls the arterial/venous specialization and vessel branching in TNBC [36]. NOV (nephroblastoma overexpressed) is a connective tissue growth factor is highly expressed in TNBC [37]. Hormone receptor gene SSTR1 (somatostatin receptor 1) play an important role in breast cancer [38]. Most of the breast cancers expressing estrogen receptor-α (ERα). TFAP2C (transcription factor AP-2 gamma) control the expression of ERα directly by binding to the ERα promoter region [39]. TRHDE (thyrotropin releasing hormone degrading enzyme) gene might increase the risk of TNBC development [40]. The key role of negative regulation of cell proliferation is associated with cancer metastasis and angiogenesis [41]. ADM, BCHE, CDKN2A, CDKN2B, IGFBP3, SSTR1 and TWIST2 are the biomarkers enriched in negative regulation of cell proliferation. Increased expression of BCHE (butyrylcholinesterase) was found in TNBC [42]. Increased risk of cancer with germline CDKN2A (cyclin-dependent kinase inhibitor 2 A) and CDKN2B (cyclin-dependent kinase inhibitor 2 B) mutation [43, 44]. High expression of mutated version of CDKN2A and CDKN2B gene in TNBC [42, 45]. Activation of altered IGFBP3(insulin like growth factor binding protein 3) stimulates mitosis and inhibits apoptosis [46]. And Key et al. [47] found the increased expression of IGFBP3 was correlated with prognosis in patients with breast cancer. Expression of TWIST2 (twist family bHLH transcription factor 2) is a highly conserved basic helix-loop-helix transcription factor that play in embryogenesis and promotes cancer invasion [48]. Accumulating evidence showed that over-expression of TWIST2 was involved in breast cancer development [49]. Similarly, GO term analysis showed that down-regulated DEGs were mainly involved in cell adhesion and extracellular matrix organization with high significant p value. Cell-to-cell and cell-to-extracellular matrix adhesion controls the social behavior of cells in tumour development [50, 51]. Carey et al., 2010 [52] reported that cell adhesion is primarily involved in TNBC development. Molecular markers such as CCL2, CDH6, COL18A1, COL8A1, COMP, CXCL12, ENG, ITGA6, MCAM, NCAM1, NTM, PCDHB15, PODXL, SORBS2, SUSD5 and VCAM1 enriched in cell adhesion. CCL2 (C-C motif chemokine ligand 2) negotiate between cancer cells and stromal fibroblasts that control cancer progression [53]. Expression of CCL2 is involved TNBC development [54]. Expression of CDH6 (cadherin 6) in the metastatic progression of many cancer types [55]. Polymorphism of the COL18A1 (collagen type XVIII alpha 1 chain) gene in breast cancer [56]. High molecular weight keratin COL8A1 (collagen type VIII alpha 1 chain) and variable elaboration that is diagnose in ~ 10–15% of breast cancers [57]. COMP (cartilage oligomeric matrix protein) is one of the most auspicious serologic markers with regard to an ability to prognose development of cancer and also play key roles in chondrogenesis and cartilage development in cancer [58]. CXCL12 (C-X-C motif chemokine ligand 12) and its specialized receptor, CXCR4, have newly been shown to be involved in tumorigenesis, proliferation and angiogenesis [59]. CXCL12 play a critical role in the progression in breast cancer through participating in cell adhesion [60]. ENG (Endoglin) is highly expressed in the tumor-associated vascular endothelium and is an component receptor for TGF-β that has been implicated in cancer cell detachment, migration, and invasiveness [61]. Molecular marker ENG play critical role in breast cancer progression [62]. ITGA6 (integrin subunit alpha 6) is tumour suppresser genes and mutation of this gene results in cancer progression [63]. Biomarker ITGA6 is found to be strongly associated with the TNBC [64]. MCAM (melanoma cell adhesion molecule) is a cell-surface glycoprotein molecule that is actively expressed on leading human carcinoma [65]. MCAM protein is confirmed to induce cell adhesion and induces TNBC [66]. NCAM1 (neuronal cellular adhesion molecule) expression has been corresponded with the existence of perineural invasion in specimens from a variety of tumours [67]. NCAM1 highly expressed in advanced breast cancer [68]. NTM (neurotrimin) is novel cell adhesion biomarkers expressed in many cancer types [69]. It has been reported in TNBC development [70]. Mutated PCDHB15 (protocadherin beta 15) expression was associated TNBC [71]. Molecular markers PODXL (podocalyxin like) is involved in the arrangement of both adhesion and cell morphology and cancer development [72]. Castro et al. [73] showed PODXL involvement in TNBC progression. SORBS2 (sorbin and SH3 domain containing 2) is a adapter protein that plays a key role in the assembling of signalling complexes, being a link between ABL kinases and actin cytoskeleton and also express in most of cancer development [74]. SUSD5 (sushi domain containing 5) is involved in metastatic colonization of most of the cancers [75]. SUSD5 is a one of the prognostic marker in breast cancer [76]. VCAM1 (vascular cell adhesion molecule-1) is an endothelial cell membrane glycoprotein that has been implicated in leukocyte/endothelial cell interactions in cancer cell metastasis [77]. VCAM1 was a pivotal contributor to TNBC progression [78]. ADAMTS5, BMP4, COL10A1, COL11A1, COL18A1, COL8A1, COMP, EFEMP1, ITGA6, NCAM1, TGFB2 and VCAM1 are the gene highly enriched in extracellular matrix organization. Extracellular matrix enzyme ADAMTS5 (ADAM metallopeptidase with thrombospondin type 1 motif) has emerged as key players in angiogenesis and cancer development [79]. BMP4 (bone morphogenetic protein 4) controls distinct cellular processes, such as proliferation, differentiation, and apoptosis [80]. Expression of BMP4 plays a positive role in progression of breast cancer [81]. COL10A1 (collagen type X alpha 1 chain) promotes cell proliferation in cancer development [82]. Accumulating evidence has demonstrated that expression of COL10A1 is involved in TNBC development [83]. COL11A1 (collagen type XI alpha 1 chain) important in cell invasiveness and tumour formation [84]. Molecular marker COL11A1 involves in the TNBC progression [85]. The polymorphisms of EFEMP1 (EGF containing fibulin like extracellular matrix protein 1) gene were identified with breast cancer and might share to the susceptibility of the progression of TNBC [86]. A polymorphism in the promoter of TGFB2 (transforming growth factor beta 2) that intensify expression of the protein was related with lymph node metastasis in TNBC patients, pointing to a role of TGFB2 in the process of invasion [87].

The enriched Wikipathways pathways of up-regulated DEGs is myometrial relaxation and contraction pathways. Previous studies have shown that myometrial Relaxation and contraction pathways up-regulated genes in TNBC development can predict the overall survival of TNBC patients [88]. Several common genes enriched in these above pathways (ADM, ATF5, IGFBP3 and IGFBP5) showed up regulation. Biomarker ATF5 (activating transcription factor 5) encourage invasion by activating the expression of integrin-alpha2 and integrin-beta1 in several human cancer cell [89]. Accumulating evidence showed that over-expression of ATF5 was involved in TNBC [90]. Reticence of IGFBP3 (insulin like growth factor binding protein 3) and IGFBP5 (insulin like growth factor binding protein 5) imbalance their proliferative action and programmed cell death in breast cancer [91, 92]. The Wikipathways results showed that the down-regulated genes were significantly enriched in two pathways, which included endochondral ossification and TGF beta signalling pathway. Endochondral ossification play key role in vascularization particularly in cancer development [93]. TGF-beta signalling reinforce tumour recurrence through IL-8–dependent expansion of cancer stem-like cells (CSCs) [94]. Genes such as ADAMTS5, CHST11, COL10A1, MGP, PLAU, RUNX3, SOX6, TGFB2 and TIMP3, were significantly enriched in endochondral ossification. Enzyme CHST11 (carbohydrate sulfotransferase 11) catalyzes the transfer of sulfate to position 4 of the N-acetylgalactosamine (GalNAc) residue of chondroitin [95]. CHST11 may play a explicit role in advancement of breast cancer and that its expression is restrained by DNA methylation [96]. Among the proteins involved in vascular calcium metabolism, the vitamin K-dependent MGP (matrix Gla-protein) plays a dominant role in breast cancer development [97]. PLAU (plasminogen activator, urokinase) play essential roles in tumour invasion and metastasis [98]. PLAU genes have been reported participate in the breast cancer development [99]. RUNX3 (Runt-related transcription factor 3) is a contender tumour suppressor gene and that is down-regulated in diverse cancers [100] and also activation of Wnt/β-catenin signalling in TNBC [101]. SOX6 (SRY-box 6) is tumor-suppressive function and its inactivation results in cancer progression [102]. Pinto et al. [103] found that low expression of SOX6 results in breast cancer development. Clinical exercise found that TIMP3(TIMP metallopeptidase inhibitor 3) was silenced in a number of cancer types [104]. Yuan et al. [105] revealed that TIMP3 silencing results in TNBC development. BMP4, ENG, LEF1 and RUNX3 are the down-regulated genes enriched in TGF beta signalling pathway. LEF1 (lymphoid enhancer binding factor 1) aberrantly controlled signalling pathways in cancer the WNT/β-catenin pathway plays an dominant role, since it was shown to be perilously involved in a wide range of cancer developments [106]. Delaunay and colleagues [107] demonstrated that LEF1 gene play important role breast cancer development. And for reactome pathway results showed that the down-regulated genes were most significantly enriched in extracellular matrix organization and integrin cell surface interactions. ADAMTS5, CHST11, COL10A1, MGP, PLAU, RUNX3, SOX6, TGFB2 and TIMP3 are the gene enriched in included extracellular matrix organization. Modification of extracellular matrix organization results in TNBC progression [108]. Integrin cell surface interactions including 23 alterations in integrin, laminin and collagen genes results in TNBC progression [109]. COL10A1, COL18A1, COL8A1, COMP, ITGA6 and VCAM1 are the gene enriched in integrin cell surface interactions. PID_NCI pathway results showed that the down-regulated genes were most significantly enriched in beta1 integrin cell surface interactions. Beta1 integrin cell surface interactions induces cellular proliferation results in breast cancer development [110]. COL11A1, COL18A1, ITGA6, PLAU, TGM2 and VCAM1 are gene enriched in beta1 integrin cell surface interactions. Finally KEGG pathway results showed that the down-regulated genes were significantly enriched in malaria and glycosaminoglycan biosynthesis—chondroitin sulfate/dermatansulfate. Glycosaminoglycan biosynthesis is crucial in TNBC development [111]. CHST11, CHST15 and CHSY3 are the gene enriched in glycosaminoglycan biosynthesis. Therefore, these results were consistent with previous studies and we identified critical genes involved in the PPI network.

We constructed the PPI network with up-regulated DEGs and list the top degree hub genes: CDKN2A, MME, PBX1, IGFBP3 and TFAP2C. CDKN2A was identified as one of the hub genes exhibiting the highest degree of connectivity. Over expression of MME (membrane metalloendopeptidase) plays a key role in the pathogenesis of TNBC [112]. Another hub gene PBX1 (PBX homeobox 1) is a TALE homeodomain protein and a proto-oncogene involved in the development of different types of cancers [113]. Moreover, proto oncogene PBX1 promotes the breast cancer development [114]. TFAP2C (transcription factor AP-2 gamma) is a transcription factor, which plays a very important role in the control of both estrogen receptor-alpha (ERα) and c-ErbB2/HER2 (Her2) and also promotes breast cancer development [115]. We also constructed PPI network for down-regulated genes and VCAM1, KRT18, TGM2, ACTA2 and STAMBP are the top five hub genes. VCAM1 was identified as one of the hub genes exhibiting the highest degree of connectivity. KRT18 (keratin 81) is a epithelial differentiation marker and it encodes the type I intermediate filament chain keratin 18 in breast cancer [116]. TGM2 (Transglutaminase 2) plays a crucial role in cancer cell growth and endurance through the antiapoptosis signalling pathway [117]. Down regulation of TGM2 in cancer cells is an important pathogenic factor in breast cancer [118]. Down regulation of epithelial–mesenchymal transition-associated (EMT-associated) gene ACTA2 (actin, alpha 2, smooth muscle, aorta) was correlated to invasion in TNBC [119]. Mutation of STAMBP (STAM binding protein) gene results in cancer development [120].

Subnetwork analysis of the PPI network for up-regulated genes revealed that the development of TNBC was associated with MMP3, CDH10, LSAMP and CSGALNACT1 genes. MMP3 (matrix metallopeptidase 3) is class of matrix metalloproteinase enzyme and this gene is associated with tumour cell invasion and metastasis with their promoter polymorphisms regulating the level of transcription [121]. MMP3 was a pivotal contributor to TNBC progression and could function as a potential therapeutic target [122]. CDH10 (cadherin 10) gene encodes a member of the cadherin family of calcium-dependent glycoproteins that mediate cell adhesion and controls many cellular events during cancer development [123]. CDH10 is important biomarker in maintenance of cell adhesion and polarity, alterations of which contribute to TNBC development [124]. LSAMP (limbic system-associated membrane protein) is a tumour suppressor gene and mutation of this gene results in cancer development [125]. Evidence reflected that inactivation of LSAMP gene may result in TNBC progression [126]. CSGALNACT1 (chondroitin sulfate N-acetylgalactosaminyltransferase 1) gene for enzymes generating chondroitin sulfate glycosaminoglycans and altered expression of this gene results in breast cancer development [127]. We also extracted subnetwork form PPI network of down-regulated genes and CARD16, COL10A1, COL18A1 and CELSR1 hub genes in this subnetwork with highest degree of connectivity. Mutation of CARD16 (caspase recruitment domain family member 16) gene and harbour this specific genes relevant to breast cancer development [128]. The CELSR1 (cadherin EGF LAG seven-pass G-type receptor 1) gene express a biomarker that is a member of the flamingo subfamily, which is factor of the cadherin superfamily and is main factor for breast cancer development [129].

Micro RNA play essential role in cancer progression [130]. Aberrant microRNA expression profiles have been identified in breast cancer [131]. Apart from DEGs and their functions, hsa-mirs such as hsa-mir-759 and hsa-mir-4446-5p for up-regulated genes and hsa-mir-26a-5p and hsa-mir-301a-3p for down-regulated genes may be important for the progression of TNBC. Recently, it has been reported that the expression of the fibrinogen alpha gene regulated by hsa-mir-759 was associated with susceptibility to TNBC [132]. Micro RNA (hsa-mir-4446-5p) is binding site for SHANK2(SH3 and multiple ankyrin repeat domains protein 2) and is responsible for breast cancer progression [133]. Micro RNA (hsa-miR-26a-5p) act as tumour suppressors in several types of cancers targeting oncogenic genes, such as breast cancer [134], nasopharyngeal carcinoma [135], and hepatocellular carcinoma [136]. Previuos report said that hsa-miR-301a-3p acts as an oncogene in TNBC [137].

In conclusion, our data provide a comprehensive bioinformatics analysis of DEGs, which may be involved in the progress of TNBC. As a result of this preliminary study, we confirm that these DEGs, including CDKN2A, MME, PBX1, IGFBP3, TFAP2C, VCAM1, KRT18, TGM2, ACTA2, and STAMBP, may play a role in the TNBC development and could be candidate molecular targets for the treatment of TNBC. In addition, extracellular matrix organization and cell adhesion may play important roles in promoting development of TNBC. The study provides a set of useful targets for future investigation into the molecular mechanisms and biomarkers. However, further molecular biological experiments are required to confirm the function of the identified genes in TNBC.