Introduction

Gene fusions as a result from somatic genomic rearrangements, most often translocations, are known to play an important role in the onset and development of distinct tumour types and clinical features, such as leukaemias and sarcomas. The estimated proportion of tumours with gene fusions is 100 % in chronic myeloid leukaemia (CML), 20 % in acute myeloid leukaemia (AML), 15–30 % in mature B Cell and T cell neoplasms, and 15–20 % in bone and soft tissue tumours [1].

Until recently, the occurrence of gene fusions in malignant epithelial tumours was believed to be rare. Gene fusions were found in 40 % of papillary thyroid carcinomas. However, thyroid cancer is a rare disease, accounting for only 1 % of all cancers. In the most common solid epithelial tumour types, such as breast carcinomas (with the exception of rare, secretory breast cancers), lung tumours, and digestive tract tumours, gene fusions were found in only <1 % of all cases. Therefore, the commonly held view was that gene fusions played a minor role in the pathogenesis of carcinomas [1].

This view changed when the amount of gene fusions and gene rearrangements was found to be equal among the different tumour types, with no significant differences between haematological disorders, mesenchymal and epithelial tumours [2]. Therefore, it was suggested that there are no tissue-specific differences in the genetic mechanisms by which tumours are initiated, and that yet unidentified gene fusions play a role in the onset of epithelial tumours. The reason that gene fusions were not identified in common solid tumours could be attributed to technical difficulties associated with the cytogenetic analysis of these tumours. Furthermore, disease progression coincides with cytogenetic heterogeneity within the tumour population. Some tumours consist of completely unrelated clones without evidence of a common origin. These observed anomalies were found in only 3 % of haematological malignancies compared to as much as 80 % in epithelial carcinomas, and might be secondary to some rare, yet unidentified and genetically essential changes such as gene fusions [3].

To bypass the technical limitations of cytogenetics in epithelial cancers, Tomlins et al. used bioinformatics (cancer outlier profile analysis) on 132 gene expression array data sets available in Oncomine (www.oncomine.org) to query for genes that were highly overexpressed in a subset of prostate cancers (PCa) rather than those present in all samples [4•]. The focus was on the identification of candidate oncogenes in PCa that were activated by chromosomal rearrangements or high copy number changes. ERG and ETV1 (belonging to the ETS family of transcription factors), also known to be involved in Ewing’s sarcoma, were in the top ten of the outlier genes that showed high expression in a subset of PCa cases.

The ETS family of oncogenic transcription factors is one of the largest families of transcription regulators (27 members in the human genome), containing a 85 amino acidic, highly conserved DNA binding domain—the ETS-domain—that displays sequence-specific binding to purine-rich DNA sequences containing a 5′-GGAA/T-3′ core sequence. ETS transcription factors play an important role in diverse biological processes, including cell proliferation, apoptosis, differentiation, angiogenesis and invasiveness. In PCa’s that over-expressed these ETS family members, fusions were identified of the 5’-untranslated region of TMPRSS2 to the coding sequence of ERG and ETV1 [4•]. Further experiments revealed that TMPRSS2 could also be fused to ETV4 and ETV5 [5, 6]. TMPRSS2 is prostate-specific and constitutively expressed under the transcriptional control of androgens. TMPRSS2-ETS fusions lead to the increased expression of the ETS members in response to androgens induced by the TMPRSS2 promoter.

Based on the success by Tomlins et al., others have adopted unbiased high-throughput methods, with increased resolution, for genome-wide detection of chromosomal rearrangements in cancer (e.g. next generation sequencing, SAGE-like sequencing, paired-end transcriptome sequencing) [7]. Using these approaches, other fusion partners for ERG were discovered, including SLC45A3, HERPUD1 and NDRG1 [69]. TMPRSS2-ERG, SLC45A3-ERG, HERPUD1-ERG and NDRG1-ERG fusions are responsible for the over-expression of ERG in ~ 50–60 % of PCa’s [7, 8]. Gene fusions involving other ETS family members or members of the RAS-RAF signalling pathway occur at a lower frequency (summarized in Table 1).

Table 1 Gene fusions in prostate cancer

Etiology of Gene Fusions in Prostate Cancer

Approximately 60 % of PCa’s are characterized by the presence of gene fusions. ERG is the most commonly rearranged ETS gene in PCa. For most gene fusions in human neoplasms, no specific initiating factor has been identified, and currently gene fusions have to be considered as stochastic events for which DNA double-strand breaks are required. Defects in genes involved in maintenance of chromosomal stability and DNA repair can lead to an increase of chromosomal rearrangements, and thus the occurrence of gene fusions.

The prostate, however, seems to be predestined for the TMPRSS2-ERG fusion. The genes for TMPRSS2 and ERG are located 3 Mb apart in the same orientation on chromosome 21q22.2. Under the influence of androgens, changes in chromatin organization are induced that can juxtapose the transcription units of both genes, thereby facilitating the genesis of this gene fusion [10]. This androgen-induced proximity may explain why this gene fusion is restricted to the prostate [11]. Double-strand breaks are induced through AR, and mediated by TOP2B, at preciseTMPRSS2-ERG rearrangement junction sites [12]. Broken DNA ends can become illegitimately repaired by double strand break repair machinery to create de novo TMPRSS2-ERG gene fusions [12]. A genome-wide linkage analysis in TMPRSS2-ERG -positive PCa families, revealed two genes, ESCO1 N191S and POLI F532S, that were significantly associated with TMPRSS2-ERG -positive PCa. Both genes encode proteins that contribute to the avoidance and repair of DNA double-strand breaks. Functional loss may lead to chromosomal instability and translocation events [13].

TMPRSS2-ERG can be generated by the mechanism of intra-chromosomal deletion (Edel), or through the insertion of intervening region to another chromosome (Esplit) [1416]. The deletion is located between TMPRSS2 and ERG on chromosome 21q22.2-3, and is observed in 39–60 % of the TMPRSS2-ERG -positive cases [17, 18]. By using Oncomine, significantly down-regulated genes (HMGN1, ETS-2) that were located in the 3 Mb area of deletion were identified. The loss of one or more of these genes may be associated with cancer progression, in addition to the oncogenic potential of the TMPRSS2-ERG fusion product. Therefore, these genes may act as tumour suppressor genes [17].

Studies revealed more than 20 mRNA transcripts and protein isoforms of wildtype ERG due to alternative splicing. It is not surprising that numerous TMPRSS2-ERG transcript variants occur as well. It has become apparent that next to alternative splicing, other recombination mechanisms (e.g. translocations and interstitial deletions) may also contribute to the distinct gene fusion transcripts, of which deletion seems to be the most common mechanism.

The reason that gene fusions between TMPRSS2 and other ETS family transcription factors (ETV1, ETV4, ETV5) occur at a lower frequency is that they are located on different chromosomes. However, ligand-bound AR promotes the spatial proximity of TMPRSS2 to either ETV1 or ERG in a non-random way, and it is suggested that the number of AR binding sites within ETV1, ETV4 and ETV5 could determine the translocation frequency of TMPRSS2 to either of these genes [11, 12, 19].

SLC45A3, FLJ35294, CANT1, HERPUD1, NDRG1, ACSL3 and KLK2 are, like TMPRSS2, androgen-induced 5’ fusion partners [7, 8]. The proviruses HERV-K and HERVK17 are also expressed in an androgen-responsive prostate-specific pattern [20]. Therefore, fusions of these genes with ETS family members are probably functionally similar to TMPRSS2-ETS fusions. Although C15orf21 is prostate-specific, it is repressed by androgens. Therefore, the function of a gene fusion with C15orf21 is likely different from TMPRSS2-ETS fusions. HNRPA2B1, a strongly expressed housekeeping gene, and DDX5 are not prostate-specific and androgen insensitive. Therefore, non-tissue-specific promoter elements will drive ETS expression in the case of fusions of HNRPA2B1 and DDX5 to ETS transcription factors.

The majority of the observed ETS gene fusions, including the most commonly found TMPRSS2 exon 1- ERG exon 4 (T1-E4) variant, encode either truncated or null fusion proteins (Table 1). The first reported chimeric protein in PCa is composed of 102 N-terminal amino acids of DDX5 fused to 419 C-terminal amino acids of ETV4. This chimeric protein is expressed at high levels in prostate cancer, although its role and function needs to be addressed in future studies [9].

The Role of These Gene Fusions in the Onset of Prostate Cancer

The variety of ETS gene rearrangements in the majority of PCa’s demonstrate that inactive oncogenes can be activated in PCa by gene fusions to tissue-specific or overall active genes. Since 2005, many studies have provided evidence that involvement of ERG gene fusions lead to ERG over-expression in two-thirds of PCa patients. The gene fusion rate is highest in the United States (42–60 %) and lowest in Asia (21 % in Korea and 16–28 % in Japan) [21, 22]. In the Chinese population lower frequencies of the 3 Mb deletion between the ERG and TMPRSS2 genes and deletion of 10q23, including the PTEN gene locus, were observed when compared to the UK population. Furthermore, in China there is a higher frequency of RAS and BRAF mutation in PCa compared to the Western population [23]. These genetic differences may underlie the regional/ethnic difference in clinical incidence, and suggest different pathways of prostate carcinogenesis in these populations. These data imply that Western men may be exposed to causative factor(s) for these specific genetic alterations. Dietary habits (plant food in Korea and Japan versus high fat animal food in United States and Europe) and environmental pollution are correlated with PCa incidence and may also be a reason for the differences in gene fusion rate [24].

Several groups have studied the biological effects of TMPRSS2-ERG gene fusions in PCa. TMPRSS2-ERG fusions are early events. Benign prostatic tissue, BPH and proliferative inflammatory atrophy (PIA) do not harbour TMPRSS2-ERG gene fusions [4•, 25, 26]. The gene fusion is found in 19 % of high grade prostatic intraepithelial neoplasia (HGPIN) lesions, 19 % of clinically staged T1 PCa’s, 48.5 % of clinically localized PCa’s, 30 % of hormone naive metastases and 33 % of hormone refractory metastases [2729]. Rajput et al. observed a higher frequency of TMPRSS2-ERG gene fusions in moderate to poorly differentiated tumours compared to well-differentiated PCa’s [30].

Genetically engineered mouse models (GEMM) showed that ERG over-expression plays a central role in the development of a large proportion of PCa’s and is required for PCa initiation. However, by itself ERG can induce the formation of epithelial hyperplasia and focal (HGPIN) lesions, but is not sufficient to induce the development of carcinoma [31, 32]. The same holds true for the loss of the tumour suppressor PTEN, which is a critical regulator of growth factors and inhibitors of PI3K [32]. Loss of PTEN and the presence of the fusion gene are events significantly associated with PCa [31, 33]. It was shown that PTEN loss and ERG over-expression cooperate in the development of a pre-neoplastic lesion such as HGPIN and invasive carcinoma. Furthermore, ERG in combination with AKT up-regulation was also implicated in neoplastic transformation. Although activation of AKT in human PCa occurs through loss of PTEN, the activated AKT and PTEN-deficient GEMM demonstrated different degrees of disease progression. Constitutively active AKT only resulted in HGPIN lesions, and PTEN deletion lead to PCa metastasis [34, 35]. Bi-allelic PTEN inactivation, by either homozygous deletion or deletion of one allele and mutation of the other, occurred in most PTEN-defective cancers, and characterized a particularly aggressive subset of metastatic and hormone-refractory PCa’s [36].

TMPRSS2-ERG gene fusion acts as an ‘on switch’ to trigger PCa. ERG expression is associated with elevated levels of HDAC1, subsequent down-regulation of HDAC1 target genes, activation of WNT/beta-catenin signalling pathway and inhibition of apoptotic signalling [15]. Activation of the WNT/beta-catenin signalling pathway activates AR, C-MYC and Cyclin D1. Activation of AR results in an increase of AR transcription and expression, enhanced transcription of TMPRSS2-ERG and high levels of ERG [37]. Over-expression of AR alone does not lead to neoplastic transformation, but when combined with high levels of ERG, it promotes the progression of HGPIN lesions to poorly differentiated, invasive PCa [32].

Activation of the C-MYC oncogene results in the disruption of the normal prostate differentiation program and interferes with the DNA-binding function of AR [38]. ERG can shut down androgen signaling by blocking the AR, and thereby prevents the normal development of prostate cells [39]. The subsequent up-regulation of the polycomb protein EZH2 by TMPRSS2-ERG, induces an embryonic stem cell-like dedifferentiation program. Dedifferentiation as a result of dysregulation of the transcriptional memory machinery of a normal prostate cell may contribute to the lethal progression of PCa. It was shown that PCa’s with high concentrations of EZH2 have a poor prognosis [40].

Results from two large studies in a watchful waiting cohort showed that the TMPRSS2-ERG fusion is associated with an aggressive PCa phenotype. In a Swedish population, men were diagnosed with PCa by transurethral resection of the prostate (TURP) for symptomatic benign prostatic hyperplasia (BPH) without PSA screening [28]. The patients were followed without curative treatment (watchful waiting) by clinical examinations, laboratory tests and bone scans every six months during the first 2 years, and subsequently with 12-month intervals without receiving curative treatment. In 15 % of the men, the TMPRSS2-ERG fusion was identified. The low frequency of TMPRSS2-ERG fusion may reflect the high percentage of low-grade tumours in this population-based cohort without PSA pre-screening. The presence of a TMPRSS2-ERG fusion was significantly associated with PCa-specific death. Another study confirmed the low frequency (19 %) of this fusion in clinical stage T1 cancers. Patients without TMPRSS2-ERG fusions demonstrated 90 % survival at 8-year follow-up [29].

FISH studies demonstrated that PCa’s with a duplication of TMPRSS2-ERG (two or more copies of 3’ERG) in the absence of sequences 5’ to ERG (due to an interstitial deletion Edel), also known as 2+Edel, are associated with poor clinical outcome [29]. This is consistent with the view that ERG over-expression is responsible for driving cancer progression, and that the 3 Mb deletion (containing genes with tumour suppressor activity) may add to the oncogenic potential of the TMPRSS2-ERG fusion product [4•, 17]. Furthermore, Mehra et al. demonstrated that all of the androgen independent metastatic PCa sites harbouring TMPRSS2-ERG were associated with Edel. These findings suggest that TMPRSS2-ERG with Edel is an aggressive, uniformly lethal, molecular sub-type of PCa associated with androgen-independent disease [18].

Different subclasses of ERG fusion transcripts are linked to poor clinical outcomes. A particular fusion transcript between exon 2 of TMPRSS2 and exon 4 of ERG (T2-E4) that encodes a TMPRSS2ERG fusion protein is associated with aggressive disease [41]. Furthermore, gene fusions with the first in-frame ATG codon present in ERG exon 3 are associated with poor clinical outcome, because of their association with seminal vesicle invasion [41].

In a setting of surgical intervention, some studies could find no or an adverse clinical correlation between the presence of gene fusions and prognosticators such as longer recurrence-free survival, pathological stage, negative surgical margins and Gleason score [4247]. One explanation for these observed differences is that patients from different geographical and ethnic backgrounds harbour different genomic alterations. Other explanations are differences in the size of patient cohorts, clinical settings (surgical or other interventions immediately after diagnosis versus watchful waiting), PSA biochemical failure versus cancer-specific death as outcome, and technical differences in sample collection and determination of gene fusions. Most data suggest a trend towards unfavourable outcome of the disease.

Another explanation could be the diversity of the TMPRSS2-ERG fusions. A total of 19 variant structures containing different combinations of sequences have been reported. Five transcripts do not code for functional ERG proteins, two transcripts can encode normal full length ERG proteins, one encodes for a chimeric protein of TMPRSS2-ERG, and the other nine encode for N-terminal truncated ERG proteins [48•, 49•]. For some of the N-terminal truncated ERG proteins, the transcription activity of its target genes is seriously impaired [49•]. Furthermore, overexpression of these truncated ERG proteins may compete in binding to the ETS binding sites in the promoters, thereby blocking the oncogenesis process [50]. Therefore, these TMPRSS2-ERG fusions may lead to less aggressive PCa features and favorable clinical outcomes [49•].

As a clonal event, TMPRSS2-ERG fusions are distributed among tumour nuclei within a discrete tumour nodule [17]. However, fusion transcripts can be detected in HGPIN lesions, but not in the PCa present in the same gland. Furthermore, the majority (70 %) of cases demonstrated heterogeneous TMPRSS2 gene rearrangements between different tumour foci [51]. These observations support the hypothesis that prostate carcinogenesis may be a multicentric process, in which at least two independent pathogenetic pathways may coexist in the same prostate, leading to independent neoplasias with or without the involvement of the ETS pathway [26].

Expression profiling revealed that distinct molecular subtypes of PCa exist [52]. Some molecular aberrations are seen in indolent PCa’s (loss of 5q21.1-q21.3, 6q15, overexpression of AZGP1), whereas others have a different set of alterations (deletion 8p21(NKX3-1), deletion 21q22 (resulting in TMPRSS2-ERG fusion)) that lead the progression from HGPIN to invasive and aggressive carcinomas [52, 53]. These findings suggest that in human PCa, the most potent function of ETS gene fusions may be to synergize with alternative genetic events and provide different pathways (e.g. AR, C-MYC, PI3K-PTEN axis) for carcinoma production and invasive behaviour.

Svensson et al. showed that rearranged and non-rearranged nuclei can occur in the same cancer focus, and different types of gene fusions may occur within a single focus as well [54]. This intra-focal heterogeneity was described as a rare event [27, 54]. However, a study by Minner et al. indicated that intra-focal heterogeneity is more frequent as was assumed [55]. Immunohistochemistry was done on 178 large tumour samples (obtained from 178 patients), to determine ERG expression. A positive staining for ERG was observed in 58 % (103/178) of the patients. The staining was homogeneous in 16 % (29/178) and heterogeneous in 42 % (74/178). Intra-focal heterogeneity was observed in 38 % (69/178) of the cases and inter-focal heterogeneity was observed in 7 % (5/178). However, the data may be overestimated, because of the difficulties in distinguishing the different foci in large cancers.

The molecular heterogeneity may be the result of tumour progression and may lead to different tumour types and clinical outcomes [56•]. To date, the largest tumour in a prostate is considered to be biologically the most significant, and defines the outcome of the disease. In 83 % of cases, TMPRSS2-ERG can be linked to the dominant tumour. However, in 17 % of cases, TMPRSS2-ERG is seen in secondary tumours [51]. Perner et al. demonstrated that through clonal selection the metastatic prostate cancer lesion harbours the same TMPRSS2-ERG fusion type as that present in the primary prostate tumour foci. The latter was not necessarily the largest tumour or the one with the highest Gleason score [57]. They concluded that because of the correlation with a more aggressive phenotype, TMPRSS2-ERG -positive PCa’s, irrespective of their size, need to be detected and treated. Men with TMPRSS2-ERG, and especially with fusion subtypes Edel, 2+Edel and T2-E4, may particularly benefit from early curative intervention.

Gene Fusions and Clinical Implications

To avoid over-diagnosis and over-treatment of patients due to the low specificity and unclear benefit of serum PSA testing, a PCa-specific biomarker test is required. Currently, the CE-marked Progensa™ PCA3 test is the first fully translated RNA-based molecular diagnostic assay available to the urologist for the detection of PCa in the urine [58]. It was shown that in men undergoing repeat biopsy, the noncoding RNA PCA3 was superior to serum PSA in predicting whether PCa is found on prostate biopsies. Similar to PCA3, the TMPRSS2-ERG fusion transcript can be detected in urine after digital rectal examination (DRE) [59, 60]. TMPRSS2-ERG in urine has a high specificity (93 %) and positive predictive value (94 %) for PCa detection [60]. In urine, TMPRSS2-ERG is associated with high serum PSA, the presence of cancer, tumour volume, PCa burden at prostatectomy, pathological stage, Gleason score ≥ 7, Epstein criteria for significant PCa (Gleason score, tumour volume, % of cancer per biopsy core, number of positive cores) and PCa-related death in prostatectomy and biopsy patients [28, 6163].

Improved detection of clinically significant PCa can be gained when the gene fusions are combined with PCA3 [60, 64, 65]. TMPRSS2-ERG has independent additional predictive value to PCA3 and the ERSPC risk calculator parameters for predicting PCa [66•]. TMPRSS2-ERG + PCA3 improve the multivariate PCPT risk calculator for predicting PCa diagnosis on biopsy [61]. TMPRSS2-ERG in urine adds significant predictive value to the ERSPC risk calculator to predict biopsy Gleason score and clinical tumour stage, whereas PCA3 does not [66•]. Men stratified by TMPRSS2-ERG + PCA3 scores in urine have markedly different risks of cancer, high-grade cancer, and clinically significant cancer upon biopsy. For instance, men with negative biopsies and highest TMPRSS2-ERG + PCA3 scores may benefit from active follow-up with biopsies, since they have a high chance of having clinically significant cancer [61].

Using an outlier meta-analysis (meta-COPA), SPINK1 was found to be exclusively expressed in 10 % of PCa’s without TMPRSS2:ETS fusions. Over-expression of SPINK1 was associated with an aggressive molecular sub-type of PCa. SPINK1 could be detected non-invasively in urine, and thus could serve to complement gene-fusion based urine testing for PCa [67]. Urinary SPINK1, GOLPH2 and TMPRSS2-ERG were, like PCA3, independent predictors of PCa upon repeat biopsy [68]. By combining PCA3 with these markers in a quantitative multiplexed RT-PCR analysis, the ROC AUC value improved from 0.66 (PCA3 alone) to 0.76. This multiplexed urine-based assay had 66 % sensitivity and 76 % specificity for detecting PCa in repeat biopsies. In men with elevated serum PSA, TMPRSS2-ERG + PCA3 may aid in decision making on biopsy and management of the disease [64, 68]. By combining SPINK1 with TMPRSS2-ERG and PCA3 in a multiplexed panel, the risk stratification of PCa may be further improved.

Gene Fusions as Therapeutic Target

For patients with advanced stage PCa, there are no therapeutic options available. Treatment of advanced PCa includes androgen deprivation therapy (ADT) using anti-androgen drugs (bicalutamide or flutamide). At first these drugs will work, but over time the cancer cells become resistant to therapy, resulting in recurrence of the cancer. Attard et al. showed that ~ 40 % of men with castration resistant PCa have ERG rearrangements. Recently, evidence was provided that ERG-positive PCa’s may respond better to anti-hormonal therapy than ERG-negative PCa’s [69].

The restricted expression of gene fusions to cancer cells makes them desirable therapeutic targets. Currently, there are studies ongoing that target the TMPRSS2-ERG fusion or its downstream signaling. It was shown that knockdown of the TMPRSS2-ERG fusion in a cancer cell line inhibited primary tumour growth. This study provided evidence for making TMPRSS2-ERG an attractive therapeutic target [70]. Recently, targeting the most common and clinically significant alternatively spliced isoforms of the TMPRSS2-ERG mRNA with specific siRNAs via liposomal nanovectors was shown to be promising therapy for men with PCa [71•]. The siRNAs were designed to span the junction of the fusion mRNAs to avoid targeting the native ERG protein, as the fusion mRNAs are only present in cancer cells. In vivo gene delivery of targeted siRNAs resulted in specific targeting of each TMPRSS2-ERG isoform, tumour growth inhibition with no apparent toxicity and no evidence of down-regulation of the native ERG protein. Thus, delivery of junction spanning siRNAs to fusion gene PCa could be a potential efficacious treatment with low toxicity for men with PCa [71•]. In the near future, more fusion-specific therapeutic solutions will be expected to appear for men with fusion-positive PCa.

Conclusion

The presence or absence of TMPRSS2-ERG gene fusions have provided evidence that multifocal PCa arise from multiple, independent (sub)clonal expansions with diverse molecular pathways. Furthermore, patients from different geographical and ethnic backgrounds have different genomic alterations. Therefore, common cancers like PCa should be divided into smaller subsets that are defined by diverse genetic abnormalities. In the era of individualized therapy, the combination of several biomarkers (e.g. TMPRSS2-ERG + PCA3) is necessary to accurately predict the presence of PCa and potential clinical outcome of disease. More gene fusions may be discovered in PCa that may involve other regulatory pathways. It is important to evaluate all these differences to enable individual risk assessment and to obtain the most optimal therapeutic strategies.