Keywords

1 Introduction

One hundred years ago, Theodor Boveri in his famous book Zur Frage der Entstehung maligner Tumoren [1] proposed an idea that later became known as the somatic mutation theory of cancer, which essentially states that cancer originates in a single cell by a mitotic disturbance leading to chromosomal damage. The acquired genetic change is then propagated during subsequent mitoses to all descendants of the originally transformed cell. This concept is today the paradigmatic view of cancer pathogenesis, supported by a wealth of experimental evidence. It long remained a theoretical idea, however, which could not be examined critically until technical improvements in human cytogenetic analysis were made half a century later, culminating in the description of the normal human chromosome complement by Tjio and Levan in 1956 [2].

The discovery only 4 years later by Nowell and Hungerford [3] of an acquired characteristic marker chromosome consistently seen in patients with chronic myelogenous leukaemia (CML), later designated the Philadelphia chromosome (Ph) after the city where it had been found, immediately provided strong support for the idea that chromosome aberrations indeed may play a major role in the initiation of the carcinogenic process. It was reasonable to assume that the specific chromosomal abnormality – a perfect example of a somatic mutation in a haematopoietic stem cell – was the direct cause of the neoplastic state, i.e., the true verification of Boveri’s somatic mutation theory. The discovery of the Ph chromosome greatly stimulated interest in cancer cytogenetics in the 1960s. However, the results obtained over the next decade were disappointing. Chromosome aberrations were detected in most tumours but no specific change comparable to the Ph was found. The abnormalities varied within the same tumour types and among patients, and at the end of the 1960s most scientists agreed that chromosome aberrations were secondary epiphenomena – not the cause, but the consequence, of neoplasia. The Ph was the exception to the rule that chromosome changes did not play any important pathogenetic role in carcinogenesis.

2 Chromosome Banding

The situation changed dramatically in 1970 with the introduction of chromosome banding by Caspersson and co-workers [4]. Each chromosome, chromosome arm, and even chromosome region could now be precisely identified on the basis of its unique banding pattern, and hence aberrations that previously had not been possible to detect could now be visualized. The first characteristic cytogenetic changes in cancer cells discovered with the help of the new technique appeared in 1972 (see Mitelman and Heim [5] for a review of the early data): a 14q+ marker chromosome in Burkitt lymphoma (BL), a deletion of the long arm of a chromosome 20 in polycythemia vera, +8 in acute myeloid leukaemia (AML), and −22 in meningiomas. The first balanced rearrangements were reported shortly afterwards. In 1973, Rowley first identified a reciprocal translocation between chromosomes 8 and 21, i.e., t(8;21)(q22;q22), in the bone marrow cells of a patient with AML [6] and the very same year she showed that the Ph in CML originated through a t(9;22)(q34;q11), not a deletion of the long arm of chromosome 22 as previously thought [7]. A steadily increasing number of characteristic, specific, sometimes even pathognomonic balanced rearrangements, in particular translocations, were soon described in various haematologic disorders and malignant lymphomas, including t(8;14)(q24;q32), t(2;8)(p11;q24), and t(8;22)(q24;q11) in BL [811], t(15;17)(q22;q21) in acute promyelocytic leukaemia [12], t(4;11)(q21;q23) in acute lymphoblastic leukaemia [13], t(8;16)(p11;p13) in acute monocytic leukaemia [14], and t(14;18)(q32;q21) in follicular lymphoma [15]. The first specific translocations in experimental neoplasms and, as it turned out, the perfect equivalents of the characteristic rearrangements in human BL were identified by Ohno et al. [16] in mouse plasmacytomas (MPC) by the end of the 1970s.

The following decade saw a similar explosion of data emerging from studies of solid tumours, initially in particular among mesenchymal tumours. Several of the aberrations identified in the solid tumours were as specific as those previously found among haematologic malignancies, e.g., t(2;13)(q36;q14) in alveolar rhabdomyosarcoma [17], t(11;22)(q24;q12) in Ewing sarcoma [18, 19], and t(12;16)(q13;p11) in myxoid liposarcoma [20]. At this time, it also became clear that many benign tumours carried characteristic aberrations, including reciprocal translocations, e.g., t(3;8)(p21;q12) in salivary gland adenoma [21], t(3;12)(q27–28;q13–15) in lipoma [22, 23], and t(12;14)(q14;q24) in uterine leiomyoma [2426]. All published abnormal karyotypes in neoplasia detected by banding analyses are presented in Mitelman et al. [27], and a comprehensive review of the presently known recurrent and specific chromosome aberrations may be found in Heim and Mitelman [28].

3 Recombinant DNA Technology

Technical developments in the late 1970s enabling the identification and characterization of genes in the breakpoints of chromosome rearrangements made it possible to elucidate the molecular consequences of the recurrent cancer-associated chromosome changes, and analyses in the early 1980s of the specific translocations in MPC, BL, and CML proved particularly pivotal for our understanding of how chromosome aberrations contribute to neoplastic transformation. When the different pieces of the puzzle were assembled, it became apparent that balanced rearrangements exert their effects by one of two mechanisms: Transcriptional up-regulation of an oncogene in one of the breakpoints through exchange of regulatory sequences in the other breakpoint, and the creation of a hybrid gene through fusion of parts of two genes, one in each breakpoint [29]. Deregulation of an oncogene by juxtaposition to a constitutively active gene region was predicted by Klein already in 1981 [30] and the principle was soon demonstrated in MPC and human BL. The breakpoints of the characteristic translocations in mice and humans were found to be located within or close to the MYC oncogene and one of the immunoglobulin heavy- or light-chain genes (IGH, IGK or IGL). As a consequence of the translocations, the entire coding part of MYC is juxtaposed to one of the immunoglobulin genes, resulting in deregulation of MYC because the gene is now driven by regulatory elements of the immunoglobulin genes. The alternative mechanism – the creation of a fusion gene – was documented at the same time in CML with the demonstration that the Ph chromosome, i.e., the der(22)t(9;22)(q34;q11), contains a fusion in which the 3′ part of the ABL1 oncogene from 9q34 has become juxtaposed with the 5′ part of a gene from 22q11 called the BCR gene, resulting in the creation of an in-frame BCR/ABL1 fusion transcript.

The first confirmation of the BL scenario in another B-cell neoplasm was the demonstration in 1984 that the t(14;18)(q32;q21) in follicular lymphoma results in overexpression of BCL2 [31] due to its juxtaposition to the IGH locus, and in 1986 an analogous situation was established in T-cell acute lymphoblastic leukaemia in which regulatory elements of the T-cell receptor alpha (TRA) gene were found to deregulate the expression of MYC [3234]; other 3′ partner genes, e.g., LYL1, TAL1, LMO1,and LMO2, involved in translocations involving TRB and TRD loci were soon identified in T-cell leukaemias/lymphomas carrying various translocations [3539]. The CML scenario, i.e., the creation of a chimeric fusion gene, was firmly established in both haematologic malignancies and solid tumours in the early 1990s: PML/RARA in acute promyelocytic leukaemia with t(15;17)(q22;q21) [40, 41], RET/CCDC6 in thyroid carcinomas with inv(10)(q11q21) [42], DEK/NUP214 in AML with t(6;9)(p22;q34) [43], RUNX1/RUNX1T1 in AML with t(8;21)(q22;q22) [44], and EWSR1/FLI1 in Ewing sarcoma with t(11;22)(q24;q12) [45].

The molecular insights into the pathogenetic mechanisms of cancer-specific chromosome aberrations sparked an enormous interest in cancer cytogenetics as a powerful tool to locate and identify genes important in tumourigenesis. Further technical improvements during the 1980s, in particular the development of fluorescence in situ hybridization (FISH), multi-colour FISH, and the widespread adoption of the polymerase chain reaction (PCR), added a further sophistication to the analysis, and radically increased the precision in identifying new gene fusions [28]. This course of action – the genomic characterization of the breakpoints in cytogenetically detected specific balanced aberrations – remained the unrivalled method to identify fusion genes in cancer for a quarter of a century and led to the detection of more than 700 fusion genes (Table 1.1) caused by acquired translocations, inversions, and insertions characterizing various tumour entities [27].

Table 1.1 Gene fusions in neoplasia reported 1980–2014, based on data contained in Mitelman et al. [27]

There was a major limitation of this remarkably successful approach, however. It was in principle restricted to haematological malignancies and mesenchymal tumours, which typically have simple abnormalities, often seen as a sole anomaly. Malignant epithelial tumours, representing the dominant cause of human cancer morbidity and mortality, which characteristically have complex karyotypes with numerous numerical and structural abnormalities, were consequently not amenable for analysis. As a consequence, very few fusion genes were detected in carcinomas. By 2005, only 29 fusion genes were known in carcinomas, all organs combined, as compared to 56 in mesenchymal neoplasms and 272 in haematological malignancies. These quantitative differences led to the generally held view that fusion genes are not an important mechanism in carcinoma pathogenesis. Indirect evidence that fusion genes actually may play the same fundamental role in epithelial carcinogenesis as they do for the initiation of haematologic and mesenchymal neoplasms was presented by Mitelman et al. [46], and direct evidence clearly substantiating this view was soon produced with the help of new powerful technologies developed during the last decade.

4 Next-Generation Sequencing

The breakthrough in the search for fusion genes by alternative methods to chromosome banding analysis followed by reverse transcriptase-PCR and Sanger sequencing was made by Chinnaiyan and coworkers in 2005 [47]. They took a bioinformatics approach to look for genes in prostate cancer that showed a very high expression in RNA microarray experiments, and demonstrated that two of the outlier genes – ERG and ETV1 – were frequently fused to the 5′ part of the prostate-specific androgen-regulated gene TMPRSS2. Subsequently other ETS family genes were found to be fused with TMPRSS2, and several other 5′ partner genes that activate ETS genes were also discovered [48]. The frequencies of the various fusions vary slightly in different patient series depending on the populations studied but altogether about 80 % of prostate cancers harbour one of the presently known fusion genes, the most common being TMPRSS2/ERG. Very soon afterwards, an EML4/ALK gene fusion was found in a subset of non-small cell lung cancer by screening a retroviral cDNA expression library from cancer samples [49]. The importance of these results in prostate and lung cancer cannot be overestimated. They showed, for the first time, that cytogenetically undetectable gene fusions may be a causative factor in a substantial fraction of common human cancers, and the findings underscored the need for high-resolution methods to be used in parallel with chromosome banding to characterize cancer genomes. The advancement of next-generation sequencing (NGS) at this time revolutionized the search for new fusion genes, enabling unprecedented opportunities to process thousands of tumours for systematic mutation and fusion gene discovery without any knowledge of the genetic constitution. The first report using the new sequencing technology to find fusion genes in cancer was presented by Stratton and co-workers in 2008 [50]. Numerous studies of common cancer types, such as carcinomas of the breast, lung, prostate, and uterus, quickly followed (e.g., [5157]), and the results have dramatically changed the gene fusion landscape. A myriad of new gene fusions – more than 1,300 – the great majority involving previously unsuspected genes, have been identified with the help of NGS-based analysis [27]. Table 1.1 shows the dramatic increase of gene fusions detected since 2010, in particular among malignant epithelial tumours, and Table 1.2 presents the distribution of all presently reported fusions among major neoplasia subtypes.

Table 1.2 Number of gene fusions and genes involved in fusions in major neoplasia subtypes, based on Mitelman et al. [27]

As can be seen from Table 1.1, the total number of gene fusions now exceeds 2,000 and at least 65 % of these were identified by various sequencing technologies during the last 5 years. Clearly, the presently known gene fusions represent only the tip of an iceberg. Given the extraordinary rate at which The Cancer Genome Atlas (TCGA) project is generating cancer genomic data [58, 59], a huge number of new genomic rearrangements can be expected to be discovered within the next few years. It is important in this context to mention two notable differences between the fusion genes detected on the basis of cytogenetically identified aberrations and those so far identified by NGS. First, multiple NGS-detected fusion genes are generally found within the same tumour, e.g., more than 25 different fusions in one prostate cancer, and secondly, very few of the NGS-detected fusion genes have been found to be recurrent. A major challenge will be to verify by functional studies which of the alleged gene fusions are primary, pathogenetically important, and which are either secondary progressional changes or non-consequential “noise” abnormalities.