Introduction

It is assumed that most cancers evolve from a single cell that has acquired a mutation, but in few clinical cases there is definitive identification of the founder mutation or the precise cell of origin. Childhood acute lymphoblastic leukemia (ALL) has multiple genetically and clinically distinct subtypes but the two most common variants have either a t(12;21) chromosomal translocation, which results in a chimeric ETV6-RUNX1 fusion gene, or chromosomal hyperdiploidy.1 There is strong evidence that the former is a recurrent founder mutation. Monozygotic twins concordant for ALL were found to share the fusion gene break point sequence, suggesting that a common pre-leukemic clone was spawned in utero and shared between the twins.2 The blood chimeras that developed between these twins are a consequence of a shared single placenta and intraplacental vascular anastomoses.3 Archived neonatal blood spots of patients with ALL were also found to be positive for the gene fusion confirming ETV6-RUNX1 as an early event in the pathogenesis of this disease.4 More recent single-cell analyses showed the presence of the gene fusion in every leukemic cell.5, 6 The fusion gene is the only acquired, ‘driver’ (recurrent) mutation in the coding region in ETV6-RUNX1+ ALLs developed in monozygotic twins7, 8 and at relapse, the ETV6-RUNX1+ genomic fusion sequence is always preserved, whereas copy number abnormalities change.9 Murine models with human precursor cells harboring the ETV6-RUNX1 fusion gene generated a pre-leukemic state that only resulted in an overt leukemic phenotype upon the acquisition of additional genetic abnormalities.10, 11, 12, 13, 14

ETV6-RUNX1+ ALLs can be characterized by a CD19+ B-cell lineage precursor immunophenotype with ongoing recombinase-associated IGH rearrangements.15 These features reflect the predominant level of differentiation arrest in the B-cell lineage but not necessarily the cell of origin in which the initial functional impact of the fusion gene occurs. This could lie anywhere antecedent to B precursor cells in the lineage hierarchy. Experiments with mouse and human stem and progenitor cells suggest that ETV6-RUNX1 may arise in a multi-lineage stem cell as a weak oncogene but transforms or impacts more profoundly on early pro-B cells. Models are, however, by their nature artifactual and may not definitively define the cell of origin in a clinical context.

Sequential rearrangements at the multi-gene segment loci IGH, IGK, IGL and TCRA/D, G and B are unique to the B-cell and T-cell lineages, respectively, and are initiated by DH to JH segment rearrangements in pro-B-cells.16, 17 These markers are present in >90% of ALL samples18, 19 but cannot in themselves define the cell of origin as they are primarily products of extensive and continuing RAG-driven clonal diversification in downstream progeny of the cell that was originally transformed.20, 21 However, we reasoned that the clonal status of immunoglobulin/T-cell receptor (IG/TCR) rearrangements in the context of monozygotic twins concordant for ETV6-RUNX1+ ALL could be indicative of the fetal cell of origin that initially expanded. Although, IG/TCR rearrangements are ongoing in ALL, including replacement variable-diversity-joining (V(D)J) switches, any diversity-joining (DJ) or V(D)J rearrangements shared by a twin pair would be indicative of the IG/TCR status of the cell giving rise to pre-leukemic clones. Limited screening of IG/TCR rearrangements in twin pairs has been previously reported.2, 22, 23 We have carried out a detailed and comprehensive screen of IG/TCR rearrangements in five twin pairs with concordant ETV6-RUNX1+ ALL. For one twin pair, we expanded the analysis incorporating single leukemic cells and included genomic mutations and copy number alterations (CNAs) common to the leukemias of each twin. Phylogenetic analysis combined all data describing disease evolution in this twin pair.

Materials and methods

Patient samples

Diagnostic bone marrow material was available from five monozygotic, monochorionic twin pairs concordant for pediatric ETV6-RUNX1+ precursor B-cell ALL. DNA was available from all patients with additional methanol/acetic acid-fixed cells available from one twin pair. Clinical and pathological information along with SNP-array data for all patients has previously been described.7 For one twin pair (3A/B), whole-genome sequencing data were also published.8 Ethical Committee approval from the Royal Marsden NHS Trust and written informed consent from the patients’ parents have been obtained for the study, which was conducted in accordance with the principles expressed in the Declaration of Helsinki.

IG/TCR gene rearrangement screening

Standard screening

Our standard IG/TCR screening approach has recently been described in detail.24 In brief, for all cases, polymerase chain reaction (PCR) amplifications of IG heavy chain variable-diversity-joining (IGH V(D)J; complete and incomplete), IG kappa variable-joining (IGK VJ), IGK V-kappa-deleting element (Kde), intron recombination signal sequence-Kde, IG lambda (IGL), TCR delta (TCRD), TCR gamma (TCRG) and TCR beta (TCRB) gene rearrangements were performed using primer mixes and conditions recommended by the BIOMED-2 Consortium.25, 26 Clonality was assessed by GeneScan profiling (Applied Biosystems, Paisley, UK)27 and in the case of positivity, monoplex PCR reactions were performed followed by cloning of the products (pCR2.1 TA Cloning kit; Life Technologies, Paisley, UK or pMOSBlue Blunt-Ended Cloning Kit; GE Healthcare Life Sciences, Amersham, UK) and Sanger sequencing of randomly picked colonies using BigDye chemistry. Sequences were aligned by basic local alignment search tool search to the human germ-line sequences deposited in GenBank and junction analyses were performed using the Ig basic local alignment search tool (www.ncbi.nlm.nih.gov/igblast/) and/or the ImMunoGeneTics tools (www.imgt.org).

Targeted next-generation sequencing

We developed a targeted capture approach using the NimbleGen system (Roche, Burgess Hill, UK) by tiling DNA baits across the V, D and J segments of the IGH, IGK, IGL, TCRD, TCRG and TCRB loci. This method avoids potential bias that can be caused by the initial PCR and cloning steps in standard screening methods. Genomic regions of interest were assayed using 100 ng DNA from six patients (twin pairs 2, 3 and 4) and a previously described capture protocol.28 Samples were barcoded using Illumina indexes, multiplexed and sequenced on a MiSeq platform generating 150-base pair paired-end reads. After base calling and quality control metrics, fastq reads were aligned to the reference human genome (build GRCh37) using BWA 0.6.2 and Stampy 1.0.20. On average, a median depth of 325 × was achieved in the genomic regions of interest across the six diagnostic samples analyzed. In addition to manual inspection of the whole data set in IGV, identification of clonal complete and incomplete IG/TCR rearrangements was also performed with a combination of Samtools and ImMunoGeneTics tools. Rare variants were considered to be true if the rearrangement was present at least three times independently in both forward and reverse high-quality reads.

Cross-test of IG/TCR gene rearrangements with allele-specific Taqman PCR

Each clonotypic rearrangement detected in only one twin (with either screening method) was screened in his/her co-twin by Taqman PCR using custom allele-specific assays, covering the unique junctional regions; assays were designed using the Primer Express v3.0 software (Applied Biosystems). Custom assay probes were labeled with FAM reporter dye and a commercial VIC-labeled Taqman RNase P copy number assay (Applied Biosystems) was used as endogenous reference. Optimal thermocycling conditions were established and reactions were performed in duplicates or triplicates depending on the amount of sample available. All experiments were performed using an ABI 7900HT Fast Real-Time Q-PCR System equipped with SDS 2.2.2 software (Applied Biosystems). Positivity was interpreted according to the ESG-MRD-ALL guidelines29 and the specificity of each Taqman PCR reaction was confirmed by cloning and sequencing the products using the methods described above.

Single-cell genotyping

Single-cell genetic analysis was performed for twin pair 3 using available methanol/acetic acid-fixed cells and our previously established multiplex quantitative-PCR (Q-PCR) approach with minor modifications.6 In brief, fixed propidium-iodide-labeled single cells were sorted into individual wells of 96-well plates by fluorescently activated cell sorting (BDFACSAria I SORP). Eleven control cord blood cells were included in each plate. After cell lysis (G-Storm GS4 Multi Block Thermal Cycler) multiplex target-specific DNA amplification was completed using custom Taqman assays (designed with Primer Express v3.0) or commercially available DNA copy number assays (Life Technologies) to amplify regions of interest. IG/TCR rearrangements, the ETV6-RUNX1 gene fusion, ETV6 and RUNX1 gene CNAs and nine coding-region somatic nucleotide variants (SNVs) (previously identified by whole-genome sequencing8) were analyzed, simultaneously. Amplified samples (diluted 1:6) and assays were then loaded into a 96.96 dynamic array (Fluidigm, San Francisco, CA, USA) and the final detection Q-PCR reaction was completed using the BioMark HD System (Fluidigm). Assays aiming to detect CNAs were completed in quadruplicates, whereas all remaining assays were used in duplicates. The gene fusion, CNA, SNV and IG/TCR status for each single cell was determined as described previously6 and the clonal phylogeny from these single-cell data was inferred using the Maximum Parsimony approach.6, 30 Data were assembled using an in-house Perl script and all parsimony analyses were performed using the computer software PAUP* version 4.0b10 for Linux.6 Trees were visualized using Dendroscope Software version 3. Owing to limitations arising from the junctional nucleotide sequence, it was impossible to include an assay for detection of the IGHD3-10/J4 junction into the Fluidigm microfluidic multiplex Q-PCR experiment. This rearrangement was found to be present in both patients of twin pair 3; thus, we performed an additional single-cell experiment using an ABI 7900HT fast real-time Q-PCR System and analyzed this incomplete IGH junction together with the ETV6-RUNX1 gene fusion and other previously identified IG rearrangements in 60–60 cells of twins 3A and 3B, respectively.

Results

IG/TCR gene rearrangement profiling

Standard screening for clonal IG and TCR gene rearrangements was performed on the diagnostic DNA of all 10 individual patients by multiplex PCR reactions and ABI GeneScan profiling. Thirty-seven reactions detected clonalities with the following distribution: IGH DJ (#0), IGH V(D)J (#10), IGK VJ (#3), IGK Kde (#8), IGL (#4), TCRD (#4), TCRB (#2) and TCRG (#6) (Supplementary Table 1). In each patient, at least two affected genes were found. We next performed monoplex PCR reactions to scrutinize all loci that showed clonal rearrangements. Amplified products were cloned, random colonies picked and directly sequenced followed by junction analyses using the Ig basic local alignment search tool and ImMunoGeneTics tools. In total, >670 bidirectional sequencing reactions were carried out and 65 rearrangements were identified in the five twin pairs.

Targeted next-generation sequencing was successfully performed for twin pairs 2, 3 and 4. On average, a median depth of 325 × was achieved in the genomic regions of interest (Supplementary Table 2). In total, 42 genuine IG/TCR gene rearrangements present in at least three independent reads were identified in twin pairs 2A/B, 3A/B and 4A/B.

All IG/TCR rearrangements detected in the five twin pairs using our screening approaches are summarized in Supplementary Table 3. Identical IGH V(D)J and IGK Kde rearrangements were observed in the siblings of twin pairs 2A/B and 3A/B, whereas twins 4A/B shared an identical TCRD Vδ2-Dδ3 junction.

In order to further refine the immunogenotypes of twins in the context of their sibling’s profile, clonal rearrangements detected in only one patient were screened in his/her sibling by allele-specific Taqman PCR. For IGH gene rearrangements, complete V(D)J and incomplete DJ junctions were investigated separately. In a subset of rearrangements, it was not possible to design sensitive allele-specific assays due to the sub-optimal sequence at the junctions; a technical limitation of this approach that is well-known from previous studies.19 With these sibling cross-tests, we were able to identify three additional rearrangements with identical junctions common to twins 1A/B (TCRD Vδ2-Dδ3), 3A/B (IGH DJ) and 5A/B (IGH DJ) (Supplementary Table 4). The number and pattern of all rearrangements detected in our twin cohort are summarized in Figure 1.

Figure 1
figure 1

Pattern of immunoglobulin (IG) and T-cell receptor (TCR) gene rearrangements detected in the five monozygotic twin pairs with concordant ETV6-RUNX1+ ALL. The figure includes all rearrangements identified by standard screening (Sanger sequencing), targeted next-generation sequencing or allele-specific Taqman PCR cross-tests between siblings. The number of identical rearrangements shared by the siblings of each twin pair is highlighted in italics. Additional alterations found in each individual are indicated with normal numbers. The incomplete IGH DJ rearrangements shown here belong to the complete IGH V(D)J rearrangements indicated in the same twin pair.

Previous bulk DNA SNP-array analysis of these twin pairs confirmed that all samples harbor diploid genomes7 allowing oligoclonality to be interpreted on a background of two rearranged allelic variants of the IG or TCR genes. Oligoclonality was observed by analyzing the IGH gene in patients 1A, 2A, 2B, 3A, 3B, 4B and 5B. In patients 2A and 2B, V-replacement on one allele produced a third rearrangement, whereas in patients 1A, 3A, 3B and 5B identical intra-sample DJ junctions were not observed suggesting a post-oncogeneic DJ recombination event, generating multiple rearrangements. In case 4B, both an oligoclonal IGH DJ constitution and V(D)J junctions suggesting V-replacement activity could be detected. Oligoclonal IGK repertoires including three different Kde junctions were observed in cases 3B and 5B. In patients 1A and 5A, more than two IGK rearrangements were identified; however, monoclonality cannot be excluded in these cases as the rearrangements include an IGK VJ and an intron recombination signal sequence-Kde that can be located on the same allele.31

Among the cross-lineage rearrangements, TCRD showed oligoclonality in patients 1B, 4A and 4B. In case 4A, the two Vδ2-Jα29 junctions comprised two independently detected Vδ2-Dδ3 joints, indicating a direct clonal relationship. Identical Vδ2-Dδ3 joints were not found in case 1B, suggesting that the various rearrangements are located on different alleles or possibly found in different cells. In patient 4B, aside from clonally related Vδ2-Dδ3 and Vδ2-Jα29 junctions, three unrelated TCRD rearrangements were detected. In both twin pairs 1A/B and 4A/B, identical Vδ2-Dδ3 rearrangements were shared by the siblings coupled with TCRD oligoclonalities mentioned above. These data demonstrate that both the B-cell lineage-specific IG genes and the cross-lineage TCRD locus may be accessible for recombination pre- and postnatally during leukemia development.

All clonal rearrangements detected in our twin pairs are presented in the context of pre- and postnatal leukemogenesis in Figures 2a–e. In each pair, we observed shared identical clonal IG/TCR rearrangement(s) in addition to the founder lesion ETV6-RUNX1, and therefore these rearrangements must have occurred in utero. The vast majority of clonal markers proved to be patient-specific and are likely to have arisen postnatally in patient-specific subclones.

Figure 2
figure 2figure 2

Postulated evolutionary sequence of clonal immunogenotypic markers identified in each twin pair (ae: twin pairs 1–5). Rearrangements are depicted in the context of in utero origin and diverging postnatal evolution of the leukemia in twin siblings. Copy number abnormality (CNA) data have been adopted from Bateman et al.7 in which the leukemia from same twin pairs was examined. HSC: hematopoietic stem cell.

Single-cell genotyping

Archived diagnostic leukemia cells from twin pair 3 were subjected to unbiased high-throughput single-cell mulitplex Q-PCR analysis that allowed the simultaneous investigation of previously defined gene fusions, CNAs, SNVs and IG/TCR gene rearrangements for each leukemia. The gene targets included the ETV6-RUNX1 gene fusion, ETV6 and RUNX1 copy numbers, nine coding-region SNVs and eight IG/TCR rearrangements. A complete list of genetic targets analyzed and assays applied can be found in Supplementary Table 5.

After quality-control assessments and analysis as previously described6 111 and 110 leukemic cells provided reliable data for patients 3A and 3B, respectively. Five subclones with 9–10 genetic markers and eight subclones with 7–15 markers were identified in patients 3A and 3B, respectively. The most likely phylogenetic history and relationships between the subclones found in each twin is shown in Supplementary Figures 1 and 2. The phylogenetic trees are branching and complex in both leukemias but share a common ancestry.

For patient 3A, a subclonal architecture with relatively modest complexity was found (Supplementary Figure 1). In addition to the ETV6-RUNX1 gene fusion, four somatic SNVs (three shared with the co-twin and one patient-specific) and an IGK Kde rearrangement were observed in every leukemic cell, suggesting that these alterations occurred early in leukemogenesis. Two complete IGH V(D)J junctions were also acquired early in the development of this leukemia; the disappearance of one of these rearrangements in 8% of the leukemic cells (subclones A4 and A5) is most likely due to the ongoing RAG enzyme-driven recombination activity replacing parts of the V(D)J junction with upstream V segments. The subclonal position of the TCRG rearrangement and ETV6 gene loss suggests that these events occurred later. ETV6 gene loss is considered to be a potential secondary ’driver’ alteration in pediatric ETV6-RUNX1+ ALL. The re-iterative loss of ETV6, twice in independent branches of the tree confirms convergent evolution and strengthens the notion that this gene is a secondary ’driver’.5 Maximum parsimony algorithm predicted the most recent common ancestor from which all subclones of the phylogenetic tree descended.

Phylogenetic analysis of patient 3B defined a more complex tree than in the co-twin (Supplementary Figure 2). ETV6-RUNX1 and the three identical somatic SNVs shared with patient 3A were observed in every leukemic cell. Patient-specific SNVs (five) and IGK VJ and TCRG rearrangements were detected in 97% of leukemic cells with an early acquired IGH V(D)J rearrangement (IGHV3-7/D3-10/J4) showing loss in minor subclones (B5 and B7—total 9%); presumably a later event due to ongoing V-replacement. Re-iterative loss of ETV6 and gain of RUNX1 were also common, supporting previous findings that CNAs in ETV6-RUNX1+ ALL can occur multiple times during leukemogenesis.5 Subclone B8 harboring the highest number of genetic changes, including both ETV6 loss and RUNX1 gain, was not numerically dominant, albeit this result is based on analysis at a single time point. Again, phylogenetic analysis predicted an undetectable common ancestor from which all subclones were potentially derived.

A striking finding in this analysis was that patient 3B harbored a minor subclone (B1) with IG rearrangements that dominated the leukemia of co-twin 3A but none of the patient-specific SNVs or IG/TCR rearrangements common to the other subclones in twin 3B. Whereas this unique genetic pattern was found only in 3% of cells, the 11 markers that distinguished subclone B1 from the closest clonally related subclone (B2) provide strong evidence that this is a true population of cells.

In the additional single-cell experiment performed using an ABI 7900HT fast real-time Q-PCR System, all analyzed ETV6-RUNX1+ leukemic cells of patient 3A were found to be positive for the IGKV2-30/Kde and negative for the IGHD3-10/J4 rearrangement. Ninety percent of these cells harbored both complete IGH rearrangements analyzed (IGHV4-55/D3-9/J5 and IGHV3-35/D3-3/J4), whereas 10% were positive only for one of those; this proportion is comparable to the combined size of subclones A4 and A5 (8% in total) detected in the Fluidigm microfluidic experiment. In twin 3B, all ETV6-RUNX1+ leukemic cells were positive for the incomplete IGHD3-10/J4 rearrangement and 96% of them were negative for all three complete or end-stage IG rearrangements mentioned above. Four percent of cells harbored the IGKV2-30/Kde rearrangement previously detected in subclone B1 during the Fluidigm experiment.

The leukemias that arose in twins 3A/B have a common in utero origin, supported by the identical ETV6-RUNX1 genomic fusion gene sequence and three somatic SNVs detected in both siblings. Thus, the phylogenetic trees derived for each sibling should be considered as diverging subclonal architectures of a single leukemia. We combined and analyzed all subclones detected in the twins to acquire a more comprehensive picture of leukemogenesis (Figure 3). This integrated parsimonious tree revealed a very close phylogenetic relationship between subclone B1 detected in patient 3B and the subclones observed in co-twin 3A. The inferred most recent common ancestor of this combined tree harbored the ETV6-RUNX1 gene fusion, the three shared SNVs and an incomplete IGH DJ rearrangement; a profile characteristic of a cell at the pro-B-cell stage of development. Two of the three IG gene rearrangements found to be shared by the twins by bulk DNA analysis were present only in the minor subclone B1 in sibling 3B and not in all cells as initially presumed. The third IG rearrangement (IGH DJ) found in all leukemic 3B cells was detected in patient 3A only by bulk DNA analysis, suggesting its presence in a very small subclone. These results demonstrate that immunogenotype diversification began before birth in this twin pair, with the formation of multiple in utero subclones, originating from the same pre-leukemic clone. Before birth, these subclones could pass from one twin to the other and contribute to the development of clinically overt leukemia.

Figure 3
figure 3

Combined phylogenetic history of ETV6-RUNX1+ ALLs revealed in monozygotic twins by single-cell genotyping. The phylogenetic tree was generated based on subclones detected in twins 3A/B using maximum parsimonious analysis.6 Leukemic subclones are represented by red circles and the normal state is indicated by a black circle. The size of the circle is proportional to the number of cells in each subclone and the detected genetic markers are listed below each circle. Inferred subclones are represented by gray boxes; these are groups of cells, which have died out, been outcompeted or if still present, exist at low frequencies below the level of reliable detection using our approach. Tree branch lengths are directly proportional to the number of evolutionary changes inferred and the points at which the branches diverge (nodes) represent the ancestor state of a clonal clade; a monophyletic group, which includes all descendants of the ancestor. The number in italics at each node indicates the jackknifing value. The phylogeny shows how the clonal expansion has evolved from a common ancestor toward the observed states. Branched subclonal architectures were observed in both twins 3A and 3B. Maximum parsimonious analysis has shown that subclone B1, detected in twin 3B, is the least evolved, earliest occurring subclone with a closer phylogenetic relationship to the subclones detected in co-twin 3A than its own subclonal counterparts. IGHD3-10/J4 was added to the phylogenetic tree based on a separate single-cell experiment and bulk DNA Taqman PCR analysis.

Discussion

ETV6-RUNX1+ ALLs can be characterized by phenotypes (CD19+, CD10+, TdT+ and RAG1/2+) and genotypes (ongoing IG rearrangements) indicative of differentiation arrest in the B-cell precursor developmental compartments.32 This cancer usually originates in utero during fetal hemopoiesis with the ETV6-RUNX1 gene fusion as the likely initiating genetic event.33 The sequencing of the ETV6-RUNX1 fusion region suggests that it arises as a consequence of non-homologous end-joining as there are no binding motifs indicative of RAG1/2 or TdT activity.34, 35, 36 In contrast, recurrent secondary postnatal CNAs do have consistent signatures of RAG1/2 and TdT activity.36 These data are compatible with the notion that ETV6-RUNX1 fusion arises in utero, prior to the pro-B-cell stage in which RAGs are active, whereas secondary genetic hits impart significant clonal advantage and possibly self-renewal capacity to downstream B-cell precursors. Modeling data involving murine12, 37 and human11 cells have also suggested a possible lympho-myeloid precursor or stem cell origin of the fusion gene. Yet, ETV6-RUNX1+ leukemias never present or relapse with a myeloid phenotype. This is in contrast to some MLL-AF4 pro-B-cell ALLs38 or pre-B-cell ALL blast crisis of prior BCR-ABL1+ CML39 and BCR-ABL1+ ALL15 that may derive from lympho-myeloid stem cells and can ‘switch’ lineages.

The investigation of bulk DNA from monozygotic, monochorionic twin pairs suggests that the pre-leukemic clone spawned in utero has IG/TCR rearrangements indicative of a pro- or pre-B-cell in all cases. In those cases, where only identical TCR gene rearrangements were shared by the siblings, it could theoretically be possible that the proliferation associated with pre-leukemic transformation started before B-lineage commitment. However, cross-lineage rearrangements in B-cell ALL are thought to occur owing to continuing recombinase activity after the clonal expansion in the B-lineage commenced40 and our single-cell data, demonstrating the later occurrence of TCRG compared with early IG rearrangements, also support this concept. Scrutiny of the subclonal architecture at single-cell level in one of the twin pairs also provided evidence that the pre-leukemic transformation may occur at the pro-B-cell stage; even in cases where complete and end-stage IG rearrangements are shared by the twins.

The most parsimonious interpretation of these results is that the ETV6-RUNX1 fusion arises in a fetal progenitor or stem cell that lies upstream of B-cell lineage-restricted RAG1/2 active precursors, but is either permissive only for B-cell lineage differentiation or only has a proliferative fitness impact on early B-cells. The pre-leukemic clone therefore arises and expands preferentially in the pro- or pre-B-lineage compartment, initially in one twin, possibly in the fetal liver and undergoes DJH and V(D)JH rearrangements. All or most clonally descendent cells with self-renewal stem cell activity are shared and sustained in both twins after birth as potential targets for secondary genetic hits essential for clinical development of ALL. This interpretation is compatible with our prior observation that identical twins discordant for ALL share a pre-leukemic stem cell population with a common DJH rearrangement.11 The observation that all detectable ETV6-RUNX1+ cells in both pre-leukemic11, 41 and overtly leukemic populations42, 43 are CD19+ also indicates that if the fusion gene does indeed arise in a lympho-myeloid progenitor, then it imparts no clonal advantage to these cells—and may even totally block myeloid differentiation as opposed to partial differentiation arrest in the B-cell lineage.

It is striking that ETV6-RUNX1+ ALL is extremely rare after the age of 15 years and most of these cases diagnosed and assessed at ages up to 14 years, have a prenatal origin of the fusion gene.33 One interpretation of this finding is that there is a progenitor target cell for ETV6-RUNX1 transformation that is unique to fetal development—or a progenitor-tissue microenvironmental relationship that is restricted to this narrow prenatal time frame. A precedent for this exists with acute megakaryoblastic leukemia in patients with Down’s syndrome, which may arise in a megakaryocytic/erythroid progenitor in fetal liver uniquely (compared with adult) dependent upon GATA1 signals.44

In summary, we have performed the first comprehensive screen of IG/TCR rearrangements in a cohort of twins concordant for ALL, using an approach including massive parallel sequencing. We have also carried out the first in-depth scrutiny of subclonal immunogenotypic architecture in twins with ALL by a highly multiplexed combined analysis of IG/TCR rearrangements and three different types of somatic aberrations. Our results suggest that the pre-leukemic transformation conferring the in utero clonal expansion of ETV6-RUNX1+ cells occurs in an early B-cell lineage committed progenitor, most likely at the pro-B or possibly at the pre-B-cell stage. Further functional studies are needed to unveil the key mechanisms channeling the evolution of ETV6-RUNX1+ lympho-myeloid progenitor or stem cells into the B-lineage in this type of ALL.