Introduction

Chronic lymphocytic leukemia (CLL), the most frequent lymphoproliferative disease in adults, is characterized by the clonal expansion of mature CD5+ B cells. CLL accounts for 25–30% of all types of leukemia [1], and typically affects elderly individuals [2]. Next generation sequence (NGS) technologies have clarified the recurrent genetic lesions in CLL and identified the molecular pathways involved in CLL pathogenesis. Studies revealed that CLL genomes exhibit the heterogeneity between patients with CLL and within cells of the same patient [3]. Moreover, CLL leukemogenesis has been described as a multistep process initiating from immature hematopoietic stem cells (HSCs) [4]. In addition, the impressive efficacies of the kinase inhibitor ibrutinib and of the BCL-2 antagonist venetoclax have changed the standard of care for specific subsets of patients with CLL [5]. This review focuses on the recent insights into the CLL leukemogenesis, emphasizing the role of genetic legions, and the various steps involved. In addition, we also introduce the progress in prospective isolation of human B1 cells and their ontogeny. B1 cells are biologically similar to CLL cells, and have been studied as the possible cellular origins of CLL cells.

Biological features and genetic lesions of CLL

CLL is a B cell malignancy, which is characterized by accumulation of clonal mature CD5-expressing B cells in the blood, bone marrow, and lymphoid tissues [6,7,8]. The prevalence of CLL increases dramatically along with age. CLL cells express functional B cell receptors (BCRs) on their cell surfaces [6, 9, 10]. CLL types are divided into two subgroups based on the presence of somatic hypermutations within the variable regions of the immunoglobulin heavy chain (IGHV) genes patients with CLL with mutated IGHV genes (IGHV-M CLL) have a more favorable prognosis than patients with CLL with unmutated IGHV genes (IGHV-UM CLL) [11]. It has been considered that CLL originate from self-reactive B cell precursors, and that the BCR somatic hypermutation status does not indicate the origin of CLL cells [12,13,14].

IGH-related translocations are rare in CLL, but more common in other types of mature B cell malignancies. The most frequent genetic lesions in CLL are deletions of 13q14 (del13q14) (50–60%) [15, 16]. Most del13q14 deletions are monoallelic and more frequently found in the IGHV-M CLL than in IGHV-UM CLL. In general, Del13q14 is associated with a favorable prognosis, but the clinical course of CLL is accelerated in patients with large del13q14 deletions that affect the retinoblastoma gene (RB1) [17]. The acquisition of chromosome 12 (trisomy12) occurs in ~ 15% of patients with CLL [15, 16]. Trisomy 12 was regarded as a genetic lesion for intermediate risk; but a study revealed that the presence of NOTCH1 mutations in patients with trisomy 12 was associated with poor survivals [18]. Moreover, patients with CLL and trisomy 12 have a higher risk for the progression of Richter syndrome (RS) [19,20,21]. The deletions in the 11q22–23 (del11q) chromosomal region are detected in ~ 15% of CLL cases [15, 16, 22], and del11q results in the loss of the ATM gene (tumor suppressor ataxia telangiectasia mutated), which encodes a DNA damage response kinase ATM [23]. About 25% of patients with CLL and del11q deletions harbor mutations in the remaining ATM allele, and the combination of del11q and ATM mutation in CLL is associated with poor prognosis [24]. Deletions in the 17p13 chromosomal locus (del17p) are observed in ~ 10% of the patients [15, 16, 22], and are frequently observed in IGHV-UM CLL [15]. Del17p deletions usually involve the entire short arm of chromosome 17, leading to the loss of the tumor suppressor gene TP53 [25]. Missense mutations in the remaining TP53 allele are found ~ 80% patients with CLL and del17p [26, 27]. Consistent with the inactivation of TP53 genes, patients with del17p exhibit high genomic complexity and poorer overall prognosis than those with wild-type TP53 [15, 25,26,27,28,29].

In addition to the large chromosomal abnormalities described above, advances in the NGS technologies have revealed recurrent driver mutations in CLL such as SF3B1, ATM, TP53, NOTCH1, POT1, CHD2, XPO1, BIRC3, BRAF, MYD88, EGR2, MED12, FBXW7, ASXL1, KRAS, NRAS, MAP2K1,NFKBIE,TRAF3, and DDX3X [16, 30,31,32,33,34].

SF3B1 mutations are the most frequently observed point mutations in CLL (10–15% of cases) [30,31,32]. SF3B1 mutations cause alternative splicing in CLL cells and induce RNA changes affecting multiple CLL-associated pathways [35].

NOTCH1 is also a frequently mutated gene in CLL (~ 10% of cases) [16, 34, 36]. NOTCH1 mutations are preferentially observed in IGHV-UM CLL. Of note, ~ 40% of patients with NOTCH1-mutated CLL harbor a trisomy 12, implying the relevance of these two genetic aberrations in the pathogenesis of CLL [18, 37]. The vast majority of NOTCH1 mutations in CLL increase the nuclear NOTCH intracellular domain through the abrogation of the PEST domain, which are necessary for F-box and WD repeat containing protein7 (FBXW7)-mediated proteasomal degradation of NOTCH1 [3, 34, 38]. Interestingly, FBXW7-inactivating mutations have been found in patients with CLL without NOTCH1 mutations (~ 3% of patients with CLL), indicating an analogous outcome of enhanced NOTCH1 signaling. Moreover, NOTCH1 activation independent of NOTHC1 mutations has been reported in CLL cells [39, 40]. Thus, the activation of the NOTCH1 pathway via multiple mechanisms can be involved in the pathogenesis of CLL [41].

POT1 mutations are found in 3–7% of CLL patients, and frequently observed in IGHV-UM CLL [16, 30, 31, 34, 42]. POT1 plays an important role in the telomere protection [43]. During normal hematopoiesis, POT1 activity is required for maintaining the activity of self-renewing HSCs [44]. POT1 mutations alter the telomeric DNA binding domain, leading to structural aberrations and chromosomal instability [42].

In all, the genetic CLL lesions can be categorized into several biological pathways such as NOTCH1 signaling, BCR signaling, DNA damage response, genome/chromatin structure, RNA and ribosomal processing, inflammatory pathways, NF-κB signaling, cell cycle, and apoptosis [16, 34]. Thus, these deregulated biological pathways coordinately drive CLL leukemogenesis in human (Fig. 1).

Fig. 1
figure 1

Summary of the pathways and molecules involved in the pathogenesis of CLL. These deregulated biological pathways affected by genetic and non-genetic mechanisms coordinately drive the leukemogenesis of CLL. The sizes of the triangle indicate the frequency of mutations reported in CLL

Multistep leukemogenesis of CLL initiating from HSCs

After describing the important molecular pathways involved in the pathogenesis of CLL by NGS studies, we will focus on how such oncogenic events initiate and accumulate during the complex leukemogenesis process. Other types of human leukemia (including acute myeloid leukemia, acute lymphoblastic leukemia, and chronic myeloid leukemia) have HSCs and immature progenitor cells playing roles in their pathogenesis, but CLL has been thought to originate from mature B cells. To trace the origins of human CLL, it is important to note that CLL is not always monoclonal [45, 46]. Moreover, a large cohort study demonstrated that virtually all patients with CLL had prior monoclonal B cell lymphocytosis (MBL) [47]. MBL is a preleukemic state of CLL representing the asymptomatic proliferation of clonal B cells with circulating numbers < 5000/μl [48]. The prevalence of MBL increases with age [47, 49], and it ranges from < 1% [50, 51] to 18% [52]. Of note, human MBL sometimes comprises oligoclonal B cell clones [53,54,55,56,57].

The progression from MBL to CLL reflects a step-wise process, but the stage at which the first oncogenic event occurs remains unknown. The existence of oligoclonal B cell clones in both patients with CLL and MBL suggests that the first oncogenic event may be traced as far back as the progenitor or HSCs. These observations led us to evaluate the primitive HSC fraction in patients with CLL, and we found that the propensity to generate clonal mature B cells is already present in HSCs. CLL cells never directly engrafted in xenograft models; but HSCs derived from patients with CLL gave rise to the abnormal monoclonal or oligoclonal mature B cells in vivo [58]. Moreover, NGS studies confirmed that CD34+ CD19 hematopoietic stem/progenitor cells (HSPCs) and/or myeloid cells from patients with CLL shared identical somatic mutations detected in CLL cells. Such recurrent mutations include NOTCH1, SF3B1, BRAF, TP53, XPO1, MED12, NFKBIE and EGR2 [33, 59, 60]. Whole genome sequence analyses also confirmed shared mutations between MBL/CLL cells and their respective polymorphonuclear cells, suggesting that the acquisition of some somatic mutations occurs before disease onset, likely at the HSCs stage [61]. In addition, the activation of NOTCH1 pathways is deeply involved in CLL leukemogenesis [41], and a study showed that NOTCH1 is aberrantly activated in HSPCs from patients with CLL, regardless of NOTCH1 mutation status (when compared to the HSPC levels in healthy donors), indicating that activation of NOTHC1 is an early event in CLL leukemogenesis that may contribute to the development of aberrant HSPCs in patients with CLL [60]. Consistent with this, advances in the analysis of IGH genes using NGS technology confirmed the presence of independent oligoclonal B cell clones (even in immunophenotypically monoclonal CLL patients) [62]. Thus, the initial oncogenic events occur in human self-renewing HSCs, which then mutate in the multistep leukemogenesis process of CLL. In addition to CLL, studies have clarified that the initial oncogenic events target HSPCs in several human mature lymphoid malignancies [63,64,65,66] as well as in murine models of mature lymphoid malignancies [63, 65, 67,68,69].

These studies have provided the novel steps in the complex leukemogenesis/lymphomagenesis process; cellular stages of tumor-initiation and final transformation are different, and the stage-specific oncogenic events coordinately drive tumor progression. Further studies will help us clarify the molecular mechanisms involved in this step-wise leukemogenesis/lymphomagenesis of the mature lymphoid malignancies.

Understanding B1 cell biology to clarify mechanisms leading to CLL

Next, we will focus on the normal B cell counterparts of CLL. As described above, the initial oncogenic events start and accumulate in HSCs, and such mutated HSCs continuously produce their progeny including mature B cells harboring identical mutations. For the development of MBL, the preleukemic state of CLL, mature B cells derived from such HSCs expand clonally and are maintained while accumulating subsequent oncogenic events that lead to the progression of CLL [4]. The question is which mature B cells are the direct cellular origin of human CLL.

Over the years, different types of B cells have been proposed as the normal counterparts of CLL. BCR signaling can play a critical role in the development of CLL. Of note, it is important to know that CLL cells express a restricted BCR repertoire including antibodies with quasi-identical CDR3 [70,71,72,73]. The striking degree of structural restriction of BCRs in CLL suggests that CLL cells may be driven by recognizing the similar antigens in vivo, and supports the hypothesis that an antigen-driven clonal selection process can be involved in the pathogenesis of CLL [7]. Such antigens may include autoantigens, because BCR of CLL cells exhibit autoreactivity and polyreactivity, suggesting that CLL cells originate from self-reactive B cell precursors [12]. Studies have identified several autoantigens recognized by BCRs of CLL cells [74,75,76]. Moreover, others have demonstrated that BCRs in patients with CLL often have the capacity for autonomous signaling via self-ligation to BCRs independently of ligands [77, 78]. Thus, an important biological feature of CLL is signaling through the autoreactive BCRs. In addition to such autoreactivity of human CLL cells, another important biological character of CLL is the expression of CD5 and IgM.

Since these biological features and immunophenotype of CLL cells are very similar to those of mouse B1 cells, B1 cells have been regarded as the possible normal counterparts of CLL cells [79, 80]. B1 cells were first reported as a rare CD5+ B cell subpopulation that secretes IgM [81]. In contrast to conventional B cells (B2 cells), B1 cells were identified at a relatively low frequency in the spleen, but they were abundant in the peritoneal cavity [82, 83]. Importantly, B1 cells differ functionally from B2 cells in their spontaneous secretion of natural IgM that is a more germ-line-like immunoglobulin than that in B2 cells because of their minimal N-region addition, broad reactivity, restricted repertoire, and autoreactivity [84,85,86,87]. Such natural IgM secreted from B1 cells plays an important role in the early defense of bacterial and viral infections [88,89,90]. B1 cells are divided into two subsets according to CD5 expression; CD5+ B1a and CD5 B1b cells [91].

Regarding the ontogeny of B1 cells, B1 cells emerge independently of HSCs during the early embryonic development [82, 92, 93], and they have their own self-renewal capacity [94], whereas B2 cells are derived from HSCs [95]. Studies have shown that mouse B1 cells are also generated from adult HSCs; but the extent to which adult HSCs contribute to B1 cell development, especially to B1a cells, has been debated [95,96,97,98].

Based on these unique biological features of B1 cells, they have been investigated as the cellular origin of CLL development in mouse models. Studies using mouse models have revealed that CLL-like disorders develop efficiently from B1 cells in aged model mice [99,100,101].

Prospective isolation of human B1 cells

Despite the biological similarities between murine B1 cells and CLL cells, human B1 cells have not been intensively investigated as a normal counterpart of CLL due to three reasons: First, about half of the patients with CLL have IGHV-M CLL, and these CLL cells have extensive somatic hypermutations, which is not compatible with the biological character of B1 cells [79, 95]. Second, the normal counterpart of human CLL has been mainly investigated based on gene expression profiling (GEP) analysis [13, 14, 102]. Early GEP studies have revealed a relatively homogeneous GEP irrespective of IGHV mutation status [13, 14], similar to that of human CD27+ memory B cells [13]. A recent GEP analysis by comparing CD5+ CLL cells and prospectively-isolated human B cell subsets suggested that IGHV-M CLL exhibited similar GEP with CD5+CD27+ post-germinal center (GC) subset, whereas GEP of IGHV-UM CLL resembled that of CD5+ CD27 pre-GC B cells [102]. GEP analysis is useful to identify similarities and/or differences among specific cellular populations, but care should be exerted when interpreting the results of GEP analysis. For example, the investigated cellular populations are not always homogeneous. If the analyzed subsets consist of several distinct cellular components, interpretation of results becomes difficult. The third reason is that the characteristics of human B1 cells remain unclear (including their immunophenotype) leading to insufficient information of human B1 GEP. Thus, GEP analyses may not have been adequate to compare CLL cells with human B1 cells.

In 2011, CD20+CD27+CD43+CD70 cells were isolated as human B1 cells, which had murine B1 cell-specific properties such as spontaneous IgM secretion, efficient T cell stimulation, and tonic intracellular signaling [103]. To date, CD19+CD20+CD27+ CD38lo/int CD43+ is regarded as an accurate immunophenotype of human B1 cells [104, 105]. Interestingly, the immunophenotype of CD19+ CD20+ CD27+ human B1 cells is shared with human memory B cells, indicating that human B1 cells were analyzed as part of the human memory B cells in the GEP study showing memory B cells exhibited the most similar GEP with CLL [13]. Therefore, further studies comparing GEP of CLL cells and prospectively-isolated human B1 cells will improve our understanding of the human normal B cell counterparts of CLL.

Human B1 cells are derived from adult HSCs

Since mouse B1 cells emerge independently of HSCs during the early embryonic development, they have their own self-renewal capacity, and they are also generated from adult HSCs [82, 92,93,94,95,96,97,98], a follow-up question is whether human B1 cells can be generated from adult HSCs. Studies to clarify the origin of human B1 cells have been conducted: Xenotransplantation of CD34+ CD38−/lo human HSCs from cord blood and adult BM reconstituted human B1 cells in vivo, and the analysis of the patients who underwent autologous/allogeneic hematopoietic stem cell transplantation (HSCT) showed early development of human B1 cells after HSCT [106]. This study provides evidence that human B1 cells can be generated from self-renewing HSCs; but the possibility of contamination of B1 cells from the transplanted cells cannot be excluded. To overcome the limitations of transplantation analysis, a recent study by Kageyama et al. [107] analyzed B1 cells in paroxysmal nocturnal hemoglobinuria patients harboring somatic PIGA mutations, and found a population of B1 B cells derived from PIGA-mutated adult HSCs. Thus, this unmanipulated analysis of human hematopoiesis provides fresh evidence that a fraction of B1 cells are derived from adult HSCs.

These findings are important when we consider the disease relevance of human B1 cells including the immunodeficiency, auto-immune diseases, and CLL. Somatic mutations accumulate within self-renewing human HSCs and their progeny carry the identical mutations in an age-dependent manner, leading to the emergence of clonal hematopoiesis [108, 109]. The extent to which such mutated HSCs contribute to the production of human B1 cells is unclear, but the fact that human HSCs differentiate into B1 cells in steady state hematopoiesis indicate that age-related clonal hematopoiesis (ARCH) [110] can also involve B1 cells in elderly individuals. Further studies are necessary to assess how ARCH affects human B1 cell function such as spontaneous IgM secretion and efficient T cell stimulation.

Conclusions and perspectives

Advances in NGS technologies have clarified the genetic abnormalities acquired in many types of hematological malignancies. In addition to the identification of recurrent mutations, further understanding of the clonal architectures of hematological malignancies have prompted the search for the origin of these diseases. In this review, we focused on the multistep leukemogenesis of CLL. Given that initial oncogenic events occur and accumulate within HSCs of patients with CLL, the fact that human B1 cells originate from adult HSCs and the common biological features in B1 cells and CLL cells suggests human B1 cells may be the counterparts of CLL cells. The characterization of human B1 cells is still undergoing, and further studies are required to assess the biological significance of human B1 cells especially in the field of immunology and hematology.