Keywords

1 Introduction

Inherited primary immunodeficiency diseases (PIDs) are a group of disorders which are caused by defects of the immune system . PIDs usually present with some common clinical manifestations such as recurrent or severe infections, including viral, bacterial, fungal and protozoal infections that are difficult to manage with conventional treatments. Patients may also suffer a variety of autoimmune or autoinflammatory complications . Compelling evidences have demonstrated that most of PIDs are caused by genetic defects, and therefore many patients develop severe diseases during the first years of life [1]. Although the incidence of PIDs varies in different countries with a range of 1 in 700 to 1 in 19,000, more and more evidence has suggested that PIDs are not rare disorders, and are more common than generally thought [4,5,4]. In the United States, approximately 1 in 1200 individuals are diagnosed with PIDs [4]. According to the classification of the International Union of Immunological Societies (IUIS) Expert Committee for PIDs, PIDs can be divided into nine different groups, including: (1) combined immunodeficiencies without syndromic features; (2) combined immunodeficiencies with syndromic features; (3) predominantly antibody deficiencies; (4) diseases of immune dysregulation; (5) congenital defects of phagocyte number, function, or both; (6) defects in innate immunity; (7) autoinflammatory disorder; (8) complement deficiencies, and (9) phenocopies of PID [5]. Because PIDs are a significant cause of premature death in children, early diagnosis and appropriate management are vital to save patients and to reduce any devastating permanent damage. Although the typical clinical features and basic laboratory evaluation for immunodeficiency are valuable, identification of specific gene mutations is considered as the most reliable method for establishing a definitive diagnosis. Up to date, approximately, 320 genes that are associated with PIDs have been reported in literatures, and 249 of them were recognized and classified by IUIS in 2014. (Table 1) [5]. Sanger-based single gene sequencing is time-consuming and expensive; thus, physicians often face a big challenge in terms of choosing a reasonable number of the most likely candidate genes from more than 240 PID-associated genes for sequencing. Given the fact that many of these disorders are clinically indistinguishable from each other; targeted functional studies are usually not clinically available. Sequencing all of the disease-related genes would be ideal for the molecular diagnosis of PIDs. In addition to clinical heterogeneity , there is also high degree of genetic heterogeneity that can cause Sanger sequencing of a manageable number of known targeted genes insufficient and inefficient in identifying novel mutations. Over the past 5 years, the clinical application of NGS technology has developed to address these limitations [6]. NGS is a massively parallel sequencing technology that can sequence all targeted regions (multiple genes, whole exome or whole genome) of the human genome in one set-up. Currently, there are three common NGS-based approaches: targeted NGS panels; whole exome sequencing (WES); and whole genome sequencing (WGS). The development of NGS technology has made it possible to sequence all known disease-causative genes simultaneously in clinical laboratories today. Furthermore, NGS has become a successful technology for the discovery of novel genes for Mendelian disorders . Indeed, the NGS-based target gene panels and WES have been rapidly adopted by clinical laboratories. WES has not only resulted in a tremendous progress in disease diagnosis but also has led to discoveries of many novel disease genes [7, 8]. In comparison to the targeted NGS panels, WES and WGS are more comprehensive, but much more expensive and time-consuming. Although the rapid development of NGS technology can ultimately overcome shortcomings of WES and WGS and make them cheaper and more accessible, interpretations of vast majority of gene variants in genes or regions of unknown clinical significance, as well as incidental findings remain challenging. As a result, gene discovery remains primarily for research purposes [9,10,11]. In this chapter, we describe the most recent applications of NGS in PIDs with a focus on clinical molecular diagnosis.

Table 1 249 PID associated genes were classified into nine categories by the International Union of Immunological Societies (IUIS) Expert Committee for Primary Immunodeficiency (April 2014)

2 Next-Generation Sequencing (NGS) Approaches in Primary Immunodeficiency Diseases (PIDs)

NGS, the high throughput, massively parallel sequencing technology, allows sequencing multiple genes simultaneously. Therefore, NGS-based gene sequencing is particularly suitable for the molecular diagnosis of PIDs. Multiple NGS approaches have been established over the past few years in PIDs for both clinical diagnosis and research purpose. Three most common approaches that have been applied to PIDs are targeted NGS gene panel, WES and WGS [12,13,14]. The targeted NGS panel is designed to detect genes known to be associated with a particular clinical disease phenotype simultaneously, and enables clinicians to focus on a specific group of genes of interest. Thus, the targeted NGS panel allows deeper sequencing of genes relevant to diseases [15, 16]. WES is designed for diseases with non-specific clinical features and/or diagnosis to sequence the complete coding and flanking noncoding regions of human exomes, where approximately 85% of disease-causing mutations are located [17]. WES is becoming practical for clinically hard to diagnose Mendelian disorders due to reduced cost [18]. Notably, WES has demonstrated enormous potential in the discovery of novel disease-causative genes [19]. The WGS is aimed to sequence the complete DNA sequence of the whole genome, including the information in deep introns and other untranslated regions that are not covered by WES. In addition to the challenging interpretation of enormous amount of variants, there is still a distance for WGS to be time and cost effective. For these reasons, the WGS has not yet been applied to clinical use widely, although the recent progress of WGS application is promising [20, 21].

3 Broadly Targeted Next-Generation Sequencing (NGS) Approach for the Diagnosis of Primary Immunodeficiency Diseases (PIDs)

A targeted NGS panel analyzes only the genes known to be related to a particular disease phenotype, thus avoids analysis of unrelated or possibly related genes [22]. Since the NGS panel analyses focus on target genes of interest, it is possible to achieve deeper sequence coverage, higher sensitivity in mutation detection with higher accuracy [23, 24]. This approach has become the first-line testing in PIDs, and has been utilized successfully at identifying mutations in the known-disease genes in PIDs. At least three different target enrichment methods have been adopted by clinical laboratories: RainDance emulsion PCR (RainDance Technologies, Lexington, MA, USA), Hybridization-based (SureSelect, Agilent Technologies Inc., Santa Clara, CA, USA; SeqCap EZ system, Roche NimbleGen; Nextera and TruSeq capture systems, Illumina) and Haloplex PCR target enrichment (Agilent Technologies Inc., Santa Clara, CA, US) captures. Nijman et al. [12] developed a targeted NGS panel to facilitate a genetic diagnosis in any of 170 known PID-related genes. The NGS panel was performed on an AB SOLiD 5500XL sequencer (Applied Biosystems, Bleiswijk, The Netherlands). Two different types of enrichment approaches were adopted, yielding a high coverage at 20× with 93.77% in Array-based capture (Agilent SurePrint G3 1 M Custom CGH Microarray, Agilent Technologies Inc., Santa Clara, CA, US) and 91.78% in SureSelect capture (Agilent SureSelectXT Target Enrichment System, Agilent Technologies Inc., Santa Clara, CA, US) respectively. Forty PID patients with known mutations were analyzed, and 1500 variants per person were detected after the primary analysis. To prioritize variants for pathogenic properties, this group developed an internal classification pipeline by using Cartagenia BENCHlab NGS module (Cartagenia, Leuven, Belgium). Briefly, variants were first checked using an internal database, and then the Human Gene Mutation Database (HGMD) database. Variants were considered benign if the minor allele frequency was greater than 5% in the following databases: dbSNP, Exome Variant Server, and 1000Genomes. Synonymous variants and variants located more than 20 bp into flanking intronic sequences were discarded. Nonsense, frameshift, and canonical splice site variants were considered pathogenic. In addition, the Alamut mutation interpretation software (Interactive Biosoftware, Rouen, France) was applied for interpretation and classification of the variants. This pipeline analysis resulted in approximately 15–25 variants per patient for further in-depth expert evaluation. This study indicated that both capture designs had a high sensitivity (>99.5%) and specificity (>99.9%) for the detection of point mutations, but only 85% of success rate for the detection of small deletion/insertion variants. To evaluate the efficiency of this NGS panel for the reclassification of PID patients, 26 patients were selected for this test who had failed to receive a genetic diagnosis previously. These patients were composed of three groups: combined immunodeficiency (CID, n = 20), autoimmune lymphoproliferative syndrome-like disease (ALPS, n = 4), and hemophagocytic lymphohistiocytosis-like disease (HLH, n = 2). In three of the patients with CID and in one patient with HLH, a reclassification genetic diagnosis was established; three of these four patients (3 of CID, 1 of HLH) presented with atypical phenotypes based on the disease diagnosis criterion [12]. This study has demonstrated that the targeted NGS approach (using either the glass array capture or solution–based capture method), is accurate and efficient for the detection of mutations in PIDs-related genes and that a targeted NGS-based panel can be used as a first-line genetic test for PID patients. Of note, in this panel 9 genes had inadequate sequence coverage, including: CARD9, C4A, C4B, CFD, ELANE (ELA2), FCGR1A (CD64), FCGR2B (CD32), IKBKG (NEMO), and NCF1. The low coverage is likely due to high CG content, high homology of pseudogenes, or both. Therefore, this panel is not sensitive and applicable when mutations are suspected in any of these 9 genes [12].

Stoddard et al. [25] developed their targeted NGS panel combined Haloplex custom target enrichment and the Ion Torrent PGM technologies, which allow the rapid screening of large panels of genes [26]. This panel included 173 genes that were known or highly suspected to be associated with particular PIDs [5]. For capture design, 2455 target regions including the coding exons plus 10 flanking bases of 173 genes were submitted for DNA capture probe design using the Agilent SureDesign web-based application software . The final probe design was expected to yield 42,909 amplicons covering 99.53% of the submitted target regions. A custom designed HaloPlex Target Enrichment kit (Agilent Technologies) was used for the capture of the target regions, which included the four following steps: (1) digestion of genomic DNA with restriction enzymes; (2) hybridization of fragments with the complementary probes; (3) capture of target DNA using streptavidin beads and ligation of circularized fragments; and (4) PCR amplification of captured target libraries. The library templates were clonally amplified using the Ion One Touch 2™, followed by an enrichment process for the recovered template-positive ion sphere particles. A standard Ion PGM 200 Sequencing V2 protocol using Ion 318 V2 chips (Life Technologies) was performed for the NGS. In total, they utilized 11 healthy controls, thirteen PID patients previously evaluated, and 120 patients with undiagnosed PIDs. This NGS panel revealed variants with 98.1% sensitivity and >99.9% specificity. Moreover, a molecular diagnosis was made for 18 of 120 patients (15%) who previously lacked a genetic diagnosis, including 9 patients who presented with atypical clinical manifestations and had previously undergone extensive genetic and functional testing. Interestingly, although the Haloplex kit was able to provide >90% coverage for most target regions; there were low coverage regions for a few genes , including HLA-DRB5, TNFRSF13C, UNC93B1, CD79A and NCF4. The poor coverage in some regions of these genes was probably due to high GC-content, repeat regions, and highly homologous sequences. Like all the other sequencing strategies, this NGS panel was not able to detect large deletions, insertions and chromosomal abnormalities. Additional techniques are required to evaluate copy number variations. The HaloPlex Target Enrichment System enables fast, simple and efficient analysis of genomic regions of interest in a large number of samples. By combining single-tube target amplification and removing the need for library preparation, the total sample preparation time and cost is reduced by eliminating the need for dedicated instrumentation or automation. However, the restriction enzyme digests can result in unexpected coverage gaps especially when fragments are longer than the read length. While there were technical limitations of the above, Stoddard et al. [25] has demonstrated that their targeted NGS panel was a cost-effective, first-line genetic test for PIDs. This targeted NGS panel would be more appropriate to be applied first for the PID patients, who present with atypical or widely variable/nonspecific clinical phenotypes.

Similarly, Moens et al. [27] developed another targeted NGS panel using a selector-based target enrichment (HaloPlex system, Halo Genomics). The selector assays were designed to cover all exons and UTRs +/− 25 bp of 179 genes [27] of all known disease-causing genes in PIDs. The NGS panel sequencing was performed on Illumina’s Genome Analyzer IIx (Illumina, San Diego, CA, USA). In this study, 33 patients were examined, 18 of which had at least one known causal mutation prior to the experiment. This HaloPlex based enrichment followed by Illumina sequencing provided a minimal coverage of 20 reads in an average of 88% target regions. The average read depth in the targeted region was 1304 ± 662 reads per base. By sequencing 18 individuals with known mutations, this NGS approach detected 20 out of 24 mutations (83%) and solved the diagnosis for 78% (14 out of 18 individuals) of patients by one single assay. Of the 4 missed mutations, 3 (2 missense, 1 splice variant) of them had low read depth and 1 (small deletion) was not covered by design. There were other regions of no/low coverage in this study, which might be due to overall low read depth with ~21% of the target genes contained one or more exons with <20X average read depth across all samples. Interestingly, the CFD gene showed a low overall coverage both in this NGS panel and a different NGS panel described previously [12]. Despite the shortcomings of targeted NGS panel, the majority of PID cases could be resolved by using this sequencing approach.

More recently, Al-Mousa et al. developed an unbiased targeted NGS approach for PIDs by using the Ion Torrent Proton NGS sequencing platform [28]. This comprehensive NGS panel included 162 genes that were associated with PIDs. To evaluate the panel’s clinical utility, sensitivity and specificity, total of 261 suspected PID patients were tested. Of the 261 patients, 122 had known disease mutations and were used to assess the specificity and sensitivity. The actual coverage for the targeted regions (encoding regions and 10-bp flanking regions of associated introns) was 96.5%, and only 9 of the 162 genes had a coverage less than 90%. The sensitivity for the single nucleotide variant has reached 96%, the missed mutations were due to low read depth. The overall specificity for this panel was 88.2%. Interestingly, this NGS panel detected unknown mutations and resulted in genetic diagnosis in 35 of 139 unsolved cases.

Although there is some difference among the coverages when different platforms, enrichment methods and gene numbers in the different NGS panels were used, they all demonstrated the efficacy of NGS panels for the diagnosis of PIDs (Table 2).

Table 2 Summary of targeted-NGS panels in PIDs

4 Specific Targeted-NGS Sub-panels in Primary Immunodeficiency Diseases

The targeted-gene panels are aimed to establish definitive diagnosis in subgroup of PID patients who have similar clinical and cellular manifestations (Table 1). Compared to the previous large all-inclusive NGS panels (~170 genes), these specific targeted-NGS gene sub-panels (10–40 genes) would generate a much lower number of gene variants. In addition to being less time-consuming, the specific targeted NGS method can result in a higher molecular diagnostic yield. One of the reasons is that we can provide 100% coverage for the genes of sub-panels, in which the limited low coverage regions of the genes can be rescued by Sanger sequencing. The advantage of using small panels is particularly significant for some PIDs with defined clinical phenotype and fitting particular diagnostic criteria, for example, familial hemophagocytic lymphohistiocytosis (FHL) . FHL is a rare, primary immunodeficiency disease characterized by an uncontrolled hyperinflammatory response [29]. Five FHL subtypes (FHL1, FHL2, FHL3, FHL4, and FHL5) have been described. Four genes in which mutations are causative have been identified: PRF1 (FHL2), UNC13D (FHL3), STX11 (FHL4), and STXBP2 (FHL5) (Table 1, under the category of diseases of immune dysregulation). These genetic abnormalities affect granule-dependent lymphocyte cytotoxicity by impairing trafficking, docking, priming for exocytosis, or membrane fusion of cytolytic granules. The function of this pathway may also be severely impaired by the loss of functional perforin, the key pore-forming protein for the delivery of proapoptotic granzymes. Diverse mutations in this pathway all give rise to similar clinical phenotypes (albeit of variable severity). Although FHL has an autosomal recessive inheritance pattern, our recent study also has indicated that a digenic mode of inheritance may also exist as a result of a synergistic function effect within genes involved in cytotoxic lymphocyte degranulation (Fig. 1) [30, 31]. Many FHL patients develop the disease within first few months or years of life and, occasionally, in utero, although later childhood or adult onset is more common than previously suspected [32, 33]. Without treatments, most FHL patients die within 2 months of disease onset due to the progression of hemophagocytic lymphohistiocytosis (HLH) . Although a possible diagnosis of FHL can be made based on 8 clinical criteria (fever, splenomegaly, bicytopenia, hypertriglyceridemia and/or hypofibrinogenemia, hemophagocytosis, low/absent NK-cell-activity, hyperferritinemia, and high-soluble interleukin-2-receptor levels) [34], the definitive diagnosis of a genetic form of HLH (FHL) is often challenging because of the lack of specificity of those diagnostic criteria and their poor correlation with the different types of defects in particular gene(s). In addition, there are some overlapping clinical features between FHL and a few other inherited immune disorders associated with highly lethal hemophagocytic syndromes, including X-linked lymphoproliferative disease (SH2D1A and XIAP), Pudlak syndrome (AP3B1 and BLOC1S6), Chediak-Higashi syndrome(LYST), Griscelli syndrome type 2 (RAB27A), X-linked immunodeficiency with magnesium defect, Epstein-Barr virus infection, and neoplasia (MAGT1), CD27 deficiency and Interleukin-2-inducible T-cell Kinase (ITK) deficiency [35,36,37,38,39,40,41,42,43,44,45] . For this reason, a comprehensive genetic diagnostic panel is needed. The Molecular Genetics Laboratory at the Cincinnati Children’s Hospital Medical Center (CCHMC) has developed a specific targeted NGS panel for FHL. This specifically targeted FHL panel has 14 genes; AP3B1, BLOC1S6 (PLDN), ITK, LYST, MAGT1, PRF1, RAB27A, SH2D1A, SLC7A7, STX11, STXBP2, TNFRSF7 (CD27), UNC13D and XIAP (BIRC4). All coding exons and 20 base nucleotides into the flanking intronic regions, as well as 5′ and 3′ untranslated regions (20 base nucleotides from 1st or last exon) were enriched using microdroplet PCR technology (RainDance Technologies Inc., USA) as previously published method [46], followed by NGS sequencing on the Illumina HiSeq 2500 instrument (Illumina Inc., USA). The resulting sequence reads were aligned against the reference DNA sequence followed by variant calls using NextGENe software (SoftGenetics, LLC, USA) [47]. PCR/Sanger sequencing was then used to fill the gaps of insufficiently covered regions and to confirm pathogenic and novel variants. The analytic sensitivity of this methodology is >99%. Although small deletions and insertions of <10 bases can be routinely detected in this panel, larger deletions or duplications would not be able to picked up by this technology. For this reason, we have developed targeted deletion and duplication analysis of each gene on this panel. The average read depth of the target regions for the different panels is more than 98% covered at >20X (Table 3). By reviewing the NGS results of panel test on the first 370 clinical cases from patients suspected of HLH,, single or bi-allelic HLH pathogenic variants were identified in 31 patients, 175 patients had variants of uncertain clinical significance, and 13 patients carried variants in more than two genes. Although the detection of exonic deletions and insertions using NGS data has been reported (Feng YM et al., GIM 2015 17:99 PMID 25032985, Wang J, et al. GIM PMID: 26,402,642), we have not validated our NGS for such purposes. Thus, we have developed targeted deletion and duplication analysis for each gene on this panel as a complementary testing. Patients with a normal NGS result or a single (heterozygous) mutation are usually recommended for the deletion and duplication test. Gross deletions and duplications have been identified in 5 patients in several FHL-related genes (unpublished data). Given the lower cost, faster turn-around time and higher yields of the detection of causative mutations than traditional gene-by-gene PCR/Sanger analyses, the HLH targeted NGS panel has been recommended as the first line test for patients presenting with FHL-like syndromes.

Fig. 1
figure 1

The granule exocytosis pathway of cytotoxic lymphocytes in familial hemophagocytic lymphohistiocytosis (FHL) Perforin is the key delivery molecule for proapoptotic granzymes in the perforin-dependent cytotoxic lymphocytes and is associated with FHL2. Defects in other FHL-associated genes (MUNC13-4, STX11, STXBP2 and RAB27A) may also affect granule-dependent lymphocyte cytotoxicity by impairing trafficking, docking, priming for exocytosis, or membrane fusion of cytolytic granules. Synergistic effects of these different molecules in this cytotoxic pathway have also been observed

Table 3 Summary of Data Quality Metrics of the FHL, ALPS and SCID targeted PID panels

Similarly, another targeted subpanel for the diagnosis of autoimmune lymphoproliferative syndrome (ALPS) is also clinically available. ALPS is a disorder of T cell dysregulation caused by defective Fas-mediated apoptosis [48]. ALPS patients usually present with lymphadenopathy, hepatosplenomegaly, autoimmunity and increased rates of malignancy. The diagnosis of ALPS is based on a constellation of clinical findings, laboratory abnormalities, and identification of mutations in genes relevant for the tumor necrosis factor receptor superfamily member 6 (Fas) pathway of apoptosis [49]. However, it is always challenging to reach a definite diagnosis because of the ALPS heterogeneity, the disease variability and different expressivity. An ALPS NGS panel could serve as an important aid for the molecular diagnosis of ALPS. This targeted NGS ALPS panel includes the following 9 genes : FAS, FASLG, CASP10, CASP8, FADD, KRAS, NRAS, MAGT1 and ITK. Among these genes, identified mutation(s) in FAS, FASLG, or CASP10 genes can confirm the ALPS diagnosis [50]. Mutations in FADD are associated with the patients who have many of the biochemical markers of ALPS, but lack the characteristically clinical features of lymphadenopathy and splenomegaly [51]; biallelic mutations in CASP8 result in a rare immunodeficiency characterized by lymphadenopathy and splenomegaly, marginal elevation of double negative T cells (DNTCs) , defective FAS-mediated apoptosis, in addition to frequent bacterial and viral infections. Mutations in NRAS and KRAS may lead to an ALPS-like condition known as RAS-associated lymphoproliferative disease [50, 52]. Mutations in ITK and MAGT1 are not associated with ALPS but are included in this panel as part of the differential diagnosis of lymphoproliferative disorders. The ALPS panel has shown a reasonable clinical sensitivity. Of the 80 patients tested recently (personal communication), 6 patients (7.5%) had pathogenic (4) or likely pathogenic variants (2), establishing a definite, molecular diagnosis. In addition, 5 patients (6.3%) had variants with unknown clinical significance (unpublished data).

Severe combined immunodeficiency (SCID) is a group of distinct congenital disorders that involves combined cellular and humoral immunodeficiency resulting from the lack of function or significant dysfunction of T lymphocytes and B lymphocytes. SCID is the most severe form of PIDs [53]. The patients with SCID usually develop disease between 3 to 6 months, typically present with recurrent or persistent infections (severe bacterial, viral or fungal infections) and failure to thrive [54, 55]. Although different forms of SCID are currently classified according to the presence or absence of T, B, and NK cells, the discovery of novel causative genes has added new complex clinical phenotypes [5, 56]. X-linked SCID (X-SCID) is the most common form of SCID affecting male infants. It is the result of defects in IL2RG gene, which encodes the common gamma chain, gamma c, of the leukocyte receptors for interleukin-2 and multiple other cytokines [57]. Puck et al. [57] has identified deleterious IL2RG mutations in 87 of 103 families (84.5%) with males affected with non-ADA–deficient SCID, suggesting a high frequency of IL2RG mutations in X-linked SCID. The remaining SCID disorders are caused by autosomal recessive mutations. The estimate prevalence is 1 in 50,000 births with a higher prevalence in males [56]. SCID is considered a pediatric emergency, and is often fatal by 6–12 months of age without treatments. For this reason, at least 34 states have already implemented or agreed to move forward with newborn screening for SCID. The screening is performed by assaying for T-cell receptor excision circles (TRECs) . This test has led to an early identification of SCID patients, and made it possible for providing appropriate managements prior to serious damages in patients. Although the TRECs screening has not been adopted nationally, the outcome of the SCID screening has been very encouraging [58]. However, a follow-up of sequencing the SCID-related genes is required in order to establish a definite SCID diagnosis, which has been regarded as a gold standard. Currently, there is an available SCID NGS panel including 20 genes (ADA; CD3D; CD3E; CD45(PTPRC); DCLRE1C; FOXN1; IL2RG; IL7R; JAK3; LIG4; NHEJ1; ORAI1; PNP; RAG1; RAG2; RMRP; STAT5B; STIM1; TBX1; ZAP70). These genes are associated with either SCID and or SCID-type symptoms , such as Omenn syndrome, Cartilage-Hair hypoplasia, and Velocardiofacial syndrome [59,60,61,62,63,64,65]. Omenn syndrome is characterized by an absence of circulating B cells and an infiltration of the skin and the intestine by activated oligoclonal T lymphocytes. Along with immunodeficiency, Omenn syndrome presents with severe erythroderma, desquamation, alopecia, lymphadenopathy, eosinophilia and elevated IgE. Cartilage-Hair hypoplasia can be caused by mutations in RMRP gene, which is characterized by metaphyseal chondrodysplasia presenting with short stature and short limbs. In addition, many patients presented with SCID-type immunodeficiency [66, 67]. Because immunodeficiency secondary to thymic hypoplasia is common in Velocardiofacial syndrome and TBX1 is the most important gene for this syndrome [68], this gene has been included in the SICD panel too. This SCID panel revealed 60–90% of the reported mutations, and the sensitivity of DNA sequencing is over 99% for the detection of nucleotide base changes, small deletions and insertions in the genes of interest. In 50 patients performed on SCID panel recently at CCHMC, we identified 9 pathogenic variants, 4 likely pathogenic variants and 26 variants of unknown clinical significance. All the above variants are confirmed by Sanger sequencing. Overall, 10% of patients (5/50) reached a definite molecular diagnosis for SCID by either carrying two pathogenic variants or one X-linked pathogenic variant in males (unpublished data). Due to large exonic deletions have been reported in ADA, DCLRE1C, IL2RG, JAK3, NHEJ1, PTPRC, RAG1, RAG2, RMRP, STAT5B and TBX1, deletion/duplication testing should be indicated as follow-up test in patients with a single mutation in any of the above genes. To address this issue, Yu et al. have developed a target gene capture/NGS assay with deep coverage which facilitates simultaneous detection of single nucleotide variants and exonic copy number variants in one comprehensive assessment [69].

As the pathophysiology of the PIDs is better characterized and new genes are emerging with unprecedented speed [5, 70, 71], the targeted panels are required to be expanded and updated on a frequent basis in order to meet the diagnostic needs and improve clinical sensitivity. In fact, more than 30 new gene defects have been added by IUIS to the updated version regarding the classification of PIDs in 2014 since the previous classification in 2011 [5]. It is likely that novel PID-related genes will continue to be identified in the future with the rapid advances in NGS technology as well as the widespread use of WES and WGS.

5 Whole Exome Sequencing (WES) in Primary Immunodeficiency Diseases (PIDs)

The current gene panels can only offer rapid genetic diagnosis for PIDs caused by mutations in known genes. However, when facing a new phenotype, atypical phenotypes or the phenotypes that are difficult to be classified into any categories of PID, WES would be the most useful tool currently for the clinical molecular diagnosis as well as the discovery of novel genes associated with diseases. Using WES, Dickinson et al. [13] examined 4 unrelated patients with an immunodeficiency syndrome that involved loss of dendritic cells, monocytes, and B and natural killer cells (DCML deficiency) [72]. They identified novel disease-causing mutations in GATA2 gene in all 4 patients. GATA2 is a transcription factor, which is composed of 2 highly conserved zinc finger domains that mediate protein-DNA and protein-protein interactions. GATA2 is required for stem cell homeostasis [73]. Furthermore, the functional studies indicated that haploinsufficiency and dominant-negative loss of GATA-2 function were potential mechanisms of pathogenesis in the DCML deficiency. This study again proved WES as a powerful tool for identifying disease-causing mutations in a small number of unrelated and sporadic cases of PIDs [13]. Hermansky-Pudlak syndrome (HPS) is a rare autosomal recessive disorder characterized by platelet dysfunction, oculocutaneous albinism, and life-threatening pulmonary fibrosis. By WES, Badolato et al. [35] identified a homozygous nonsense mutation (c.232C > T (p.Q78X)) in PLDN in a female with HPS-like primary immunodeficiency syndrome. In vitro, this PLDN mutation caused defective NK-cell degranulation and cytolysis, suggesting that the c.232C > T (p.Q78X) change in PLDN is pathogenic.

WES is particularly useful for patients who have no identifiable mutations after available NGS panels are exhausted. Patel et al. [74] reported an infant with low TRECs and non-SCID T lymphopenia. An early diagnosis of SCID with appropriate treatment and management, including the avoidance of exposure to viral infections or live virus vaccines, offering immune system restoring treatments or early hematopoietic stem cell transplantation (HSCT) , would change the patient’s prognosis significantly. A targeted NGS sequencing of SCID associated genes (ADA, AK2, CD3D, CD3zeta, DCLRE1C, ILRG, IL7R, JAK3, LIG4, NHEJ1, PNP, PTPRC, RAC2, RAG1, RAG2, RMRP, and ZAP70; GeneDx, Gaithersburg, MD) did not reveal any pathogenic mutations. By contrast, WES analysis identified two nonsense mutations; c. 842 T > G (p. L281X) and c.1030C > T (p. Q344X) in a compound heterozygous state in the NBN gene. These mutations were predicted to result in a loss of function, and were consistent with the absence of protein by immunoblotting and radiosensitivity testing on the patient lymphocytes. NBN encodes nibrin, which is a component of a molecular complex involved in the early recognition and subsequent repair of DNA damage [75]. Thus, the WES test led to a definite diagnosis of Nijmegen breakage syndrome (NBS) . The clinical phenotype of NBS is variable, although most patients with NBS present with immunodeficiency [76]. The appropriate application of WES resulted in the definite diagnosis of NBS, avoiding a complicated differential diagnosis with a large group of immunodeficiency diseases. Despite that many new PID associated genes have been identified, more novel disease-causative genes are expected to be discovered in future by NGS technology [5].

One limitation of the targeted NGS panels is that the gene list has to be updated frequently, followed by clinical validation which is laborious and expensive. WES overcomes this limitation . For example, although over 14 different SCID genes have reported, no specific gene defects have yet detected in many patients with hereditary abnormalities [77]. In a nonconsanguineous patient with early onset profound combined immunodeficiency and immune dysregulation , Punwani et al. [78] did not find any mutations by a comprehensive NGS panel of known SCID genes. While WES analysis revealed compound heterozygous mutations; c.1019-2A > G and c.1060delC (p.Y353fs*18), in a new gene, MALT1. Functional studies indicated that both T cells and B cells were damaged and NF-κB signaling pathway was impaired. This study suggested that the immunodeficiency in this patient was due to MALT1 defect. Based on these molecular findings, the patient was effectively treated by HSCT. This example highlighted the importance of a definite diagnosis that not only established a definite diagnosis possible but also brought a successful outcome with appropriate intervention.

Due to its cost-effectiveness, WES has been broadly used for discovering new genetic etiologies of immunodeficiency. Zhang et al. [79] reported a new syndrome of severe atopy, recurrent infections, autoimmunity, vasculitis, renal failure, and lymphoma, associated with motor and neurocognitive impairments. Using WES combined with Sanger sequencing, they identified two mutations; c.1585G > C (p. E529Q) and c.1438_1442del (p. L480Sfs*10) in PGM3 in a compound heterozygous state from one family, and a homozygous mutation, c.975 T > G (p.D325E) in another family. The further functional studies indicated reduced enzymatic activity and abnormal glycosylation which was resulted from the mutations. The mutations were segregated with the disease. Interestingly, all these patients showed hypo-sialylation of O-linked serum glycans , consistent with impaired PGM3 function. PGM3 gene encodes phosphoglucomutase 3 (PGM3) , which is a member of the hexose phosphate mutase family and catalyzes the reversible conversion of GlcNAc-6-phosphate (GlcNAc-6-P) to GlcNAc-1-P, required for protein glycosylation [80]. For the first time, these results defined a new PGM3 –mediated disorder characterized by severe atopy, immune deficiency, autoimmunity, intellectual disability and hypomyelination.

Willmann et al. [81] studied a large consanguineous pedigree with two patients presenting with combined immunodeficiency including recurrent, severe bacterial and viral infections and Cryptosporidium infection. Combined WES with single-nucleotide polymorphism (SNP) array-based homozygosity mapping, they identified a single homozygous variant, c. 1694C > G (p. P565R) in MAP3K14 on chromosome 17q21. This gene encodes NF-κB-inducing kinase (NIK) , which is a serine/threonine protein-kinase. NIK binds to TRAF2 and stimulates NF-kappaB activity. Interestingly, the patients with mutated NIK exhibit B-cell lymphopenia, have decreased frequencies of class-switched memory B cells and hypogammaglobulinemia due to impaired B-cell survival, and impaired ICOSL expression. In this study, the unexpectedly broad range of phenotypic aberrations (affecting B-, T- and NK-lineages) highlighted essential roles for NIK and adequate control of non-canonical NF-κB signaling for the generation and maintenance of the human immune system, thus, demonstrating the functional NIK deficiency as a novel, pervasive combined primary immunodeficiency syndrome. By WES, Martin et al. [82] identified a homozygous mutation (c.1692-1G > C) in CTPS1 in 8 patients from 5 unrelated families with a novel and life-threatening immunodeficiency. All patients presented with early onset of severe infections mostly caused by herpes viruses, including EBV and varicella zooster virus (VZV) and also suffered from recurrent encapsulated bacterial infections, a spectrum of infections of a typical combined deficiency of adaptive immunity. CTPS1 encodes CTP synthase 1, which is responsible for the catalytic conversion of uridine triphosphate to cytidine triphosphate (CTP) . CTP is a building block required for the biosynthesis of DNA, RNA and phospholipids [83]. The CTP synthase activity may play an important role for DNA synthesis in lymphocytes [84]. This CTPS1 mutation is predicted to affect a splice donor site at the junction of intron 17–18 and exon 18, leading to the expression of an abnormal transcript lacking exon 18. Functional studies demonstrated that CTPS1 deficiency led to an impaired capacity of activated T and B cells to proliferate in response to antigen receptor-mediated activation. As a result of WES test, a new type of PIDs was confirmed.

Despite the non-specific and overlapping clinical and laboratory features of PIDs, new gene defects are continuing to be identified by WES, leading to a rapid classification of new types of PIDs. One good example is the identification of the dedicator of cytokinesis 2 gene (DOCK2) [85]. DOCK2 gene encodes a hematopoietic cell-specific, Caenorhabditis elegans Ced-5, mammalian DOCK180 and Drosophila melanogaster myoblast city (CDM) family protein that is indispensable for lymphocyte chemotaxis. DOCK2 is specifically expressed in hematopoietic cells, predominantly in the peripheral blood leukocytes, and may be involved in remodeling of the actin cytoskeleton required for lymphocyte migration, through the activation of RAC [86]. Dobbs et al. [85] performed WES and immunologic studies on five unrelated children, who presented with a distinctive type of combined immunodeficiency that is characterized by early-onset, invasive bacterial and viral infections; T-cell lymphopenia; impaired T-cell, B-cell, and NK-cell function; and defective interferon immunity in both hematopoietic and non-hematopoietic cells. They detected biallelic mutations in DOCK2 in all 5 patients, and all mutations were predicted to be deleterious. The functional studies of DOCK2 deficiency in humans revealed an impaired RAC1 activation and defects in actin polymerization, T-cell proliferation, chemokine-induced lymphocyte migration, and NK-cell degranulation. Thus, they demonstrated DOCK2 deficiency as a new Mendelian disorder with pleiotropic defects of hematopoietic and non-hematopoietic immunity. Furthermore, normalization of immunologic abnormalities and resolution of infections were obtained in 3 patients after HSCT. This rescue of the clinical phenotype was possibly due to the generation of a source of cells producing interferon-α/β (e.g., plasmacytoid dendritic cells) and therefore complementing the defect in non-hematopoietic tissues . By contrast, the 2 patients without HSCT treatment died early in childhood. In conclusion, the definite molecular diagnosis helped physician develop a personalized treatment plan for the patient and resulted in better outcome in these patients. Although we only described a few examples of using WES for clinical diagnosis and discovery of new genetic defects in this chapter, the impressive outcomes have promised a broad application of WES in PIDs.

6 Selective Whole Exome Sequencing

Although many NGS panels are available, none of them has included all known disease causing genes for certain single or groups of diseases. With the advent of NGS technology; novel genes are consistently being identified; however, these genes are not able to be immediately added on to corresponding panels. There are several major reasons: (1) the genetic field is dynamic, (2) the complexity of NGS panel design and (3) a time-consuming clinical validation process. Thus, selective exome sequencing (SES) is an excellent alternative that can capture all of the currently known and future disease causing genes efficiently. The NGS technology and sequencing process for SES is similar to WES. The strategy for such SES approach is to sequence the whole exome but only analyze a group of genes of interest. The SES is best suited for patients with clearly defined, genetic heterogeneous conditions whereby a comprehensive gene panel is not available, or the patient has a single gene disorder for which clinical testing is not currently available. Moreover, SES sequencing permits the analysis of genes related to patient’s phenotypes, thus, offers the flexibility of incorporating new clinical genes at any time. In comparison with the regular WES, the SES sequencing offers test results with shorter turn-around time because the analysis is restricted to specific number of genes, a “focus panel”. Unlike WES trio analysis (a common WES strategy), SES testing is only performed on the proband and does not use samples from family members for the analysis, which reduces the cost of the test. Another difference with WES is that SES reports will not include any incidental findings because it is a targeted panel and non-panel genes will not be analyzed including those recommended by the American College of Medical Genetics (ACMG) guidelines. SES is not a replacement for established panel tests because we expect the sensitivity to be somewhat reduced by having some regions with lower or no coverage. However, the coverage is still acceptable. For example, the coverage of the comprehensive SES PID panel composed of 336 genes had an average coverage of 112X. It also had a 98, 98, 97, 96, 94 and 90.20% of coverage at 3X, 5X, 10X, 20X, 30X and 40X levels respectively (unpublished data). Similar coverage numbers were observed with other smaller sub-PID panels. For the aforementioned reasons, the SES sequencing approach is predicted to be one of the approaches that transform molecular diagnostics.

7 Whole Genome Sequencing (WGS) in Primary Immunodeficiency Diseases (PIDs)

WGS is designed to target the whole genome, which includes both the protein coding regions and the non-coding regions. Theoretically, WGS would allow characterization of all variants of the whole genome including the large deletions, duplications that WES fails to detect [10]. Consequently, the WGS will give rise to large numbers of gene variants. The clinical adoption of WGS has been challenging both due to the high cost and the difficult and time-consuming nature of interpretation of the variants particularly resulted from the non-coding regions [10].

Notably, WGS has been proven to be an effective tool for the molecular diagnosis of several genetic disorders, with no candidate gene variants detected by other NGS testing before [20]. The study was performed by Taylor et al. on 500 patients with diverse genetic disorders without disease-causing variants or candidate genes resulted from other NGS tests. On average, 82.7% of the genome including 88.2% of the exome was covered by at least 20X. To evaluate the clinical efficacy of WGS, only156 patients or families with Mendelian and immunological disorders were summarized. Overall, they identified disease-causing variants in 21% of cases, with the proportion increasing to 34% (23/68) for Mendelian disorders and 57% (8/14) in family trios. In addition, they detected 32 potentially clinically actionable variants in 18 genes unrelated to the referral disorder. Interestingly, there were two candidate pathogenic variants outside the coding fraction of the genome. One of them is in 5′ UTR of the EPO gene from two independent families with erythrocytosis and co-segregated with the disease; another one is a complex deletion of 1.4 kb of the X chromosome and insertion of 50 kb from chromosome 2p from a patient with X-linked hypoparathyroidism. This variant lay 81.5 kb downstream of SOX3, and is segregated with the phenotypes. The discovery of these two pathogenic candidates demonstrated the value of WGS for screening the noncoding genome. This study has implicated that WGS for clinical diagnosis is becoming realistic in spite of many challenges ahead.

WGS has shown a strong potential of identifying mutations in noncoding regions of genome in PIDs. Mousallem et al. [14] performed WGS on a girl with a clinical diagnosis of SCID, who presented at infancy with a history of failure to thrive and recurrent infections. Flow cytometry did not show any T or B cells, but revealed an elevated percentage of NK cells. T-cell proliferation studies showed no responses to mitogens. In this study, the WGS detected and defined the exact breakpoints of a homozygous 82-kb deletion spanning exons 1 to 4 in DCLRE1C. WGS analysis of the other two SCID patients revealed the same deletion mutation in DCLRE1C in addition to the splice mutation (c.362 + 2 T > A). The combined presence of this frame shift deletion and the c.362 + 2 T > A (IVS5 + 2 T > A) splice site mutation was predicted to be the cause of SCID in this patient. This study demonstrated a promising potential of using WGS to reveal size and precise breakpoint of large complex mutations that cannot be achieved by WES and other NGS testing.

8 Summary

Even though the application of NGS technology to clinical molecular diagnosis is still at an early stage, impressive results have been obtained in PIDs, leading to an accurate, rapid diagnosis. NGS panels can detect variants in most known disease genes at once, and has thus made comprehensive PID diagnosis easier and faster. Focused exome sequencing is a middle-of-the-road test with the advantages over NGS panels of being able to include larger more comprehensive panels and the ability to add newly discovered disease associated genes, and the promise of having a faster turnaround time than the full WES test. WES, originally a powerful tool for dissecting genetics, is now much more affordable and has been widely adopted in PID diagnostics. In addition to the high diagnostic yield, WES has identified a large number of novel genes that cause diseases. WES is greatly accelerating our elucidation in all of the genetic diseases including the PIDs. While the WGS for clinical use is still in its infancy, WGS has raised an exciting possibility that all of the gene variants can be detected at once. As knowledge is rapidly acquired with respect to the clinical significance of the millions of variants carried by each individual, the WGS may be employed as a routine genetic testing strategy for clinical diagnosis 1 day. Despite the limitations resulted from targeted NGS panels, WES or even WGS, NGS-based gene sequencing tests have clearly demonstrated their unique potentials for the most complicated diagnoses in PIDs. However, we have to be aware that each method has its own specific limitations. It is critical for clinicians and molecular geneticists to choose the most appropriate NGS-based tests in order to reach the best outcome. For atypical PID patients with Mendelian inheritance pattern or the patients with a negative result by targeted NGS panels, WES should be recommended. WGS has not been adopted formally by clinical laboratories, it may reveal many variants in the genes that are remained unknown. Thus, the interpretation to these gene variants should be taken cautiously, especially when these variants are in a noncoding region or regulatory region.

Despite the emerging challenges, NGS has proven revolutionary and has significantly impacted all the fields of genetic and genomics. The NGS-based technologies have not only empowered clinical molecular diagnostics, but also provided the best tool for dissecting the genetic bases of unknown diseases. In the next few years, there is no doubt that the application of NGS (target NGS panels and WES) will continue to be the leading force in clinical work and clinical and/ or basic genetic research. WGS has attracted a great deal of attention because of its potential for detecting gene defects of the whole genome. Therefore, in the near future, the WGS should also be considered as an ultimate approach to identify the unknown genetic disorders in clinic.