Introduction

Diabetes is an epidemic, with almost 90 million patients with pre-diabetes, type 1, or type 2 diabetes in the USA today. In an effort to control their blood sugar, patients with diabetes endure a litany of lifestyle adjustments, dietary restrictions, insulin injections, and medications (and their respective side effects). While this does improve health outcomes and decrease co-morbidities, it severely impacts their quality of life. Because of the proportions of this public health problem, there is increased demand for a cellular replacement therapy as an alternative to daily insulin injections and a potential cure for diabetes.

A decline in the number of functional insulin-secreting beta cells is a part of the underlying pathogenesis of both type 1 and type 2 diabetes [1,2,3]. Over the years, there have been three primary approaches to generating new beta-cell mass: inducing self-renewal of existing beta cells, reprogramming other cell types to become beta cells, and directed differentiation of pluripotent stem cells to a beta-cell fate. While all three approaches have particular merits, strengths, and weaknesses, the directed differentiation of pluripotent stem cells into beta cells (abbreviated as SC-β cells) has garnered the most attention as of late [4,5,6,7,8,9,10]. Protocols to produce SC-β cells have been shown to work in both human embryonic stem cells (ESCs) and induced pluripotent stem cells (iPSCs), which are pluripotent cells reprogrammed from somatic cells [11, 12, 13••, 14, 15]. These advances enable researchers to bypass the ethical and legal considerations endemic to using human ESCs and catalyze the translational potential of SC-β cells in drug discovery and disease modeling. Specifically, the use of iPSCs to generate SC-β cells has garnered the most enthusiasm because of the possibility of using these cells for autologous transplantation therapies in the future.

The protocols for producing SC-β cells are informed by our understanding of developmental biology [16, 17]. The cocktails of growth factors and small molecules temporally activate transcription factor cascades that specify endoderm, foregut, pancreatic progenitors, pancreatic endocrine progenitors, and finally differentiated pancreatic islet cells. These differentiation protocols recapitulate endogenous beta-cell differentiation, as understood from mouse and human studies [6]. In the past, the ability to genetically engineer mice conferred an advantage when using mouse embryos as a model system to understand beta-cell differentiation. While these models have proven invaluable to our understanding of beta-cell development, there are inherent differences in human and mouse development and physiology that have restrained our ability to generate robust, functional beta cells in vitro. Over the last decade, human pluripotent stem cell differentiation protocols have evolved from producing low numbers of poly-hormonal cells to large numbers of functional insulin cells in vitro [18,19,20, 15, 13••, 21, 14, 22]. When coupled with the improvements in gene editing, the field is now positioned to expand our understanding of human beta-cell development, human beta-cell regeneration, and work towards human beta-cell replacement therapy. The remainder of this review will focus on how gene editing technology, specifically CRISPR/Cas-based editing, can be used to advance our mechanistic understanding of development and disease to advance the goal of developing new therapies for patients with diabetes.

Methods of Gene Editing

Targeted modification of DNA sequences has been a long-standing approach to understanding development and disease. From the use of phage for site-directed mutagenesis to the co-opting of endogenous DNA repair mechanisms to facilitate homologous recombination, decades of research have focused on how insertion, deletion, or replacement of DNA can facilitate our understanding of gene function. The bottleneck in using gene editing has traditionally been in specificity and efficiency. Within the last two decades, three cutting-edge technologies have emerged that employ enzymatic “scissors” to cut DNA at specific gene loci: zinc-finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs), and the clustered regularly interspaced short palindromic repeats-associated Cas protein system (CRISPR/cas) [23]. All three of these technologies induce a double-strand break (DSB) in the DNA and rely on the endogenous DNA repair mechanisms, either homologous recombination (HR) or non-homologous end-joining (NHEJ), to repair the DNA breaks as illustrated in Fig. 1 [24].

Fig. 1
figure 1

ZFN, TALEN and CRISPR/Cas9 gene editing techniques result in double-stranded DNA breaks. Endogenous DNA repair pathways activate non-homologous end-joining (NHEJ) or homology-directed recombination (HDR, using donor DNA) to repair the break in DNA and potentially add, delete, or edit DNA, depending on the design of the experiment. Unlike the TALEN and ZFN gene editing systems that require a custom-designed DNA-binding domain for each target, Cas9 has two endonuclease domains and is able to generate a double-strand break without the need for dimerization. The CRISPR/Cas9 system is currently favored because of its modularity, flexibility, specificity, decreased toxicity, and ease of designing target single-guide RNA (sgRNA). Gene editing can facilitate a patient-specific approach to studying beta-cell development, understanding the molecular mechanisms that cause diabetes, and generating new beta cells as a cellular therapy for patients with diabetes

ZFNs consist of a custom-engineered N-terminal DNA binding domain and a C-terminal endonuclease (usually a non-specific cleavage domain from the bacterial FokI endonuclease) that dimerize to cleave DNA [25, 26]. The DNA binding domain is an array of multiple zinc-finger domains, each of which recognizes a specific three base pair sequence of DNA [27]. Four to six zinc finger domains are then fused together in an array that recognizes specific 12–18 base pair sequences of nucleotides [28]. While the ZFN approach is highly specific, it is also limited because some three-nucleotide sequences are not recognized by zinc-finger domains and requires expertise in protein engineering to design arrays that flank target sites [28,29,30,31,32]. Once the DNA has been cleaved by the ZFN dimer, endogenous DNA repair machinery is activated to repair the DSB by homologous recombination (if a donor DNA template is present) or NHEJ [33, 34]. NHEJ may be harnessed to disable an autosomal dominant gene, but can also result in oncogenic translocations or gene silencing [34]. Currently, ZFN technology is being used in a clinical trial to mutate the CCR5 gene CD4+ T cells used for an autologous cell transplantation to treat HIV infection [35, 36].

TALENs, like ZFNs, contain an N-terminal DNA binding domain and a C-terminal FokI endonuclease domain. The transcription activator-like effector (TALE) DNA binding domain is a highly conserved sequence of 33–35 amino acids, with a variation at the 12th and 13th position that confer sequence specificity [37, 38]. Multiple TALE domains can be arrayed to flank the specific area of interest, and to dimerize and cleave DNA [39, 40]. Because the endogenous DNA repair machinery is employed to close the DNA strand breaks, TALEN technology is prone to similar concerns of off-target effects and translocations that are present with ZFNs [41, 42]. TALENs are easier to engineer and have an improved cytotoxic prolife compared to ZFNs, but their usefulness is restricted in instances when a thymidine is not part of the target sequence or when targeting highly methylated gene regulatory sequences [43,44,45,46].

Clustered Regularly Interspaced Short Palindromic Repeats-Associated Cas Protein System

CRISPR/cas is adapted from prokaryotes’ adaptive resistance to bacteriophages, invading plasmids, or viruses [47, 48]. The prokaryotic genome has A–T-rich leader sequences directly next to 27–42 base pair palindromic repeats. The palindromes are separated by “interspaced” DNA, known as protospacers, which is a template of previously recognized bacteriophage DNA and serves as a signature of past infection [49,50,51,52, 53••, 54]. The leader sequence, palindromic sequence, and protospacer are transcribed into RNA that, after processing, is referred to as a CRISPR RNA (crRNA). These crRNAs are recognized by Cas proteins and are targeted to foreign DNA by the complementary protospacer region of the crRNA. The mature Cas complexes are capable of cleaving nucleic acids complimentary to the crRNA. In the case of Cas9, a type II Cas protein, a second transactivating crRNA (tracrRNA) pair with the repeat sequence in the crRNA activates the Cas9 endonuclease activity [55]. It has been shown that the crRNA and tracrRNA could be custom-engineered into a single-guide RNA (sgRNA) that could simultaneously target and activate Cas9 endonuclease activity throughout the mammalian genome [56]. The ability to express the Cas9 protein in concert with any number of custom-designed sgRNA makes the CRISPR/Cas9 approach a versatile modular system able to economically and efficiently target genomic sites for editing.

In comparison to ZFN and TALEN gene editing, CRISPR/Cas9 has lower cytotoxicity and higher targeting efficiency, especially when multiple sgRNAs target a single gene [57,58,59]. While very specific, CRISPR technology also has off-target binding, possibly due to mismatch tolerance between genomic DNA and gRNAs, that can lead to off-target insertions, deletions, or translocations [60]. GUIDE-seq, digenome-Seq, and CIRCLE-seq are unbiased genome-wide DNA sequencing approaches that have been effective in quantifying the off-target effects of CRISPR/Cas9 activity [61,62,63], Bioinformatic design of sgRNAs based on Cas9 binding specificity can decrease off-target binding [64,65,66]. Mutant variants of Cas9 have improved targeting precision. Gene targeting by Cas9-HF1 or eSpCas9, variants that have neutralizing mutations to reduce non-specific binding crRNA to DNA, did not result in any off-target events detectable by GUIDE-seq [67, 68]. Cas9n, a variant where one endonuclease domain is inactivated thus only allowing for a single strand break, is paired with sgRNAs targeted to opposite strands and has increased targeting fidelity [69, 70]. Several variants of Cas9 have been designed to induce the expression of Cas9. The iCRISPR system has doxycycline-inducible expression of Cas9 in human embryonic stem cells, while a light-inducible form of Cas9 was engineered by the inclusion of a caged lysine amino acid [71, 72].

As there have been improvements in the targeting specificity of the CRISPR/cas9 system, there have also been considerable efforts made to increase the gene knock-in efficiency. Homology-directed repair (HDR), which is the use of exogenous targeting vectors as a template to introduce precise genetic modifications at DNA DSBs, is the alternative cellular pathway that can be co-opted to insert genes of interest at CRISPR-targeted DSBs. The efficiency of HDR is between 0.5% and 20% (and between 2–4% in hPSCs), and is only active in cells in the S-phase and late G2. [56] To improve knock-in efficiency, groups have chemically or genetically inhibited NHEJ to force HDR, while others have synchronized cells, pushed cells into the S-phase, and nucleofected pre-assembled Cas9 complexes into cells [73,74,75,76]. These techniques variably improved HDR efficiency between 4- and 50-fold, achieving up to 38% of hPSCs harboring the HDR knock-in allele in at least one allele. Varying the length of the homology arms (HAs) on the donor vector has also been effective in improving knock-in efficiency [77,78,79,80]. Using single-stranded oligodeoxynucleotides (ssODNs) with 90 nucleotide HAs or double-cut donor plasmids (with sgRNA recognition sequences flanking the insertion cassette) with 600 nucleotide HAs increased insertion efficiency as high as 30% in hPSCs [76].

It is possible that the epigenetic state of pluripotent stem cells restricts the efficiency of HDR. Recent work reported that naïve hPSCs that were directly reprogrammed from a patient with β-thalassemia fibroblasts were corrected via CRISPR/Cas9-mediated HDR; naïve hPSCs are stem cells with a ground-state pluripotent transcriptome more reflective of naïve pluripotent stem cells,. The CRISPR/Cas9 insertion efficiency was higher in the naïve hPSCs than in iPSCs, but still relatively low (4.7%) [81]. It has been reported that highly active sgRNAs and Cas9 are located in euchromatic areas of the nucleus, and that nucleosomes directly prevent Cas9 binding and cleavage [82, 83].

CRISPR/Cas9 technology can be used for purposes other than genome editing. If both endonuclease domains are inactivated, the catalytically inactive cas9 (dCas9) can be tethered to other proteins to modulate gene activity [84]. The dCas9 fusion system has been used to turn on gene expression by dCas9-VP16 fusion or to repress gene expression by dCas9-KRAB fusion to block RNA polymerase [85]. It was recently reported that this dCas9 system was used to activate INS expression in fibroblasts from patients with type 1 diabetes [86]. A light-activated Cas9 effector (LACE) fused Cas9 to the Cry2 and Cib proteins to induce transcription of endogenous target genes in the presence of blue light [87]. dCas9 mutants have also been fused to TET1 and DNMT3a proteins to target DNA methylation and demethylation of DNA at precise locations [88]. Taken together, the modularity of the CRISPR/Cas9 system makes it a flexible, multifunctional platform from which gene editing, activation, and repression can be modulated to study disease, development, and drug discovery.

While this review is focused on the CRISPR/Cas9 system, it is important to note that other class II Cas endonucleases are emerging as potential alternatives to Cas9-mediated gene editing. One such variant is the endonuclease Cpf1, which is a single RNA-guided endonuclease that differs from Cas9 in that it does not require tracrRNA and it has a T-rich PAM sequence proximal to the protospacer region [89, 90••]. While Cas9 cleavage results in a blunt-end DSB, Cpf1 cleaves DNA via a staggered DSB, thus creating a 5’ overhang. This results in sticky ends that can be exploited to facilitate orientation of a gene of interest. Genome-wide analysis study suggests that Cfp1 has a similar, if not lower, indel mutation rate and that of target mutagenesis is very low [91, 92]. CRISPR-Cfp1 has been used to make knockout mice [93, 94]. It was recently reported that CRISPR-Cfp1 nuclease can efficiently correct DMD mutations in patient-derived iPSCs and mdx-null mice, allowing for restoration of dystrophin expression in cells generated from patients and mice with Duchenne muscular dystrophy [95]. Taken together, improvements in the specificity, fidelity, and targeting of CRISPR-based gene editing, whether with Cas9 or the emerging variants like Cfp1, make CRISPR-based gene editing a versatile and efficient tool to study development and disease.

Using Gene Editing to Understand Development and Disease

Using gene editing to manipulate the genome of human pluripotent stem cells has extended the possibilities of how we can study beta-cell development, of how we can investigate the mechanisms that underlie diabetes, and of how we can pursue new cellular therapies for curing diabetes (see Fig. 1).

Gene Editing to Understand Beta-Cell Development

Using human pluripotent stem cells has recently elucidated differences in mouse and human endocrine cell development. A key example is the requirement for NGN3. In mouse models, ngn3 is required for the differentiation of all endocrine cell types of the pancreas and the intestine [96,97,98]. Yet, multiple case studies have described patients with mutations in NGN3 that have no enteroendocrine cells, but are born making c-peptide and are non-diabetic, although these patients develop diabetes later in childhood [99,100,101]. This suggests that the requirement for NGN3 expression in human beta-cell differentiation is less strenuous than that of mice. To address this discrepancy, James Wells’ group generated NGN3-null human embryonic stem cells using CRISPR/Cas9 gene editing [102]. They describe hESCs that were able to efficiently form endoderm and pancreatic progenitor cells, but were unable to form endocrine cells in vitro. In contrast, they generated a line of hESCs with a siRNA knockdown of NGN3 expression, and concluded that as little as 10% of NGN3 activity was sufficient to induce SC-β cell differentiation. This suggested that the patients in the clinical case reports had mutations that decreased, but not eliminated, the activity of NGN3 that was sufficient to induce pancreatic endocrine cell differentiation [102]. More recently, Danwei Huangfu’s group used both TALEN and CRISPR/Cas9 gene editing to delete eight transcription factors (PDX1, RFX6, ARX, NGN3, PTF1A, HES1, GLIS3, and MNX1) from human embryonic stem cells [103••]. In contrast to the Wells study, the Hangfu NGN3-/- hESCs were able to generate a small percentage of c-peptide+ SC-β cells [103••]. This suggests there are alternative pathways to human beta-cell development that do not require NGN3, and that NGN3 is not just necessary for endocrine differentiation, but also beta-cell maturation. This is a potentially divergent role of NGN3 in humans and mice that could only be elucidated with the use of gene editing. More recently, the same group used human pluripotent stem cells to investigate how haploinsuffiency of GATA6 impairs human pancreatic progenitor formation and has identified a dose-dependent requirement for GATA4 in pancreatic progenitor cell formation [104]. Taken together, these studies support the use of human pluripotent stem cells as a model for investigating genetic basis of human pancreatic, endocrine, and beta-cell differentiation. It will be interesting to see if iPSCs from patients with these genetic mutations exhibit the same phenotypes as the CRISPR/Cas9 models and if they can be corrected and used as isogenic SC-β cells capable of restoring glucose homeostasis.

In addition to providing new insights into human development, CRISPR/Cas9-edited human iPSCs are being used to decipher the mechanisms by which genes identified by genome wide association studies (GWASs) underlie beta-cell dysfunction. Over the past decade, GWASs have identified genetic variations in more than 90 loci, including SNPs/copy number variants/insertions/deletions in intergenic and intragenic regions, that are associated with type 2 diabetes [105,106,107,108,109,110,111]. Yet, studies designed to understand how those variations dysregulate beta-cell function have been lacking. A recent study took on this problem by using CRISPR/Cas9 to delete three genes identified as genetic variants associated with beta-cell function (CDKAL1, KCNJ11, and KCNQ1) in human ESCs [112]. While deletion of these genes did not affect beta-cell differentiation, SC-β cells from each knockout hESC line displayed impaired insulin secretion, and in the case of CDKAL1, hypersensitivity to glucolipotoxicity [112]. This is an important first step to understanding how the genetic variations identified by GWAS may translate into beta-cell dysfunction. In the future, it will be enlightening to either induce or correct GWAS-identified variations in iPSCs and evaluate changes in beta cell function.

Gene Editing to Escape the Immune System

Like allogenic donor islet transplantations, current allogenic SC-β replacement therapies would require life-long patient immunosuppression if the cells were not protected in a macroencapsulation vessel [116, 117]. A significant amount of capital has been invested in testing the efficacy of SC-β cells in encapsulation devices and their ability to improve glucose homeostasis in patients with type 1 diabetes [113,114,115]. There are multiple companies working on this problem, and some have moved into clinical trials (https://clinicaltrials.gov/show/NCT02239354). While encapsulation devices represent a potentially eminent solution to the autoimmunity problem in patients with type 1 diabetes, a long-term solution could be to use CRISPR/Cas9 gene editing to engineer a beta cell that could escape autoimmune attack. A clinical case report describing a patient with long-standing type 1 diabetes afflicted with an insulinoma may provide a hint that this is possible [118]. Histologic evaluation of the resected pancreatic tissue showed small, glucagon dominant islets in the healthy tissue while the insulinoma had robust insulin staining. The patient was afflicted with hypoglycemic episodes and produced measureable c-peptide until the insulinoma was removed, suggesting that the insulin cells in the tumor were functional [118]. This patient’s case suggests that there were immunogenic modifications that insulinoma stem cells made that enable them to evade autoimmune attack in a patient with type 1 diabetes. Close examination of this case and other similar cases may provide revolutionary insight into how to engineer patient-specific iPSCs that can differentiate into SC-β and be able to evade an autoimmune attack when transplanted into patients with type 1 diabetes. Several groups are working towards making universally compatible hPSCs by using gene editing technologies to genes critical to immunogenicity. For example, hESCs deficient in β-2-microglobulin maintained pluripotency, were devoid of HLA class I proteins, and were hypoimmunogenic when transplanted into mice [119]. iPSCs null for β-2-microglobulin have been used to differentiate universally compatible blood platelets, and studies have suggested this could serve as a potential supply of renewable platelets for patients [120]. Other approaches to generating universally compatible stem cells have focused on deleting CIITA or ectopically expressing immunosuppressive molecules [121,122,123]. Taken together, these studies suggest that it may be possible to generate non-immunogenic patient-specific iPSCs and differentiate them into SC-β for an autologous stem cell transplant for patients with type 1 diabetes.

Conclusions

Gene editing has transformed the way we can use human-induced pluripotent stem cells to study development and diabetes. The flexibility and efficiency of CRISPR/Cas9 technology has accelerated the pace at which new discoveries can be made. Yet the complexities of human diversity confounds our ability to convincingly and consistently model diabetes in humans [124, 125]. There are still many issues to be addressed in this field that suggest the goal of developing a personalized cellular therapy to treat patients with diabetes is in our future, not in our present. Nonetheless, we are progressing to that goal faster and more efficiently with the power of gene-editing technologies.