Introduction

The capability to genetically engineer mammalian cells and organisms has broadened our knowledge in studying biology and disease mechanisms. Such a goal has been difficult to achieve for more than two decades because of the extremely low gene-targeting efficiency of the traditional DNA homologous recombination (HR). Until recently, a number of genome-editing tools have emerged, such as zinc-finger nucleases (ZFNs) [1, 2], transcription activator-like effector nucleases (TALENs) [3, 4], and clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated protein 9 (Cas9) [5, 6]. The first two gene-targeting technologies are engineered endonucleases that contain a sequence-specific DNA-binding domain and nonspecific DNA cleavage domain. The latest gene-targeting tool CRISPR/Cas9, which originates from the adaptive bacterial immunity system, is developed by fusing CRISPR-RNA and trans-activating crRNA to form a guide RNA (gRNA); gRNA guides the Cas9 protein to cleave a specific DNA sequence [7, 8]. Similar to ZFNs and TALENs, double-strand breaks (DSBs) are created in the targeted site upon being cleaved by Cas9. DSBs stimulate cellular DNA repair mechanisms, including error-prone nonhomologous end joining (NHEJ) and homology-directed repair (HDR), if a repair template exists [9]. CRISPR/Cas9 has been successfully applied for efficient gene targeting in several cell types and organisms, including induced pluripotent stem cells [8], mice [10], rat [5, 6], zebrafish [1113], rabbit [8, 14, 15], monkey [16], C. elegans [17], and plants [18]. However, cleavage activity must be evaluated before delivering to target cells or embryos because CRISPR/Cas9 does not cleave all loci with similar efficiencies.

In the application of the CRISPR/Cas9 system to target genes in cells, especially to target multiple genes in a single step, testing the specificity and cleavage activity of designed gRNAs is foremost. The currently dominating approach to test cleavage activity is T7 endonuclease 1 (T7E1) assay. T7E1 is a mismatch-specific DNA endonuclease used to recognize and cleave nonperfectly matched DNA [19, 20]. However, the digested heteroduplex DNA-containing mutant and wild-type alleles can hardly be visualized by agarose gel, and the production of the template DNA through polymerase chain reaction (PCR) can also affect the sensitivity of T7E1 assay. To facilitate the detection of the cleavage activity of a customized Cas9/gRNA, fluorescent reporters are established based on the Cas9/gRNA mediated-HDR restoration of the mutated green fluorescent protein (GFP) gene. Compared with HDR, single-strand annealing (SSA) is initiated between two homologous sequences in the same direction flanking the targeting site when a DSB occurs in cells [2124]. The two homologous sequences containing repeated sequences can anneal to each other, thereby restoring DNA as a continuous duplex. We accordingly conducted a GFP-based SSA reporter assay to facilitate testing. Once a break is introduced by the Cas9/gRNA cleavage, the two repeated mutant GFP sequences anneal together and recombine into an integrated GFP fragment. Thus, the cleavage efficiency can be conveniently detected by fluorescence-activated cell sorting (FACS).

In this study, we compared the GFP-based SSA reporter assay and the traditional HDR assay in testing the cleavage activity of the Cas9/gRNA system in 293 cells. The targeting efficiency was detected by flow cytometry. The SSA recombination assay showed a higher sensitivity than the HDR assay. Given such an advantage, we attempted to disrupt the immunoglobulin M (IgM) gene and the nephrosis 1 (NPHS1) gene in porcine somatic cells. T7E1 assay and Sanger sequencing were also performed to confirm the real cleavage activity to the endogenous genes in cells. We reported the feasibility of the SSA-GFP reporter system in estimating the targeting efficiency of the Cas9/gRNA system in mammalian cells, which will provide a convenient and rapid basis for subsequent targeting applications.

Methods

Vector Construction

The vectors used in this work were constructed through standard cloning methods. The human codon-optimized spCas9 gene flanked by two nuclear localization signals (NLSs) was synthesized as previously reported [8]. The gRNA scaffold harboring BbsI enzyme sites was synthesized and cloned into the pMD18T vector (Takara), which was driven by a human U6 promoter. The gRNA-targeting sites were designed by the GN20GG rule. Two complementary oligos containing the gRNA target site and cohesive ends were synthesized and annealed to a double-strand DNA, which ligated to the BbsI sites of the U6-gRNA cloning vector. The oligos used for constructing IgM-gRNA and NPHS1-gRNA were as follows:

  • IgM-gF:CACCGATTACTATGCTATGGATCTC;

  • IgM-gR:AAACGAGATCCATAGCATAGTAATC;

  • NPHS1-gF:CACCGTGGGAAACTGGGGATCCT; and

  • NPHS1-gR:AAACAGGATCCCCAGTTTCCCAC.

The oligos used for constructing IgM-gRNA with mismatches were as follows:

  • IgM-gRNA OT1 F: CACCGATTACTATGCTATGGATCTC;

  • IgM-gRNA OT1 R :AAACHAGATCCATAGCATAGTAATC;

  • IgM-gRNA OT3 F: CACCGATTACTATGCTATGGATDTC;

  • IgM-gRNA OT3 R :AAACGAHATCCATAGCATAGTAATC;

  • IgM-gRNA OT5 F: CACCGATTACTATGCTATGGBTCTC;

  • IgM-gRNA OT5 R :AAACGAGAVCCATAGCATAGTAATC;

  • IgM-gRNA OT7 F:CACCGATTACTATGCTATHGATCTC;

  • IgM-gRNA OT7 R: AAACGAGATCDATAGCATAGTAATC;

  • IgM-gRNA OT9 F: CACCGATTACTATGCTBTGGATCTC;

  • IgM-gRNA OT9 R :AAACGAGATCCAVAGCATAGTAATC;

  • IgM-gRNA OT11 F:CACCGATTACTATGDTATGGATCTC;

  • IgM-gRNA OT11 R: AAACGAGATCCATAHCATAGTAATC;

  • IgM-gRNA OT13 F:CACCGATTACTAVGCTATGGATCTC;

  • IgM-gRNA OT13 R:AAACGAGATCCATAGCBTAGTAATC;

  • IgM-gRNA OT15 F:CACCGATTACVATGCTATGGATCTC;

  • IgM-gRNA OT15 R:AAACGAGATCCATAGCATBGTAATC;

  • IgM-gRNA OT17 F:CACCGATTBCTATGCTATGGATCTC;

  • IgM-gRNA OT17 R:AAACGAGATCCATAGCATAGVAATC;

  • IgM-gRNA OT19 F:CACCGAVTACTATGCTATGGATCTC;

  • IgM-gRNA OT19 R: AAACGAGATCCATAGCATAGTABTC;

The HDR reporter and SSA reporter vectors were constructed on the basis of the pEGFP-N1 (Clontech) plasmid. For the HDR reporter vector, the 122-bp sequence of the enhanced GFP (EGFP) gene was deleted, and a stop code (TAA) flanked by SpeI and MluI cloning site sequences was added by the PCR method. The primer sequences were as follows: HDR-EGFP-F:TACGTTTAACACGCGTCTACAACAGCCACAACGTCT and HDR-EGFP-R:TACGTTTAACACTAGTTTAGTCGTCCTTGAAGAAGATGGT. PCR production containing the entire EGFP gene sequence without the start codon (ATG) was amplified from pEGFP-N1 and used as the repair donor. The primer sequences were as follows: dEGFP-F:GTGAGCAAGGGCGAGGAGCT and dEGFP-R:TTACTTGTACAGCTCGTCCAT. For the SSA reporter vector, two segments of the EGFP sequence containing an identical 200 bp of the homologous region and a stop code (TAA) flanked by SpeI and MluI cloning sites were introduced into the pEGFP-N1 vector by the PCR method. The primer sequences were as follows: SSA-EGFP-F:TACGTTTAACACGCGTCCATGCCCGAAGGCTACGT and SSA-EGFP-R:TACGTTTAACACTAGTTTAGTCCATGATATAGACGTTGTGGCT. To generate the gRNA-targeting site specific reporter vector, the 200 bp sequence of porcine IgM and NPHS1 genes contained in the gRNA-targeting site were amplified by PCR and cloned into the SpeI and MluI cloning sites of the HDR Reporter and SSA reporter vectors. The primer sequences were as follows: IgM-TF:AACACTAGTAGGGCATTGGCCGTCGCA; IgM-TR: AACACGCGTCCTCCCGAGGATCAGAGTCAG; NPHS1-TF:AACACTAGTTGCCTGAAAACTTGACGGTG; and NPHS1-TR:AACACGCGTCTGGTGGCGCAGGATCTCAG.

Cell Culture and Transfection

Human embryonic kidney 293 (HEK293) cells were cultured in Dulbecco’s modified Eagle’s medium (DMEM) containing 10 % fetal bovine serum (FBS). Porcine fetal fibroblasts (PFFs) were isolated from the E35d embryos of a Chinese mini pig through Collagenase IV digestion as previously described. Dissociated cells were cultured in DMEM supplemented with 10 % FBS, 0.5 % penicillin–streptomycin, 1 % nonessential amino acids, 2 mM GlutaMAX, and 1 mM sodium pyruvate. These cells were maintained at 37 °C and 5 % CO2 in a humid environment.

For the HDR-based detection, 0.5 μg of Cas9, 0.1 μg of gRNA, 0.5 μg of HDR reporter, and 0.1 μg of repair donor plasmids were co-electroporated into 5.0 × 105 HEK293 cells. To detect the activity of Cas9/gRNA with the SSA reporter, 0.5 μg of Cas9, 0.1 μg of gRNA plasmids, and 0.5 μg of SSA reporter were co-electroporated into 5.0 × 105 HEK293 cells by using a Neon transfection system in accordance with the manufacturer’s instructions (Life Technologies). After 48 h of transfection, EGFP expression was observed under a fluorescence microscope using appropriate filters, and the ratio of EGFP-positive cells was measured by flow cytometry.

PFFs were cultured in 10-cm dishes until subconfluent. Approximately 1 × 107 PFFs were electroporated with the vectors of 2 μg of Cas9 and 0.5 g of gRNA and 0.5 μg of pCMV-Td tomato by using the Neon transfection system (Life technology) at 1350 V, 30 ms, 1 pulse in 100 ul of Buffer B. After the selection with G418 (1 mg/mL) for 10 days, 100 cell clones were expanded and screened by PCR analysis. The PCR products were then sequenced to identify the existence of mutation. The primer sequences were as follows: IgM-TF:CGCTTCACTTGGGCGTCAG; IgM-TR:TCAGAACTTCCCACAGGCTC; NPHS1-TF:CACTGAGCAAGGCCAGGGAT; and NPHS1-TR:TGTCTCTGAGCGGCTGACC. The PCR products were cloned into the pMD-18T vector (Takara). At least six TA clones selected from each transformation were used for sequencing to obtain detailed information of the mutation.

T7E1 Assay

The PCR products (100 ng) were amplified from the genome of the targeted pFF cells and purified with a Tiangen PCR Purification Kit. For the T7E1 assay, the purified PCR products were denatured and re-annealed in NEBuffer2 (New England Biolab) by using a thermocycler with the following protocol: 95 °C, 5 min; 95–85 °C at −2 °C/s; 85–25 °C at −0.1 °C/s; hold at 4 °C. Hybridized PCR products were treated with 5 U of T7E1 at 37 °C for 30 min in a 10 μL solution. The products were resolved on 8–10 % polyacrylamide gels. The gels were stained with and detected by ethidium bromide and then imaged with a gel-imaging system.

Densitometry measurement was performed using ImageJ. The NHEJ percentage was calculated as described by Guschin et al. {% gene modification = 100 × [1 − (1-fraction cleaved) 1/2]}.

Flow Cytometry Analysis

HEK293 cells were washed with phosphate-buffered saline (Life Technologies), harvested by treatment with 0.05 % Trypsin-EDTA (Life Technologies), re-suspended in PBS/0.1 % BSA, and then analyzed using an Accuri C6 flow cytometer (Accuri Cytometers, Ann Arbor, MI). At least 10,000 cells were analyzed per run.

Result

Cas9/gRNA System

For the Cas9 expression vector, the Cas9 gene sequence was human codon optimized, fused with NLSs, and then placed downstream of the human cytomegalovirus (CMV) immediate early enhancer and promoter (pCMV) (Fig. 1a). The customized gRNA was driven by human U6 polymerase III promoter (Fig. 1b). We selected porcine IgM and NSPHS1 genes as candidate genes to test whether or not the CRISPR/Cas9 mutagenesis system could work. gRNA-targeting sequences with 20 bp were followed by an NGG protospacer adjacent motif (PAM), which was necessary for Cas9 cleavage (Fig. 1c). The U6-gRNA plasmid was digested with BbsI and then gel purified using a Gel Extraction Kit (Tiangen). A pair of complementary oligos for IgM and NSPHS1 targeting sites were annealed and ligated into the linearized U6-gRNA vector to generate a gRNA-expressing plasmid.

Fig. 1
figure 1

Schematic overview of the activity detecting Cas9/gRNA with the fluorescent reporter system. a, b Diagrams showing the Cas9 and gRNA expression vectors. A human codon-optimized SpCas9 fused with nuclear localization signals (NLSs) was driven by the CMV promoter. The customized guide RNA (gRNA) was driven by human U6 polymerase III promoter. c Cas9/gRNA mediated DNA double-strand break, and Cas9 unwinded DNA duplex and cleaved both strands upon recognition of a target sequence by gRNA, but only if the correct PAM was present at the 3′ end. d, e, h Homologous recombination-based strategy to detect Cas9/gRNA activity. An EGFP-coding sequence was disrupted by the insertion of a stop codon and a target genomic sequence. f, g, h Single-strand annealing-based strategy to detect Cas9/gRNA activity. A mutated EGFP coding sequence was divided into two segments, which were separated by a stop codon and a target genomic sequence. Both segments contained an identical 200-bp homology region. The Cas9/gRNA SSA reporter (or HDR reporter and a repair donor) was co-transfected into a HEK293T cell line. After double-strand breaks were introduced into the target site of the reporter vector, the mutated EGFP was repaired by the endogenous cell repair machinery via the SSA or HDR mechanism

Construction of SSA Reporter and HDR Reporter Vectors

We constructed two Cas9/gRNA-mediated targeting plasmids harboring the GFP reporter for assay to test the site specificity and cleavage activity of the designed IgM-gRNA and NSPHS1-gRNA. For the HDR reporter system, a full-length EGFP-coding sequence was disrupted by inserting a termination codon combining with the targeting sequence (Fig. 1d). Green fluorescence was not detected until the targeting region was cleaved by Cas9/gRNA, and an intact gene was restored via the traditional HDR (Fig. 1e, h). For the SSA reporter plasmid, two 200 bp repeated GFP sequences flanking the targeting site were introduced into the middle of the GFP-coding region and driven by the pCMV promoter (Fig. 1f). After a break was introduced by the Cas9/gRNA cleavage, the two repeated GFP sequences annealed to each other and recombined into an integrated GFP-coding sequence and thus can be detected by FACS to evaluate the efficiency and specificity of newly designed gRNAs (Fig. 1g, h). The overall strategy is shown in Fig. 1.

Detecting the Efficiency of Cas9/gRNA with the SSA and HDR Reporter Systems

To test and compare the functionality of the SSA and HDR reporter systems in detecting the cleavage activity of the Cas9/gRNA system, we separately transfected the HDR reporter and SSA reporter plasmids combined with Cas9/gRNA targeting to porcine IgM and NSPHS1 genes into HEK293 cells, as well as analyzed the percentage of GFP-positive cells by using a flow cytometer. We first transfected the HDR reporter plasmid and repair template together with the pCMV-hCas9 and U6-gRNAs plasmids into 293 T cells. Flow cytometry results showed that the GFP signals of IGM-gRNA and NSPHS1-gRNA were 18.7 and 11.1 %, respectively (Figs. 2 and 4); these values were significantly higher than that of the HDR reporter transfected alone or without the repair donor or gRNAs. We then transfected the SSA reporter plasmid together with the pCMV-hCas9 and U6-gRNAs plasmids into 293 T cells. A high expression of GFP-positive cells was detected after 48 h under a confocal microscope by using appropriate filters. Flow cytometry results also showed that the percentages of GFP-positive cells of IGM-gRNA and NSPHS1-gRNA reached up to 59.9 and 27.1 %, respectively (Figs. 3 and 4). The transfection of pCMV-hCas9 without the gRNA plasmid was used as a negative control. These data demonstrated that compared with the HDR reporter system, the SSA reporter system was highly specific to its targets in detecting the cleavage of designed gRNAs and can be harnessed for further genome editing.

Fig. 2
figure 2

Detecting the activity of Cas9/gRNA targeted to porcine IgM and NSPHS1 genes with HDR reporters. HDR reporter plasmid alone or combined with Cas9/gRNA and repair plasmid was transfected into HEK293 cells. EGFP expression was detected with a confocal microscope using appropriate filters after 48 h, and the ratio of EGFP-positive cells was measured via flow cytometry

Fig. 3
figure 3

Detecting the activity of Cas9/gRNA targeted to porcine IgM and NSPHS1 genes with SSA reporters. SSA reporter plasmid along or combined with the Cas9/gRNA plasmid was transfected into HEK293 cells. EGFP expression was detected under a confocal microscope using appropriate filters after 48 h, and the ratio of EGFP-positive cells was measured by flow cytometry

Fig. 4
figure 4

The efficiency of EGFP restored by Cas9/gRNA based on HDR and SSA repair machinery. Error bars represent s.e.m., n = 3

Detecting the Off-Target Effect of Cas9/gRNA with the SSA Reporter

Previous study showed that CRISPR/Cas9 has considerable off-target effects in mammal cells and embryos. To test the off-target effect, ten IgM-gRNAs (IgM-gRNA OTs) bearing different single substitutions were generated and co-transfected with the Cas9 and IgM SSA reporter plasmids into HEK293T cells separately. Forty-eight hours post-transfection, the percentages of GFP-positive cells were detected by flow cytometry. Comparing with the on-targeted IgM-gRNA, the first six IgM-gRNA OTs have significant lower cleavage activity. However the other IgM-gRNA OTs with single mismatch adjacent to 5′-hydroxyl terminus showed activities comparable to the on-targeting gRNA (Fig. 5).

Fig. 5
figure 5

The efficiency of EGFP restored IgM-gRNA and IgM mismatch gRNAs. Error bars represent s.e.m., n = 2

T7E1 Assay

We further examined the T7E1 cleavage assay from the polyacrylamide gel electrophoresis-based approach to confirm the results of the SSA reporter assay. The PCR products spanning the IgM- and NSPHS1-targeting sites in the genomic region were purified. The hybridized PCR products were digested with T7E1 for 30 min and then subjected to 8 % polyacrylamide gels. The T7E1 assay showed that the modification frequencies of the IgM and NSPHS1 genes in PFF cells were 39 and 19 %, respectively (Fig. 6a), suggesting that the SSA reporter system was compatible with the T7E1 assay.

Fig. 6
figure 6

Cas9/gRNA system mediates gene targeting in PFFs. a The efficiency of Cas9/gRNA targeted to porcine NSPHS1 and IgM genes detected in PFFs using T7EI assay. b Representative sequencing results of targeted IgM and NSPHS1 genes cell colonies PCR fragments, showing a double curve at the mutation around the PAM region. c Detailed mutations in the targeting site of IgM and NSPHS1 in mutant colonies

CRISPR/Cas9-Mediated Gene Targeting in PFF Cells

We selected the IgM gene as the first gene of interest to test the targeting efficiency of the Cas9/gRNA system in PFF cells to evaluate the targeting effectiveness of the Cas9/gRNA system in the porcine genome. We co-transfected Cas9-nickase- and IgM-gRNA-expressing vectors into the PFFs derived from a 35-day-old fetus by electroporation. We also co-transfected the Cas9- and NSPHS1-gRNA-encoding vectors into the PFFs to target the porcine NSPHS1 gene. Approximately 10 days after the G418 selection for the IgM gene, 120 single cell-derived colonies were selected and individually analyzed by sequencing the PCR products covering the target locus. For the NSPHS1 gene, 108 cell colonies were collected and screened by PCR sequencing. Among these colonies, 85 cell clones were identified as carrying different mutations in the targeted genes. The percentages of cell colonies containing indels of sequences in the IgM and NSPHS1 genes were up to 45.8 and 27.8 %, respectively (Fig. 6b), and various types of mutations were found at the target loci, including deletions and insertions (Fig. 6c).

Discussion

The advents of genome-editing technologies such as ZFNs, TALENs, and CRISPR/Cas9 are continuously revolutionizing our knowledge of studying genes and their functions in the fields of science, medicine, and biotechnology. Although ZFNs and TALENs enable targeted genome modifications, the design and assembly of ZFNs and TALENs require a great deal of optimization to realize site-specific gene targeting. By comparison, CRISPR/Cas9 offers several advantages over ZFNs and TALENs, including the ease of customization, higher targeting efficiency, and the capability to facilitate multiple gene modifications [25]. Previous studies have proven that the cleavage activity of designed gRNAs greatly varies among different sites within the same locus [26]. Strategies through repairing mutated fluorescence genes based on HDR, NHEJ, and SSA were used to validate the gene-targeting efficiency mediated by customized endonuclease [2729]. However, most of them were very labor-intensive and time-consuming. The sensitivity and efficiency of those strategies were also varied in detecting the efficiency of gene targeting. The simpler methods were necessary to establish for evaluating the targeting efficiency of gRNAs in genome engineering, especially when multiple genes should be knocked out in one step. In this study, we described the feasibility of applying the GFP-based SSA reporter system to rapidly detect the most effective gRNA before practical application.

We first compared the capability of the GFP-based SSA reporter system with that of the HDR reporter system in detecting the cleavage activity of Cas9/gRNA targeted to porcine IgM and NSPHS1 genes in human HEK293 cells. The targeting efficiency was measured by the ratio of EGFP-positive cells through flow cytometry. Our data showed that the HDR-based approach was based on HDR at a frequency range of 11.1–18.7 %. By contrast, the SSA-based approach showed high sensitivity and efficiency in detecting the targeting efficiency at a frequency exceeding 27.1–59.9 %, which exhibited a significant advantage. The SSA-based approach also did not require a donor, and its targeting efficiency had a similar sensitivity to that of the conventional T7E1 assay. Moreover, with the mismatch gRNAs, the off-target effect of Cas9/gRNA were detected with SSA reporter. Consistent with prior reports [29, 30], mismatches adjacent to the PAM have much weaker off-target effect, which showed that the SSA reporter could also be used for screening specific on-targeting gRNAs. Nevertheless, considering the target locus could be affected by chromatin structures and epigenetic state, the SSA reporter system established in this study is better to rule out inactive gRNAs than select active gRNAs.

Encouraged by the high efficiency of designed Cas9/gRNAs through GFP-based SSA assay, we then investigated the targeting efficiencies of the IGM and NSPHS1 genes in pig fetal fibroblast via the CRISPR/Cas9 system. Their gene-targeting efficiencies were 45.8 and 27.8 %, which were significantly higher than the efficiency obtained using the traditional HR method. Various mutants were found at the PAM locus, including small deletions or a few insertions. The highly efficient gene deletion pig fibroblast cells could serve as donor cells for further somatic cell nuclear transfer.

In summary, we developed a GFP-based SSA reporter system to provide a simple and rapid method to evaluate and compare the efficiencies of gRNAs at inducing indel mutations introduced by the CRISPR/Cas9 system. Thus, our method by selecting the most effective gRNAs would reduce the uncertainties and greatly expand the practical possibilities of CRISPR/Cas9-mediated genome engineering in livestock.