Introduction

Maize has a dual role, being a major crop species and a model species in genetics. Genome-edited waxy maize characterized by modified starch composed entirely of amylopectin was one of the first crops edited using CRISPR–Cas9 technology that obtained clearance to be cultivated and sold without GM-type oversight by the US Department of Agriculture (Waltz 2016). This example illustrates the intense interest in the potential of CRISPR–Cas9 technology for both applied and fundamental research studies. The starch industry has appreciated waxy maize for decades, because the absence of amylose makes starch easier to process. Although the waxy trait is not novel, CRISPR–Cas9 technology allowed the direct creation of waxy deletions in elite lines over one or two generations, avoiding time-consuming backcrosses and genetic drag experienced with conventional introgression (Cigan et al. 2017).

In fundamental research, understanding the contribution of genes to phenotypic traits in maize has been a challenge for many decades. By comparing a standard (wild type) to a mutant, the contribution of a genetic sequence to a biological process can be assessed. Although a large number of natural maize mutants exist, increasing their diversity through mutagenesis has been a long-term goal (Candela and Hake 2008). For example, the original waxy mutation discovered in 1909 (Collins 1909) has been joined over time by hundreds of additional alleles reflecting emerging mutagenesis tools. In the 60s, mutagenesis induced by the chemical agent ethyl methanesulfonate (EMS) was popular and allowed the generation of an allelic series at the Waxy locus with different levels of residual amylose (Briggs et al. 1965). A few years later irradiation mutagenesis, which preferentially creates deletions rather than point mutations, helped to generate true loss-of-function mutants that are more informative for functional genetics (Amano 1968). With the advent of molecular biology, transposon mutagenesis was developed, since transposon insertions could be easily localized in the genome. The cloning of the Waxy gene by transposon tagging was a prime example for the success of this strategy (Shure et al. 1983). Lately, over the past few years CRISPR–Cas9 technology has emerged as an appreciated alternative to sequenced indexed mutant collections (Settles et al. 2007; Vollbrecht et al. 2010), mainly because these collections do not saturate the maize genome, and because CRISPR–Cas9 technology can be targeted.

In contrast to random mutagenesis tools, which require the molecular screening of large mutagenized populations to find a mutation in a given gene, targeted mutagenesis of a gene gives ready access to a specific mutant and its phenotype, but has been a major challenge. It has been made possible by the development of techniques inducing double-strand breaks (DSBs) of genomic DNA at a predetermined site (Puchta and Fauser 2014). DSBs are then repaired by one of several cellular repair mechanisms that can be non-conservative and can, therefore, lead to mutations at the desired location. The most frequent repair process in plants is non-homologous end joining (NHEJ), during which a DNA ligase joins the damaged strands, and which can be classified as either classical NHEJ or alternative NHEJ, also known as microhomology-mediated end joining (MMEJ) (Lieber 2010). Classical NHEJ primarily induces the insertion or deletion of a low number of nucleotides, whereas MMEJ generally leads to larger deletions (McVey and Lee 2008; Puchta and Fauser 2014). DSBs can also be repaired by homologous recombination (HR), which can also be classified into two classes: conservative and non-conservative. Non-conservative HR, called single-strand annealing, occurs if a repeated sequence of more than 30 nucleotides is present upstream and downstream of the DSB (McVey and Lee 2008). The presence of these repeated sequences renders single-strand annealing very efficient for mutagenesis, as it repairs up to 1/3 of the DSBs and can generate large deletions (Siebert and Puchta 2002; Steinert et al. 2016). Conservative HR is of particular interest because it can be used for the replacement or the insertion of a sequence of interest, present on an extra chromosome, at the desired genomic locus. However, although the presence of DSBs increases the efficiency of conservative HR, this remains two orders of magnitude lower than that of NHEJ (Steinert et al. 2016).

Inducing a DSB at a predetermined site in the genome requires both the recognition of the target sequence and the cleavage of the DNA, hitherto achieved using endonucleases. Several technologies have been developed to direct endonucleases to sequences of interest, either by engineering the DNA binding domains of naturally occurring meganucleases (Choulika et al. 1994) or by linking modular DNA-binding domains such as zinc finger (Bibikova et al. 2003; Porteus and Carroll 2005; Shukla et al. 2009) or transcription activator-like effector (TALE) domains (Christian et al. 2010) to endonuclease domains such as FokI. All three technologies have been successfully implemented in maize (Bibikova et al. 2003; Porteus and Carroll 2005; Shukla et al. 2009; Gao et al. 2010; Liang et al. 2014; Char et al. 2015). In 2013, the adaptation of the bacterial immune system CRISPR–Cas9 of Streptococcus pyogenes offered a novel type of technology in which the recognition of the DNA was not due to a protein domain but to a short guide RNA (sgRNA) that forms an active complex with the Cas9 protein (Jinek et al. 2012; Cong et al. 2013; Nekrasov et al. 2013; Shan et al. 2013). The sgRNA is composed of 20 nucleotides which are homologous to the genomic region targeted, followed by a short hairpin RNA (shRNA), also referred to as scaffold RNA. Within the genome the 20 targeted nucleotides should be followed by a protospacer adjacent motif (PAM) composed of the nucleotides NGG. DSBs induced by Cas9 are generally located 3 bp upstream of the PAM site. The ease of design and low cost explain the rapid success of this user-friendly and efficient technology in a wide range of organisms including plants.

In the last 5 years, CRISPR–Cas9 technology has been successfully adapted to maize. For the introduction of the CRISPR–Cas9 machinery, direct DNA transfer to protoplasts (Liang et al. 2014; Xing et al. 2014), particle bombardment of immature embryos (Xing et al. 2014; Svitashev et al. 2015; Feng et al. 2016; Zhu et al. 2016) and Agrobacterium-mediated transformation of immature embryos (Xing et al. 2014; Svitashev et al. 2015; Feng et al. 2016; Zhu et al. 2016; Char et al. 2017) have been used. Protoplast experiments serve mainly for the evaluation of the efficiency of different sgRNA designs, since there is presently no protocol for the regeneration of maize plants from protoplasts. Biolistics avoid the use of Agrobacterium, which is regulated in certain countries since it is a plant pathogen. Agrobacterium-based stable transformation with subsequent elimination of the CRISPR–Cas9 cassette by backcross nonetheless remains the most widely used method. The transfer is almost exclusively based on DNA molecules encoding Cas9 and the sgRNA but the bombardment of Cas9-expressing plants with sgRNA (Xing et al. 2014; Svitashev et al. 2015; Feng et al. 2016; Zhu et al. 2016) and of wild-type plants with pre-assembled Cas9–sgRNA ribonucleoproteins (RNP) has also been reported (Svitashev et al. 2016). Multiplexing with more than one guide RNA in a single construct is of particular interest in maize due to the lengthy and not very efficient transformation protocol. Two techniques have been developed: one based on a multi-guide RNA activated by a single promoter and processed by tRNA motif-mediated self-cleavage into several sgRNAs, and another based on tandem repeats of different U3 and U6 promoters each controlling one guide RNA (Qi et al. 2016; Char et al. 2017). As expected, the mutations resulting from targeted mutagenesis are mainly deletions or insertions of a few nucleotides probably due to classical NHEJ. Larger deletions of more than ten bases, potentially resulting from an MMEJ repair, have also been reported but are less frequent (Xing et al. 2014; Svitashev et al. 2015; Feng et al. 2016; Zhu et al. 2016). Furthermore, true genome editing, i.e. the predetermined modification of an allele based on a repair matrix carrying the desired mutation by HR (Xing et al. 2014; Svitashev et al. 2015; Feng et al. 2016; Zhu et al. 2016), as well as HR-mediated promoter swap have also been achieved in maize (Svitashev et al. 2015; Shi et al. 2017).

Here, we describe the CRISPR–Cas9-based mutagenesis of 20 maize genes selected for their putative implication in maize kernel development. The mutagenesis efficiency, the type of mutations obtained, the simultaneous knockout of genetically tightly linked genes and the rate of transmission to the next generation will be addressed.

Materials and methods

Plant material and growth conditions

The maize (Zea mays) inbred line A188 (Gerdes and Tracy 1993) and derived transgenic or edited plants were grown in growth chambers that fulfill the French S2 safety standards for the culture of transgenic plants. In the 15 m2 growth chambers, the plants were illuminated by a mixture of 10 LED spots of 500 W (Neptune LED, Ste Anne sur Gervonde) set at 60% intensity and eight high-pressure sodium lamps of 400 W, resulting in the spectrum presented in Online Resource 5 and a photosynthetic photon flux density (PPF) of about 300–400 µmol s−1 at plant height. The photoperiod consisted of 16 h light and 8 h darkness in a 24 h diurnal cycle. Temperature was set to 24 °C/17 °C (day/night) during the first 84 days after sowing (DAS) and then to 26 °C/28 °C for the remaining 30 days of the life cycle. The relative humidity was controlled at 55% (day) and 65% (night). Seeds were germinated in 0.2 L of Favorit MP Godets substrate (Eriterre, Saint-André-de-Corcy) and were transferred between 12 and 20 DAS to 8 L of Favorit Argile TM + 20% perlite substrate (Eriterre, Saint-André-de-Corcy) supplemented with 50 mL of Osmocote Exact Hi.End 5-6M (15-9-12 + 2MgO + TE) fertilizer (Scotts, Écully). All plants were propagated by hand pollination.

Vector cloning

The integrative plasmid L1609 (Fig. 1) is based on the backbone of pSB11 (Ishida et al. 1996), from which a SapI site was removed. It contains between the T-DNA borders a rice codon-optimized Cas9 (Miao et al. 2013) driven by a synthetic maize ubiquitin promoter lacking several restriction sites, a rice U3 promoter separated from a shRNA (Shan et al. 2013) by two adjacent but otherwise unique SapI sites, unique EcoRV and I-CeuI sites, and a Basta® resistance cassette. The small plasmid L1611 (Fig. 1) contains a wheat U6 promoter followed by two adjacent SapI sites and a shRNA (Shan et al. 2013), the entire cassette being flanked by unique EcoRV and I-CeuI sites. Annealed oligonucleotides with SapI-compatible overhangs and corresponding to 20 nt-targeted sequences containing at their 5′ end an A in the case of the U3 promoter or a G in the case of the U6 promoter were cloned in L1609 and L1611, respectively. The U6-driven target cassette present in L1611 was subsequently excised with EcoRV and I-CeuI, and cloned into the L1609 derivative downstream of the U3-driven target cassette. The resulting plasmid was transferred to Agrobacterium tumefaciens strain LBA4404 (pSB1) and used for maize transformation. Alternatively, Gateway-compatible assemblies of two to four cassettes consisting each of a long or short maize U6 promoter, followed by a 20-nt target site starting with a G and a shRNA (Char et al. 2017) were entirely synthesized (GENEWIZ, New Jersey) and recombined into plasmid pGW-Cas9 (Char et al. 2017) containing between T-DNA borders a maize codon-optimized Cas9 driven by maize ubiquitin promoter and a Basta® resistance cassette conferring glufosinate-ammonium herbicide resistance.

20-nt target sequence choice

For the design of sgRNAs targeting specifically a single gene in the maize genome, the online tools CRISPR-P (http://crispr.hzau.edu.cn/CRISPR/) (Lei et al. 2014) and CRISPOR (http://crispor.tefor.net/) (Haeussler et al. 2016) were interrogated and targets at convenient positions with high scores in both tools were chosen. Since these tools are not readily suited to target several members of a gene family with a sgRNA, we wrote custom Perl scripts to design sgRNAs directed against up to ten genes each. All candidate CRISPR–Cas9 targets were identified in the B73 maize reference genome sequence v3.26 (Schnable et al. 2009) using the following criteria: 23-mers ending with NGG, not containing more than 4 Ts in a row, and with no variant of the last 12 nt ending in NAG existing in the genome. Using Jellyfish v2.2.0 (Marçais and Kingsford 2011), we counted the number of occurrences in the genome of the last 15 nt of each candidate (excluding NGG), and kept only those occurring at most ten times. The resulting database (Online Resource 1 and https://flower.ens-lyon.fr/maize/crispr/) contained 15,715,633 23-nt sequences, targeting 19,024,477 loci. We queried it to identify targets in the genes we wanted to edit. In both cases, the design was only retained, if the sequence of the reference genome of genotype B73 v3.26 (Schnable et al. 2009) available in the design tools did not show any polymorphism in the 20-nt target sequence and the PAM with the sequence of genotype A188 used for transformation.

Identification of off-target loci

To identify potential off-target loci in the maize genome, the 23-nt sequence of each sgRNA was used as query in a WU-Blast search of the B73 maize reference genome sequence v3.26 with the very relaxed parameters “W = 1 M = 4 N = − 5 Q = 8 R = 7 gapX = 100 E = 1e7 V = 50 B = 1e6 filter = none kap pingpong”. Alignments were subsequently filtered with ad hoc scripts to keep those covering the whole sgRNA length, with at most three mismatches in the last 15 nt, if the NGG had a perfect match; in case the NGG was not conserved, we kept only alignments with a perfect match on the last 15 nt. Alignments were sorted by decreasing match quality, favoring those with the longest match on the 3′ region of the sgRNA, and manually examined (Online Resource 4 and https://flower.ens-lyon.fr/maize/crispr/). For experimental validation, off-target-specific PCR primers were designed (Online Resource 3) and used for PCR amplification and Sanger sequencing.

Maize transformation and screen for Cas9-free edited plants

Immature embryos of maize inbred line A188 were transformed with A. tumefaciens strain LBA4404 harboring pSB1 and the construct of interest according to a standard protocol (Ishida et al. 1996, 2007). T-DNA integrity was checked as described elsewhere (Gilles et al. 2017). Genome editing was evaluated on leaves of T0 plants, individually for each targeted gene by specific PCR amplification of the targeted region (see Online Resource 3 for primer sequences) followed by Sanger sequencing. Segregation of T-DNA in T1 plants was evaluated by PCR amplification of the Bar gene, checking the presence and quality of genomic DNA by PCR amplification of the GRMZM2G136559 control gene (see Online Resource 3 for primer sequences).

Results

Multi-sgRNA plasmids for single and multiple gene editing

To carry out single- or multiple-gene mutagenesis using CRISPR–Cas9 technology in maize, two types of vectors were used. The first type was designed in-house and will be named Reproduction et Development des Plantes (RDP) vectors hereafter (Fig. 1). The final construct typically contains two guide RNAs and is built by combining derivatives of the initial plasmids L1609 and L1611 (Fig. 1) by restriction and ligation. L1609 is a binary vector containing a T-DNA suitable for Agrobacterium-mediated maize transformation, which encompasses a plant selection marker conferring resistance to the Basta® herbicide and a Cas9-coding sequence driven by the maize ubiquitin promoter, which is active in most plant tissues (Christensen and Quail 1996). The specific 20-nt sequence that will hybridize with the target site in the genome and thus guide the Cas9 complex to the gene(s) of interest is inserted between the Oryza sativa U3 (OsU3) promoter and the shRNA (Fig. 1). The other initial plasmid L1611 allows the cloning of a second 20-nt targeting sequence between the TaU6 (Triticum aestivum U6) promoter and a shRNA. The sub-cloning of this TaU6::sgRNA cassette into the modified L1609 plasmid leads to the generation of the final RDP vector with two sgRNAs (Fig. 1). The second type of CRISPR–Cas9 vector used was derived from the Gateway® compatible plasmid pGW-CAS9 developed by Iowa State University (Char et al. 2017) and will be referred to as Iowa vectors hereafter. Two to four sgRNA cassettes flanked by attL sites were entirely synthetized prior to recombination into pGW-CAS9.

Fig. 1
figure 1

CRISPR–Cas9 cloning vectors. Cloning strategy for RDP vectors. The final RDP vectors contain two small guide RNAs (sgRNA1 and sgRNA2) and are generated by assembly of the two initial plasmids L1609 and L1611. First the 20 nt corresponding to the recognition sequences are synthesized as oligonucleotides with SapI-compatible ends and inserted between the U3 or U6 promoter and the scaffold RNA (shRNA) after SapI digestion in both plasmids, forming sgRNA1 and sgRNA2. Then the TaU6::sgRNA2 cassette is transferred by EcoRV/I-CeuI digestion into the plasmid already containing the OsU3::sgRNA1 cassette. BAR Basta® resistance gene, Cas9 rice codon-optimized Cas9 gene, LB T-DNA left border, OsU3 rice (O. sativa) U3 promoter, pActUbi maize ubiquitin promoter, pOsAct rice actin promoter, RB T-DNA right border, shRNA short hairpin RNA, sgRNA small guide RNA, TaU6 wheat (T. aestivum) U6 promoter, 20 nt recognition sequence of 20 nucleotides inserted before the shRNA

Both the RDP and Iowa vectors used in this study contain multi-guide RNAs, allowing the targeting of several genes with a single construct. We also created a database of all  multi-target 20-nt sequences, targeting up to ten loci in the genome with one sgRNA, allowing, for example, to target paralogous genes (Online Resource 1). In our functional genetics approaches, we targeted the coding sequence to increase the likelihood of generating loss-of-function mutations. Four main strategies for sgRNA design were employed to achieve different types of gene knockout(s) (KO) (Fig. 2; Table 2): (1) targeting two unique, non-related genes with a single guide RNA each, (2) targeting a unique gene with two guide RNAs, (3) targeting paralogs with a single or multiple guide RNAs and (4) targeting a unique gene with four guide RNAs.

Fig. 2
figure 2

Different approaches to generate single and multiple gene knockouts in maize. The first strategy consists of targeting two distinct genes with specific guide RNAs for each gene, the second of targeting a single gene with two guide RNAs, the third of targeting several paralogous genes with one or several guide RNAs, and the fourth of targeting a single gene with four guide RNAs

Different types of mutations are created using multi-guide RNAs strategies

A total of 20 genes were targeted with different RDP or Iowa vectors (Table 1). After stable transformation of maize immature embryos, DNA was extracted from young leaves of transgenic T0 plants to assess the type and frequency of the mutations generated. Based on PCR amplification of the target site, and subsequent Sanger sequencing, at least one mutant allele was obtained for 18 of the 20 genes. All edited alleles are summarized in Table 1.

Table 1 CRISPR–Cas9 alleles generated in 20 maize genes
Fig. 3
figure 3

Scheme of an atypical mutant allele of ZmEsr1 (GRMZM2G046086). The intronless ZmEsr1 gene is represented by a square box with the open reading frame in blue and the UTRs in red. Numbering starts at the first nucleotide of the ATG start codon. The duplicated intergenic sequence is depicted in yellow. The 35-bp segment deleted in the mutant allele is indicated in dark blue. (Color figure online)

For genes targeted by a single guide RNA (strategies 1 and 3 in Fig. 2), a total of 56 mutations were generated in 13 genes (top section of Table 1). With the exception of one of the two guide RNA targeting GRMZM2G352274, all other guide RNAs gave rise to new alleles, ranging from 1 to 12 different alleles (in the case of GRMZM2G089517). In this context, it should be noted that the number of alleles does not reflect mutation efficiency, since transformation rates varied over time and not all transformation events were carried to the plantlet stage, and also because identical mutations were generated independently in different plants. The mutations generated were predominantly (82%, 46/56) small indels, defined as short (< 10 bp) insertions or deletions or mixtures of both (Table 1). As expected the vast majority of these indels occurred 3 bp upstream of the PAM sequence, the position where the Cas9 nuclease cleaves double-stranded DNA (Zuo and Liu 2016). Less frequently (14%, 8/56), larger deletions (> 10 bp) were observed, the largest one observed reaching 136 bp (Table 1). Interestingly, the majority of these larger deletions concerned a single gene, GRMZM2G089517, in which six of the eight larger deletions were found. In addition, two substantial insertions (10 bp and 11 bp) of unrelated DNA occurred in this gene, both accompanied by the deletion of a few nucleotides, as well as three classical indels (Table 1). This atypical example suggests that a specific gene context may influence the type of mutations generated, possibly by favouring a particular repair mechanism. However, in the case of GRMZM2G089517 it was not possible to implicate a specific mechanism with certainty, since the start and end points were not shared between the large deletions and since a search for repeated nucleotides did not detect obvious micro-homologies in proximity to the cutting site. Last, two other types of mutations were observed only once. The first, which again concerns the atypical GRMZM2G089517 gene, consists of a substitution of two nucleotides on either side of the PAM site, for which it is difficult to provide a mechanistic explanation (Table 1). The second atypical mutation was found in GRMZM2G046086, in which 35 bp next to the putative cutting site were substituted by an insertion of 62 bp (Fig. 3; Table 1). This insertion comprises an adenine nucleotide plus 61 nucleotides corresponding to a stretch of intergenic DNA region found 602 bp downstream of the putative cutting site (Fig. 3). Interestingly, this 61-bp intergenic sequence is still present at the original location in the two alleles of the T0 plant, indicating that it was duplicated to create this atypical mutation.

We next analysed the six genes that had been targeted by two guide RNAs concomitantly (strategy 2 in Fig. 2). The rationale behind this strategy was to increase the probability of success with a single construct, since a mutation at either target site would be sufficient for loss-of-function. In the ideal case, the two guide RNAs, spaced between 40 bp and 100 bp apart, would induce deletions of a predictable size that could be easily detected by simple PCR in agarose gels and avoid the Sanger sequencing step to detect and follow the mutant allele. A total of 27 mutations were generated in four of the six genes, whereas neither deletions between the two cleavage sites nor other mutations were obtained for GRMZM2G040095 and GRMZM2G035701. There were no obvious reasons for the two failures, since the sgRNA design followed the same rules as for the four successful constructs. More intriguingly, the two sgRNAs for GRMZM2G035701 (failure) were actually present on the same construct and in the same plants as the two sgRNAs for GRMZM2G149940 (success). The large majority of the mutations identified (78%, 21/27) did not involve a deletion between the two guide RNA targets, but were caused by indels or larger deletions at one (74%, 20/27) or both target sites (4%, 1/27) (Table 1). Clear preferences for one of the two target sites were noted in all four cases and likely reflect differences in mutation efficiency or target site accessibility. Only 22% (6/27) of mutations harboured deletions of the region located between the two target sites (Table 1). In only one case (GRMZM2G049141), the 100-bp deletion corresponded exactly to the zone between the two putative cleavage sites. Regarding the other five deletions, small indels at one or both target sites either caused deletions that were slightly smaller (GRMZM2G039538 and GRMZM2G363552) or slightly larger (GRMZM2G049141) than the expected size (Table 1). In summary, it was possible to generate deletions in regions between two guide RNAs. However, the exact size of the deletion was variable and deletions between two target sites were less frequent than indels generated by the action of an individual guide RNA.

Last, a vector with four guide RNAs was designed to target a unique gene (strategy 4 in Fig. 2). Three guide RNAs gave rise to mutations in GRMZM2G471240, which were all of the indel type (Table 1). No deletions between the four target sites were observed.

Mutation efficiency

In total, 28 guide RNAs were expressed in plants, 20 using RDP vectors, and 8 using Iowa vectors. Three guides targeted two genes in conserved regions. Among them, 22 resulted in at least one mutation and 6 did not induce any sequence change in the analysed plants (Table 2). For RDP vectors, 17 guide RNAs induced at least one mutation and three did not generate a mutation. For Iowa vectors, the proportion of unsuccessful guides was higher (3/8) but this result should be interpreted with caution because considerably fewer transformation events were obtained when using Iowa vectors in our conditions. This was certainly due to the non-optimal combination of our Agrobacterium strain LBA4404 (pSB1) and the binary vector, and more precisely an incompatibility between the origins of replication of pSB1 and pGW-Cas9 (Char et al. 2017).

Table 2 Guide RNAs used and relationship with plant transformation events

Bi-allelic mutations, meaning that alleles on both the maternal and paternal chromosomes carried mutations, were detected in 19% (16/83) of the mutated plants, and more precisely in 18% (13/74) of the mutants obtained with RDP vectors and in 33% (3/9) of the mutants generated with Iowa vectors (Table 2).

Mutation efficiency was calculated as the number of transformation events harbouring at least one mutation as a proportion of all transformation events obtained for a given guide RNA (Table 2). Although this number may be somewhat influenced by differences in the accessibility of certain targets, for example, due to chromatin differences between centromeric and telomeric chromosome regions, or by competition between guide RNAs in the plants that produced more than one guide, it was clear that mutation efficiency was very variable despite similar rules for guide RNA design (Table 2). Concerning the promoter used to drive guide RNA expression in the RDP vectors, mutations were obtained using both the OsU3 and the TaU6 promoters. Averaging the percentages for each promoter, a higher overall mutation efficiency was observed with the TaU6 promoter (65%) compared to the OsU3 promoter (39%) (Table 2). Using the same approach, a slightly higher efficiency was noted when the 20-nt target and the NGG were chosen on the coding (+) strand (58%) compared to the non-coding (−) strand (48%) (Table 2). Finally, the mutation efficiency was not strongly correlated to the overall GC content of the 20-nt targeted sequence (r = 0.31, Table 2).

Although the sample number (three cases) in which one guide RNA was used to target two paralogous genes precluded a quantitative analysis, the mutation efficiency seemed to be in the same range for both target genes with 63%/88% for the first (GRMZM2G039538/GRMZM2G363552), 25%/25% for the second (GRMZM2G039538/GRMZM2G363552) and 75%/50% for the third guide RNA (GRMZM2G140302/GRMZM2G046086) (Table 2). Our results on the first two cases suggest that the difference in mutagenesis efficiency between two guide RNAs targeting the same gene was more important than the difference between the mutagenesis efficiency for a single guide RNA targeted the two paralogs.

Transmission of edited genes to the next generation

Mutations must be present in germinal cells to be passed on to the next generation. We, therefore, tested whether mutations detected in leaf material of T0 plants fulfill this criterion. During the detection by PCR amplification and Sanger sequencing of leaf material of T0 transformation events, two categories of chromatograms indicative of editing were observed (Online Resource 2): (1) the first and most common category (85%) was characterised by a switch from a homogenous chromatogram to two overlapping sequences with similar peak height (Online Resource 2a), indicating two alleles present in approximately the same proportion in the extracted DNA; (2) the second, less frequent category (15%) presented a main signal and a very weak overlapping signal in the chromatograms (Online Resource 2b), suggesting that the proportion of mutated DNA is very low compared to wild-type DNA. Without any exception, all mutations of the first category were systematically detected in the next generation. Transmission from the T0 to the T1 generation generally followed Mendelian segregation rules, suggesting that the edited alleles had been fixed and been present in all leaf cells and that the mutations had probably occurred early in the maize transformation process, likely during the callus formation step. It should be noted that all alleles presented in Table 1, including the alleles with multiple edits, were of this type. With regard to the second category, we never observed any transmission of the mutations to the next generation suggesting that the mutations were present only in few leaf cells and that the mutations had probably occurred during leaf development. These data indicate that, although chimeras may exist in maize, fully edited T0 plants are predominant and that the distinction between chimeric and fully edited T0 plants can be made on the basis of the Sanger chromatograms.

Limitation of off-target effects

During targeted mutagenesis of a gene of interest, the presence of additional mutations elsewhere in the genome should ideally be minimised. To reduce unintended off-target effects at sites with no homology to the target site, primary transformants were backcrossed to the parent line A188 and only Cas9-free mutant T1 plants, in which the T-DNA had been segregated away based on a negative PCR assay for the BAR gene (Online Resource 3), were used for subsequent molecular or phenotypic characterisations. Furthermore, the use of at least two independent mutations (transformation events) and of several plants per mutation will allow to establish a clear link between the mutated gene and observed phenotypes.

To minimize off-target effects at sites with substantial homology to the target, a rigorous design of the 20-nt recognition sequence was put in place (see “Material and methods”). Since the objective of obtaining knockouts left a lot of freedom to the exact position of the targets in the coding sequence, targets with high similarities to sites elsewhere in the genome (less than three mismatches) were excluded from the design whenever possible. In fact, early works on the specificity of CRISPR–Cas9 had established that cleavage at target sites with more than two mismatches to the sgRNA were generally extremely rare (< 0.01%), although exceptions existed (Mali et al. 2013; Pattanayak et al. 2013). While the design of the present study did not consider bulges (Lin et al. 2014), targets with similarity to other sites followed by NAG rather than NGG PAM sites were also excluded. In the case of strategy 3 (Fig. 2) targeting two or more paralogous genes with a sgRNA, these rules were not applied to the paralog(s) but maintained for the rest of the genome. As a quality control of the selected 20-nt recognition sequences used in this study, a customized BLAST search was performed (see “Material and methods” section). For each of the 28 targets the most likely off-target was selected based on the BLAST score and the position of the mismatches relative to the 7–12-bp seed sequence close to the PAM (Online Resource 4). Three of the putative off-target sites with three mismatches were chosen for experimental analysis of off-target mutations by specific PCR amplification followed by Sanger sequencing on T0 plants (Online Resource 3). No editing was detected at any of the three sites.

Creation of multiple mutants

One of the advantages of the CRISPR–Cas9 technology is that it allows the creation of multiple mutants in a single step, thereby avoiding time-consuming crosses and/or backcrosses. With regard to unlinked genes located on different chromosomes, three double mutants were produced in the T0 generation using a construct with two guide RNAs, one for each gene (strategy 1 in Fig. 2). They concerned members of the same gene family in the case of GRMZM2G157313/GRMZM2G014499 (two double mutants in four transformation events, Table 2) and GRMZM2G059165/GRMZM2G120085 (1/3), and true paralogs in the case of GRMZM2G039538/GRMZM2G363552 (5/8). More importantly, multiple mutants were also obtained in genes that were tightly linked on the same chromosome, and for which the production of a double knockout mutants would have been difficult to achieve. Double mutants were identified for GRMZM2G089517/GRMZM2G352274 [separated by 75 kb on chromosome 5 (one mutant found out of 14 transformation evens)] and GRMZM2G145466/GRMZM2G573952 [located within 53 kb on chromosome 7 (1 out of 2)]. Finally, we successfully managed to knock out three small (< 600 bp) paralogous genes that are situated in the same region of chromosome 1. These genetically strongly linked genes are ZmEsr1, ZmEsr2 and ZmEsr3 (GRMZM2G046086, GRMZM2G315601, GRMZM2G140302) (Opsahl-Ferstad et al. 1997). Since ZmEsr2 and ZmEsr1 are separated by only 29 kb, and ZmEsr1 and ZmEsr3 by only 13 kb, the production of triple knockout mutants underlines the power of CRISPR–Cas9 technology. Using the CRISPR–Cas9 strategy 3 illustrated in Fig. 2, a plant with a frame-shift mutation in each of the three ZmEsr genes was obtained. By a simple self-pollination, we have been able to generate T1 plants homozygous for the three mutated ZmEsr genes that are now available for functional analysis. However, no large deletions between the cleavage sites in the linked genes were found, despite specific PCR reactions designed to detect them.

Discussion

The present study examined CRISPR–Cas9-mediated targeted mutagenesis in maize aimed at routine use for functional genetics studies. Analysing mutations in 20 genes in genome-edited maize plants, it was conducted at a larger scale than previous studies in maize, which either simply demonstrated the feasibility for a single gene or addressed a maximum of five genes (Liang et al. 2014; Xing et al. 2014). It also focused on regenerated plants rather than protoplasts or calli, systematically analysed offspring and is the first study to use the inbred line A188. The results indicate that CRISPR–Cas9 is a robust technology for gene knockout in maize, and can be used to generate various types of mutations with a high frequency of success. Furthermore, it allows the production of double and triple mutants in tightly linked genes.

Three types of mutations were observed in the 93 mutant maize plants analysed: indels, larger deletions and local chromosome rearrangements. The occurrence of larger chromosome rearrangements, such as those reported recently for mouse embryonic stem cells (Kosicki et al. 2018), cannot be excluded but would not be detected with our method. Indeed the detection method, based on PCR amplification and subsequent Sanger sequencing, can only detect mutations in which the two primer-binding sites on either side of the putative cleavage site are conserved in head to head orientation and remain at a distance allowing standard PCR amplification. Small indels as produced in the case of classical NHEJ repair (Ma et al. 2016; Bortesi et al. 2016) were, as expected, the most frequent outcome (80%, 74/93) and were documented for each of the 18 genes that were successfully mutagenized. They were generally located at, or close to, the putative cleavage site 3 bp upstream of the PAM. Larger deletions (> 10 bp) ranging from 11 to 136 bp were considerably less frequent (11%, 10/93) and concerned only 4/18 genes. Thought to be generated by the MMEJ repair mechanism, short (2 bp to 4 bp) microhomologies were indeed present on both sides of the putative cleavage site in the wild-type sequence of GRMZM2G120085 (GC), GRMZM2G149940 (CCG) and GRMZM2G049141 (GACT) and the large deletions tended to correspond more or less precisely to recombination products between these direct repeats. In contrast, for GRMZM2G089517 larger deletions were more frequent than indels, the start and end points of the deletions were not conserved between events, and two other atypical mutations were obtained: a combination of a 7-bp deletion with an 11-bp insertion, and two point mutations flanking the PAM. The mechanism generating these atypical mutations remains unclear, although it is known that strand resection and random DNA synthesis can lead to unpredictable outcomes during MMEJ repair (Wang and Xu 2017; Sinha et al. 2017).

An unexpected allele was also detected for GRMZM2G046086 alongside five other classical indels. This allele consists of a 35-bp deletion accompanied by the insertion of a 61-bp DNA fragment copied from the intergenic region downstream of the gene (Fig. 3).

Importantly, defined deletions (6%, 6/93) of predetermined size and position were successfully provoked by the simultaneous action of two guide RNAs on target sites separated between 44 and 102 bp in a given gene. The precision of these deletion events was not perfect, since only one deletion was precisely of the expected size, whereas the other five contained indels of 1 bp or 2 bp at least at one end of the deletion. In addition, this approach worked only for 3/7 targeted genes and in the three successful cases indels at only one of the target sites were more frequent than the deletion. Since variations in target accessibility over such short distances in coding regions are unlikely, this suggests that very similar efficiency of the two guide RNAs is crucial for the successful generation of defined deletions. Differences between guide RNA efficiency may also partially explain why larger deletions involving target sites distant between 13 and 75 kb in genetically linked paralogs were not detected, despite the fact that CRISPR–Cas9-mediated deletions of up to 120 kb have been documented in plants (Gantner et al. 2018).

The overall mutation efficiency (averaging the percentages for each guide RNA) of 53% was in the global range (2–100%) of previous reports on targeted mutagenesis in maize, as was the 19% rate of biallelic mutations obtained (Liang et al. 2014; Xing et al. 2014; Lee et al. 2018). Since higher rates have been achieved in maize with the same basic elements (maize ubiquitin promoter, codon optimized Cas9, cereal U3 or U6 promoters), the specific choices made during vector design, such as the choice of different versions of the ZmUbi promoter, the choice of the terminator, the position of promoter–Cas9 and Cas9–terminator junctions, as well as the presence of an NLS domain, of tags for immuno-detection or of introns in the Cas9-coding sequence, are possible parameters for optimisation. However, this suboptimal rate of biallelic mutations also has advantages in the context of functional genetics studies of genes involved in maize kernel development, since mutations could be lethal for the embryo and/or seedling in the homozygous state (Neuffer and Sheridan 1980; Doll et al. 2017). It is, therefore, preferable to generate heterozygous plants and to assess the (lethal) phenotype after self-pollination in segregating ears.

More importantly, the mutation efficiency was very variable at different levels. First, 2 of the 20 genes could not be mutated at all, despite the use of two guide RNAs per gene and the generation of eight and two transformation events, respectively. Second, among the 18 genes successfully mutated, not all transformation events caused mutations. For example, in the case of GRMZM2G352274 only 1 of the 16 transformation events yielded a mutation. Third, in transformation events carrying novel mutations, not all guide RNAs present in the same plant induced mutations. The reasons for failure are likely linked either to the intrinsic quality of the sgRNA design or to the accessibility of the target sequence. Although the design of all sgRNAs followed the same rationale, the online and in house tools used only ensure a relatively high minimum quality standard, but they do not exclude quality differences between the possible designs. The GC content of the binding site (Ren et al. 2014; Labuhn et al. 2018), the secondary structure of the sgRNA, and its capacity not only to guide but also to activate the nuclease activity of Cas9 are known to be important parameters (Liu et al. 2016). In this context, it is noteworthy that the GC content of both target sites in GRMZM2G035701 (failure) was relatively low (45%), whereas the GC content of the two sites in GRMZM2G149940 (targeted with success by the same construct in the same plants) was considerably higher (60% and 65%). The criteria for target site accessibility are less clear. Although Cas9 cleavage activity is not thought to be strongly affected by DNA CpG methylation (Hsu et al. 2013), it is generally accepted that the chromatin status of the target region influences the efficiency of CRISPR–Cas9 approaches, that DNase I hypersensitivity (DHS) is a good indicator for Cas9 binding (Wu et al. 2014) and that heterochromatin may be less accessible (Jensen et al. 2017). On the other hand, the accessibility of genes located in globally heterochromatic, centromeric regions of maize chromosomes to Cas9-mediated targeted mutagenesis has been demonstrated in protoplasts (Feng et al. 2016). In our study, the two recalcitrant genes GRMZM2G035701 and GRMZM2G040095 are located in gene-rich regions on the long arm of chromosome 8 and close to the end of chromosome 2, respectively. These regions do not present any obvious features explaining failure.

Differences in mutation efficiency between transformation events are expected, since the genomic environment is known to influence the expression level of transgenes, in the present case of the Cas9 and sgRNA genes. However, very low success rates, such as the single edit for GRMZM2G352274 in 16 transformation events, are difficult to explain by insufficient expression, in particular since the second guide RNA present in the same plants caused mutations in 14/16 events. In this as in other cases, the competition of guide RNAs of unequal quality, or differences in target gene accessibility, are more likely explanations for differences in successful mutagenesis than positional effects on transgene expression. Our study suggests that other parameters with a minor impact on mutation efficiency were the choice of the type III promoter with a preference for the TaU6 over the OsU3 promoter, and the choice of the DNA strand with mutagenesis improved by binding of the sgRNA to the template rather than non-template strand. This last observation is likely caused by a quicker release of the Cas9 from the template strand due to displacement by RNA polymerase II and faster repair of the DSB by the cellular machinery (Clarke et al. 2018). Overall, these results can be translated into five recommendations for gene knockout in maize: (1) the use of at least two guide RNAs per gene, (2) the generation of at least five transformation events, (3) the retargeting of recalcitrant genes with constructs targeting a single gene, (4) the use of maize or wheat U6 promoters, (5) the preferential use of target sequences on the coding strand.

Chimerism is an important issue in CRISPR–Cas9-mediated mutagenesis, since in stably transformed plants constitutively expressing Cas9 and sgRNA genes, genome editing can occur at any time and in any number of cells during the life cycle of the plant, raising the question of whether the mutations detected in the leaves or other organs of primary transformants will be present in germinal cells and thus transmitted to the offspring. Our results indicate that chimerism does occur, but that the majority of events detected in leaf material are fully edited and that sequencing chromatograms with overlapping sequencing peaks of equal height are predictive for transmission to the next generation. This is in agreement with earlier reports in maize (Liang et al. 2014; Xing et al. 2014) and seems to indicate that the majority of editing events occur very early on during the transformation of immature maize embryos, likely at the callus stage.

The ease of multiplexing is frequently cited as one of the major advantages of CRISPR–Cas9 technology over the use of other site-directed nucleases such as meganucleases, zinc finger nucleases or TALENs, and CRISPR–Cas9 constructs harbouring as many as 14 guide RNAs have been used successfully in Arabidopsis (Peterson et al. 2016). Three double mutants in gene family members residing on different chromosomes, two double mutants in paralogs separated by 53 kb or 75 kb, and a triple mutant in paralogs separated by 13 kb or 29 kb were generated in our study. These examples underline the power of CRISPR–Cas9 technology since the production of double or triple knockout mutants in tightly linked genes would have been nearly impossible to achieve by crossing of single mutants, and would have required the analysis of thousands of recombinants. Multiplexing is of particular interest in maize, which is an ancient tetraploid known to contain numerous functionally redundant paralogs, hampering functional analysis. As a result the production of multiple mutants by CRISPR–Cas9 will almost certainly become a prime tool for functional genomics studies in this species.

Author contribution statement

NMD, PMR and TW conceived and designed research; NMD, LMG, MFG, CR, YF, VMG, GG, and TW conducted experiments. MFG, CR and GG performed maize transformation. NMD prepared tables and figures. JJ performed bioinformatics analyses to produce Online Resource 1 and Online Resource 4. NMD, GI, PMR and TW wrote the manuscript. PMR and TW were involved in project management and obtained funding.