Keywords

Introduction

Maize is a major cereal food crop worldwide, and most of its nutritive value is localized in the kernel. Historically, plant breeders and agronomists have increased the productivity of corn to keep pace with the demand for its traditional dietary uses. Over the past three decades, recombinant DNA technology has also been employed to increase yields by improving performance with respect to drought tolerance, pest resistance , and weed management (Kasuga et al. 1999; Kasuga et al. 2004; Funke et al. 2006; Morran et al. 2010b; Jouanin et al. 1998).

Recently, attention has turned to using plants as an alternate energy source to help supplement traditional fossil fuels and to provide a clean, indigenous, and renewable fuel source. Interest in biofuels has focused on the abundant and readily available starch obtained from the corn kernel, a precursor that can be converted easily to ethanol. Cornstarch accounts for the vast majority of biofuel in the USA today, and this alternate market has increased the demand for corn grain, such that biofuels now account for 40 % of corn production (http://www.ers.usda.gov/data-products/us-bioenergy-statistics.aspx).

Corn grain is a safe, inexpensive, and stable product that has prompted many additional applications that take advantage of these intrinsic properties, as well as the establishment of specialized processing methods to increase its functionality. Corn has been developed with altered kernel composition such as high lysine (Vasal 1994), high oil (Lambert 1994), and high protein (Dudley and Lambert 1992). These are exceptions, however, to the vast majority of the past work on corn which was focused on increasing yields without significantly changing the nature of the crop itself, or altering its composition.

With the advent of recombinant DNA technology, commodity corn is now being used as a starting point to add completely new functionalities to the grain itself (Ramessar et al. 2008; Naqvi et al. 2011). Many of these new functionalities are conferred by the overexpression of specific proteins. Corn grain has key characteristics that offer benefits to overexpress proteins that include high protein content (Shewry 2007), high levels of protease inhibitors (Habib and Fazili 2007), high carbohydrate content, and low water content, all of which aid accumulation of specific proteins in a stabilized form (Stoger et al. 2004; Lamphear et al. 2005).

New functionalities in corn grain can be achieved by adding specific recombinant proteins to exploit attributes for various outcomes. Some of these functions include (1) enhancing nutrition, by increasing the lysine content in corn seed using several methods, including the expression of lysine metabolic pathway genes, expression of high-lysine proteins, inhibiting lysine catabolic enzymes through RNA interference (RNAi) mechanisms, reducing lysine-poor zein protein levels, or combinations of these procedures (Frizzi et al. 2008; Houmard et al. 2007; Chiang et al. 2005); (2) expressing a protein that provides a high-intensity sweetener (e.g., brazzein) in the grain as a low-cost alternative to high-sugar snacks and cereals (Lamphear et al. 2005); and (3) expressing a vast array of industrial (Khan et al. 2013) and pharmaceutical proteins that promise to provide a low cost, animal-free source for applications in biofuels (Torney et al. 2007; Shetty et al. 2005), vaccines, and therapeutics (Daniell et al. 2001; Streatfield and Howard 2003b, 2003a; Ma et al. 2005; Ramessar et al. 2008; Boothe et al. 2010; Naqvi et al. 2011).

The common principle in these new applications is the reliance on the accumulation of specific proteins. This promise of increased functionality is only theoretical unless these proteins can accumulate at concentrations that are high enough to allow for economically viable products. Protein accumulation is inversely proportional to the cost of production and, therefore, one of the most critical factors leading to commercialization. Several reviews highlight a range of techniques to increase expression and accumulation of proteins in plants (Padh et al. 2010; Streatfield 2007; Mullis et al. 2012; Egelkrout et al. 2012; Hood et al. 2012; Table 3.1) including the various attributes that different host plants offer (Howard and Hood 2005b). This chapter focuses on strategies that have been used for the overproduction of recombinant proteins in maize grain.

Table 3.1 Promoters used for protein accumulation in various plant species. (Reproduced with permission from Egelkrout et al. 2012)

Protein Accumulation

The basic principles of protein accumulation can be accounted for by comparing the rate of recombinant protein expression to the rate of degradation. In practice, however, there are many reasons that make this much more complicated than a simple subtraction problem. Many of these factors have been described previously (Streatfield 2007; Egelkrout et al. 2012), and the intent of this chapter is not to repeat these general rules, but, instead, to focus on aspects specific to corn and to cite examples wherever possible.

Protein of Interest

A critical factor for accumulation of a protein in any host is the makeup of the protein itself. While this consideration holds true for the accumulation of proteins in any host, corn kernels have shown advantages for the expression of, otherwise, recalcitrant proteins. One general class of proteins known for poor expression are membrane proteins (Bernaudat et al. 2011). Membrane proteins are not only critical for cellular functions and cell recognition but are also of practical importance in some medically related products, such as subunit vaccines, and for structural analysis. Thus, they are a target for overexpression in many types of recombinant hosts (Mason et al. 2002; Bernaudat et al. 2011; Mus-Veteau 2010; D’Aoust et al. 2008; Ahmad et al. 2012).

While membrane proteins are not among the most highly expressed proteins in any system, they have accumulated much better in maize than when expressed in other recombinant hosts. An example is the hepatitis B surface protein, HBsAg, which has been commercialized as a subunit vaccine. HBsAg has been expressed in many recombinant systems including the yeasts Saccharomyces cerevisiae and Pichia pastoris, in cell cultures infected with recombinant baculovirus, vaccinia virus, and adenovirus (Cregg et al. 1987; Takehara et al. 1988; Davis et al. 1985; Mason et al. 1992), and in several plant hosts (Mekala et al. 2008; Guan et al. 2010; Pniewski 2013). One goal that has been undertaken to combat hepatitis is the development of an effective oral vaccine with this antigen. This could have dramatic outcomes, but a rate-limiting aspect has been the ability to express the antigen at the high concentrations required in an edible tissue for the oral vaccine to be administered in a food product. There are orders of magnitude differences in expression levels obtained using the different plant systems with the highest levels being reported in corn kernels (Hayden et al. 2012). This demonstrates the host advantage that corn can bring compared to some other plant tissues. Furthermore, this level was accomplished in non-optimized maize germ tissue, leaving great potential for even higher levels in the future (see discussion on optimization of germplasm) .

The example above is dramatic for the high-level expression of a membrane-bound protein, but it is still at relatively low levels compared to results obtained with less refractory proteins. By contrast, thermostable proteins, such as cellulase and xylanase, have been shown, in general, to accumulate well in many plant systems (Herbers et al. 1995; Hyunjong et al. 2006; Xue et al. 2003; Jensen et al. 1996b; Ziegler et al. 2000a; Austin-Phillips et al. 1999b; Ruggiero et al. 2000). This generalization holds true for maize, and, as an example, the thermostable cellulase E1, an endo-β1,4-glucanase, has been shown to accumulate at 0.13 % dry weight in grain (Hood et al. 2012), among the highest concentrations known to accumulate in any plant. Some representative examples of high levels of recombinant proteins are shown in Table 3.2. The values given represent expression based on the whole kernels. However, the tissue specificity of the promoters would imply that the embryo promoters provide a tenfold higher concentration of protein if this is based solely on the germ tissue. Protein levels in the relatively small amount of pericarp tissue in the kernel were not quantified. However high protein concentrations in the pericarp together with high expression levels could indicate significant accumulation.

Table 3.2 Examples of high-level protein accumulation in maize kernels

These examples illustrate not only that the nature of the protein of interest is critical in determining the expectations for overproduction but also the potential for high levels of accumulation, and the reason that the maize kernel is rapidly becoming a host of choice to overexpress many proteins (Ramessar et al. 2008; Naqvi et al. 2011). It is difficult to predict the specific reasons why some proteins have shown greater accumulation in maize grain because there are few studies by which direct comparisons can be made. The most likely reasons for high protein accumulation include an abundance of protease inhibitors, ample chaperones to ensure correct folding, high carbohydrate concentrations to stabilize protein, the large size of the kernel, and low water content, all which have been discussed elsewhere (Streatfield 2007; Naqvi et al. 2011). From a pragmatic perspective, it is apparent that many proteins do express better in grain than in other systems. There are many specific strategies used to overexpress proteins, and the discussion below is focused on illustrating examples where specific strategies for maize grain have shown benefit. A partial listing of proteins produced in plants can be found in Khan et al. (2013; see Table 3.3).

Table 3.3 Partial list of industrial proteins expressed in plants. (Modified with permission from Khan et al. 2013)

Location, Location, Location

The real-estate mantra of location, location, location applies to accumulation of recombinant proteins in grain. With the aim to accumulate as much of the specific protein in the kernel as possible, the obvious choice is to obtain a promoter that would express in all tissues throughout the whole seed. If there is no reason to be concerned about toxicity in other plant tissues due to high expression (see discussion on protein toxicity, below), a strong constitutive promoter that expresses in all parts of the plant should work well. This strategy has been shown to be extremely efficient for the protein avidin when using the constitutive ubiquitin promoter, leading to some of the highest levels of expression reported in the kernel (Hood et al. 1997). Not all constitutive promoters are alike. Both CaMV and ubiquitin (Christensen and Quail 1996) promoters drive high expression in leaves, but very low levels of protein accumulate in the seed with the CaMV promoter (Stoger et al. 2005), while high levels were demonstrated in seed with the ubiquitin promoter (Witcher et al. 1998).

Accumulation in the kernel may be desired, but overexpression in other tissues may be detrimental to the plant (see discussion on protein toxicity, below). Most enzymes will alter significantly the metabolism of the cell when overexpressed. Therefore, it can be greatly advantageous, and in some cases essential, to have high expression in the kernel with little or no expression in other parts of the plant. Regarding the kernel, the endosperm accounts for the vast majority of the biomass, with the embryo (~ 10 %) and the pericarp or seed coat (~ 5 %) making up the remainder. In theory, a promoter is possible that could drive expression specific to the kernel in all three of these tissues, but there have been no natural promoters identified to date with this feature, nor have synthetic promoters been created. This may be possible in the future, but presently reliance must be on promoters that drive expression preferentially in one of these tissues.

At first glance, it would seem that the endosperm would be the best tissue for protein accumulation since it has the most biomass to store the protein. Strong endosperm-preferred promoters have been used and do show great utility (Schernthaner et al. 1988; Russell and Fromm 1997; Streatfield et al. 2004b). Interestingly, however, when the constitutive ubiquitin promoter was used, the majority of the recombinant protein accumulated in the embryo rather than the endosperm (Hood et al. 1997; Witcher et al. 1998; Zhong et al. 1999). One could argue that this is a specific feature of the ubiquitin promoter and would not hold true when strong endosperm promoters are compared to strong embryo promoters. However, the greatest accumulation of recombinant proteins in the seed, to date, has been achieved using embryo-preferred promoters (Stoger et al. 2005; Streatfield et al. 2010b; Egelkrout et al. 2012; Hood et al. 2012).

Promoters are not only responsible for tissue specificity; they are one of the most important factors driving the level of expression. A partial list of some maize promoters , along with other components that modulate expression, such as codon usage, terminator, and leader sequences, has been presented (Egelkrout et al. 2012; see Table 3.1). One aspect that modulated the levels of protein expression, which is favored in monocotyledons compared to dicotyledons, is intron-mediated enhancement (IME). This phenomenon was first discovered in cultured maize cells (Callis et al. 1987). The first intron in many plant genes has been shown to increase accumulation up to tenfold through posttranscriptional mechanisms (Rose 2008). The enhancing effect of introns in plants was identified initially in Arabidopsis, but studies have shown that the first intron is the only one that shows this effect, and that no specific sequence appears to be responsible. Other researchers have found that certain introns function in monocotyledons, but not dicotyledons (Morita et al. 2012), although all introns that show the effect have the conserved motif “GATCTG.” The use of introns to provide an IME needs to be tested empirically.

Intracellular Targeting

Proteins within each tissue can be targeted to specific subcellular locations using well-characterized targeting sequences (Kermode 1996; Lau and Dale 2009). Chloroplasts in the leaves of plants have shown great potential for protein accumulation (Chebolu and Daniell 2009; De Marchis et al. 2012), but there are no functional chloroplasts in the kernel. While the cytoplasm would appear to have the advantage of a large volume for protein accumulation, this site has only provided modest expression levels at best (Hood et al. 2003). The most consistent intracellular targets for high-level expression in the seed have been the cell wall, vacuole, and endoplasmic reticulum. This was illustrated initially with laccase (Hood et al. 2003) and confirmed with several other proteins (Woodard et al. 2003; Clough et al. 2006; Hood et al. 2007). Each of these sites also permits glycosylation, which can be essential for correct folding and biological activity (Gomord et al. 2010; Solá and Griebenow 2010), or used to reduce clearance rates in pharmaceutical proteins (Doran 2000; Solá and Griebenow 2010).

However, in rare cases, such as when a protein of bacterial origin has an inadvertent glycosylation site in a particularly strategic position like the catalytic site, glycosylation can cause inactivation of the protein. The popular marker protein, GUS, beta-glucuronidase is inactivated by glycosylation (Iturriaga et al. 1989; Farrell and Beachy 1990), thereby limiting the native protein’s use as marker, when targeted to intracellular sites that glycosylate the protein. Thus, proteins targeted for expression should be scanned for potential sites of glycosylation.

Protein Toxicity

Many proteins possess biological activity that can interfere with metabolic processes in the host cell. This turns out to be one of the major limitations for high accumulation of many recombinant enzymes in foreign hosts. Even proteins that are not considered detrimental to metabolism can interfere when they accumulate at high concentrations. Some of the more obvious examples of proteins that can interfere with metabolism include proteases, glycosidases, phosphatases, and redox enzymes. Strategies to overexpress these proteins without causing toxicity have led to several options to sequester the activity of the protein and prevent it from interfering with the plant’s metabolism.

Avidin is a protein that binds tightly to biotin, an important vitamin and enzyme cofactor, and an example of a protein that can cause toxicity by depleting biotin when accumulated at high concentrations in foreign host tissue. However, when sequestered in the apoplast, it can accumulate to concentrations with few complications (Hood et al. 1997). At very high concentrations, however, it causes male sterility, so even this sequestration is not sufficient when a constitutive promoter is used. Another example of enzyme toxicity is illustrated by the protein laccase. In this case, free radicals are formed that, presumably, are detrimental when the enzyme is present at high concentrations. Protein accumulation was increased greatly by targeting the enzyme to the embryo, whose high oil and low water content retards radical formation (Galuszka et al. 2005; Riva 2006). Although embryo expression showed great promise, higher concentrations of laccase in seeds were inhibitory to germination. High-oil germplasm was used to overcome this damaging activity, with improved germination rates from 40 to 75 %. Furthermore, this germplasm also provided an increase in accumulation due to the increase in the ratio of the germ size to the kernel (Hood et al. 2003; Hood et al. 2007).

Manganese peroxidase (MnP) is another example of an enzyme whose expression at high levels had a detrimental effect on the health of the plant. In particular, leaves and stems showed browning and compromised growth (Austin et al. 1995; Clough et al. 2005). Cofactor availability can be modulated in such cases to allow the expression of proteins that potentially interfere with cell metabolism, while limiting their activity (Hofrichter 2002). MnP was successfully accumulated in maize kernels by restricting expression to the seed. When the protein was subsequently extracted, there was only a low level of activity in the extract. However, when the cofactor, Mn, was added exogenously, protein activity was greatly increased, indicating that cofactor was required for optimal activity and was limiting in the plant (Clough et al. 2005). A similar situation was found to be the case for organophosphate hydrolase, which requires cobalt as its cofactor (Pinkerton 2004).

An alternate technology to accumulate enzymes that interfere with metabolism is to express the zymogen form of the enzyme that would be inactive in the plant but could be activated at a later time. Trypsin is an example of a protease that is very difficult to express at high levels in recombinant hosts because of its broad specificity to cleave proteins. However, expression was accomplished in maize kernels by expressing the zymogen (Woodard et al. 2003; Király et al. 2006). In addition to expressing the proenzyme trypsinogen, rather than the active enzyme, the protein was also targeted to the kernel where there is an abundant supply of protease inhibitors (Woodard et al. 2003). The combination of these strategies was needed to reach high levels. Other approaches to expressing zymogens may include intein technology which would allow for an inactive enzyme to accumulate in the plant tissue. Then, under the appropriate conditions, it would self-cleave to release the active protein (Raab 2010).

One tactic to limit toxicity in the plant is to use heat-activated enzymes. Many thermostable proteins only have activity at high temperatures not experienced during normal plant development. An example is a thermophilic cellulase, which would degrade the cell wall if it were active in the cell. At ambient temperatures, however, it is innocuous, and the enzyme can accumulate without any apparent effect on the plant (Ransom et al. 2007a; Biswas et al. 2006; Hood and Woodard 2002).

Another potential strategy to express a toxic protein is to place the gene under the control of a chemically induced promoter, and to initiate expression shortly before harvest to moderate adverse effects on the host plant (Corrado and Karali 2009). Promoters have been used that are induced by physiological stress (Yi et al. 2010), or pathogen infection (Rana et al. 2012). This strategy was explored for enzymes such as cellulase (Lebel et al. 2005). While this method has considerable potential, this has only provided moderate levels of enzyme accumulation. Future efforts may require the use of a synthetic promoter that fuses high-expression promoters with inducible promoters.

Gene Silencing

A major concern limiting gene expression in plants has been the phenomenon known as gene silencing (Meister and Tuschl 2004; Moazed 2009; Huntzinger and Izaurralde 2011) . This has not been a major problem in the case of seed-specific expression in maize. A lack of gene silencing effects may be due, in part, to the fact that the DNA sequence is known to play a large role, and the majority of gene-silencing events utilize the viral promoter, CaMV, which may be particularly prone to silencing. As noted earlier, seed-specific and endogenous promoters are used for high accumulation, which may alleviate much of the gene-silencing effects.

Multiple copies of the same gene can be introduced by the biolistic process and can also jumble sequences when inserted. This was the case for aprotinin when expressed using a constitutive promoter. In some of these cases, variable levels of expression from the multiple copy inserts also indicated that gene silencing was occurring (Zhong et al. 1999). Increased protein accumulation was usually observed when multiple copies were inserted in a more precise manner using Agrobacterium-mediated transformation. However, in one case, using a gene for cellulase, there was evidence for lower expression when four identical copies of the gene and promoter were used, possibly due to recombination in the host (Egelkrout et al. 2013). Thus, copy number effects can be unpredictable and must be determined empirically .

Protein Stability

The ability to accumulate protein in a tissue is not only related to its expression but also to its degradation. The environment of the protein can be critical for this, and is presumably one of the main reasons different intracellular compartments can accumulate different amounts of the same protein. In the context of protein stability, it is pertinent to discuss posttranslational modifications. This begins with the presence of molecular chaperones and disulfide isomerase in maize seed to help fold the protein appropriately, since proteins that are inappropriately folded, or modified, may be targeted preferentially for degradation. Low proteolytic activity and desiccation of the seed also protects proteins from degradation (Naqvi et al. 2011). Proteolytic activity can be further minimized by removing known protease sites, or using plants expressing cathepsin D protease inhibitor. Protease inhibitors may serve a dual purpose by inhibiting the digestive proteases of insects that consume the seeds, as well as inhibiting endogenous proteases in the seed (Goulet et al. 2010; Schlüter et al. 2010).

Whole-Plant Genetic Strategies to Maximize Protein Concentrations in Seeds

Breeding and Selection

When molecular strategies for optimal protein expression in maize seed are satisfied, genetic means are then employed for increasing target protein accumulation. The transformation of foreign genes is normally not site specific in plant chromosomes, and, therefore, multiple high-expressing T1 lines from several independent events are usually screened to ensure recovery of high grain-yielding lines with high expression. One of the most interesting phenomena observed in the past several years is the ability to increase heterologous protein accumulation in grain through breeding and selection from plants derived from an initial transformation event. It is unclear what exact mechanisms are responsible or how applicable this is to other species, but, doubtless, it is a major strategy for increasing heterologous protein accumulation in maize seed.

When genes are transformed into corn, first-generation plants with the best recombinant protein levels are chosen for further breeding. Figure 3.1 illustrates the breeding scheme. As shown in Fig. 3.1a, 10–15 plants from the T1 generation representing several independent transgenic events from each transformation vector are propagated in the T2 generation. These plants are chosen because some of the seeds analyzed showed high expression (Fig. 3.1b). For example, plants CDN0201 and CDN0202 are better choices than CDN0303 and CDN0304 because each has seeds with really high expression levels, whereas CDN03 plants have much lower expression in their top seeds. Each T1 ear produces 20–50 seeds, in general. It was determined statistically that analyzing six individuals of that group of seeds would be representative of the range and variation of all seed from each plant. Thus, the remaining seed from each of these analyzed plants will reflect the same range and variation in expression as the six individuals analyzed. The “low-expressing” individuals in Fig. 3.1b (less than 2 % total soluble protein; TSP) represent background noise of null segregants. If single insertions are recovered, only one copy of the transgene is found on one chromosome without a duplicate on the paired chromosome. Therefore, when pollinated with a wildtype inbred plant, only half of the progeny will express the transgene. Thus, because T1 seeds segregate 1:1 for the transgene, when these seeds are planted, they must be screened for nulls so that only transgenic plants are propagated. Selection is accomplished by spraying plants with the herbicide, Liberty®, to which the transgenic plants are resistant. Transgenic plants remain green, while null segregants show extensive leaf damage or death. It is important in the early breeding generations to have more than one event represented because insertions can affect agronomic performance, including yield , in subsequent hybrids. When surviving plants are pollinated with either of two inbreds, they produce T2 generation seed. The two inbreds are the complementary parents of a high-producing hybrid, and both inbreds must carry the transgene for maximum protein production in grain.

Fig. 3.1
figure 1

Breeding scheme for selecting for higher target protein accumulation from first-generation independent transgenic events in corn. a. First-generation plants (T0) are regenerated from tissue culture. Each of the ten plants from each independent transgenic event is pollinated with an elite inbred in the glasshouse and seeds are collected. An average of 50 seeds per ear is recovered. Six individual seeds are analyzed singly for protein concentrations. The highest-expressing ears (10–15) are chosen from each vector, representing several events, and planted for continuing in the backcross program. T1 seed are planted and young plants screened for the transgenic trait by resistance to the herbicide, Liberty®. Half of the plants should be resistant. Some T1 plants are pollinated with the original inbred, and equal numbers are pollinated with an inbred that is compatible to produce a hybrid. This process is continued for six generations until sufficient elite germplasm is present in the transgenic line. b. Variability is observed in the single-seed analyses of T1 seed. Averages of seeds from T1 lines would mask the potential of the high-expressing lines. CDN02010 and CDN02020 are plants from a single event. CDN03030 and CDN03040 are plants from a second event. Values below 2 % TSP in these lines indicate background activity in the assay and are not transgenic. TSP total soluble protein

Each T2 ear recovered is analyzed individually using a random selection of 50 seeds. Each generation of plants produces ears with variable protein accumulation levels that cover a broad range of values (see Fig. 3.2). Although the amount of protein recovered per ear covers a broad range of values (Fig. 3.2a), the highest values in each generation increase (Fig. 3.2b; Hood et al. 2012). Additional seed from these highest-expressing ears is replanted the following season, screened for herbicide resistance, and crossed again to the elite inbred for the backcross program.

Fig. 3.2
figure 2

a Range of values of recombinant protein accumulation in individual ears from a single backcross generation. Each bar is the value for a single 50-seed pool from an individual ear. The variation from 0.08 to 0.8 is tenfold. All ears are derived from a single transgenic event. b By planting only the highest-expressing ears from each generation, significantly higher expression levels can be achieved in subsequent generations, reaching equilibrium by generation T6

By the fourth or fifth generation, the breeding program selects one or two events for production. From the protein expression levels illustrated in Fig. 3.2, the top eight to ten ears would be chosen for replanting. Choices are also based on yield and field performance of the plants. Unfortunately, yield cannot be predicted before the hybrid lines are generated from the inbreds and grown for grain production as illustrated in Fig. 3.3. Thus, it is useful to have more than one event or line in the breeding program, even at this late stage of development. Six generations of inbred germplasm are generally used to move the transgenic event into elite lines. After the backcrossing is finished, the transgenic lines are self-pollinated twice to generate inbred lines that are homozygous for the transgenic trait.

Fig. 3.3
figure 3

The breeding program is important to recover high-yielding plants for production in the field. Tissue culture-derived plants are grown in the glasshouse and pollinated separately with an inbred. In subsequent generations, the plants are pollinated with two inbreds to generate homozygous parent seed. The compatible parent inbreds are crossed to generate hybrid seed that then can be planted to produce grain for protein production

Some observations that are encountered in the breeding process are segregation of the Hi-II parental germplasm, the high variability of expression in each ear, and a decrease in expression levels from T1 to T2 generations. Thus, the highest-expressing seeds should be carefully selected for breeding in the T1 and T2 generations. The cellobiohydrolase I (an exocellulase) and E1 (an endocellulase) in Table 3.2 are examples that illustrate the result of moving from generation T1, first-generation seed from the tissue culture-derived plants, to generation T2. T1 seed is analyzed singly, using six randomly chosen seeds from each recovered ear. As was seen in this example, tremendous seed-to-seed variability is always observed in the first generation, presumably because of the hybrid transformation host Hi-II. Hi-II is a cross between A and B parents (Armstrong et al. 1991) that segregates in the ovules of first generation reproduction. This segregating variability is compounded by pollination of the Hi-II ovules with an elite inbred to begin the movement of the transgene into production germplasm . The best T1 seed expression recovered from all T1 seed analyzed is illustrated in Table 3.4. However, T2 lines, in contrast to T1 lines, are screened using 50 seed pools from each ear, meaning that each sample comprises equal numbers of transgenic and null seeds, and that variably expressing seeds are mixed in this population. Thus, often in T2, the recovered expression value drops below the first-generation average seed values. Nevertheless, this result shows that improved protein accumulation is occurring because the average expression includes null seeds. Choosing the highest-expressing ears in T2 for replanting allows recovery of higher-expressing ears in subsequent generations. This strategy, while more complex than that used with many other plants, has been successful for more than 12 genes and, in each case, resulted in expression levels greater than tenfold higher than the initial level in the T1 seed.

Table 3.4 E1 and CBHI transgenic events and level of enzyme accumulation in the average of all positive T1 seeds, and highest T2 ear from each event. Six seeds were used separately for enzyme assay from each T1 transgenic plant and 50 seed bulks were analyzed for T2 ears

Germplasm Pools

Types of corn produced include sweet corn , popcorn, and dent corn, with various minor types such as waxy corn and colored corn. Dent corn has, by far, the largest acreage and is used for ethanol, animal feed, and processed corn products. A wide array of varieties and stocks of germplasm pools are available representing the genetic diversity of dent corn available for current breeding (Mikel and Dudley 2006; Mikel 2011), including Oh43, Lancaster, Oh07-Midland, Iodent, the commercial hybrid-derived Maiz Amargo, and Stiff Stalk varieties. Combining germplasm from different groups allows strong heterosis for commercial hybrids. B73, a Stiff Stalk variety, and Mo17, a Lancaster variety, are the most frequently used germplasm backgrounds for generating commercial hybrids. They are often crossed with other germplasm pools to create a unique material that is used subsequently in commercial hybrids (Mikel and Dudley 2006). The take-home lesson is that corn germplasm is extremely diverse, and current hybrids have only begun to tap into the possibilities to enhance recombinant protein.

Specialized germplasm with specific characteristics that allow high protein accumulation are of interest for breeding programs. Examples of germplasm groups with valuable traits include high-oil phenotypes with large embryos, high-protein phenotypes with reduced endosperm volume (Dudley and Lambert 1969), and opaque-2 mutants with reduced zein (Puckett and Kriz 1991). Each of these genotypes has a mechanism that allows maximizing embryo-localized protein recovery on a weight basis (Hood and Howard 2009). Several recombinant proteins in maize, i.e., laccase, avidin, MnP, brazzein, aprotinin, and trypsinogen, were tested with these germplasm pools. All crosses yielded a significant increase in recombinant protein accumulation in either high oil or opaque-2 backgrounds. When laccase lines were crossed to high-oil lines, improvements were seen in germination as well as protein accumulation (Hood et al. 2003). High oil also improves protein accumulation above what would be expected from the increase in germ size. The high-oil crosses could be particularly interesting from a production standpoint because they are commercial lines with high yields. Other specialized pools, e.g., high protein and opaques, may have limited utility because of lower yields from those lines. Nevertheless, as is true for elite germplasm, the possibilities are vast for genetic manipulation to maximize recovery of traits of interest.

The sequence of the B73 maize genome was published in 2009 (Schnable et al. 2009), providing a powerful tool for understanding much of the molecular and genetic variation among varieties and germplasm pools by providing a basis for comparison across genetic lines (Lai et al. 2010). Indeed, with the cost of DNA and RNA sequencing declining rapidly, detailed comparisons can be made among similar genetic lines to identify variations in coding loci, insertions and deletions, and single-nucleotide polymorphisms (SNPs), as well as low-sequence-diversity intervals (Lai et al. 2010). These comparisons can inform genome dominance in crosses and inheritance of variability that may be associated with particular traits of interest, such as high-protein accumulation in seed.

To date, the generational increases in protein accumulation have been determined empirically. Identification of high- and low-expressing lines per generation is determined only through quantification of the recombinant protein in each ear recovered in each generation; often as many as 3000–4000 analyses from a backcross nursery of 500 rows. Molecular markers that identify relevant loci could be used in earlier generations to select promising lines to continue breeding into elite or preferred specialty germplasm , potentially eliminating the time-consuming protein analysis on each progeny ear.

In an effort to identify the factors that contribute to the increase in protein accumulation during breeding and selection, transcriptome sequencing of high-and low-expressing lines was conducted. High and low lines recovered from the same generation were analyzed for differences in gene expression. Those differences would potentially be the basis for the genetic factors that determine the ability to increase gene expression and protein accumulation at each generation. Current transcriptome sequencing experiments have described embryos at 15, 21, and 27 days after pollination (DAP; Teoh et al. 2013). In these experiments, an unidentified storage protein gene in the cupin family is expressed at higher levels than globulin-1, the protein previously determined to be present at the highest concentrations in maturing embryos (Belanger and Kriz 1991). Data such as these could yield new regulatory sequences that could change the methods and level of recovery of recombinant proteins. Mining the genome will yield many new tools, but will require a great deal of effort to identify the genes or sequences of interest.

Additional studies of messenger RNA (mRNA) sequences between isogenic high- and low-protein accumulation lines from the same generation at 15, 21, and 27 DAP show some interesting differences in abscisic acid synthesis genes as well as increases in a number of unannotated genes. It is planned to continue this analysis to identify loci and alleles that account for the majority of the high-accumulation phenotype, similar to quantitative trait loci (QTLs), and also determine if SNPs can be associated with those loci. The SNPs would be convenient tools for early selection during breeding.

Containment Principles

Many proteins being expressed in maize are intended for industrial and pharmaceutical purposes. Additional regulatory requirements above those, for input traits, must be addressed to avoid intermixing with food/feed corn. Regulatory guidelines outline containment management practices to prevent the inadvertent introduction of these proteins into the food chain that follow the same principles used for other food organisms (e.g., bacteria, yeast, and eggs) and have proven to be very effective. In addition, United States Department of Agriculture (USDA) has added regulatory guidelines for containment management practices as they relate specifically to plants. Maize pollen is relatively heavy and does not survive long under desiccation nor travel far, so physical isolation is a viable strategy (Luna et al. 2001; Ma et al. 2004). Genetic strategies to prevent intermixing may be desirable to complement physical isolation (Lee and Natesan 2006; Al-Ahmad et al. 2004; Daniell 2002) to alleviate some of these onerous requirements and provide greater confidence to the public.

Male sterile corn is an obvious method to prevent inadvertent pollen transfer. Methods for this are well established using a cytoplasmic male sterility system (Dewey et al. 1987). In addition, other systems have been proposed that rely on the preferential expression of proteins in the anther and pollen that devitalize the pollen. Several methods have been described that allow for restoration of viable pollen (Schnable and Wise 1998; Weider et al. 2009).This has the added benefit of being linked to the foreign gene of interest and may be a useful tool in the future.

Another example of containment is to control germination. Systems, such as terminator technology and controlled germination, have been proposed that manipulate the germination of seeds (Lee and Natesan 2006; Schernthaner et al. 2003; Oliver and Hake 2012). These approaches could increase flexibility in production of selected products, but a practical system is not currently available.

One recommendation that often comes up in relation to genetically engineered (GE) plants that express pharmaceutical proteins, vaccines, or industrial enzymes, i.e., nonfood traits in a food crop, is having some visual marker that allows identification of the transgenic lines. For maize, the most obvious way to track a GE crop with proteins in the seed is to mark the seed coat with a color. An obvious choice for driving expression of a visual marker is the use of the promoter for the extensin gene in maize because it is highly expressed in silk and pericarp (Hood et al. 1993). Two series of experiments have failed subsequently to demonstrate that this promoter is active in pericarp, one using an 840 bp region upstream of the extensin gene, and a 1978 bp region upstream of the extensin gene that contains several repeated regions that could account for differential expression in multiple tissues of this single-copy gene. An independently identified pericarp promoter actively promotes expression of beta-glucuronidase in pericarp tissues at relatively high levels. This promoter could be coupled with a reporter gene that would allow field identification of GE plants by cursory examination rather than by molecular analysis.

Reporter genes are needed in combination with seed coat-preferred promoters. For example, a fluorescent protein could be detected in the field or storage bin using a hand-held ultraviolet light source , although in bright sunlight the detection would be difficult. Alternatively, flavonoids, carotenoids, or xanthophylls could be used as long as they are active in the germplasm of interest. These genes often require activation loci which are not present in all germplasm sources, for example, the b1 locus in maize (Selinger and Chandler 2001).

Summary and Conclusions

Maize has been manipulated for centuries in order to improve its ability to provide a reliable supply of food and feed. This highly efficient production platform is now being developed as a source for industrial products, as well as for new uses that are continuing to emerge. The most common approach to increase the crop’s utility for new products relies on the high level of expression of novel proteins in the kernel. Maize has proven to be one of the most useful crops to meet this need for several reasons, including its low cost of production, its inherent safety as a food and feed product, its demonstrated ability to express novel genes at high concentrations, the diverse germplasm available to customize the novel protein expression , and its ability to integrate the novel proteins directly into food, feed, and industrial applications without the need for purification of the protein.

Genetic manipulation both at the molecular and whole-plant level can help maximize protein accumulation. The technology is well suited for cost-effective production of large volumes and low-cost proteins and/or avoiding human pathogens in the final product. Because of this potential, a number of studies are underway with the aim to produce new foods, feeds, vaccines, pharmaceuticals, and industrial products.

This potential for making new products has led researchers to investigate novel ways of increasing expression. The kernel has proven to be a very effective site for overaccumulation of proteins that is aided by its inherent qualities of sequestering active proteins in the kernel, a relatively low metabolically active tissue, reduced concerns over gene silencing and proper folding, high protease inhibitors to limit degradation, and multiple methods to restrict gene flow to address regulatory concerns. With these advantages, the maize seed will continue to be the system of choice for high-volume output traits until such time that a customized plant can be generated without the concern for food/feed intermixing (Howard and Hood 2005a).