Keywords

11.1 Introduction

Soybean (Glycine max L. Merr.) is the richest and cheapest annual commercial legume (Fabaceae) and major oilseed and protein dual-use crop worldwide that produces the best and most of global vegetable oil and protein (Broun et al. 1999; Kim et al. 2012). In 2018, soybean yields accounted for approximately 60% of the world’s oilseed production (http://soystats.com). Cultivated soybean seeds have the highest protein content (around 40%) among legumes (between 20 and 30%) and contain approximately 18–22% oil (Patil et al. 2018; Liu 1997). Soybean oil which is a complex mixture of five fatty acids includes palmitic acid (10%), oleic acid (18%), stearic acid (4%), linoleic acid (55%), and linolenic acid (13%). Soybean oil has role in growth and development and reproductive biology of soybean crop (http://www.soystats.com; Lee et al. 2007; Mekhedov et al. 2000; Clemente and Cahoon 2009). Soybean oil is one of the key sources of edible herbal oil for individuals and finds competitive use in food products, for instance, margarine, cooking oils, salad dressings, and beverage production and industrial applications, for instance, sizing for cloth, plastics, fire-extinguisher fluids, and soy-based biodiesel production from above 1 billion gallons of oil in 2011 and bioenergy resource for sustainable development (Mekhedov et al. 2000; http://www.soystats.com). Soybean oil-based emulsifier and lubricant known as lecithin finds application in pharmaceuticals to protective coatings (http://www.soystats.com). Current trends in health awareness and increased demands of soybean seed oil have given much concern to quality and content of oil. Soybean oil quality depends on fatty acid composition. The fatty acid composition in soybean influences flavor, dietetic value, and strength of oil. The unsaturated fatty acids regulate immune system, clotting of blood, neurotransmission, metabolism of cholesterol, and organization of membrane phospholipids in the brain and retina. But unsaturated fatty acids are prone to oxidation causing off-flavor and reduced shelf life of oil (Smouse 1979; Mounts et al. 1988; Abedi and Sahari 2014). To address these needs, substantial evolution has been achieved by soybean breeding to increase the yield of soybean. Marginal progressions have been achieved in high-yielding germ lines due to contrary association between contents of seed oil and protein. Crop needs to have increased content of monounsaturated fatty acids (e.g., oleic acid) and decreased content of polyunsaturated fatty acids (e.g., linoleic acid and linolenic acid). This will increase oxidative stability of soybean oil and fatty acid content and total content of seed oil with increased health benefits to humans (Clemente and Cahoon 2009; Lee et al. 2012). The genetic variation determining oil content is smaller than of protein content, making it challenging to increase oil content at genetic basis. Several induced changes in the genetic makeup improved oil composition with reduced Glym4(C6T3L5) allergens in soybean seed by fatty acid desaturase (GmFAD3), gene silencing of metabolic pathways that are responsible for encoding fatty acids, and developing value-added soybeans (Nazrul et al. 2019).

11.2 Biosynthetic Pathway of Lipids (TAG) in Seeds

Vegetable oil synthesis is a complex process. Oilseed plants are renewable sources of fatty acids which are a significant component of oil. Oilseed plants accumulate fatty acids in seeds in the form of triacylglycerols (TAGs) (Thelen and Ohlrogge 2002). De novo fatty acid (FA) biosynthesis initiates mainly in stroma of chloroplasts wherein fatty acids undergo two interconnected metabolic pathways of acyl-CoA-dependent pathway and acyl-CoA-independent pathway and are exported to the cytoplasm (Okuley et al. 1994; Ohlrogge et al. 1979; Reed et al. 2000; Kroon et al. 2006). Acyl-CoA-dependent pathway or Kennedy pathway involves priming via acetyl-CoA carboxylase (ACCase) and elongation via malonyl-CoA of nascent acyl chains as direct precursors up to 18 carbons in length (Kennedy 1961). Glycerol-3-phosphate acyltransferase (G3PAT) catalyzes transfer of fatty acid to glycerol-3-phosphate (G3P) and forms lysophosphatidic acid (LPA). LPA is acylated by cytosolic lysophosphatidic acid acyltransferase (LPAAT) by incorporating oleic acid at sn-2 position to produce phosphatidic acid (PA). Then PA is dephosphorylated by phosphatidic acid phosphatase (PAP) to form diacylglycerol (DAG). The last step of DAG acylation is catalyzed by a key rate-limiting enzyme diacylglycerol acyltransferase (DGAT) found in oil bodies or endoplasmic reticulum (ER) which transfers an acyl group from acyl-CoA at sn-3 position of sn-1,2-diacylglycerol to synthesize triacylglycerol (TAG) neutral lipids in plant seeds (Kamisaka et al. 1997; Cagliari et al. 2010; Jako et al. 2001). TAG is important for oil formation in seeds and mainly synthesizes, assembles, and accumulates in oil bodies (Barthole et al. 2012).

Acyl-CoA-independent pathway uses phospholipid:diacylglycerol acyltransferase (PDAT) for final acylation reaction. PDAT directly relocates one acyl group from phosphatidylcholine (PC) to DAG and produces TAG (Dahlqvista et al. 2000). Regulation of genes encoding complex enzyme machinery of lipid biosynthetic pathway may affect oil content in seeds (Okuley et al. 1994; Reed et al. 2000). This complex genetic mechanism of oil concentration and composition is challenging to study but provides prospective to increase oil content and composition of fatty acids (Liu et al. 2013).

11.3 Genomic Traits Linked to Soybean Seed Oil

Fat biosynthesis in soybean comprises synthesis, termination, discharge, and desaturation of fatty acid chain and triglyceride (TAGs) synthesis along with formation of polyunsaturated fatty acids and liposomes. The synthesized fat is deposited as glycerine and phospholipid and as TAG in seeds (Mekhedov et al. 2000). Identification and verification of a number of lipid metabolism genes have been reported (Barthole et al. 2012). DGATs are four types in plants (Yen et al. 2008). The DGAT genes (10) in soybean have subfamilies of DGAT1, DGAT2, or DGAT3 with distribution on different chromosomes. DGAT1 is primarily present in germinated and mature seeds. DGAT2 occurs in nodules, leaves, flowers, green pods, and matured seeds. DGAT3 is expressed in leaves, roots, and seeds (Liu et al. 2013). Transformation of soybean seed cells expressed UrDGAT2A positioned on endoplasmic reticulum and membrane of the oil body (Lardizabal et al. 2008). Thus, DGAT expression relates to variations in content of seed oil. GmDGAT2D hairy root soybean transgenic synthesizes 18:1 or 18:2 TAG. GmDGAT1A transgenic hairy roots synthesize 18:3 acyl-CoA for TAG biosynthesis (Chen et al. 2016). Genetic engineering, genetic editing, or RNAi interference of key enzyme-encoding genes can activate or deactivate enzyme activity and improve oil quality. The rstDGAT1 gene was cloned in mice (Cases et al. 1998) and Arabidopsis thaliana (Hobbs et al. 1999), DGAT1 gene in Tropaeolum majus (Xu et al. 2008) and Ricinus communis (He et al. 2004), and DGAT2 in Mortierella ramanniana (Lardizabal et al. 2001) and Arabidopsis thaliana (Salanoubat et al. 2000). A number of investigations focus on identification and verification of multiple related genes of lipid metabolism (Barthole et al. 2012). Genome-wide association studies (GWAS) also discovered some genes linked with content and composition of oil (Cao et al. 2017; Zhang et al. 2018a). The content and composition of seed oil in soybean are under regulation of multiple quantitative trait loci (QTLs)/genes and are also influenced by the environment (Burton 1987; Diers et al. 1992). QTLs associated with seed oil (>322 oil QTLs) and fatty acids (228 fatty acid QTLs) have been recognized on 20 chromosomes in SoyBase database (Diers et al. 1992; Spencer et al. 2004; Shibata et al. 2008; Bachlava et al. 2009; Qi et al. 2011; Sun et al. 2011; Mao et al. 2013; Pathan et al. 2013; Ha et al. 2014; Kim et al., 2010; https://www.soybase.org). QTL regions of 1.64–2.09 Mb and 33.35–35.95 Mb on Chr. 20 are for the oil content, and QTL region of 44.58–48.58 Mb on Chr. 14 is for linolenic acid of seeds (Qi et al. 2011; Spencer et al. 2004; Bachlava et al. 2009; Csanadi et al. 2001; Wang et al. 2014; Han et al. 2015; Reinprecht et al. 2006; Patil et al. 2018; Xie et al. 2012). Based on high-density genetic map, one QTL on Chr. 05 (qOil-5) for seed oil and two QTLs (qOil10–1 and qOil10–2) for oil content and stable QTLs for oil content on Chr. 02 (qOil_02), Chr. 08 (qOil_08), Chr. 15 (qOil_15), and Chr. 20 (qOil_20) (3 K-SNP) were identified (Cao et al. 2017; Zhang et al. 2018b; Patil et al. 2018). Specific-locus amplified fragment (SLAF) markers detected 26 stable QTLs for 5 fatty acids (Li et al. 2017). Twenty-four stable QTLs for content and composition of oil in seed were identified by model-based composite interval mapping (CIM) in soybean. QTLs (23) overlapped with or were adjacent to previously reported QTLs. One QTL, qPA10_1 (5.94–9.98 Mb), on Chr. 10 is a novel locus for palmitic acid (Yao et al. 2020).

11.4 Soybean Genetic Improvements

Biotechnology proposes novel tools for designing soybean plants with improved oil quality via direct modification of fatty acid biosynthesis or by producing novel fatty acids. Only few success stories are available involving modification of enzymes and substrate pools in Kennedy pathway with novel technologies for the enhanced relative production of triacylglycerols (TAGs) in soybean seed oil. The synthesis of fat includes the metabolisms of sugar, pyruvate, fatty acids, and other pathways. Seed oil content is regulated by synthesis of fatty acids, accumulation of lipids, and development of seeds (Bao and Ohlrogge 1999; Yun and Isleib 2000). Refining oil content depends principally on the manipulation of fatty acid biosynthesis pathway. Synthesis of fatty acids and assembly of TAG undergo regulation at the levels of transcription, post-transcription, and metabolism, but the regulatory networks are uncharacterized. ABI3, LEC1, LEC2, Dof, WRI1, and FUS3transcription factors play vital roles in lipid biosynthesis (Libeisson et al. 2010).

Enhancing expression of DGAT which converts diacylglycerols (DAGs) to TAGs in oil seed crops improves both oil content and fatty acid composition. Overexpression of Umbelopsis ramanniana diacylglycerol acetyltransferase (UrDGAT2) gene was reported in soybean seeds and designed a new soybean variety with better oil content (Lardizabal et al. 2008). DGAT1 overexpression promotes accumulation of TAGs in Arabidopsis thaliana and Nicotiana (Bouvier-Navé et al. 2000). WS/DGAT bifunctional genes and cytoplasmic peanut AhDGAT genes regulate lipid biosynthesis and accumulation. P24 oleosin isoform B, P24 oleosin isoform A, and two oleosin-5 showed upregulation in mature soybean transgenic plants. Overexpression of oleosin genes improved total content of fatty acids (Shimada and Hara-Nishimura 2010). GmDGAT1-2 after transfer to WT soybean (JACK) increased total fatty acid contents and 18:1 composition in transgenic soybeans, but lowered linoleic acid (18:2) than in WT. Differentially expressed proteins (436) and differentially expressed metabolites (180) were reported in WT (JACK) and transgenic soybean pods (Xu et al. 2021). Overexpressing DGAT2 transgenic soybeans showed 1.5% increase in total seed oil without reduction in seed protein content or yield. Overexpression of yeast sphingolipid compensation (SLC1) protein which possesses lysophosphatidic acid acyltransferase (LPAT) activity leads to 1.5% increased oil content in soybean seeds and 3. 2% increased oil content in somatic embryos. SLC1 converts lysophosphatidic acid to phosphatidic acid which is a precursor of DAG (Rao and Hildebrand 2009).

Genetic engineering can enhance content of seed oil for a specific fatty acid or a class of fatty acids. Transgenic soybean with oleic acid content of around 80% of total oil was designed by downregulation of FAD2 genes encoding enzymes for conversion of monounsaturated oleic acid to polyunsaturated linoleic acid (Kinney 1997). Linolenic acid (LA) reduces oxidative stability of oil resulting in rancidity and reduced shelf life. Three desaturase genes (GmFAD3) subscribe to LA synthesis, and targeted gene silencing approach suppression or downregulation of GmFAD3 gene resulted in low (LA contents <2%) linolenic soybeans (Flores et al. 2008). Higher oleic acid (OA-a precursor of linolenic acid) mutant soybean varieties with four times more oleic acid were generated via genetic editing using transcription activator-like effector nucleases (TALENs) tools. TALENs bind and target specific DNA loci in FAD2-1A and FAD2-1B genes and create small deletions in their coding sequence and downregulate desaturases which converts OA to LA. Transgenic seeds ad OA content increased up to ~80% (Haun et al. 2014; Mazur et al. 1999; Buhr et al. 2002). Overexpression of borage Δ6desaturase which converts LA and α-linolenic (ALA) to GLA and SDA, respectively, increased stearidonic acid (SDA) and γ-linolenic acid (GLA) fatty acid contents in soybeans which exhibit pharmacological properties and nutritional value (Sato et al. 2004). Borage Δ6desaturase transgenic soybean produced GLA to ~27% and SDA to ~3% in seed oil (Clemente and Cahoon 2009). Pyramiding of borage Δ6desaturase and Arabidopsis Δ15desaturase which converts LA to ALA increases SDA levels (21.6%)in soybean lines (Eckert et al. 2006). SDA is a precursor of long-chain polyunsaturated fatty acids eicosapentaenoic acid (EPA) and docosahexaenoic acid (DHA) and has nutritional value of cardiovascular fitness in humans. EPA levels reached to 20% and DHA levels up to 3% of total seed fatty acid content (Kinney et al. 2004). Tocopherol (α, β, γ, and δ,) molecules or vitamin E act as antioxidants and add to oil stability. Biotechnological approaches of soybean tocopherols focused on increasing α-tocopherol contents due to its highest nutritional value. Upregulation of homogentisate phytyltransferase (HPT) which catalyzes first step in tocopherol synthesis increased in little total tocopherol levels in transgenic seeds (Savidge et al. 2002; Karunanandaa et al. 2005). Total tocopherol content increased by 1.5-fold in soybean, by expressing HPT genes from Arabidopsis and Synechocystis with a strong seed specific promoter (Karunanandaa et al. 2005). Bacterial chorismate mutase-prephenate dehydrogenase (TYRA) gene expression in soybean with several other enzymes under the control of seed specific promoters increased >ten-fold total tocopherol content. Overexpression of rice homogentisate geranylgeranyl transferase (HGGT) involved in biosynthesis of tocopherols in monocots increased antioxidant activities and total tocopherol contents in soybean (Kim et al. 2011). Metabolic engineering involving co-expression of VTE3 and VTE4 responsible for methylation of tocopherol, from Arabidopsis, generated transgenic plants with α-tocopherol levels >90% of total tocopherol content. The total levels of tocopherol remained the same, but only α form yielded a fivefold increase in vitamin E activity in soybean oil (Van Eenennaam et al. 2003).

GmNFYA, GmDof4, GmbZIP123, GmDof11, and GmMYB73 transcription factors regulate accumulation of lipids by direct binding to lipid biosynthesis gene promoters. The overexpression of these transcription factors significantly increased lipid accumulation in transgenic seeds (Lu et al. 2016; Wang et al. 2007; Song et al. 2013; Liu et al. 2014). Overexpression of GmNFYA transcription factor enhanced seed oil content in transgenic plants (Lu et al. 2016). Overexpression of GmDof4 and GmDof11 improved lipid accumulation in Arabidopsis, microalgae, and rapeseeds. In rapeseed, oleic acid increased, and linoleic acid and linolenic acid reduced. GmDof4 activated FAB2 expression by directly binding to its promoters, whereas GmDof11 directly inhibited FAD2 expression and thus regulates genes associated with fatty acid biosynthesis (Wang et al. 2007).GmbZIP123 transgene overexpression enhanced lipid content and enhanced expression of sucrose transporter genes (SUC1 and SUC5) and cell-wall invertase genes (cwINV1, cwINV3, and cwINV6) by binding directly to their promoters and increased levels of glucose, fructose, and sucrose in Arabidopsis thaliana seeds. GmbZIP123 regulates lipid accumulation in soybean seeds by controlling seed sugar transport (Song et al. 2013). MYB-type gene GmMYB73 involved in upregulation of fatty acid accumulation shows differential expression in seeds at different developmental stages. GmMYB73 interacted with GL3 and EGL3 and suppressed GL2 which is a negative regulator of oil accumulation. GmMYB73 overexpression enhanced total lipid contents (linoleic acid (18:2) and linolenic acid (18:3)) in transgenic hairy roots of soybean plants (Liu et al. 2014).

11.5 Conclusion

Soybean has the maximum protein content (40%) among leguminous crop plants and contains approximately 18–22% oil contents which is a complex mixture of oleic acid (18%), palmitic acid (10%), linoleic acid (55%), stearic acid (4%), and linolenic acid (13%). Enhancing expression of DGAT (UrDGAT2, GmDGAT1-2, and DGAT2) and SLC1 improves both the oil content and fatty acid composition in soybean plant. Only few success stories of designer soybean crops with improved oil content and composition are available. Biotechnological approaches using novel tools for designing soybean plants via manipulations of homogentisate phytyltransferase (HPT), homogentisate geranylgeranyl transferase (HGGT), and TYRA enhanced soybean tocopherols. Overexpression of borage Δ6desaturase and pyramiding of borage Δ6desaturase and Arabidopsis Δ15desaturase increased stearidonic acid and γ-linolenic acid in soybean. GmNFYA, GmDof11, GmDof4, GmbZIP123, and GmMYB73 transcription factors increased considerably accumulation of lipids in transgenic seeds.