Introduction

Citrus is one of the major crops worldwide that provides a variety of nutritional, health, and other benefits. While cultivation of citrus is limited to subtropical locales, many regions around the world produce citrus both for fresh fruit consumption as well as processed products such as juice, fruit sections, natural flavors, cosmetics, cleaners, air fresheners, and more. Greater than 20 countries world-wide cultivate citrus with Brazil, India, China, Mexico, and the United States, respectively as the top countries in acres of orange production (http://www.yara.us/agriculture/crops/citrus/key-facts/world-citrus-production/; https://apps.fas.usda.gov/psdonline/circulars/citrus.pdf) and China, United States, Mexico, South Africa, and Turkey, respectively, in terms of acres of grapefruit production (https://apps.fas.usda.gov/psdonline/circulars/citrus.pdf). In the U.S., citrus production is highest in Florida overall followed by California, Arizona, and Texas. It has a commercial value of more than $3 billion annually (http://aggie-horticulture.tamu.edu/vegetable/guides/the-crops-of-texas/citrus-and-subtropical-tree-crops/; http://www.agmrc.org/commodities_products/fruits/citrus/citrus-profile/; http://www.freshfromflorida.com/Divisions-Offices/Marketing-and-Development/Education/For-Researchers/Florida-Agriculture-Overview-and-Statistics).

The importance of flavonoid glycosides in citrus has many components. One, of course, is the benefit of these compounds to the plants themselves with roles in protection against ultraviolet light damage and defense roles in interactions with herbivores and microbes being the top contenders (reviewed in Owens and McIntosh 2011). Flavonoid glycosides also play a large role in consumer acceptance and in the health benefits associated with citrus consumption by humans. Citrus is known for the accumulation of significant levels of flavanone and flavone diglycosides (Berhow et al. 1998; Peterson et al. 2006). For example, the flavanone diglycoside naringin is a major bitter compound found in grapefruit and pummelo (Berhow et al. 1998; Owens and McIntosh 2011 and refs therein). While some tonic flavor is expected by consumers of these fruits, picking or processing immature fruit can lead to a level that is unacceptable to the human palate. In oranges, the non-bitter flavanone diglycoside hesperidin accumulates to high levels and can result in the production of “cloud” in juice products which can affect consumer acceptance (Hendrickson and Kesterson 1964). Often, the key is knowing when to harvest, juice, and blend juices to obtain desired flavors (Rouseff 1980; Mansell et al. 1983; McIntosh 2000).

Because the flavonoids in citrus tissues are predominantly found in glycosylated form, it is logical for research on biosynthesis and metabolism of flavonoid glycosides in these species to include focus on the glycosyltransferase (GT) enzymes responsible for the addition of the sugar moieties. Biosynthesis of the flavonoid ring system in citrus has been recently reviewed (Owens and McIntosh 2011). Herein, a historical context on citrus glycosylation is provided. The GT enzymes characterized from citrus and their biochemical properties, expression patterns, levels of enzyme activity during development, and structure/function relationships are reviewed. Understanding the biochemical properties and regulation of GT enzymes is critical for informed assessment of either crop improvement or custom design of enzymes for production of desired compounds with nutritional and/or medicinal application. Citrus is an ideal model system for GT studies because the nature of the glycosylated flavonoids accumulated is different from most plants. This implies the presence of a different set of GTs with varying substrate specificities from those that have been characterized to date.

Biologically active flavonoids in citrus

Flavonoids and their glycosides are involved in a number of activities that are critical to plant survival. This includes known functions in coloration as pigments/co-pigments, UV protection, signaling, male fertility in some species, as well as antimicrobial, antifungal, and antifeedant/feedant roles (reviewed in Winkel-Shirley 2001; Iwashina 2003). Specifically in citrus species, flavones, flavone glycosides, flavanone glycosides, and flavonol glycosides have been shown to act as attractants that induce egg laying by the butterfly species Papilio xuthus and Papilio protentor demetrius (reviewed in Toh et al. 2013). However, these flavonoids are only active in combination with other compounds such as adenosine, proline, stachydrine, and quinic acid. The methylated flavanones nobiletin and tangeretin were shown to have antifungal activity against the known plant pathogens F. moniliforme, S. rolfsii, and V. albo-atrum. In addition to nobiletin and tangeretin, demethylnobiletin, 5,4′-dihydroxy-6,7,8,3′-tetramethoxyflavone, and xanthomicrol were isolated from citrus as well and were shown to inhibit D. tracheiphila which is the causative agent of the citrus disease mal secco.

Elucidating the health benefits of consuming citrus flavonoid glycosides has been an active field of research for decades, having been previously reviewed (e.g. Benavente-García et al. 1997; Benavente-Garcia and Castillo 2008; Yamane and Kato 2012), and will only be briefly considered here. Flavonoid glycoside health benefits have been shown to include anti-inflammatory, anti-carcinogenic, anti-allergenic, cholesterol lowering, capillary strengthening, hypertension and diabetes reduction properties (e.g., Cody et al. 1986; Alam et al. 2014; Stoclet and Schini-Kerth 2011; Watson et al. 2014). In addition to their well-established roles in taste and marketability of citrus products, naringin and hesperidin have also been of particular interest for their pharmacological activities. Administering 500 mg of hesperidin for 3 weeks to patients with known vascular metabolic risk factors resulted in significant improvements in endothelial function (Rizza et al. 2011). Hesperidin and naringin have both been indicated in the reduction of cholesterol and triglyceride levels (Craig 1997; Jung et al. 2003), but these findings remain controversial as conflicting results in which no improvements were noted have also been reported (Demonty et al. 2010). An interaction between grapefruit products and the cholesterol lowering statin drugs that leads to increases in effective dosage is of particular interest to the general public. This interaction is known to be mediated by inhibition of the critical drug metabolizing P450 enzyme, CYP3A. Although naringin has inhibitory action against CYP3A in vitro and was initially thought to be the primary source of this effect, it has since been demonstrated that furanocoumarins (e.g. bergamottin, bergaptol and bergapten) are the primary causative agents in vivo (Paine et al. 2006).

Elucidation of citrus flavonoid glycosylation

Citrus GT enzymology

The earliest work on flavonoid glycosylation in citrus was motivated by a keen desire to elucidate the final steps in the biosynthesis of the bitter compound, naringin (Fig. 1). While early studies focused on feeding experiments to elucidate synthesis of the flavanone ring structure (Fisher 1968), results published in the late 1980’s and early 1990’s made key contributions in learning about the glycosyltransferases involved. Complimentary to this work was elucidation of other flavonoid glycosyltransferase activities (Fig. 2).

Fig. 1
figure 1

Representative flavonoid structures and modifying sugar groups

Fig. 2
figure 2

Citrus flavonoid glycoside biosynthetic pathway. Numbers refer to the following publications that were important is resolving the enzymology at each step of the pathway: (1) McIntosh and Mansell (1990); (2) McIntosh and Mansell (1990), McIntosh et al. (1990), Lewinsohn et al. (1989a, b), Berhow and Smolensky (1995); (3) Frydman et al. (2013), Ohashi et al. (2015); (4) Lewinsohn et al. (1989a, b), McIntosh and Mansell (1990), Bar-Peled et al. (1991), Frydman et al. (2004), Ohashi et al. (2015); (5) McIntosh et al. (1990), Berhow and Smolensky (1995); (6) Frydman et al. (2013); (7) Bar-Peled et al. (1991), Frydman et al. (2004); (8) Owens and McIntosh 2009; (9) Frydman et al. (2013); (10) Frydman et al. (2013), Ohashi et al. (2015)

In the 1980’s there was significant interest in plant cell culture and the potential for biotechnological applications. Therefore, it should come as no surprise that several groups were working on the production of flavanone diglycosides in cell cultures and during plant regeneration (refs: e.g., Lewinsohn et al. 1986; Barthe et al. 1987; Barthe et al. 1988; Mansell and McIntosh 1991). In 1986, Lewinsohn et al. demonstrated that Citrus paradisi cell cultures were capable of glucosylating exogenous naringenin and hesperitin resulting in the production of prunin (naringenin-7-O-glucoside) and hesperitin-7-O-glucoside, respectively. No rhamnosylation of exogenous flavonoids was noted in this biotransformation study. In another study published in 1989, Lewinsohn et al. (1989a) tested cell cultures of sour orange (Citrus aurantium), a Citrus trifoliata hybrid, Citrus limon, and C. paradisi for ability to biotransform aglycones and showed that all cultures could glucosylate exogenous flavanones at the 7-OH position. One sour orange culture further rhamnosylated prunin to form the rutinoside, narirutin (naringenin 1,6 rhamnoglucoside). Barthe et al. (1987) showed that levels of glycosylated flavanones dropped as C. paradisi (var. Duncan and Thompson) and Citrus sinensis (var. Parson Brown and Valencia) tissues underwent dedifferentiated growth in callus and suspension cultures. Upon transfer of cultures to regeneration medium, Duncan grapefruit showed an increase in endogenous levels of naringin in buds and further increase in shoots (Barthe et al. 1987). For example, callus contained an average of 3.6 ppm naringin, regenerated buds contained an average of 148 ppm, and new regenerated shoots contained an average of 1051 ppm. This indicated that conditions leading to organogenesis also led to increased production of flavanone glycosides. The high sensitivity of the radioimmunoassay used for naringin and hesperidin (Jourdan et al. 1985; Barthe et al. 1987, 1988) allowed for their identification in callus cultures while previous TLC methods were not able to detect these compounds in cell cultures (Lewinsohn et al. 1986, 1989a, b). The ability of detached young grapefruit to make naringin and prunin when fed 14C-acetate and 14C-phenylalanine demonstrated that both compounds could be synthesized in vitro (Berhow and Vandercook 1989). Immature peels synthesized only prunin.

The question of whether sugars were added in sequence or in diglycose form was resolved somewhat by the results discussed above as well as independent demonstration of sequential addition of glucose molecules in the synthesis of flavonol triglucosides in pea and tulip (Jourdan and Mansell 1982; Kleinehollenhorst et al. 1982). In citrus, work with crude cell-free extracts from calamondin and pummelo young leaves and fruits showed chalcone synthase activity as well as GT and rhamnosyltransferase (RT) activities (Lewinsohn et al. 1989b). The RT activity was demonstrated using a coupled assay for the conversion of UDP-Glc to UDP-Rhm for detection of RT activity in crude tissue extracts.

The first work published on enriched enzyme preparations was done with glucosyltransferases isolated and partially purified from young grapefruit leaves (McIntosh and Mansell 1990). Glucosylating activities with naringenin chalcone, naringenin, hesperitin, kaempferol and quercetin were reported. In crude leaf extracts, subsequent rhamnosylation to naringin occurred. UDP-14C-Glc was used in the reactions and label was found in both the glucose and rhamnose moieties suggesting that the crude extract also contained a UDP-Rhm synthase that converts UDP-Glc to UDP-Rhm. Further enrichment of the glucosyltransferase enzyme preparation resulted in loss/removal of the RT activity (McIntosh and Mansell 1990). This supported the hypothesis that there were separate enzymes carrying out sequential addition of glucose and rhamnose in the production of naringin. Subsequent rigorous purification of the GT activities in young grapefruit leaves (ammonium sulfate fractionation followed by gel filtration, hydroxyapatite, UDP-GA agarose, Mono Q, and Mono P columns) resulted in partial characterization of flavonol GT, flavonoid GT, and thorough characterization of a flavanone-specific GT activity (McIntosh et al. 1990), the latter having over 900-fold enrichment with naringenin as acceptor. Products of reactions with flavanones and flavones were 7-O-glucosides and with naringenin chalcone was the 4′-O-glucoside. Details of properties of the enzymes can be found in the next section. Flavonoid GT activity was also found in extracts from C. limon leaves where 2 peaks of activity were obtained from a Sephacryl S-200 column (Berhow and Smolensky 1995). Results showed that one fraction glucosylated the flavones apigenin and crysin as well as hesperitin at the 7-OH position and the second peak glucosylated the flavonols kaempferol and morin (Berhow and Smolensky 1995). This work further characterized the hesperitin 7-GT activity and showed that levels of enzyme activity were highest in young leaves.

With respect to biosynthesis of bitter flavonoids in citrus, the 1,2 addition of rhamnose to the glucose moiety is critical. While use of coupled assays were important for establishing the role of an RT in naringin synthesis (e.g., McIntosh and Mansell 1990; Lewinsohn et al. 1989a, b), direct study of rhamnosyltransferases has been hampered by lack of commercial availability of UDP-Rhm. Bar-Peled et al. (1991) were able to synthesize sufficient UDP-14C-Rhm to allow for purification and partial characterization of a 1,2RT activity from pummelo that rhamnosylated 7-O-glucosides of naringenin, hesperitin, apigenin, and luteolin. Results with flavonol glucosides were not reported. A pumello 1,2RT was characterized by transforming the gene into tobacco cells which lack this native enzyme activity but produce UDP-Rhm (Frydman et al. 2004). Subsequently, a similar approach was used to examine the characteristics of a 1,6RT clone from C. sinensis (Frydman et al. 2013). This enzyme exhibited a somewhat promiscuous activity, rhamnosylating flavanone 7-glucosides, flavonol-7-O-glucosides, and flavonol-3-O-glucosides when transformed into B2Y tobacco cells. Application of cloning these RT enzymes into yeast (along with a UDP-Rhm synthase) to test efficacy of using this system for whole-cell catalysis (Ohashi et al. 2015) substantiated the substrate preferences of the RT enzymes from pummelo and orange. It is not clear at this time if efforts are underway to clone the 1,2RT and 1,6RT enzymes into yeast for heterologous expression and direct biochemical characterization.

GT’s other than those involved in biosynthesis of bitter neohesperidosides and non-bitter rutinosides have also been characterized in citrus, including flavone, flavanone, chalcone, and flavonol GTs already discussed above. Additionally, a flavonol-specific 3-O-GlcT from grapefruit (Cp3GT) has been cloned, heterologously expressed, purified, and chemically and biochemically characterized (Owens and McIntosh 2009). Details of enzyme biochemical properties are discussed further below. Other putative secondary product GT clones from grapefruit were heterologously expressed and the resultant proteins tested for activity. One showed low activity with flavonols and another shows activity with catechol (Devaiah et al. 2016).

Glycosyltransferase gene expression

Metabolism of flavonoid glycosides in citrus is dynamic and shows indication of tissue-specific and developmentally regulated gene expression. Levels of 7-O-GT activity were highest in very young leaf tissues in grapefruit and lemons (McIntosh and Mansell 1990; Berhow and Smolensky 1995). Studies of 1,2RT and 1,6RT expression and accumulation of neohesperidosides and rutinosides, respectively, showed correlation with highest expression in young leaves and fruits (Bar-Peled et al. 1993; Frydman et al. 2013). Expression of Cp3GT was studied in developing roots, stems, leaves, and in flowers (Daniel et al. 2011). Expression was higher in stage 3 (seedlings with first true leaves) and 4 (seedlings with second true leaves) roots and stems as compared to earlier stages and higher levels in younger leaves of stage 4 and 5 (1–4 year old trees) plants were seen. Thus, there is a difference in expression and activity of the enzymes involved in rhamnoglucoside production as compared to flavonol glucoside production. It is intriguing to consider that this may be due to the importance of bitter compound production to protect very young tissues and fruits from herbivory while flavonol glucosylation is increased during expansion and growth of tissues, especially those that would need protection from damaging UV radiation due to more exposure to direct sunlight. Another grapefruit clone showed trace activity with flavonols and also showed highest level of gene expression in stage 5 younger leaves and stage 3 stems with only a trace of expression in stage 3 and 4 roots (Daniel et al. 2011; Devaiah et al. 2016). A catechol-GT was expressed in leaves of all stages with highest expression in stage 5 young leaves and stage 4 roots (Daniel et al. 2011; Devaiah et al. 2016).

Bioinformatic tools in GT identification

Additional candidate flavonoid GTs may be identified from citrus genomes using the PSPG box signature motif, described in detail below, as a marker (https://www.citrusgenomedb.org/). The eight genomes posted to date include varieties of mandarin, pummelo, and orange. Additionally, candidates may be found in the HarvEst database (http://harvest.ucr.edu/) using similar techniques. While the latter represents contigs composed of hybrid sequences using information from ESTs published from a variety of citrus species, it has been used to successfully clone three putative GT’s from grapefruit (Mallampalli 2009). None of the latter showed activity with flavonoid substrates. This serves to further illustrate that it is currently not possible to ascertain precise GT function simply from amino acid sequence. Direct testing of function through either heterologous expression and screening the protein for activity (e.g., Owens and McIntosh 2009) or by cloning into plants not containing the gene and looking for altered flavonoid accumulation (Frydman et al. 2004, 2013) remain the most convincing approaches.

Chemical and biochemical properties of citrus flavonoid glycosyltransferases

Mechanisms for glycosylation include addition to –OH groups (O-glycosides), C groups (C-glycosides) or N groups (N-glycosides). In citrus, O-glycosides predominate with low levels of C-glycosides also having been reported (e.g., Gattuso et al. 2007, Gentili and Horowitz 1968). All citrus flavonoid GTs studied to date catalyze sugar addition to form O-glycosides. Characterization of citrus flavonoid GTs has been pursued to varying degrees from studying straightforward reactions and substrate screening to more thorough characterization including determination of pH optima, thermal stability, reaction kinetics, inhibitors, activators, and more. Results to date are summarized in Tables 1, 2 and 3. Note that information from early plant tissue culture biotransformation experiments, discussed above, are not included in the tables, but information from enzymes heterologously expressed in other organisms is included.

Table 1 Properties of citrus flavonoid 7-O-glycosyltransferases
Table 2 Properties of citrus flavonoid 3-O-glycosyltransferases
Table 3 Properties of citrus rhamnosyltransferases

While not all citrus GTs have been thoroughly characterized, available data do give indication that they share some similar properties to those characterized from other plant species. For example, all of the GTs for which size has been determined are within the typical 49–56 kDa range, reported optimal pH’s range from 6.5 to 8.0, some divalent cations inhibit activity, and UDP is a competitive inhibitor (Tables 1, 2, 3). There is limited pI information available in the literature, however the citrus GT’s have pI’s that are in the acidic range. Reports on substrate specificity are variable, however enzymes that were highly enriched or homogeneous such as the flavanone-specific GT (McIntosh et al. 1990) and the flavonol-specific GT (Owens and McIntosh 2009) tended to show high substrate and regiospecificity (Tables 1, 2). Exceptions are the 1,2RT and the 1,6RT (Table 3). The highly enriched 1,2RT would add rhamnose to flavanone and flavone glucosides (Bar-Peled et al. 1991). Complications of more recent testing of RT clone function in transgenic tobacco by feeding with flavonoid aglycones lie with inherent flavonoid glucosyltransferase activities (and flavonoids) present in tobacco. For example, the 1,6RT appeared to be somewhat promiscuous adding rhamnose to the 7-glucoside residue of flavanones, flavones, and flavonols as well as to 3-glucosides of flavonols (Frydman et al. 2013). In a transgenic cell culture system, it is not possible to conduct direct enzyme kinetics and thus determine substrate preference. In an effort to remove complications of a plant background in heterologous expression, Ohashi et al. (2015) cloned the Citrus 1,2RT and 1,6RT genes as well as an Arabidopsis rhamnose synthase into yeast. The latter was done in order to synthesize the UDP-Rhm substrate required for the reactions. Analysis of substrate specificity of the 1,2RT and 1,6RT was performed with crude cell lysates, and the 1,6RT also appeared somewhat promiscuous with respect to substrate preference.

As interest in understanding the structure and function of flavonoid glycosyltransferases has grown, and as some of the citrus glycosyltransferases show significant specificity, they are good candidates for studies to elucidate potential parameters important for substrate and regiospecificity.

Structure and function of citrus and other flavonoid glycosyltransferases

Glycosyltransferase primary and secondary structure

Glycosyltransferases have been identified from all the known kingdoms of life (Gachon et al. 2005). The enzymes are classified based on a number of different properties such as if the sugar donor contains a nucleotide (Leloir) or not (non-Leloir), whether there is retention or inversion of the formed glycosidic bond at the anomeric carbon of the substrate (stereoselectivity), as well as evolutionary relationships among the enzymes. The prevailing bioinformatic classification system for all carbohydrate active enzymes is the CAZy database (http://www.cazy.org/; Lombard et al. 2014) that currently classifies GTs from all organisms into 97 different families. Families and subfamilies are defined by identity at the amino acid level and include at least a single biochemically analyzed member; however, the majority of sequences in the database represent uncharacterized proteins (Breton et al. 2012). All identified plant secondary product GTs to date group with glycosyltransferase family 1 in the database. Sequence homology among GTs is notoriously low, even among enzymes that have been established to share the same substrate, regio-, and/or stereospecificity. Aside from its role as a continually adapting classification database, CAZy also serves as a central repository for various other types of GT information.

Unfortunately, although there is a wealth of primary sequence information available for mostly uncharacterized GTs, bioinformatic means alone remain an unreliable method by which to deduce GT function. Even though the protein sequences of GTs differ greatly, the enzymes are typically very similar in terms of secondary and tertiary structure. Attempts at using mathematical modeling based on protein secondary structure have shown some promise in predicting substrate and regiospecificity of glucosyltransferases (Jackson et al. 2011; Knisley et al. 2009). However, these methods are hindered by the insufficient availability of biochemically characterized enzymes on which to develop training data sets and are currently not dependable predictors of enzyme function. The greatest recent advances in predicting GT function have been made in the structure–function analysis of glycosyltransferses employing tertiary structures derived from both crystallography and homology modelling. However, direct biochemical assay of enzyme activity remains the only reliable means to define precise GT activities.

Glycosyltransferase tertiary structure

The tertiary structures of most solved GTs conform to one of three folds known as GT-A, GT-B, and the most recently defined GT-C. However, in a few instances, rarer topologies have been observed that are placed in a fourth grouping typically called unknown/other folds. Although conforming to these folds is characteristic of GTs, enzymes of other function have also been known to adopt both the GT-A and GT-B folds (Lairson et al. 2008). Therefore, identification of these topologies alone is not sufficient to assign GT function.

All of the identified GT-C enzymes are integral membrane proteins with non-Leloir, inverting GT activity and many are involved in N-glycosylation of proteins (Liang et al. 2015). They typically require a divalent cation for activity and use a lipid-phosphate as the sugar donor. All GT-A and GT-B fold GTs are Leloir enzymes, with inverting and retaining enzymes having been identified in each group. Both the GT-A and GT-B folds contain two distinct super-secondary structural elements composed of alternating β strands and α helices with up to seven typically parallel β strands forming a β sheet which are known as β-α-β Rossmann-like domains. Rossmann folds are often associated with proteins that bind dinucleotides, such as UDP. The orientation of these two domains in relation to each other is a primary difference between the topographies. In GT-A glycosyltransferases, the two domains are abutting and produce an essentially continuous β sheet. GT-A proteins typically require a divalent cation that is predicted to interact with the nucleotide diphosphate (NDP) sugar donor during catalysis. A DXD amino acid signature sequence encoding a cation binding domain is often present (Gloster 2014).

In GT-B enzymes, the two Rossmann-like domains face each other as facilitated by a flexible linker region within the protein (Fig. 3). The enzyme active sites are located in the cleft formed between these domains. The N-terminal domain is associated with substrate specificity and contains sugar acceptor interacting residues, while the C-terminal domain is involved in NDP-sugar binding (Wang 2009). Sequence conservation among GT-B enzymes is higher near the C-terminus consistent with the greater similarity among sugar donors than acceptor substrates. GT-B enzymes from all organisms contain a semi-conserved amino acid signature sequence within the C-terminal domain that represents the specific UDP-sugar binding domain. The UDP-GT signature is defined at Prosite by accession PS00375 which has been updated from an initial 29 residue sequence to the current 45 amino acid pattern: [FW]-X(2)-[QL]-X(2)-[LIVMYA]-[LIMV]-X(4,6)-[LVGAC]-[LVFYAHM]-[LIVMF]-[STAGCM]-[HNQ]-[STAGC]-G-x(2)-[STAG]-X(3)-[STAGL]-[LIVMFA]-X(4,5)-[PQR]-[LIVMTA]-X(3)-[PA]-X(2,3)-[DES]-[QEHNR] (Mackenzie et al. 1997; Sigrist et al. 2012). This definition encompasses a more defined plant specific 44 residue signature sequence that is known as the plant secondary product glycosyltransferase (PSPG) box (Hughes and Hughes 1994). It is also commonly observed that the most C-terminal α helix of the structure crosses over the catalytic cleft and contributes to the formation of the N-terminal domain (Fig. 3).

Fig. 3
figure 3

Homology model of Cp3GT as a representative GT-B fold enzyme. Red, gray, and yellow indicate the N-terminal, flexible linker, and C-terminal domains respectively. The PSPG box, which represents the UDP-sugar binding domain, is in black. His22 and Asp122 are the predicted catalytic residues. The N- and C-terminal residues are indicated at the left. (Color figure online)

Plant secondary product glycosyltransferase crystal structures

Crystal structures for 6 different plant glucosyltransferases with associated structural information are available at RCSB PDB (http://www.rcsb.org; Berman et al. 2000) and results are summarized in Table 4. A seventh crystallized plant GT protein, R. serpentine arbutin synthase, is listed at the CAZy database but detailed information about it does not yet appear to be publically available. All crystallized GTs to date, are leloir enzymes with the GT-B fold and have an inverting mechanism of action. The inverting mechanism of action for the majority of GT-B enzymes involves a catalytic histidine residue that acts as a Bronsted base and an associated aspartate residue that contribute to charge stabilization in a manner analogous to the Ser-His-Asp triad of serine hydrolases (see Lairson et al. 2008 for a detailed review of GT catalytic mechanisms). The amino acids associated with these positions (c and f)Footnote 1 are conserved completely in all of the crystallized plant GTs (Fig. 4). Although an asp (f) is conserved in UGT72B1, detailed structural analysis has demonstrated that it has a unique binding geometry in which Ser14 (a) serves in the charge stabilization role (Brazier-Hicks et al. 2007) demonstrating that some variation in catalytic interactions exists among GTs.

Table 4 Properties of crystallized plant glycosyltransferases
Fig. 4
figure 4

Alignment of Cp3GT with all of the plant glycosyltransferases for which crystal structure data are currently available. Crystallized enzymes are referenced as named in the literature with important details for each listed in Table 4. Residues with a black background are identical while those with gray are highly conserved. Amino acids highlighted in red indicate that an interaction was identified within the crystallized protein. These associations are further refined by #, ^, +, * below the sequences which indicate catalytic, UDP, sugar, and acceptor interactions, respectively. Lower case letters below the sequences represent regions of interest as referenced in the text. (Color figure online)

All of the plant GTs crystallized to date have demonstrated enzyme activity using UDP-Glc as the sugar donor. Sugar donor specificity has been biochemically examined thoroughly for VvGT1 (Offen et al. 2006) and to a lesser extent with UGT85H2 and UGT71G1 (Shao et al. 2005; Li et al. 2007). In all cases, the GTs were shown to have an in vitro preference for UDP-Glc. Although attempts have been made at co-crystallization with UDP-Glc for UGT78K6, UGT72B1, and UGT78G1, enzymatic hydrolysis appears to have taken place during the process resulting in loss of the glucose moiety. Similar results were observed with UDP-Gal in UGT71G1. Interestingly in UGT72B1, a Tris molecule from the crystallization buffer was substituted in place of the donor sugar (Brazier-Hicks et al. 2007) which may suggest a structural basis for the reduction of enzyme activity observed with Tris buffer in Cp3GT (Owens and McIntosh 2009). However, co-crystallization was successfully achieved using UDP-Glc in UGT71G1 and with a non-transferable glucose analog, UDP-2-deoxy-2-fluoro-glucose, in UGT72B1 and VvGT1 (Offen et al. 2006; Brazier-Hicks et al. 2007). In the glucose containing structures, the last two residues of the PSPG box (x and y) interact with the sugar, and are strongly conserved across all the enzymes. The terminal PSPG box residue (y) has been suggested as being indicative of sugar usage with glutamine associated with glucose activity and histidine with galactose activity (Kubo et al. 2004; Cheng et al. 2014). Although mutagenesis from histidine to glutamine in a S. baicalensis galactosytransferase was able to convey glucosyltransferase activity, the complementary mutation was not able to impart galactosyltransferase activity to a glucosyltransferase. Even though the nature of the residue at this location likely plays a part in sugar recognition, the idea that a single residue controls sugar specificity in GTs does not seem like it will hold true in most cases. Further complicating the assignment of a distinct function to this residue, D367 at x in UGT78K6 forms a hydrogen bond with a substrate, delphinidin. A tryptophan near the center of the PSPG box (s) that interacts with glucose was identified from VvGT and UGT71G1, and was invariant across all structures except UGT78 KG. Thr141 (g) from VvGT1 interacted with glucose demonstrating that residues outside the PSPG box, and even the C terminal domain, also serve roles in sugar recognition.

For most structures in which UDP was included as a ligand there is very little difference in overall structure between the bound and unbound forms of the enzymes (Offen et al. 2006; Hiromoto et al. 2013). Consistent with its role as the UDP-sugar binding domain, 8 of the 11 residues with UDP interactions are found within the PSPG box. The residues indicated in binding the ribose portion of the nucleotide are invariant across all the structures (q and v). The first two residues of the PSPG box have uracil binding activity with no variation at o, but less conservation at p with C361 of UGT85H2 having a very different chemistry than the other enzymes. Ser308 of UGT78G1 (n) outside of the PSPG box is also indicated in uracil binding. Most of the residues that interact with the phosphate groups of the nucleotide are identified within the PSPG box (r, t, u, and w), and are highly conserved across all the species with the exception of G367 in Cp3GT (t). Outside of the PSPG box, residues at b and m also interact with phosphate, although much variation is observed at b.

Co-crystallized structures for xenobiotic, flavonol, and, most recently, anthocyanidin sugar acceptors have been solved (Table 4). No direct H bonding interactions were observed between UGT72B1 and the xenobiotic substrate trichlorophenol (Brazier-Hicks et al. 2007). Consistent with previous observations of GT-B fold glucosyltransferases, all but a single substrate interacting residue (D367 at x) are located in the N terminal domain. Sub-regions of substrate interacting residues are apparent, but as has been previously predicted there is much lower conservation observed than in the sugar donor domains. In both UGT78G1 and VvGT1, the orientation of flavonol binding in the acceptor pocket was quite similar. However, substrate binding in UGT78K6 was divergent in that the flavonol lies in the reverse direction within the pocket, which has implications on the regiospecificity of sugar transfer due to the steric protection of certain acceptor hydroxyl groups (Hiromoto et al. 2015). It was suggested that Asp181 (k) is a critical residue in determining the orientation of the substrate in the binding pocket. Even though there are considerable structural differences between flavonols and anthocyanidins, the substrate interacting residues are the surprisingly the same within UGT78K6 except for an additional interaction with Asn137 (h).

Future directions for glycosyltransferase crystallography

While global similarities among structures of crystallized plant GTs are apparent, the nuances of structural changes that are necessary for specific function remain elusive. As additional crystal structures and mutational analyses results for critical residues of biochemically characterized proteins become available, it may become possible to strengthen the reliability of functional prediction/assignment. For example, it will be interesting to interpret ongoing mutational analyses of the flavonol-specific Cp3GT in relation to what has been observed with more promiscuous GTs. As a similar situation exists between the substrate specific 1,2RT and the more promiscuous 1,6RT, both of whose importance in flavor chemistry has been previously discussed, these findings have the potential to be useful in future biotechnology applications.

Concluding remarks and directions for future research

One recurring theme in plant GT research is the desire for better prediction of precise function of putative flavonoid GT enzymes identified through the myriad of gene, contig, and other databases. Toward that end, more information from direct biochemical testing of encoded protein activity will help identify parameters to receive greater weight in models and therefore help strengthen the confidence in predictions. This may seem an onerous task, but improvements in heterologous protein expression systems (e.g. Ohashi et al. 2015; Devaiah et al. 2016) and screening assays coupled with the many putative GTs annotated in citrus genome and EST databases provide significant opportunities.

Additionally, solving crystal structures of plant GTs with differing substrate, regio-, and/or stereo-specificity is important to further advance the field. For example, C. paradisi 3GT is strictly flavonol-specific (Owens and McIntosh 2009), the Clitorea ternatea 3GT strongly prefers anthocyanidins with the most active tested flavonol having 8.2 % of the activity of the most active anthocyanidin (Hiromoto et al. 2015), and there is a more promiscuous Vitis vinifera 3GT that prefers anthocyanidins but also glucosylates flavonols with the most active tested flavonol having 48 % of the activity of the most active anthocyanidin (Ford et al. 1998). Obtaining a crystal structure for Cp3GT would complement information available for the other two enzymes and may further elucidate critical factors and aid in model refinement. This work is underway in our lab.