Introduction

Since their isolation from various plant tissues more than half a century ago, arabinogalactans (AGs) have been proposed to be linked to a large group of hydroxyproline-containing polypeptides known as AG proteins (AGPs) (Fincher et al. 1974, 1983). The typical structure of an AGP consists of a highly variable core protein enriched in hydroxyproline residues and various carbohydrate side chains that are often anchored to the cell membrane according to glycosylphosphatidylinositol (GPI) lipid anchor signals (Seifert and Roberts 2007; Nguema-Ona et al. 2013; Pereira et al. 2015). GPI-anchored AGPs can be released into the cell wall. Since many AGPs are highly glycosylated, more than 90% of their total molecular mass comes from glycan moieties consisting of (1 → 3)-β-galactan and (1 → 6)-β-linked galactan chains, which are thought to be important for the functional diversity of AGPs (Knoch et al. 2014).

Compared with many other protein families, only a handful of AGPs have been functionally characterised, mainly due to the complexity of their AG sugar chains and the heterogeneity of their core proteins. With the help of tools such as anti-AG chain-based immunomicroscopy, β-Yariv reagents, degradation enzymes that target specific parts of AG chains, chemical synthesis of specific structures of AG chains, and bioinformatics methods, the nature of AGPs is becoming increasingly clear. AGPs have crucial roles in multiple biological processes, including cell division, cellular communication, programmed cell death, embryogenesis, postembryonic pattern regulation, secondary wall deposition, organ abscission, plant–microbe interactions, plant growth, and reproductive processes (reviewed in Majewska-Sawka and Nothnagel 2000; Seifert and Roberts 2007; Ellis et al. 2010; Nguema-Ona et al. 2012, 2013; Pereira et al. 2015, 2016a). In reproduction processes, AGPs and their sugar chains are involved in gametophyte development and male–female interactions (Fig. 1).

Fig. 1
figure 1

Arabinogalactan proteins (AGPs) involved in plant reproduction. AGPs and their functions are indicated (blue circles: male gametophyte development and function; orange circle: female gametophyte development; red circles: male–female communication). See the text for details of each function. AMOR is an AG sugar chain, for which the protein backbone is unknown

In this review, we summarise recent progresses in our understanding of the multiple functions of AGPs in plant reproduction, with a special focus on the analytical tools used to study AGPs, as well as the biosynthesis and functions of AG sugar chains.

Classification of AGPs

According to the structure of their core proteins, the AGP family can be generally categorised into two types: classical AGPs and non-classical AGPs (Showalter 2001). Classical AGPs often share a signal peptide in the N-terminal and always combine with a GPI anchor in the C-terminal; only the central domain consists of a high percentage of proline, alanine, serine, and threonine (PAST) residues (Schultz et al. 2000; Showalter 2001). Those with short mature protein backbones, usually with residues of around 10–13 amino acids, are separately designated as AG peptides (Schultz et al. 2000). Distinct from these two classical AGPs, a third type of classical AGP is the lysine-rich AGP subfamily, which is characterised by a short lysine-rich region at the C-terminal (Sun et al. 2005).

Distinct from the classical AGPs, non-classical chimeric AGPs with different domains in their core proteins can be classified into three main subfamilies: fasciclin-like AGPs (FLAs), phytocyanin-like AGPs (PAGs), and xylogen-like AGPs (XYLPs) (Ma et al. 2017). FLAs contain at least one fasciclin-like domain, a secretion signal in the N-terminal, one or two AGP regions, and often a GPI anchor in the C-terminal (Johnson et al. 2003). Members of the PAG subfamily share a similar structure with phytocyanins, except for the presence of amino acids that bind copper in their plastocyanin-like domains (Mashiguchi et al. 2009; Cao et al. 2015). The XYLP subfamily members, originally isolated from differentiating xylem cells of in vitro Zinnia elegans cultures, contain AGP and non-specific lipid transfer protein domains and act as functional extracellular proteins in various plant tissues (Motose et al. 2004; Kobayashi et al. 2011). In addition, there are many other chimeric AGPs that cannot be grouped into these three subfamilies, which are designated as “other chimeric AGPs” (Ma et al. 2017). Apart from the classical and chimeric AGPs, those containing sequence characteristics of both AGPs and extensins (EXTs) are separately referred to as hybrid AGP/EXTs (HAEs) (Showalter et al. 2010). A recent study characterised 151 AGPs from the genome database of Arabidopsis thaliana, including 42 classical AGPs, 105 chimeric AGPs, and 4 HAEs (Ma et al. 2017; Fig. 2).

Fig. 2
figure 2

Classification and expression of AGPs of Arabidopsis thaliana. Publicly available expression data of 130 of 151 AGP genes are shown as heat maps using Genevestigator (Hruz et al. 2008), and according to the classification of Ma et al. (2017). See the main text for details of AGP classification. The expression levels in the male reproductive organs/tissues are highlighted in blue, while those in female tissues are highlighted in orange

AGPs are the jack of all trades of plant reproduction

AGP genes show differential expression in various plant tissues (Fig. 2). Most AGPs are expressed in plant reproductive tissues, and some classical and non-classical AGPs show predominantly high expression levels. For example, AGP6/11/22/23/24/40 and BCP1, which are classical AGPs, ENODL6/7 and BCB from the PAG subfamily, FLA3/14 from the FLA subfamily, members of XYLP, and other chimeric AGPs are highly expressed in male reproductive tissues, including the stamen and pollen. Meanwhile, FLA1/8/10/16, many PAG members such as ENODL1/11/12/13/14/15, the XYLP subfamily members XYP11, AT5G09370, and AT1G73560, and several other chimeric AGPs accumulate in the female parts of the pistil, carpel, stigma, ovary, and ovule (Fig. 2). The expression levels of several classical AGP members have been confirmed by AGP-specific promoters fused with fluorescent proteins or β-glucuronidase, and with fluorescence in situ hybridisation (Pereira et al. 2014). These expression patterns of AGPs are consistent with their roles in plant reproduction.

AGPs in gametophyte development

Land plants have a complex life history that encompasses haploid gametophytes and diploid sporophytes, both of which are strictly regulated by numerous molecular networks. In flowering plants, the male gametophyte (i.e. pollen grain) develops during two phases, microsporogenesis and microgametogenesis, in the stamen. Meanwhile, the female gametophyte (i.e. embryo sac) originating from the ovule experiences two similar phases, known as megasporogenesis and megagametogenesis (reviewed in Borg et al. 2009; Yang et al. 2010). In the following sections, we examine the various roles of AGP members in the development of male and female gametophytes in flowering plants.

In addition to studies using monoclonal antibodies and β-Yariv reagents, AGP mutants provide insights into the function of each AGP gene. AGP18, a gene encoding a lysine-rich classical AGP in A. thaliana, is indispensable for the development of female gametophytes (Acosta-Garcia and Vielle-Calzada 2004; Demesa-Arevalo and Vielle-Calzada 2013). For example, downregulation of AGP18 was associated with decreased fertility and aborted ovules, whereby functional megaspores could not undergo normal haploid mitosis and thus failed to initiate female gametogenesis (Acosta-Garcia and Vielle-Calzada 2004). Overexpression of AGP18 also reduced fertility due to abnormal maintenance of viable megaspores, indicating its function in megaspore selection (Demesa-Arevalo and Vielle-Calzada 2013).

AGPs are not only involved in female reproductive tissues, but many have been demonstrated to be essential for the differentiation of male parts. AGP6 and AGP11, two phylogenetically similar AGP genes specifically expressed in male reproductive tissues, are required for stamen and pollen grain development (Pereira et al. 2006; Coimbra et al. 2008; Levitin et al. 2008; Coimbra et al. 2009). In double mutants of agp6 agp11, a large proportion of pollen grains collapsed, and there was a failure to release pollen, leading to reduced fertility (Levitin et al. 2008; Coimbra et al. 2009). A mutation in the AGP11 homolog BcMF8 in Brassica campestris also exhibited abnormal pollen shape and pollen tube growth, suggesting the functional conservation of this AGP branch within the family Brassicaceae (Lin et al. 2014).

Apart from the classical AGPs, FLA-like chimeric AGPs are also recruited during gametophyte development. FLA3, which encodes an FLA in pollen grains and tubes, is involved in microspore development by affecting cellulose deposition (Li et al. 2010). After downregulating FLA3 expression, male fertility was reduced, and many wrinkled and shrunken pollen grains were observed. In addition, half of the pollen grains showed defects in the pollen intine layer and were aborted during transition to the bicellular stage, indicating that FLA3 has a role in intine layer formation (Li et al. 2010). In the monocot rice, MICROSPORE AND TAPETUM REGULATOR1 (MTR1) encodes an FLA specifically expressed in male reproductive cells, and its mutant showed a complete male sterile phenotype, indicating that MTR1 is indispensable to pollen tapetum and microspore development in rice (Tan et al. 2012).

AGPs in male–female interactions

Flowering plants (angiosperms) have evolved a set of mechanisms to guide pollen tubes precisely to enter the embryo sac and ultimately accomplish double fertilisation. It has been demonstrated that AGPs are widely involved in the interactions between the male and female parts during this process.

Pioneering studies were performed using Nicotiana species more than two decades ago. The AGP protein, transmitting-tract-specific (TTS) protein from Nicotiana tabacum, stimulates pollen tubes to grow along the stylar transmitting tissue towards the ovary (Cheung et al. 1995; Wu et al. 1995). In the transmitting tissue, the extent of glycosylation of TTS proteins gradually increases from the stigma to the ovarian end, which might provide positional cues for the pollen tube (Wu et al. 1995). With suppression of TTS expression, the pollen tube growth rate is significantly reduced, which is thought to be conserved at least among the Nicotiana genus (Cheung et al. 1995; Wu et al. 2000). Another two HAEs from tobacco, class III pistil EXT-like proteins (PELPIII) and 120 kDa glycoprotein (120 K), have been shown to be involved in self-incompatibility (SI) and interspecific incompatibility (II), respectively (Hancock et al. 2005; Eberle et al. 2013). PELPIII suppression leads to an increase in interspecific pollen tube growth, suggesting its role in the specific inhibition of pollen tubes in II (Eberle et al. 2013); meanwhile, downregulation of 120 K results in failure of S-specific pollen rejection, favouring the idea that HAE-type AGPs play different roles in male–female interactions (Lind et al. 1994; Hancock et al. 2005). AGPs are also likely to have a role in acquisition of stigma receptivity in the apple flower (Losada and Herrero 2012).

Many recent studies have revealed the role of AGPs in male–female interactions in A. thaliana. A group of AGPs from the PAG subfamily, ENODL11/12/13/14/15, show high expression levels in the funiculus and ovules, which redundantly control pollen tube reception (Hou et al. 2016). Single, double, and triple mutants of these ENODL genes did not exhibit any obvious phenotype; however, in the ovules of quintuple mutants, wild-type pollen tubes failed to release sperm cells, indicating their function in coordinating male–female communication and facilitating double fertilisation (Hou et al. 2016). JAGGER encodes AGP4, which is specifically expressed in the transmitting tract, stigma, and integuments before fertilisation, and is responsible for the polytubey block (Pereira et al. 2016b). In a null mutant of JAGGER, the persistent synergid cell survived, which is normally disorganised after fertilisation, leading to partial failure of polytubey block (Pereira et al. 2016b).

AMOR has been identified as a bioactive AG sugar chain derived from the ovules of Torenia fournieri, which makes the pollen tube competent for the ovular attraction signal (Mizukami et al. 2016). Parallels can be drawn between the animal capacitation and plant pollen tube activation by pistil tissue, and AMOR was first identified as a responsible pistil molecule (Mizukami et al. 2016; Sankaranarayanan and Higashiyama 2018). AMOR activates the LURE (attractant peptides) signalling cascade in the pollen tube. The protein backbone of AMOR remains unknown. Interestingly, the terminal disaccharide structure, the β isomer of methyl-glucuronosyl galactose (4-Me-GlcA-β-1,6-Gal), has been shown to be both necessary and sufficient for AMOR activity (Mizukami et al. 2016). The strict structural activity relationship of AMOR (Jiao et al. 2017) might imply the existence of a receptor in the pollen tube that can recognise its disaccharide structure.

Tools used to identify the roles of AGPs and their sugar chains

Recent progress regarding our understanding of AGPs and their sugar chains has relied both on classical methods and on the development of new methods. In this section, we summarise the tools used to study AGPs and their sugar chains.

Adding to the tools summarised below, molecular genetics (mutant analysis) and glycan structure analysis have been powerful in AGP research. As shown in the study of ENODLs (Hou et al. 2016), some AGPs are likely to act redundantly. Highly efficient CRISPR/Cas9-mediated gene knockout in Arabidopsis (e.g. Tsutsui and Higashiyama 2017) contributes to explore the large AGP family. For details of glycan structure analysis, for example, Gane et al. (1995) reported structure analysis of AG sugar chains from stigmas and styles of Nicotiana alata by nuclear magnetic resonance (NMR). Tryfona et al. (2012) reported structure analysis of AG sugar chains from leaves of Arabidopsis by enzyme degradation, mass spectrometry, and carbohydrate gel electrophoresis.

Monoclonal antibodies

The localisation of AG sugar chains in plant tissues can be visualised using monoclonal antibodies that can detect different glycosidic AGP epitopes (Pennell and Roberts 1990; Penell et al. 1991; Yates et al. 1996; Willats et al. 1998; Qin and Zhao 2006; Moller et al. 2008). More than two decades ago, the alteration in AGP levels detected by MAC207 antibody was first proposed as a developmental switch occurring during primordia initiation in stamen and carpel (Pennell and Roberts 1990). Many more monoclonal antibodies have since been used to characterise the distribution of AG sugar chains and other carbohydrates. For example, JIM4, JIM8, and JIM13 recognise an AG epitope similar to MAC207 (Penell et al. 1991; Yates et al. 1996), LM6, and LM13 detect the location of arabinans and the pectin structure, and LM14 binds to the epitope of type II AG that may occur on AGPs (Willats et al. 1998; Moller et al. 2008). The immunolocation detected by these antibodies has been of great value for elucidating the functions of AGPs and their sugar chains, even in recent studies (Corral-Martinez et al. 2016; Da et al. 2017; Olmos et al. 2017; Suzuki et al. 2017). The reproductive organs of flowering plants show divergence in the distribution of specific sugar chains (e.g. Coimbra et al. 2007), implying that specific AG sugar chains might be involved in local cellular development and cell-to-cell communication. Epitopes of monoclonal antibodies have been examined with enzyme-linked immunosorbent assay (ELISA) using chemically defined sugar chains, and a microarray with various types of printed, synthesised sugar chains was developed recently to rapidly show the specificity of each monoclonal antibody (Ruprecht et al. 2017).

β-Yariv reagents

Yariv reagents are a group of synthetic phenylglycosides that were initially used for the purification of sugar-binding proteins, which were later demonstrated to selectively bind to AGPs depending on the glycosyl residue (Yariv et al. 1967; Kitazawa et al. 2013). For example, β-Yariv reagents bind to the β-1,3-galactan main chain (Fig. 3), a structure conserved among AGPs (Kitazawa et al. 2013). These reagents not only visualise the distribution of AGPs, but also perturb their biological functions, which has led to the wide application of β-Yariv reagents to study the functions of AGPs during plant development and growth.

Fig. 3
figure 3

AG sugar chain biosynthesis. A typical AG sugar chain structure (e.g. Tryfona et al. 2012) is shown with glycosylation enzymes. See the main text for details of the enzymes. The indications of KNS4/UPEX1, GALT29A, GALT31A, RAY1, and FUT4, and 6 for some residues are omitted. GALT31A has been suggested to work cooperatively with GALT29A by forming a protein complex (Dilokpimol et al. 2014)

In mature ovary cryosections of T. fournieri, the β-1,3-galactan of AG is abundant in the ovule and cell layer of the placenta surface, and AG accumulation could only be detected by β-Yariv reagents rather than α-Yariv reagents, indicating a high specificity of this chemical compound (Mizukami et al. 2016). In Z. elegans, the function of xylogen, an AGP required for xylem differentiation, was impaired in the presence of β-Yariv but not α-Yariv reagents (Motose et al. 2004). In N. tabacum, the size of prefertilised ovules significantly decreased with increasing concentrations of β-Yariv, but not α-Yariv reagents, in the culture medium (Qin and Zhao 2006). Overall, studies based on β-Yariv reagents have improved our knowledge of the functions of AGPs.

Degradation enzymes of AG sugar chains

Many glycoside hydrogenases of microbial and plant origin have been identified (reviewed in Knoch et al. 2014). These enzymes are important for the metabolism of AG sugar chains, including AG chain turnover. In addition, these enzymes are useful tools for studying the function of AG sugar chains. To examine the function of the sugar moiety of AGPs, acid treatment to remove the sugar moiety and conventional glycosidase treatment, non-specific to AG sugar chains, have been used. However, it is debated as to whether acid treatment can damage protein backbones and whether glycosidase treatment, with a wide spectrum of activity, can affect non-AGP glycoproteins in the fraction. Using degradation enzymes targeted to specific structures of AG sugar chains, the function of AG sugar chains can be examined in a more reliable manner. Moreover, it is possible to characterise the critical structures of AG sugar chains with respect to their function. For example, AMOR in T. fournieri maintained its activity even when the β-1,6-galactan side chain (Fig. 3) was digested by endo-β-1,6-galactanase and α-L-arabinofuranosidase, but lost its activity when treated with β-glucuronidase, which removed β-glucuronosyl and 4-O-methyl-glucuronosyl residues at the terminal (for details of these enzyme treatment, see also Kotake et al. 2004; Konishi et al. 2008; Takata et al. 2010). This led to identification of the terminal disaccharide structure, 4-Me-GlcA-β-1,6-Gal, as the structure responsible for AMOR activity (Mizukami et al. 2016).

Degradation enzymes are also useful for glycan structure analysis to release oligosaccharides from AG sugar chains (e.g. Tryfona et al. 2012) and to quantify sugars in AG fractions (e.g. Mizukami et al. 2016).

Chemical synthesis of AG sugar chains

Chemical synthesis of specific components of AG sugar chains is a powerful method to obtain large amounts of pure molecule. Chemically synthesised AG sugar chains have been used to elucidate epitopes of monoclonal antibodies (e.g. ELISA and the aforementioned microarray). For example, chemical synthesis was sufficiently powerful to show that 4-Me-GlcA-β-1,6-Gal has specific AMOR activity (Mizukami et al. 2016; Jiao et al. 2017), whereas the α isomer of the disaccharide showed less activity (approx. 1/100), and a disaccharide without the methyl residue showed no AMOR activity. Chemically synthesised disaccharide AMOR is now commercially available from a Japanese company (Tokyo Chemical Industry Co Ltd). Chemical synthesis must be sufficiently powerful to search for other structures responsible for the bioactivity of AG sugar chains, although chemical synthesis of long and branched AG sugar chains, like those of native AGPs, is difficult.

Omics approaches to identify AGPs

The omics era has yielded substantial increases in big data in biology. Moreover, bioinformatics approaches have proven able to efficiently mine AGP sequences from published genomic and transcriptomic databases (Schultz et al. 2002; Showalter et al. 2010; Johnson et al. 2017a, b; Ma et al. 2017). Developed using a Perl script, the BIO OHIO software program, which is based on the proportion of PAST in each protein and specific amino acid motifs associated with known hydroxyproline-rich glycoproteins (HRGPs), has successfully identified 85 AGPs, moderately glycosylated EXTs, 18 lightly glycosylated proline-rich proteins, and other HRGPs from the A. thaliana genome (Schultz et al. 2002; Showalter et al. 2010). Based on the original version, an improved version of BIO OHIO (ver. 2.0) was released to screen for HRGPs from the complete genome of Populus trichocarpa (Showalter et al. 2016).

Based on BIO OHIO ver. 2.0, a small Python script, “Finding-AGP,” has successfully identified AGP gene family members from various lineages of the plant kingdom (Ma et al. 2017). The Finding-AGP program can not only discover AGPs with large proportions of PAST, but can also identify chimeric AGPs with smaller PAST percentages, covering more than half of all AGPs (Ma et al. 2017). However, more accurate, manual curation of the output results is still required, which would be time-consuming when dealing with large datasets. More recently, an automated motif and amino acid bias (MAAB) pipeline was introduced to classify HRGP sequences and has been applied to the 1000 Plants project, which includes sequencing data for over 1,000 plant species (Johnson et al. 2017a, b).

Biosynthesis of AG sugar chains

Post-translational modification of AG sugar chains to the protein backbone occurs in the secretory pathway of cells. O-glycosylation is the predominant form of AGP glycosylation. Hydroxylated prolines (Hyps) in repetitive dipeptide motifs (e.g. alanine–proline, serine–proline, threonine–proline, and valine–proline) are first glycosylated with a galactose to form Gal-β-1,4-Hyp. This initiation is critical for the successful elongation of AG sugar chains. The enzymes in this initiation, hydroxyproline O-galactosyltransferases (HPGTs), and their genes have been identified (Basu et al. 2013; Ogawa-Ohnishi and Matsubayashi 2015). HPGT1, 2, 3 of A. thaliana are three potent HPGTs, in which a triple loss-of-function mutant resulted in considerable reductions in β-Yariv-precipitated AGPs (Ogawa-Ohnishi and Matsubayashi 2015; Fig. 3). In this triple mutant, many plant growth and development processes were defective, including plant reproduction, suggesting that AG sugar chains are involved in various plant growth and development processes.

Other glycosyltransferases required for the step-wise elongation of AG chains have been identified, including KAONASHI4 (KNS4)/UNEVEN PATTERN OF EXINE1 (UPEX1) as a b-(1,3)-galactosyltransferase for main chains (Suzuki et al. 2017), GALT29A and GALT31A as β-(1,6)-galactosyltransferases for side chains (Dilokpimol et al. 2014), FUT4 and FUT6 as α-1,2-fucosyltransferases (Wu et al. 2010), and RAY1 as an arabinofuranosyltransferase (Gille et al. 2013) (Fig. 3; reviewed in Knoch et al. 2014). A loss-of-function mutant of AtGALT31A showed defects in embryogenesis (Geshi et al. 2013). Related to AMOR in T. fournieri, β-glucuronosyltransferases that add a glucuronic acid to galactan, AtGlcAT14A, B, C have been identified in A. thaliana (Knoch et al. 2013; Dilokpimol and Geshi 2014). Detailed analysis of mutants defective in AtGlcAT14 genes is of interest to examine how these genes and β-glucuronosyl residues of AG chains are involved in plant reproduction in A. thaliana. The identification of AMOR and differential localisation of specific structures of AG sugar chains suggest that the functions of AGPs and their sugar chains are locally controlled by these AG biosynthesis enzymes.

Conclusions

From gametogenesis to fertilisation, AGPs are recruited into crosstalk between male and female parts, and between sporophyte and gametophyte generations. AGPs possibly act as plant wall components, calcium capacitors, and nutritional or signalling molecules (Tan et al. 2013; Lamport et al. 2018). However, it remains unknown how these AGPs and their sugar chains function during plant reproduction processes, which is of great academic interest.

In future research, elucidation of the mechanisms of AGPs, and especially their interactors, will require a combination of multidisciplinary tools, as partly summarised here. Decoding the precise structure, and dissecting the functions, of AG sugar chains linked to the core proteins will be challenging. However, achieving an integrated understanding of the roles of AGPs and their sugar chains will help unveil the mysteries of plant reproductive processes and could ultimately lead to agricultural advances.