Introduction

Wheat (Triticum aestivum L.) is one of the three major grain crops and about 35% of the population in the world uses wheat as a staple food. Meanwhile, wheat is also an important source of plant protein for humans. Wheat flour can be widely used to make bread, noodles, biscuits, cakes, steamed bread, and other foods. Two kinds of major storage proteins gliadins and glutenins are present in the wheat endosperm, which differ in functional property and solubility in different solvents. Glutenins consist of high- and low-molecular-weight glutenin subunits (HMW-GS and LMW-GS), dissolve in dilute acid or dilute alkali, and endow dough viscoelasticity, while gliadins dissolve in 70–90% alcohol and confer wheat dough extensibility (Payne 1987; Wrigley 1996).

HMW-GS are encoded by the Glu-1 loci located on the long arm of chromosomes 1A, 1B and 1D of the first partial homologous group of wheat (Lawrence and Shepherd 1981). Each locus consists of two closely linked genes, encoding a larger x-type subunit with molecular weight 80–88 kDa and a smaller y-type subunit with molecular weight 67–73 kDa), depending on the composition of the motifs in the repeating region and the cysteine content (Shewry et al. 2003). Theoretically, hexaploid wheat can express 6 HMW-GS (1Ax, 1Ay, 1Bx, 1By, 1Dx, and 1Dy), but usually 3 (1Bx, 1Dx and 1Dy) to 5 (1Ax, 1Bx, 1By, 1Dx, and 1Dy) genes express due to the gene silencing (Yang et al. 2006). In general, the 1Bx, 1Dx, and 1Dy encoded genes all have expression activity, while the 1Ax and 1By encoded genes are sometimes silent in hexaploid wheat, especially the 1Ay subunit (Jiang et al. 2009). The allelic variations at Glu-1 loci are closely associated with gluten quality. For example, 1Dx5 + 1Dy10 (Glu-D1d), 1Bx7 + 1By8 (Glu-B1b), and 1Bx17 + 1By18 (Glu-B1i) subunits have higher quality score and confer superior dough strength and breadmaking quality, while 1Dx2 + 1Dy12 (Glu-D1a), 1Bx20 (Glu-B1e), and 1Bx7 + 1By9 (Glu-B1c) subunits are related to poor quality performance (Payne 1987; Shewry et al. 1992; Yan et al. 2009; Jiang et al. 2019; Guo et al. 2019). At present, more than 30 alleles at Glu-1 loci are identified, and most of these genes have been cloned and well characterized (Payne and Lawrence 1983). However, the allelic variation of HMW-GS genes in common wheat is limited and the identified alleles related to superior gluten quality still lack (Li et al. 2007). Thus, it is highly important to discover and clone new candidate Glu-1 genes in the related species for further wheat quality improvement (Liang et al. 2015).

Aegilops, the closest related genus of wheat, is an important genus of gramineous plants, mainly distributed in Europe and Asia, and contains a large number of valuable genetic resources (Ma et al. 2013). Aegilops consists of 22 species, including 7 genomes of S, M, C, U, N, D, and T, of which the S genome contains S (Ae. speltoides, which is considered to be the donor of the common wheat B genome). Sb (Ae. bicornis), Sl (Ae. longissima), Ss (Ae. searsii), and Ssh (Ae. sharonensis) genomes. As a diploid species of Aegilops, Ae. longissima (SlSl, 2n = 2x = 14) has rich genetic resources for wheat improvement such as powdery mildew resistance gene Pm13 (Alberto et al. 2003), eyespot resistance genes (Sheng and Murray 2013), and drought resistance (Zhou et al. 2016). In addition, Ae. longissima contains rich trace elements such as iron, zinc, copper, manganese, calcium, magnesium, and potassium, and the content of trace elements in the hybrid progeny was promoted through crossing with common wheat (Neelam et al. 2013). Ae. longissima also has abundant glutenin allelic variations and special HMW-GS that have shown potential values for wheat quality improvement (Wang et al. 2013; Garg et al. 2014; Zhu et al. 2015).

In this study, we aim to investigate the allelic variations at Glu-1 in Ae. longissima accessions and identify new 1Sl-encoded HMW-GS by sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE). Then, allelic-specific polymerase chain reaction (AS-PCR) was used to clone their encoded genes. Ultimately, the complete sequences of 5 x-type and 5 y-type HMW glutenin subunits were obtained from Ae. longissima. The molecular characterization of these subunit genes showed special structural features and have potential application values for wheat quality improvement.

Materials and methods

Plant materials

Six Ae. longissima accessions kindly provided by Dr. Hsam from Plant Breeding Institute, Technical University of Munich, Germany, were used as materials, including PI542196, PI604108, PI604129-1, PI604130, PI604136-1, and PI60414. Chinese Spring (CS) with HMW-GS (null, 1Bx7 + 1By8, and 1Dx2 + 1Dy12), and CS-Ae. longissima 1Sl/1B substitution line CS-1Sl(1B) (null, 1Slx2.3* + Sly16*, 1Dx2 + 1Dy12) (Wang et al. 2013) were used as the control for HMW-GS identification.

HMW-GS extraction and SDS-PAGE

Seed glutenin extraction was according to the previously described method (Mackie et al. 1996). The albumins, globulins, and gliadins were first extracted and excluded from 15 mg seeds using 75 μL distilled water, 75 μL dilute salt solutions, and 120 μL 30% ethanol, respectively. Then, glutenins were extracted using commonly used glutenin extraction buffer (50% isopropanol, 80 μL Tris–HCl, pH 8.0) with 64.83 μL dithiothreitol (DTT) and 1.4% 4-vinylpyridine (v/v). Sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) was performed with Bio-Rad PRO-TEAN II XL equipment in 12% gel and electrophoresed at 15 mA for 2 h based on the previously described method (Yan et al. 2003a). The new HMW-GS identified in Ae. longissima were named according to the accession number and subunit types (x- and y-types).

DNA extraction and PCR amplification

Genomic DNA was isolated from seedling leaves at three-leaf stage with CTAB protocol following the procedure of Yan et al. (2004) with slight modifications. The primers for allelic-specific polymerase chain reaction (AS-PCR) were designed based on the previously cloned sequences (Wan et al. 2002), and AS-PCR amplification included an initial denaturation step at 95 °C for 3 min, followed by denaturation at 95 °C for 15 s, anneal at 58 °C for 15 s, extension at 72 °C for 2 min 30 s in total of 32 cycles, and a final extension step at 72 °C for 10 min.

Molecular cloning and sequencing

The PCR fragments of the expected size were purified from the gel using the Gel Extraction Kit (Omega). Purified products were ligated into CE-Vector (Vazyme, Nanjing, China) and transformed into Trans-2-blue receptor cells (TransGen, Beijing, China). DNA sequencing from three random positive clones of each PCR fragment was carried out by TaKaRa Biotechnology, Dalian, China.

Sequence alignment, SNP, and InDel identification

Multiple sequence comparison of HMW-GS nucleotide and protein sequences were completed by Bioedit 7.0.1.1. Based on sequence alignment, single-nucleotide polymorphisms (SNPs) and insertions/deletions (InDels) were identified with software Bioedit 7.0.

MALDI-TOF/TOF–MS

To verify the authenticity of the cloned HMW-GS genes, the corresponding subunits were collected from SDS-PAGE gels and digested by trypsin. The digested protein subunits were further identified using matrix-assisted laser desorption ionization time-of-flight/time-of-flight mass spectrometry (MALDI-TOF/TOF–MS) based on the previously described method (Ge et al. 2012).

Construction of phylogenetic tree and estimation of divergence time

Clustal W program and MEGA 7.0 were used to construct phylogenetic tree and the estimation of the divergence times among HMW-GS genes was based on Gaut et al. (1996). The Clustal W program was used to make a multiple alignment with homologous nucleotide sequences; the alignment file was converted to software MEGA 7.0 by the complete coding regions of HMW-GS genes with bootstrap values 1000. The divergence times of HMW-GS genes were estimated using MEGA 7.0 with the evolution rate as 6.5 × 10−9 substitution/site year according to Allaby et al. (1999).

Results

Identification of novel HMW-GS in Ae. longissima L.

Six Ae. longissima accessions were consecutively reproduced and saved in our laboratory, which showed higher purity and genetic stability. SDS-PAGE analysis showed extensive allelic variations in HMW-GS compositions among 6 accessions, and 13 new HMW-GS were identified and named: including PI542196 (1Sl2x + 1Sl2y), PI604108 (1Sl6x + 1Sl6y1 + 1Sl6y2), PI604129 (1Sl16x + 1Sl16y), PI604130 (1Sl17x + 1Sl17x), PI604136 (1Sl23x + 1Sl23y), and PI604142 (1Sl25x + 1Sl25y) as indicated in Fig. 1. Compared to the HMW-GS compositions in CS and CS-1Sl(1B), the new subunits identified in Ae. longissima accessions had significantly slower electrophoretic mobilities and higher molecular weight than 1Bx7 + 1By9 subunits of CS. Particularly, six x-type and six y-type subunits moved clearly slower than the greater x-type and y-type subunits 1Slx2.3* + 1Sly16* from CS-1Sl(1B) substitution line, respectively. These special HMW-GS could serve as new genetic resources for wheat quality improvement.

Fig. 1
figure 1

Identification of new HMW-GS in Ae. longissima L. by SDS-PAGE. Lines 1–8: China Spring (CS), PI542196, PI604108, PI604129, PI604130, PI604136, PI604142, and CS-1Sl/1B)

Cloning and characterization of ten novel HMW-GS genes from Ae. longissima L.

The specific primers SxF/SxR and SyF/SyR were designed (Table 1) and used for amplify 1Sl-encoded x-type and y-type new subunit genes, respectively. AS-PCR results showed that the specific amplified fragments with about 2400–2800 bp and 2200–2300 bp were obtained (Fig. 2), which were corresponding to the sizes of x-type and y-type subunit genes, respectively. After collection, cloning, and sequencing of the specific amplification fragments, the complete open reading frames (ORFs) of five x-type (1S12x, 1S116x, 1S117x, 1S123x, and 1S125x) and five y-type (1S12y, 1S16y1, 1S116y, 1S117y, and 1S123y) HMW-GS genes were obtained. The full length of five x-type subunit genes ranged from 2469 to 2829 bp, encoding 821–941 amino acid (aa) residues, while five y-type subunit genes had 2250–2316 bp that encoded 749–771 aa residues. Website predictions (https://web.expasy.org/compute_pi/) showed that the deduced molecular weight of these encoding sequences were 87,475–100,010 Da for x-type subunits and 80,280–82,761 Da for y-type subunits, much higher than 1Bx7 (82,865 Da) and 1By8 (77,343 Da) subunits from common wheat (Table 2). Among them, the deduced molecular weight of the x-type (1S12x, 1S116x, and 1S125x) and all y-type subunits were also remarkably higher than those of previously characterized 1Slx2.3* (97,797 Da) and 1Sly16* (78,165 Da) subunits (Wang et al. 2013), respectively. All cloned ten novel HMW-GS genes were deposited in GenBank with the accession numbers from MH699857 to MH699866.

Table 1 The sequences of primers used to amplify HMW-GS genes
Fig. 2
figure 2

AS-PCR amplification of x-type and y-type HMW-GS genes from 1Sl genome of Ae. longissima L. with primers SxF + SxR and SyF + SyR. a PCR amplification of x-type HMW-GS from 1S genome of Ae. longissima L., 1, 2: 1S12x; 3, 4: 1S116x; 5, 6: 1S117x; 7, 8: 1S123x; 9, 10: 1S125x; M: DNA makers. b PCR amplification of y-type HMW-GS from 1S genome of Ae. longissima L., 1, 2: 1S12y; 3, 4: 1S16y1; 5, 6: 1S116y; 7, 8: 1S117y; 9, 10: 1S123y; M: DNA markers

Table 2 Comparison of the mature protein sequences of new novel HMW-GS

Multiple sequence alignment of the deduced amino acid sequences using BioEdit7.0 showed that all ten new subunits had typical structural features of HMW-GS, including the signal peptide (21 aa), conserved N-terminal (81–87 aa), central repeat region (582–792 aa), and C-terminus (42 aa) (Figs. 3, 4). In addition, each x-type subunit contained four conserved cysteines (three at the N-terminus and one at the C-terminus), while the y-type subunits had seven conserved cysteines (five at the N-terminus, one at the repetitive domain, and one at the C-terminus) as other HMW-GS from bread wheat and related species (Figs. 34; Table 2). The amino acid sequence repeat regions showed that five x-type new subunits contained a longer repetitive domain, with 8–11 nonapeptides (consensus GYYPTSPQQ and GYYPTSLQQ), 23–31 hexapeptides (consensus PGQGQQ and SGQGQQ), and 33–41 tripeptides (consensus GQQ). Obviously, the number of tripeptides in five x-type new subunits was much higher than 1Slx2.3* and 1Bx7 (Table 2). The amino acid sequence alignment indicated high similarity of these x-type subunits with 1Slx2.3* and 1Bx13 (Fig. 3), and the major differences occurred at the central repeat region. In particular, 1Sl2x, 1Sl16x, 1Sl25x, and 1Slx2.3* had 84 and 20 aa insertion at the position 669–752 and 778–797, respectively, which resulted in the length increase of the central repeat region. The five y-type new subunits also had a longer central repeat region compared to those from bread wheat (Fig. 4). Their repeat region length ranged from 582 aa to 604 aa, and contained 20–22 hexapeptides (consensus PGQGQQ) and 11–12 nonapeptides (consensus GYYPTSP/LQQ or GHYPASQQQ). The nonapeptide repeats were much higher than 1Sly16* (8) and 1By8 (4). Both 1Sl6y1 and 1Sl23y subunits had one 15 aa insertion at the position 454–468. Five x-type and five y-type subunits had 280–332 (34.1–35.4%) and 240–250 (32.0–32.6%) glutamine residues, respectively. Although they had a similar glutamine content with 1Slx2.3* (35.0%) + 1Sly16* (31.9%) and 1Bx7 (33.1%) + 1By8 (33.2%), two x-type subunits 1Sl2x (332) and 1Sl25x (331) as well as all y-type subunits contained more glutamine residues than 1Slx2.3* (329) + 1Sly16* (239) and 1Bx7 (261) + 1By8 (239) (Table 2).

Fig. 3
figure 3

Mutliple sequence alignment of the deduced amino acid sequences of 1Slx and 1Bx types HMW-GS and five newly cloned subunit genes in this work in Ae. longissima. The red box indicates the cysteine residues

Fig. 4
figure 4

Multiple sequence alignment of the derived amino acid sequences of 1Sly and 1By types HMW-GS and five newly cloned subunit genes in this work in Ae. longissima. The red box indicates the cysteine residues

Analysis of SNPs and InDels present in the cloned new HMW-GS genes

The coding sequences of five x-type and five y-type subunit genes were aligned with other 22 x-type and 18 y-type HMW-GS genes from common wheat and related species, respectively. The SNP and InDel variations present in these HMW-GS genes were identified and the results are listed in Tables 3 and 4. In general, all subunit genes showed extensive allelic variations. Five x-type genes had 64 SNPs at different positions, and the number of SNP per gene was between 12 and 15, of which 31 (48.4%) were resulted from the base transitions of G-A or C-T (Table 3). Five y-type genes contained 79 SNPs at different positions, and the number of SNP per gene was between 12 and 28, and 25 (31.64%) SNPs were resulted from the base transversions of C-A (Table 4). Moreover, more than half of the detected SNPs in five x-type genes and five y-type genes were nonsynonymous. In addition, two deletions at 96–105 and 3082–3089 in 1S116x, a deletion at 1355–1364 in both 1S16y1 and 1S123y, and one deletion at 2579–2591 in 1S12y were detected.

Table 3 The positions of SNPs identified in five new x-type HMW-GS genes
Table 4 The positions of SNPs identified in five new y-type HMW-GS genes

Tandem MS analysis and verification of the cloned HMW-GS genes from Ae. longissima L.

To verify the authenticity of the cloned HMW-GS new genes, we selected four subunits (1Sl17x, 1Sl25x, 1Sl6y1, and 1Sl17y) for further tandem mass spectrometry analysis. The corresponding subunits were collected from SDS-PAGE gel (Fig. 1) and digested by trypsin, and then used for MALDI-TOF/TOF–MS identification. The results showed that 2–6 protein peptides with 11–91 amino acid residues were well matched to their corresponding sequences specifically present in four subunits with high protein score CI % (Table 5). This further confirmed the reliability of the cloned genes.

Table 5 HMW-GS from Ae. longissima L. identified by MALDI-TOF/TOF–MS

Phylogenetic analysis

The complete encoding sequences of the cloned 10 new HMW-GS genes from Ae. longissima, and 10 x-type and 10 y-type subunit genes from common wheat and related species were used to construct neighbor-joining phylogenetic tree, including MF805726 (1Slx2.3*), JN001483 (1Slx2.9), X61009 (1Ax1), M22208 (1Ax2*), X13927 (1Bx7), KF733216 (1Bx14), AB263219 (1Bx17), AJ437000 (1Bx20), X03346 (1Dx2), X12928 (1Dx5), JN408502 (1Sly16*), JN001484 (1Sly2.3), FJ404595 (1Ay), AY245797 (1By8), X61026 (1By9), KF733215 (1By15), EF540765 (1By16), KF430649 (1By18), X12929 (1Dy10), and BK006459 (1Dy12). The results showed that the x-type and y-type HMW-GS genes in the phylogenetic tree were clearly separated into two branches (Fig. 5). Similar to the previous report (Wang et al. 2013), the x-type and y-type subunit genes from S1 genome showed closer relationships with 1Dx2 and 1Dx5 and 1Dy10 and 1Dy12 subunit genes from bread wheat, respectively.

Fig. 5
figure 5

The phylogenetic tree of 30 HMW-GS based on complete amino acid sequences. 1S12x* and 1S116y* is cloned from the S genome of Chinese Spring chromosome substitution line CS-1Sl/1B, in which the 1B chromosome was substituted by 1Sl from Ae. longissima

As shown in Tables S1 and S2, the calculated divergence times between five x-type and y-type HMW-GS genes isolated from Ae. longissima and those from A, B and D genomes of Triticum species ranged from 2.10 to 10.00 million years ago (MYA), from Aegilops species ranged from 1.50 to 7.48 MYA, and from Ae. longissima ranged from 0.99 to 6.51 MYA.

Discussion

It is well known that HMW glutenin subunits play an important role in the structure and properties of gluten, as well as the wheat processing quality. Considerable work demonstrated that the variations at Glu-1 loci in common hexaploid wheat are limited (Li et al. 2007), and only a few of HMW-GS genes have positive effects on gluten quality, although numerous subunits are identified (Payne et al. 1981). However, higher diversity in HMW-GS compositions in the wheat-related species is widely present (Ciaffi et al. 1993; Yan et al. 2003b). The Aegilops genus has rich storage protein gene resources (Liu et al. 2003; Yan et al. 2008), and particularly, some potential superior HMW-GS from Ae. longissima were identified (Wang et al. 2013; Garg et al. 2014; Zhu et al. 2015). In this work, we identified and cloned ten new HMW-GS genes in Ae. longissima, and they showed special molecular structural features with higher molecular weight and more glutamine residues that may confer superior gluten quality. Thus, the Sl genome has extensive allelic variations at Glu-1 loci and contains potential gene resources for wheat quality improvement.

In theory, high glutamine content can stabilize the polymeric structure of glutenin by forming more hydrogen bonds (Bekes et al. 1995), and therefore, the newly identified HMW-GS rich in glutamines may have a positive effect on dough strength. Meanwhile, these subunits contained more repetitive domains, leading to a higher molecular mass. In general, longer and more regular repeating units have a positive effect on the elasticity and viscosity of the dough through intermolecular interactions (Masci et al. 2000). The insertion of the intermediate repeat region fragments directly affects the function of glutenins (Hassani et al. 2005). Compared with other HMW-GS, 1Sl2x, 1Sl16x, and 1Sl25x have 84 and 20 aa insertions at positions 669–752 and 778–797, respectively (Table 3), and both 1Sl6y1 and 1Sl23y have one 15 aa insertion at the position 454–468 (Table 4), significantly increasing the length of the repeat region. In addition, the consensus of tripeptide and hexapeptide in repetitive domains has effects on wheat quality (Masci et al. 2000). Three new x-type subunits (1Sl2x, 1Sl16x, and 1Sl25x) have more tripeptide motifs in repetitive domain than 1Bx7 and 1Slx2.3*, while five y-type subunits had much more nonapeptide repeats than 1Sly16* and 1By8 (Table 2). D’Ovidio and Anderson (1994) reported that y-type HMW-GS plays the most important role in various major factors affecting flour technology. Thus, the molecular characterization of these special HMW-GS may improve the formation of gluten macropolymers and breadmaking quality, and have potential application values during wheat quality breeding.

Molecular markers based on SNPs have shown to be a fast and effectively for identifying HMW-GS genes in wheat breeding (Rafalski 2002). To date, various molecular markers for HMW-GS genes have been developed, which provide a powerful tool for wheat quality improvement (Ma et al. 2003; Liang et al. 2015). The HMW-GS genes identified and cloned from Ae. longissima showed abundant SNP and InDel variations (Tables 3, 4), which could facilitate to develop molecular markers used for marker-assisted selection in the early generations during quality improvement programs. HMW-GS genes can be successfully transferred to wheat by transgenic technology (Liu et al. 2011). Recently, wheat genetic transformation has made a greater progress (Wang et al. 2017), which can greatly shorten the breeding period. Therefore, the newly cloned special HMW-GS genes in this study are expected to be used for wheat quality improvement by combining with effective genetic transformation and marker-assisted selection.

Conclusion

In this study, ten special HMW-GS in Ae. longissima were identified and their complete encoding genes were cloned by AS-PCR. Molecular characterization showed that all subunits contained a longer central repetitive domain with more glutamine repeats and glutamine residues than those from common wheat, which could benefit the formation of superior breadmaking quality. The encoding sequences of these new genes had abundant SNP and InDel variations that could facilitate to develop specific molecular markers for gluten quality improvement. The 1Sl-encoded HMW-GS showed closer relationships with those at Glu-D1 locus of bread wheat as revealed by phylogenetic analysis, and they diverged from Triticum species at about 2.10-10.00 MYA. These novel HMW-GS genes have potential application values for breadmaking quality improvement of wheat through molecular marker development and marker-assisted selection as well as direct genetic transformation.