Introduction

The high-molecular weight glutenin subunits (HMW-GS) are a set of well-conserved endosperm proteins synthesized in the grain of wheat and related grasses (Lawrence and Shepherd 1981; Shewry et al. 2003). In hexaploid wheat, they are encoded by the Glu-1 homoeoloci located on the long arms of chromosomes 1A, 1B and 1D, with each locus comprising a pair of tightly linked genes encoding the x-type (Glu-1-1) and the y-type (Glu-1-2) subunits (Lawrence and Shepherd 1981; Payne 1987). Qualitative and quantitative variation in the HMW-GS is associated with 45–70% of the variation in bread-making performance of European wheat, even though they only represent about 10% of grain protein (Branlard and Dardevet 1985; Payne et al. 1987, 1988). Because of their importance for wheat quality improvement, a substantial number of Glu-1 genes have been cloned (Forde et al. 1985; Sugiyama et al. 1985; Thompson et al. 1985; Halford et al. 1987; Anderson and Greene 1989; Anderson et al. 1989; Halford et al. 1992). Sequence analysis of these genes has shown that each contains a long repetitive region, flanking two highly conserved terminal non-repetitive domains. The repetitive region includes tripeptides, hexapeptides and nonapeptides, with the tripeptide motif restricted to the x-type subunits.

Although HMW-GS are clearly important for the determination of end-use quality, the number of high quality alleles is rather limited within the bread wheat genepool. Thus, some effort has been made to transfer alleles from related species, using either a wide crossing (Zhou et al. 1995) or a somatic hybridization approach. Our focus has been to take advantage of the latter route. We have so far succeeded in fusing the protoplasts of the bread wheat cultivar Jinan 177 (JN177) with UV-irradiated protoplasts of tall wheatgrass (Agropyron elongatum (Host) Nevski [Thinopyrum ponticum]) (Liu SW et al. 2007; Liu H et al. 2009). We have also attempted symmetric somatic hybridization, in which the tall wheatgrass protoplasts were not UV irradiated. Regenerant plants of this latter protoplast fusion resembled the tall wheatgrass parent, but inherited several introgression segments from wheat (Cui et al. 2009). Selections CU and XI (each derived from a single fusion cell) were particularly fertile. Here, we have investigated whether any novel HMW-GS alleles are present in these somatic hybrid regenerants.

Materials and methods

Plant materials

The plant material used in these experiments consisted of the tall wheatgrass and bread wheat biparents of the somatic fusion, five R3 (third generation following the regeneration of the primary somatic hybrid) lines R3CU1, R3CU2, R3CU3, R3XI1 and R3XI2. Karyotypic analysis indicated that the chromosome number of the in vitro cultured tall wheatgrass cells ranged from 60 to 70, while about 80% of R3CU1–R3CU3 and R3XI1–R3XI2 cells carried 66–70 chromosomes. Cui et al. (2009) showed, by a combination of cytological and marker analyses, that the wheat chromosome segments were introgressed into A. elongatum chromosomes in the genomes of R1–R3 regenerants. Both JN177 and the introgression lines were grown in a greenhouse separated from other wheat cultivars to avoid uncontrolled outcrossing.

SDS-PAGE analysis of HMW-GS

The HMW-GS content of JN177 was obtained by SDS-PAGE analysis (Feng et al. 2004) of a crude protein extract of an embryo-less half grains, while those of tall wheatgrass and the introgression lines were obtained from extracts of the whole seed, as described by Mackie et al. (1996).

Cloning and characterizing of HMW-GS genes

Genomic DNA was extracted from the introgression line seedlings using the CTAB method (Murray and Thompson 1980). As the HMW-GS genes are intron less, genomic DNA was used as a PCR template to amplify the entire coding region. A pair of degenerate primers (P1: 5′-ATGGCTAAGCGGc/tTa/gGTCCTCTTTG and P2: 5′-CTATCACTGGCTa/gGCCGACAATGCG) was designed on the basis of published DNA sequences. P1 includes the HMW-GS start codon, and P2 includes the two conserved tandem stop codons. PCR amplification employed a high fidelity LA Taq polymerase (TaKaRa Biotechnology, Dalian, China) with a GC buffer provided for GC-rich template. The amplification profile consisted of a denaturation step (95°C/5 min), followed by 28 cycles of 94°C/40 s, 68°C/4 min, and ending with an extension step (72°C/7 min). The amplicon was recovered from a 1% agarose gel, cloned into the pMD18-T vector (TaKaRa Biotechnology, Dalian, China), and transformed into E. coli DH10B competent cells. Sequencing was performed commercially (Invitrogen, Shanghai, China). Both amplification and cloning were repeated at least three times to minimize the risk of amplification and/or sequencing errors. Sequence analyses were carried out using the MEGA software package v3.1 (Kumar et al. 2004) along with standard programs available from NCBI (http://www.ncbi.nlm.nih.gov/Tools/) and EBI (http://www.ebi.ac.uk/Tools/sequence.html).

Bacterial expression of HMW-GS sequences

To express the mature introgression line HMW-GS proteins in E. coli, two sets of PCR primers (PF/PR1 and PF/PR2) were designed to amplify the sequences while excluding their signal peptides, and at the same time introducing cloning sites. The sequence of PF was 5′-ACCCATATGGAAGGTGAGGCCTCT-3′, that of PR1 was 5′-CTAGAATTCCTATCACTGGCTGGCCGA-3′ (for Hy4, Hy3, Hy7, Hy8) and that of PR2 was 5′-CTAGAATTCCTATCACTGGCTAGCCGA-3′ (for Hy6). PF contains an NdeI site and both PR1 and PR2 an EcoRI site. The amplicons were cloned into the expression vector pET-24a (Novagen, Shanghai, China), and transformed into E. coli BL21 (DE3) pLysS competent cells (Promega, Shanghai, China). Heterologous expression was induced using standard methods (Sambrook et al. 1989) and proteins were extracted by dissolving cells in SDS-PAGE sample buffer for SDS-PAGE analysis (Wan et al. 2002).

Results

The HMW-GS content of the biparents and the introgression lines

The HMW-GS content of JN177 is 1Bx7.1 + 1By9.1; 1Dx2.1 + 1Dy12.1, while that of the tall wheatgrass consists of nine distinct subunits. The HMW-GS composition of the five introgression lines R3CU1–R3CU3 and R3XI1–R3XI2 was overall very similar to that of the tall wheatgrass, although a small number of novel subunits could be identified (Fig. 1). A gel separation of the amplicons derived from each of the five introgression lines was shown in Fig. 2. After restriction enzyme digestion mapping and terminal DNA sequencing, we confirmed that at least 14 distinct sequences had been amplified from the introgression lines (designated as Hx1Hx6 and Hy1Hy8, according to their type and length).

Fig. 1
figure 1

HMW-GS profiles of some bread wheat/tall wheatgrass symmetric somatic hybridization derivatives and their parents. Lanes 1–5 R3CU1-R3CU3 and R3XI1-R3XI2 progeny, lane 6 tall wheatgrass, lane 7 JN177

Fig. 2
figure 2

PCR amplification of HMW-GS coding sequences from some bread wheat/tall wheatgrass symmetric somatic hybridization derivatives and their parents. M Lambda DNA digested by EcoRI + HindIII, lane 1 JN177, lane 2 tall wheatgrass, lanes 3–7 R3CU1–R3CU3 and R3XI1–R3XI2 hybrid spike lines, respectively

Expression of the HMW-GS alleles in bacterial cells

Five of the cloned sequences with intact ORFs were successfully expressed in E. coli, namely pET-Hy3, pET-Hy4 pET-Hy6, pET-Hy7 and pET-Hy8. The SDS-PAGE mobility of four of these (Hy4, Hy6, Hy3 and Hy8) was similar to that of equivalent subunits extracted from tall wheatgrass seeds, but there was no match between the proteins directed by pET-Hy7 and any tall wheatgrass seed-extracted protein (Fig. 3).

Fig. 3
figure 3

SDS-PAGE analysis of heterologously expressed HMW-GS proteins. M protein molecular weight marker, lane 1 JN177, lane 2 tall wheatgrass, lane 3 bacteria harboring pET-Hy4 without IPTG induction, lanes 4–8 expression of the modified sequences of Hy4, Hy6, Hy3, Hy7 and Hy8. The HMW-GS gene-directed proteins induced by IPTG are indicated by arrows

Characteristic of derived amino acid sequences of HMW-GS alleles

The deduced peptide sequences of the 14 HMW-GS genes shared the expected primary structure. Each consisted of a 21 residues signal peptide, a conserved N-terminal region, a central repetitive domain and a conserved C-terminal region. The N-terminal regions of five of the eight y-type subunits include 105 residues, while this length in Hy1, Hy4 and Hy5 was 104, 76 and 59 residues, respectively (Table 1). N-terminal regions of Hy1 lacked a glutamine residue when compared with Hy2, Hy3, Hy6, Hy7 and Hy8. This glutamine residue is also present in all the known x-type subunits. The conserved C-terminal regions of all the 14 subunits comprise 42 residues, and their central repetitive region included both hexapeptide and nonapeptide motifs; the six x-type subunits also contained the diagnostic GQQ tripeptide motif. Differences between these subunits and those already known in wheat lie mostly in single residue substitutions, and the insertion/deletion of repeat motifs in central repetitive region. The deduced peptide lengths of these subunits varied from 817 (Hx1) to 295 (Hy7 and Hy8) residues (Table 1). The Hy7 and Hy8 are also two of the smallest known HMW-GS.

Table 1 Sequence characteristics of the HMW-GS genes isolated from a set of bread wheat/tall wheatgrass symmetric somatic hybridization derivatives

Relationships between HMW-GS sequences

A phylogenetic tree was assembled from the alignment of the full-length nucleotide sequences of the 14 HMW-GS genes and the HMW-GS genes from JN177 and tall wheatgrass (Liu SW et al. 2007, 2008) (Fig. 4). As expected, the y-type genes were separated from the x-type ones. The eight y-type and the six x-type sequences each clustered into three clades. The Hy1 sequence resembled that of tall wheatgrass Aey6, and was distantly related to the remaining seven y-type sequences, while Hy6 was more similar to Aey10 than to any of the other introgression line alleles. The other six y-type alleles of the introgression lines fell into three subgroups. Hy2 and Hy3 shared a close relationship with Aey8, while Hy7 was similar to Hy8, as were Hy4 and Hy5. Of the six x-type alleles, Hx1 was similar to Aex2, Hx5 to Aex5 and quite closely related to Hx2, Hx3 and Hx4, Hx6 was an outlier within the x-type clade.

Fig. 4
figure 4

Phylogenetic analysis of the HMW-GS sequences in some bread wheat/tall wheatgrass symmetric somatic hybridization derivatives and their parents. The phylogenetic tree was constructed according to the full-length DNA sequences using the MEGA software package v3.1. Hy1Hy8 and Hx1Hx6 came from the somatic hybridization derivatives; Aey1Aey10 and Aex1Aex5 came from tall wheatgrass; 1Bx7.1, 1By9.1, 1Dx2.1 and 1Dy12.1 came from JN177

Discussion

Asymmetric somatic hybridization between bread wheat and UV-irradiated tall wheatgrass is known to generate wheat-like introgression lines (Xia et al. 2003; Wang et al. 2005), among which a deal of allelic variation for the HMW-GS genes has been identified (Liu SW et al. 2007; Liu H et al. 2009). Symmetric somatic hybridization involving the same biparents has produced fertile regenerants which more resemble tall wheatgrass in phenotype, but whose genomes still contain some introgressed wheat segments (Cui et al. 2009). Therefore, we obtained a contrary introgression line of wheat/tall wheatgrass, which is favorable for exploring the variation of HMW-GS in different introgression lines via symmetric or asymmetric somatic hybridization.

The Hx5 and Hy1 sequences each differed by only a small number of single nucleotides from a tall wheatgrass sequence (Aex5 and Aey6, respectively), and they had no close match with any of the HMW-GS sequences present in another parent JN177. Thus, it is likely that both were inherited from the tall wheatgrass parent, suffering some point mutation as a result of the somatic hybridization process. Similarly, Hy2 and Hy3 resembled Aey8, but for the presence of additional repeat motifs and a few single-nucleotide polymorphisms. Hx1 and Hy6 resembled Aex2 and Aey10, respectively. When compared with Aex2, Hx1 gained three additional repeats but lost one, while, compared to Aey10, Hy6 gained one and lost two (Fig. 5). Possibly, therefore, these four Glu-1 alleles may have derived via slippage of their corresponding parental gene during replication. The reason for why we have not found the origin of the other six novel HMW-GS sequences of the introgression lines might be that A. elongatum was a cross-pollinating species and there were plenty of the Glu-1 alleles in A. elongatum and we have only obtained a limited number of Glu-1 alleles from them (Liu SW et al. 2008).

Fig. 5
figure 5

Comparisons of the primary structure of HMW-GS sequences extracted from some bread wheat/tall wheatgrass symmetric somatic hybridization derivatives and their parents. a Hy2 and Hy3 versus Aey8, b Hy6 versus Aey10, c Hx1 versus Aex2

The addition or deletion of repeat motifs is thought to be an effective source of variation (Wells 1996), while Anderson and Greene (1989) have proposed that the evolution of HMW-GS genes proceeds via a combination of single base changes, deletions or additions within a repeat, single repeat changes and deletions or duplications of blocks of repeats. The formation of some novel hybrid genes was inosculated with the mechanism mentioned by Anderson and Greene (1989). Thus, the forms of novel HMW-GS alleles generated in these introgression lines are consistent with that of naturally emerging ones, although the process of their formation appears to be accelerated by the somatic hybridization procedure.

Both the present symmetric hybridization experiments, as well as those based on the asymmetric hybridization (Liu SW et al. 2007; Liu H et al. 2009) have produced regenerants carrying a number of novel HMW-GS alleles. Although the regenerants from JN177 callus have also been shown to produce novel HMW-GS alleles, the somaclonal mutation rate is much lower (Feng et al. 2004). Although some of the somatic hybridization-induced alleles may have arisen through somaclonal variation, it seems likely that many resulted from an interaction between the biparental genomes and/or the process of protoplast fusion itself; in the case of the asymmetric hybridization products, an additional source of variation is provided by the pre-fusion UV-irradiation treatment (Liu H et al. 2009). The analysis of certain newly synthesized alloploids has shown that when two genomes are united in a single nucleus, some instability ensues, which results in the elimination of genomic DNA sequences, the alteration of cytosine methylation patterns, and the reactivation of retrotransposons (Shaked et al. 2001; Ozkan et al. 2001; Madlung et al. 2002, 2005). Similar instability is, therefore, not unexpected in a somatic hybrid, and this has been demonstrated in wheat/tall wheatgrass combinations in the form of variation at microsatellite sequences, the elimination of DNA sequences, changes in the pattern of cytosine methylation and silencing or activation of homoeologous alleles (unpublished). The wide hybridization of different genomes might trigger a genomic shock that lead to these responses and it fit McClintock’s view about genomic shock response that “initiates a highly programmed sequence of events within the cell that serves to cushion the effect of the shock” (McClintock 1984). Therefore, the response to genomic shock triggered by the merger and interaction of biparent genomes might be mainly responsible for the sequence variation in the introgression lines.

In conclusion, we have shown here that the variation in the HMW-GS sequences can be induced by symmetric as well as asymmetric somatic hybridization. It is possible that some of the novel alleles may make a positive contribution to the wheat end-use quality and is favorable to the investigation of genome variation and evolution.