Introduction

Puroindolines are amphipathic proteins of ca. 13,000 Da, and share homology with grain softness protein (GSP), purothionins, lipid transfer proteins, and other members of the prolamin super-family of proteins (Shewry and Halford 2002). Puroindolines are present throughout the Triticeae tribe of the Poaceae (Gramineae), including wheat (Triticum spp.), rye (Secale spp.), barley (Hordeum spp.), and the wild relatives of wheat (Aegilops spp. and Triticum spp.). In Triticum spp., puroindolines exist as two expressed genes, Puroindoline a and Puroindoline b, on the distal end of the short arm of chromosome 5 (5DS). An exception to this general situation lies with the tetraploid (AABB) wheats (T. turgidum), which include cultivated durum (ssp. durum). Apparently during the allotetraploidization forming T. diccocoides, the wild ancestor of cultivated durum, both the A- and B-genome Puroindoline loci were eliminated (Chantret et al. 2005). Consequently, hexaploid wheat, T. aestivum, possesses puroindoline a and b on the D-genome (contributed by Ae. tauschii during the allohexaploidization of this taxon), but lacks homoeologous loci on the A- and B-genomes. In rye (Secale cereale), the homologous gene is referred to as Secaloindoline; the number of loci is somewhat in question (e.g., Simeone and Lafiandra 2005). In barley (Hordeum vulgare), three genes have been identified (Hordoinoline a, Hordoindoline b1, and Hordoindoline b2) (Darlington et al. 2001). Puroindoline-like genes are also present in the Aveneae tribe, where they are referred to as Avenoindolines (Gautier et al. 2000).

Extensive genetic surveys, cytological analysis, and transformation experiments in wheat and rice (Oryza sativa) have demonstrated that puroindoline a and b act to create soft kernel texture (also referred to as “grain hardness”) (Bhave and Morris 2008a,b, Morris 2002, Morris and Bhave 2008). The presence of Puroindoline a-D1a (Pina-D1a) and Puroindoline b-D1a (Pinb-D1a) are associated with soft kernel texture in soft wheat (T. aestivum) varieties. The absence of Pina and Pinb in durum wheat is associated with very hard kernel texture. Research has shown that the two puroindolines act in concert to effect kernel softness, and that the absence of one or perturbations in the primary structure (due to SNPs) in either results in a “hard” kernel texture that is intermediate between soft wheat and durum.

The first documented mutation in T. aestivum was a SNP which conferred a glycine to serine change in Pinb (Giroux and Morris 1997). Subsequently, research focused on hard kernel varieties did not possess this mutation; this work led to the discovery of a second mutation, a large deletion in Pina (Giroux and Morris 1998). Since these reports, a number of additional mutations have been found and all are associated with an increase in kernel hardness (Bhave and Morris 2008a). To date, it seems clear that these are all post-hexaploidation mutations to the original Pina and Pinb contributed by Ae. tauschii. Due to the evolutionary bottleneck that occurred during the allohexaploidization event(s), T. aestivum is limited in the amount of variation present in Pina and Pinb. However, Ae. tauschii appears to be a rich source of variation in the puroindoline genes. Direct access to this gene pool is difficult but more easily facilitated through the use of synthetic hexaploids. In this regard, Gedye et al. (2004) surveyed a large number of synthetic hexaploids from CIMMYT and found that various puroindoline sequences were associated with softer kernel texture.

The evolution and duplication of the puroindoline genes present itself as an interesting system. Based on the analysis of Massa and Morris (2006), Gsp-1 appears to be the more ancient of the Pina, Pinb, and Gsp-1 genes. At the Hardness locus on 5DS, all three genes are localized within 39 kb of each other in the order Gsp-1, Pina, and Pinb (Chantret et al. 2005). Additional puroindoline-related gene duplication events have apparently occurred. Chantret et al. (2005) reported the existence of two other related gene sequences, Pinb-relic and PseudoPinb. Pogna et al. (2002) and Gazza et al. (2006) have provided intriguing reports of puroindoline genes detected in a few related durum wheat varieties, and a Pinb-relic in tetraploid Aegilops ventricosa.

Recently, Wilkinson et al. (2008) reported three new puroindoline gene-like sequences in hexaploid wheat. The genes shared close homology with Pinb and hence were designated Pinb “variants”. Pinb variant 1 was mapped through linkage analysis to the long arm of chromosome 7A, and the locus was designated Pinb-A2. These authors conclude that, “it is probable that all three variants in bread wheat are encoded by the same Pinb-2 locus on 7A”.

In the present report, we provide evidence to show that the three variants reported by Wilkinson et al. (2008) exist, in part, as a homoeologous series, wherein variant 1 (Pinb-2v1) resides on the long arm of chromosome 7D (7DL), variant 2 (Pinb-2v2) resides on 7BL, and variant 3 resides on 7B; variant 3 may be allelic to variant 2. Finally, we report the discovery of a fourth variant (Pinb-2v4) which resides on 7AL.

Materials and methods

Flanking genomic sequences of Pinb variant genes were isolated using PCR. Genomic sequences of the Gsp-1, Pina-D1a and Pinb-D1a genes from T. aestivum, and the Hinb-1 and Hinb-2 genes from H. vulgare were compared, and conserved regions flanking the coding sequences were used to design degenerate primers. Four different primer combinations were employed. Upstream flanking sequence was obtained using either pin5tail1S1 or pin5tail2S1 in conjunction with primer pinb-2vU reverse (Table 1). Downstream flanking sequence was obtained using the primer pinb-2vU forward in conjunction with either pin3tail1A1 or pin3tail2A1. Genomic DNA from Winsome, a Pacific Northwest hard white spring wheat cultivar (PI613177) was used as template. PCR conditions are listed below. The mixed PCR products were cloned using the pGEM-T system (Promega) according to the manufacturer’s protocols. Approximately 24 individual clones from each primer combination-reaction were sequenced.

Table 1 PCR primers used in generating Puroindoline b-2 variant gene sequences in wheat

Seed of genetic stocks was obtained from B.S. Gill and J. Raupp of the Wheat Genetic and Genomic Resources Center, Manhattan, KS. Seed of Winsome was obtained from John Burns, Washington State University Uniform Cereal Variety Testing Program.

Genomic DNA of genetic stocks was extracted from leaf tissue of individual seedlings or from individual dry mature half kernels according to Chen et al. (2006). PCR primer sequences were designed using Primer 5.0 software. Reactions were performed in 25 µL volume containing 100 ng of genomic DNA, 10 pmol of each primer, 250 µM of each dNTP, 1× Taq DNA polymerase reaction buffer containing 1.5 µM of MgCl2 and 0.5 unit of Taq DNA polymerase (Promega, Madison, WI). The cycling conditions were 94°C for 5 min followed by 35 cycles of 94°C for 50 s, 45–65°C for 50 s (primer-specific annealing temperatures; see Table 1), 72°C for 1 min, followed by a final 10-min extension at 72°C. An aliquot (8 µL) of the PCR products was analyzed on 1.5% (w/v) agarose gels, stained with ethidium bromide, and visualized with UV light. Amplified fragments were purified by ExoSAP-IT according to the manufacturer’s instructions. Sequencing reactions were performed with the Big Dye Terminator Version 3.1 Cycle Sequencing Kit; sequencing was conducted with an Applied Biosystems 3100 Genetic Analyzer (PerkinElmer Applied Biosystems Division, Foster City, CA). Multiple alignment of sequences was analyzed by DNAMAN Version 6.0 and graphic data were analyzed by FinchTV Version 1.4.0.

Results

Cloning and sequence of Puroindoline b-2 variant genes

Conserved sequences flanking puroindoline-like genes from wheat and barley were used to design various combinations of primers aimed at generating amplicons that contained the coding sequence of Pinb-2 and variable amounts of additional 5′ and 3′ sequence (Table 1). Forty-three individual clones were sequenced and found to match perfectly the coding sequence of Pinb-2 variants 1, 2 and 3 reported by Wilkinson et al. (2008). The pinb-2V1 was represented by 17 clones, pinb-2V2 was represented by 10 clones, and pinb-2V3 was represented by 13 clones. These clones also provided approximately 150 bp of additional 5′ flanking sequence. The coding sequence of Pinb-2v1, Pinb-2v2 and Pinb-2v3 is 453 bp (NB: Wilkinson et al. 2008 incorrectly lists the NCBI accession identifiers for Pinb-2v2 and Pinb-2v3; their correct accession identifiers are AM944732 and AM944733, respectively). Our extended sequences are archived in the NCBI as GQ496616 (Pinb-2v1), GQ496617 (Pinb-2v2) and GQ496618 (Pinb-2v3).

Figure 1 presents an alignment of Pinb-2v1, Pinb-2v2, and Pinb-2v3 compared with Pinb-D1a. All four genes showed very high homology. For the best alignment, two single-base gaps were introduced into all three of the Pinb-2 genes compared to Pinb-D1a in the sequence 5′ to the ATG initiation site (gaps at −141 and −84; Fig. 1). Additionally, Pinb-2v1 and Pinb-2v2 exhibited an 11-base deletion compared to Pinb-D1a and Pinb-2v3 (−65 to −76, Fig. 1). The only other indel was a 6-bp insertion in the coding sequence of all the Pinb-2 variants near the 3′ terminus, relative to Pinb-D1a (427–432; Fig. 1).

Fig. 1
figure 1

DNA sequence alignment of Puroindoline b-2 gene variants Pinb-2v1, Pinb-2v2, Pinb-2v3, and Pinb-2v4 from wheat (T. aestivum). Pinb-2 variants are aligned with Pinb-D1a. Polymorphic sites are shaded in gray, hyphens indicate gaps introduced to improve alignment, gene-specific primer sites are indicated with underlining, numbering is based on the ATG start site as “1”, deduced translation ends with TGA at position 451

Discovery of Puroindoline b-2 variant 4

As mentioned above, degenerate primers located in the 3′ flanking region were used in combination with pinb-2vU forward (Table 1). PCR products from these reactions were cloned and sequenced. Analysis revealed an extremely unequal distribution of sequences in which variants 1, 2 and 3 were not represented. Perhaps this was due to the lack of sequence conservation in this area for these genes. Of the 41 clones sequenced, 22 were identical to Pina-D1 and 13 were Pinb-D1. The remaining six clones were represented by a variant, but novel sequence. The validity of this sequence, which was designated Pinb-2v4 (variant 4), was confirmed by an additional round of degenerate PCR and cloning. Pinb-2v4 sequence is archived in the NCBI as GQ496619.

Alignment with the other Pinb genes showed that it shared high homology and was most closely related to Pinb-2v1 (Table 2). Due to the primer strategy, an additional 44 bp of sequence was obtained extending 3′ past the TGA stop codon (Fig. 1). Pinb-2v4 sequence was not obtained in the previous PCR and cloning experiments using the 5′ extension primers pin5tail1S1 and pin5tail2S1 (data not shown), possibly the result of polymorphism between variants 1, 2 and 3 versus variant 4 in this region. The 3′ extension showed high homology with Pinb-D1a, differing at only eight sites; a single-base indel was introduced in Pinb-D1a for best alignment.

Table 2 Homology matrix of four Puroindoline b-2 variant sequences and Pinb-D1a (percentages)

The sequences of all four variants from position 22–453 (Fig. 1) were evaluated for similarity; Pinb-D1a was also included for comparison. Pinb-2v1 and Pinb-2v4 exhibited the highest level of sequence homology, 95% (Table 2). However, all four puroindoline b variant sequences were at least 91% homologous among themselves. Being so similar, they were all 72–73% homologous to Pinb-D1a (Table 2).

Sequence comparisons at the amino acid level

The PINB-2 variants shared many notable features with PINB-D1a (Fig. 2). Notably, the cysteine backbone comprised 10 residues was perfectly conserved among the four translated protein sequences and PINB-D1a. The tryptophan motif which is WRWWKWWK in PINA-D1a, and WPTKWWK in PINB-D1a, is reduced to only KWWK in the four PINB-2 variants. Interestingly, the KWWK is conserved throughout all translated proteins. At the C-terminus, none of the PINB-2 variants exhibited the W which terminates PINB-D1a. The PINB-2 variants either ended with the conserved GYY or GYYY. Two single-residue gaps were introduced into PINB-D1a at 143 and 146 to achieve best alignment with the PINB-2 variants. Among the four variants, 21 sites were polymorphic, 129 conserved.

Fig. 2
figure 2

Deduced amino acid sequence alignment of Puroindoline b-2 gene variants Pinb-2v1, Pinb-2v2, Pinb-2v3 and Pinb-2v4 from wheat (T. aestivum). PINB-2 variants are aligned with PINB-D1a. Polymorphic sites are shaded, hyphens indicate gaps introduced to improve alignment

Haplotyping varieties for Puorindoline b-2 variant genes

Based on these sequences obtained from the genomic PCR clones from Winsome (Fig. 1), gene-specific PCR primers were designed for all four variants (Table 1; Fig. 1). These gene-specific primers were used to haplotype cvs. Chinese Spring, Cheyenne, Recital, Wichita and Winsome. Results showed that Chinese Spring was Pinb-2v1/Pinb-2v2/Pinb-2v4, as was Recital. Cheyenne and Wichita were Pinb-2v1/Pinb-2v3/Pinb-2v4. Twelve individual seeds of Winsome were found to be uniformly Pinb-2v1/Pinb-2v3/Pinb-2v4. This result was not fully consistent with the results obtained from cloning. In summary, among this small population of varieties, all possessed Pinb-2v1 and Pinb-2v4, whereas Pinb-2v2 and Pinb-2v3 were variable, and possibly allelic.

Physical mapping of Puroindoline b-2 variant genes

All 38 available ditelosomic aneuploids available in the Chinese Spring background were evaluated using PCR and the gene-specific primers (Table 1). Results indicated that in 35 of the lines, PCR product was produced using gene-specific primers for Pinb-2v1, Pinb-2v2, and Pinb-2v4, same as in Chinese Spring (Table 3). Among the remaining three lines, gene-specific primers failed to produce one product in each. In Chinese Spring ditelosomic 7AS (CS Dt-7AS) which is lacking the long arm of 7A, no product was produced using the gene-specific primers for Pinb-2v4. In CS Dt-7BS, no product was produced for Pinb-2v2, and for CS Dt-7DS, no product was produced for Pinb-2v1. In summary, these results indicate that the physical location of Pinb-2v1 is 7DL, Pinb-2v2 7BL, and Pinb-2v4 is 7AL. No information could be gained in this set of aneuploids regarding the location of Pinb-2v3 since this sequence was not present in Chinese Spring.

Table 3 Puroindoline b-2 variant gene amplification in various genetic stocks of T. aestivum

PCR amplification and sequencing using the ‘universal’ primers pinb-2vU indicated that 35 CS ditelosomic lines had the same mixed PCR products as did CS, whereas CS Dt-7DS had a PCR product of mixed Pinb-2v2 and Pinb-2v4 sequences, CS Dt-7BS had a PCR product of mixed Pinb-2v1 and Pinb-2v4 sequences, and CS Dt-7AS had a PCR product of mixed Pinb-2v1 and Pinb-2v2 (data not shown). These results further support that Pinb-2v1 is located on 7DL, Pinb-2v2 on 7BL, and Pinb-2v4 on 7AL in Chinese Spring.

All of the chromosome group 7 nullisomic–tetrasomic lines of Chinese Spring were analyzed using the gene-specific primers. Results showed that Pinb-2v1 was present in N7A-T7B, N7A-T7D, N7B-T7A, and N7B-T7D, but absent in N7D-T7A and N7D-T7B (Table 3). Pinb-2v2 was present in N7A-T7B, N7A-T7D, N7D-T7A, and N7D-T7B, but absent in N7B-T7A and N7B-T7D. Pinb-2v4 was present in N7B-T7A, N7B-T7D, N7D-T7A, and N7D-T7B, but absent in N7A-T7B and N7A-T7D. In summary, the results support that Pinb-2v1 is on 7D, Pinb-2v2 on 7B, and Pinb-2v4 on 7A. Moreover, sequencing PCR product of all group 7 nullisomic–tetrasomic lines of Chinese Spring produced using the ‘universal’ primers Pinb-2vU indicated that N7A-T7B and N7A-T7D produced mixed amplification of Pinb-2v1 and Pinb-2v2, N7B-T7A and N7B-T7D produced mixed amplification of Pinb-2v1 and Pinb-2v4, and N7D-T7A and N7D-T7B produced mixed amplification of Pinb-2v2 and Pinb-2v4 (data not shown).

A total of 20 disomic substitution lines of Chinese Spring with substituted Cheyenne chromosomes were evaluated using PCR and the gene-specific primers. Results indicated that Pinb-2v1 and Pinb-2v4 were present in all 20 disomic substitution lines, whereas Pinb-2v3 was present only in CS-CNN DS7B. Pinb-2v2 was produced from 19 CS-Cheyenne disomic substitution lines but not CS-CNN DS7B. These results support the varietal analysis that placed Pinb-2v3 on 7B of Cheyenne and Pinb-2v2 on 7B of Chinese Spring.

Discussion

Research on puroindolines and closely related genes has burgeoned over the last decade since the discovery of the first two hardness mutations, one in Pinb (Giroux and Morris 1997) and one in Pina (Giroux and Morris 1998; Bhave and Morris 2008a,b; Morris 2002; Morris and Bhave 2008). In addition to the discovery of new alleles at the Pina and Pinb loci in several taxa, a few reports have identified the existence of Puroindoline-related gene duplications. These include Pinb-relic and PseudoPinb (Chantret et al. 2005; Pogna et al. 2002; Gazza et al. 2006), and more recently a report of three new Pinb-like variants by Wilkinson et al. (2008). Our present study corroborates the existence of these three variants, and extends their findings by providing additional 5′ and 3′ sequence. Additionally, the discovery of a fourth Pinb variant is reported. All four variant sequences are 91–95% homologous, and all four are about 73% homologous to Pinb-D1a (Table 2). It will be of interest to re-examine this sequence homology as more 5′ and 3′ sequence is obtained.

Wilkinson et al. (2008) described sequence polymorphism for variant 1 which was used to genetically map its location by close linkage with Xwmc116 to chromosome 7AL. They placed the Puroindoline-b2 locus (they designated Pinb-A2) on 7AL in all three of their doubled-haploid mapping populations. Our physical mapping placed variant 1 (Pinb-2v1) on 7DL. The reason for this discrepancy is not currently known. Also, based on an autoradiogram posted on the Grain Genes web site (UMV040BE590621), the authors concluded that, “it is probable that all three variants in bread wheat are encoded by the same Pinb-2 locus on 7A”. Our results would indicate otherwise, as Pinb-2v2 was consistently physically mapped to 7BL in Chinese Spring (and was absent in Cheyenne and Wichita); Pinb-2v3 was mapped to 7B of Cheyenne and was present in Wichita and Winsome (Table 3). Finally, our research identified a new fourth variant, Pinb-2v4, which was localized to 7AL. At this juncture, one model would describe these variants as forming, in part, a homoeologous series. However, there is a question as to whether Pinb-2v2 and Pinb-2v3 are paralogs or orthologs. Since we observed (and recovered as clones) both sequences from the variety Winsome, our first postulation was that they were not allelic but are paralogs. However, re-examination of 12 Winsome seeds all produced an identical Pinb-2v1/Pinb-2v3/Pinb-2v4 haplotype. From this, we surmise that the original Winsome seed stock may have been heterogeneous. Consequently, we are suggesting that Pinb-2v2 and Pinb-2v3 may indeed be allelic. Additional research will be required to sort this out. Wilkinson et al. (2008) reported variant 1 to be present in all the bread and durum wheat varieties examined. Our results could potentially contradict the presence in durum, since we localized Pinb-2v1 to 7DL which would not be present in durum; this research remains to be conducted. Of note, there is some issue with the correct identity of the CS chromosome 7D ditelosomic stocks (Friebe et al. 1996). At the outset of the present research the identifiers for CS Dt 7DL and CS Dt 7DS were reversed and consequently incorrect. The veracity of the present stocks has been verified and is correct as they appear here.

In conclusion, puroindolines continue to be a topic of considerable interest, primarily for their evocative biological roles including their mysterious ability to soften cereal grains. The discovery of additional expressed, functional puroindoline genes may provide breeders and cereal chemists additional means of manipulating end-use quality.