Introduction

Rice plants including the cultivated Oryza sativa and wild Oryza rufipogon are well adapted to semiaquatic conditions by acquisition of high tolerance to anaerobiosis (Sauter 2000). The adaptation is accomplished by the avoidance of acidification in the cytoplasm, which leads to cell death (Menegus et al. 1991). Plant species with low tolerance to anaerobiosis, such as maize, initially use the pathway of lactic fermentation as glucose metabolism under anaerobiosis. The accumulation of lactic acids produced in this pathway results in the acidification in the cytoplasm (Menegus et al. 1989). Instead of the lactic fermentation, rice mainly uses alcoholic fermentation as glucose metabolism under anaerobiosis (Menegus et al. 1991).

Alcohol dehydrogenase is one of the key enzymes in the alcoholic pathway, and catalyzes the reversible reaction from ethanol to acetaldehyde (Sauter 2000). O. sativa has two alcohol dehydrogenase loci (Adh1 and Adh2), which are located on the chromosome 11. The physical distance between the two Adh regions was about 29 kbp (Tarchini et al. 2000). The ADH1 and ADH2 of O. sativa are mainly expressed in the leaf and root tissues, respectively (Xie and Wu 1989). The function of the ADH1 has been studied by using a mutant of O. sativa lacking the activity of the ADH1 (Matsumura et al. 1998). The mutant could not recover plant growth after long-term anaerobic stress, implying that the ADH1 is important for the adaptation to anaerobiosis.

Recently, DNA polymorphism in the Adh1 region of O. rufipogon has been studied to elucidate the maintenance mechanism of DNA polymorphism in natural populations of O. rufipogon (Yoshida et al. 2004). Nucleotide diversity (π, Nei and Tajima 1981) in the entire region of the Adh1 in O. rufipogon was 0.0020 (π at silent sites: 0.0025), which is one of the lowest values in the Adh regions of the plant species studied so far (Gaut and Clegg 1993a, b; Innan et al. 1996; Miyashita et al. 1996; Cummings and Clegg 1998; Lin et al. 2002; Chiang et al. 2003). The coding region of the Adh1 of O. rufipogon can be classified into three domains: catalytic domain 1 (CD1), co-enzyme-binding domain (CBD) and catalytic domain 2 (CD2). The CD1 had a lower level of polymorphism (π=0.0009) than the other domains. Tests of neutrality for the CD1 indicated significantly negative deviation from the neutral mutation model. These results suggest that purifying selection operates on the CD1, reducing the level of DNA polymorphism in the Adh1 region of O. rufipogon.

However, the low level of DNA polymorphism in the Adh1 region could be related to small effective population size. So far, DNA polymorphism in the Adh regions has also been analyzed for five grass species that are intolerant of submergence in water. Among the five grass species, nucleotide diversity for the Adh regions in three selfing species Hordeum vulgare, Pennisetum glaucum, and Miscanthus sinensis (π) is 0.0018, 0.0020 and 0.0062, respectively (Gaut and Clegg 1993a; Cummings and Clegg 1998; Chiang et al. 2003). On the other hand, outcrossing grass species Zea mays and M. condensatus have nucleotide diversity of 0.0174 and 0.0197, respectively (Gaut and Clegg 1993b; Chiang et al. 2003). Since selfing species have a smaller effective population size than outcrossing species (Wright 1938), the lower levels of DNA polymorphism in the selfing species might be due to smaller effective population size. O. rufipogon has breeding system of partial selfing and also propagates clonally (Morishima et al. 1961; Xie et al. 2001). We could not reject the possibility that the low level of DNA polymorphism in the Adh1 region of O. rufipogon was caused by small effective population size. Since small effective population size would influence the level of polymorphism over the entire genome, other nuclear regions of O. rufipogon should be studied to examine this possibility.

The interspecific comparisons in the Adh1 region between O. rufipogon and its related species showed that the amino acid sequences were conserved among the A genome species of Oryza (Yoshida et al. 2004). On the other hand, the level of replacement divergence between O. rufipogon and O. australiensis (E genome) was high. One of the replacement substitutions in the CD1 caused physicochemical change of amino acid, according to Miyata et al. (1979), which might influence the ADH1 activity. O. rufipogon is found in watery environments, while O. australiensis is found on the edge of ditches, and does not seem to be deeply submerged in water even in rainy season (Morishima 2002). The replacement substitutions between O. rufipogon and O. australiensis might be related to adaptive change in the ADH1, reflecting environmental differences where the Oryza species encounter anaerobiosis.

In this report, we analyzed nucleotide variations in the Adh2 regions of O. rufipogon and its related species (including the cultivated rice). One of the purposes is to clarify maintenance mechanisms of DNA polymorphism in the Adh2 region. We compared levels of DNA polymorphism between the Adh1 and Adh2 regions to examine the possible influence of small effective population size on the low level of nucleotide variation detected in the Adh1 region. The second purpose is to examine the involvement of replacement substitutions in the Adh2 region with the adaptation in anaerobic environments, as suggested for the Adh1 region. We compared level of replacement divergence of the Adh2 region between O. rufipogon and the other A genome species with that between O. rufipogon and O. australiensis. The third purpose is to elucidate epistatic interaction between the Adh1 and 2 regions of O. rufipogon. It is possible that two adjacent Adh genes could be coregulated for anaerobic adaptation. Epistatic interaction could cause linkage disequilibrium (LD) between loci in natural populations (Kimura 1956). We test LD between the two Adh regions in O. rufipogon. In addition, O. rufipogon is considered to be the wild ancestor of O. sativa (Oka and Chang 1959). O. sativa Japonica and Indica are assumed to be originated from different strains of O. rufipogon (Second 1982; Ishii et al. 1988). The study of DNA polymorphism in the Adh2 region of O. rufipogon, O. sativa Japonica and Indica would contribute to clarifying the phylogenetic relationship between these species.

Materials and methods

Plant materials

Twenty Asian wild rice species O. rufipogon, four of its related species and five cultivated rice O. sativa were used for analyzing DNA polymorphism in the Adh2 region (Table 1). Eight strains of O. rufipogon were examined for allozyme variation of ADH. Seeds and DNAs are maintained in the Laboratory of Plant Breeding, Faculty of Agriculture, Kobe University. They were originally provided from the National Institute of Genetics (Japan), Sizuoka University (Japan), and the International Rice Research Institute (Philippines). The seeds were sterilized by using the benomyl (Du pont, Wilmington, DE, USA) at 28°C over night, and germinated in the dark at 28°C. The plants of each accession used for DNA extraction were grown in a pot under 28°C and 14-h light conditions. The plants used for the allozyme experiment were grown in a green house during August 2002 in Kyoto, Japan.

Table 1 Plant materials

DNA extraction, PCR amplification and sequencing

Total DNA was extracted from leaves and stems with the modification of the CTAB method (Weising et al. 1991), and used for PCR amplification of the Adh2 region. The primers for the PCR amplification for O. rufipogon, O. sativa Japonica, O. sativa Indica, O. barthii and O. meridionalis are ORADH2-1: 5′-TCCTCCTTGTCTTCACTCTG-3′ and ORADH2-101: 5′-GCCACAATGCTGACAATAAA-3′, which are located in the 5′ and 3′ flanking regions, respectively. By using these primers, a 3.2-kbp region of the Adh2 was amplified. The primers for the PCR amplification for O. glumaepatula are ORADH2-3.1: 5′- ATGGCGACAGCCGGGAAGGT-3′, which is located in the exon 1, and ORADH2-101. By using these primers, a 2.8-kbp region of the Adh2 was amplified. The primers for the PCR amplification for O. australiensis are ORADH2-3.1 and ORADH2-103: 5′-CGTCCCCTTGAGCGTCTTCT-3′. A 2.3-kbp region of the Adh2 from exons 1–9 was obtained, which lacks a part of exon 9 and 10. These four primers were designed using published sequence information of the Adh2 region of O. sativa (Genbank accession AF172282). Taq polymerase (Roche Applied Science) was used for the PCR reaction. The PCR products were cloned into the plasmid pUC118 (TaKaRa), which was used as template for sequencing reaction. Sequence reaction was conducted by using the Thermo Sequenase fluorescent-labeled cycle-sequencing kit with 7-deaza-dGTP (Amersham/Pharmacia Biotech, Piscataway, NJ, USA). Sequence was determined by a Pharmacia ALFred sequencer. We mixed three plasmid clones at almost the same molarity to eliminate PCR artifacts. Sequencing primers were designed at about 500-bp intervals. Newly determined sequences were deposited in the DDBJ databank under accession numbers AB208516-AB208542.

Allozyme of ADH

A 12.5% starch gel was prepared, and a slit was made near the cathodal end of gel. Leaves and roots at 1 month after germination were ground with extraction buffer grade I+, separately. The buffer grade I+ is composed of 0.1 M Tris–HCl pH7.5, 5% sucrose (w/v), 10 mM diethyldithiocarbamate, 21 mM mercaptoethanol (0.15% v/v), 0.2% bovine serum albumin (w/v) and 5% PVP-40 (w/v). A piece of filter paper was used to absorb the extract, which was inserted in the slit. Electrophoresis was conducted for 5 h at 160 volts under 4°C. After electrophoresis, the starch gel was stained for ADH as described by Glaszmann et al. (1988). The ADH bands were determined on the basis of the report by Xie and Wu (1989). The fast band is ADH2 homodimer, the middle band ADH1-ADH2 heterodimer and the slow band ADH1 homodimer.

Data analysis

DnaSP program version 3.50 (Rozas and Rozas 1999) was used to analyze intra- and interspecific DNA variations. Nucleotide diversity (π) and θ (4Neμ: Watterson 1975) were estimated after removing indels. Tests of Tajima (Tajima 1989) and Fu and Li (Fu and Li 1993) were conducted to investigate departure from the neutrality. Genetic distance (K) between O. rufipogon and its related species was calculated by Jukes and Cantor method (Jukes and Cantor, 1969). Genetic distance for the 5′ flanking region of the Adh2 between O. rufipogon and O. glumaepatula could not be estimated due to missing sequence information on the 5′ flanking region of O. glumaepatula. Also we could not estimate genetic distance for both flanking and CD2 regions of Adh2 between O. rufipogon and O. australiensis, because we could not determine these sequences of O. australiensis. MEGA program version 2.1 (Kumar et al. 2002) was used to construct neighbor-joining (NJ) tree. Maximum parsimony (MP) tree was constructed with a heuristic search using PAUP 3.1.1 (Swofford 1993). Informative indels were included as a fifth base. Heuristic search was also performed to estimate the number of replacement substitutions on each of tree branches of O. rufipogon and O. australiensis. Homology plot analysis between the Adh1 and Adh2 regions of O. rufipogon was conducted by using EMBOSS GUI v 1.12 dottup (Rice et al. 2000). Intra- and interlocus LDs for polymorphic variations detected in the Adh1 and Adh2 regions of 17 O. rufipogon analyzed for both regions were examined by χ2 test implemented in the DnaSP program. For the χ2 tests, we included indels as DNA variations irrespective of their length to test LD, but SSR (simple sequence repeat) polymorphisms were excluded from the LD analysis.

Results

Polymorphic sites in the Adh2 region of O. rufipogon

In the 3.3-kbp Adh2 region of O. rufipogon, 152 nucleotide variations were detected (95 sites and 57 indels) (Fig. 1). In the 5′ flanking region, there were 31 nucleotide variations (13 sites and 18 indels), none of which were located in the TATA box or putative regulatory elements, involved in the adjustment of Adh2 expression in anaerobic conditions (Xie and Wu 1989). This result suggests that the nucleotide variations in the 5′ flanking region do not change the transcriptional regulation of the Adh2 gene of O. rufipogon. In the coding region, there were 24 polymorphic sites (16 synonymous and 8 replacement) and no indel variations, of which 9 polymorphic sites (8 synonymous and 1 replacement) were found more than once in the samples. The nonsingleton replacement (CTG (Val) and CTC (Leu)) at the position 1566 (CD1) did not cause drastic physicochemical change of amino acid, according to Miyata et al. (1979). The singleton replacements at the positions 1540 (CD1), 1654 (CD1), 2322 (CBD) and 2499 (CBD) had amino acid distances ranging between 2.37 and 3.06 (Miyata et al. 1979), which indicates distinct physicochemical changes without causing the structural disruption of the protein. The other three replacements did not cause any physicochemical change or structural disruption. Electrophoresis of ADH protein detected no changes in the mobility among strains, which have these singleton replacement sites (data not shown).

Fig. 1
figure 1

Summary of DNA variations in the 3.3-kb Adh2 region of O. rufipogon and O. sativa Japonica and Indica. The structure of the Adh2 region from the sites 0 to 3,325 is shown at the center of figure. The black, white and gray boxes indicate the exons coding the catalytic domain 1, the co-enzyme-binding domain and the catalytic domain 2, respectively. Singleton and nonsingleton nucleotide sites in O. rufipogon are shown as vertical bars. Open circles show replacement polymorphism. Triangles on the top of the vertical bar show indels. DNA variations are summarized at the bottom of figure, where dots indicate the variation identical to W1976 sequence. The d, m, s and r are abbreviations for indels, microsatellite, synonymous and replacement polymorphism, respectively. + and − indicate indels, respectively. A n , T n and G n indicate that adenine, thymine and guanine repeated n times, respectively. m n indicates microsatellite with n repeat. A indicates the annual type of O. rufipogon

Pattern of nucleotide polymorphism in the Adh2 region of O. rufipogon

A dimorphic pattern was detected in the Adh2 region of O. rufipogon (Fig. 1). From the distribution of polymorphic variations, seven distinct sequence types can be defined. The Adh2 region can be divided into seven blocks on the basis of the partition pattern of the strains, although the boundary between the blocks 6 and 7 was not clear because of the unclear dimorphic pattern. There was no correlation between these sequence types and their geographic origins, implying that the dimorphic pattern was not caused by geographic isolation among local populations of O. rufipogon. No association between the dimorphic pattern and life form of O. rufipogon was detected either (Fig. 1). For example, both LV61 and KA strains of O. rufipogon were annual types. The LV61 and KA strains belonged to the sequence types 1 and 7 separately, which were the most divergent among the seven sequence types (Fig. 1).

Since the sequence types 1 and 7 were the most divergent, these two sequence types could be ancestral. The other sequence types could be recombinants between the sequence types 1 and 7. Assuming two double recombinations in the history of the Adh2 region in O. rufipogon, the other five sequence types could be produced by five intragenic recombinations (Fig. 2a). The times of recombination events were estimated by using the method of Innan et al. (1996) on the basis of the number of nucleotide differences at silent sites (including synonymous sites) between recombinants and ancestral sequence types. Given synonymous substitution rate of 6.5×10−9 per site per year for grass Adh genes (Gaut et al. 1996), it was shown that these recombination events occurred 0.14 ∼ 0.81 MYA (Fig. 2b). Since the two ancestral sequence types were estimated to have diverged 1.73 MYA, the intragenic recombinations in the Adh2 region of O. rufipogon would have occurred only recently.

Fig. 2
figure 2

a Diagram of recombination events for the five recombinant sequence types detected in this study. b Estimated recombination time. Dashed lines connect the recombinant to the parental sequence types

Level of nucleotide polymorphism in the Adh2 region of O. rufipogon

Level of nucleotide polymorphism in the Adh2 region of O. rufipogon was estimated (Table 2). The levels of polymorphism (π and θ) in the entire region of the Adh2 were 0.008 and 0.009, respectively, which are higher than those of the Adh1 of O. rufipogon (π=0.002 and θ=0.003: Yoshida et al. 2004). Polymorphism was high in the 5′ flanking region and introns of the Adh2. In the coding region, synonymous sites had a high level of variation, whereas the level of variation at replacement sites was as low as that of the Adh1 region of O. rufipogon (ibid.). These results indicate that the difference in the level of nucleotide polymorphism between the Adh1 and Adh2 regions was caused by the difference at silent sites.

Table 2 Summary of DNA variation in the Adh2 of O. rufipogon and its related species

The level of polymorphism for the sequence type 1 was estimated (Table 3). We could not estimate level of polymorphism for the sequence type 7, which was detected only once in the sample. The level of polymorphism was low over all the functionally different regions. Especially, a low nucleotide diversity was detected at replacement sites.

Table 3 Summary of DNA variation for the sequence type 1 of O. rufipogon and its related species

Each block (Fig. 1) could be classified into partition A identical to the sequence type 1 and partition B identical to the sequence type 7. Average silent nucleotide diversity (π) for the partitions A and B was 0.004 and 0.003, respectively (Fig. 3). These values were smaller than that (π=0.011) at silent sites of the entire Adh2 region (Table 2). Taken together, these results indicate that the high level of polymorphism at the silent sites of the entire Adh2 region was caused by nucleotide differences between the two divergent sequence types.

Fig. 3
figure 3

Summary of nucleotide diversity in each block. The sequences in the each block are classified into the partition (A) identical to the sequence types 1 and the partition (B) identical to the sequence type 7

Difference in the level of nucleotide polymorphism among the ADH2 domains

We estimated the level of nucleotide polymorphism for subcoding region corresponding to each of the three domains (CD1, CBD and CD2) of the ADH2 in O. rufipogon (Table 4). The highest nucleotide diversity was detected in the CD1, while nucleotide diversity in the CBD and CD2 was low. This result is different from that of the ADH1, where the CD1 had a lower diversity than the other domains (Yoshida et al. 2004). When only the sequence type 1 was analyzed, all the domains had a low level of polymorphism (data not shown). This result indicates that the high nucleotide diversity in the CD1 was mainly due to the difference between the two divergent sequence types.

Table 4 Summary of DNA variation in each domain of the Adh2 of O. rufipogon and its related species

Tests of neutrality for nucleotide polymorphism in the Adh2 region of O. rufipogon

Tests of Tajima and Fu and Li were conducted to examine the neutrality of nucleotide polymorphism for the functionally different regions of the Adh2 (Table 2). Fu and Li’s D* test gave a significantly negative value for replacement sites. This result was consistent with the low level of polymorphism at the replacement sites of the Adh2 region (Table 2), indicating that purifying selection operates on the replacement sites of the Adh2 region.

When the neutrality tests were conducted only for the sequence type 1, significantly negative values were detected in the coding region (Table 3). The Tajima’s D and Fu and Li’s D* were also largely negative for replacement sites, although the tests did not give significant values. Considering the lowest nucleotide diversity at the replacement sites, these negative values for the coding region and replacement sites of the sequence type 1 could be explained by purifying selection.

When these tests were applied to each of the three domains, Tajima’s D value was significantly negative for the CBD (Table 4). Both tests gave significantly negative values for replacement sites in the CBD. Considering the low level of polymorphism in the CBD (Table 4), these results indicate that purifying selection operates on the CBD. When the tests of neutrality were conducted for each of the three domains of the sequence type 1, no significant value was obtained (data not shown). This is probably due to lack of the power in the tests, because the number of segregating sites was small.

Phylogenetic relationship between O. rufipogon and its related species

To clarify the phylogenetic relationship between O. rufipogon and its related species, a NJ tree was constructed based on the nucleotide variations in the entire region of the Adh2 (Fig. 4). O rufipogon was separated from O. barthii, O. meridionalis and the E-genome species O. australiensis. Especially, O. australiensis was highly diverged from O. rufipogon. The sequence type 1 of O. rufipogon was grouped with O. glumaepatula, O. sativa Japonica and Indica. A MP tree was also constructed. The topology of the MP tree was almost the same as that of the NJ tree (data not shown).

Fig. 4
figure 4

Neighbor-joining tree for 20 strains of O. rufipogon and its related species using genetic distance calculated by Jukes and Cantor method on the basis of DNA variation in the entire region of Adh2. Bootstrap probabilities >60% from 1,000 replications are shown. The scale bar of genetic distance is shown at the bottom of the tree. Strains of O. rufipogon also show their geographic origins. The number on the right side of the figure indicates the sequence type of O. rufipogon

As observed for the Adh1 region (Yoshida et al. 2004), O. sativa Japonica and Indica were not clearly separated into the different clusters in the tree (Fig. 4). One of O. sativa Japonica (Nourin 22) was closer to O. sativa Indica (435 and IR36) than the other Japonica strains (YT1A and Nipponbare), which were close to a Myanmar O. rufipogon (YG2A). Two of O. sativa Indica (435 and IR 36) and an Indian O. rufipogon (W120) formed a single cluster, but the bootstrap probability did not support this clustering. From this analysis, it is difficult to conclude the birth place of O. sativa.

Species-specific difference between O. sativa Japonica and Indica was detected in the region of the block 1 of O. rufipogon (Fig. 1). Nucleotide sequence in all the blocks of O. sativa Indica was similar to that of the sequence type 1 of O. rufipogon. On the other hand, the sequence in the block 1 of O. sativa Japonica was identical with that of the sequence type 7, and the sequence in the other blocks was largely identical with that of the sequence type 1. This observation implies that the Adh2 region of O. sativa Japonica is a recombinant between the sequence types 1 and 7. The time of the recombination event was estimated to be 0.10 MYA, given synonymous substitution rate of 6.5 ×10−9 per site per year for grass Adh genes (Gaut et al. 1996). This recombination event occurred before the domestication of O. sativa, which is estimated 0.01 MYA (Wasano 1995).

Nucleotide divergence in the Adh2 region between O. rufipogon and its related species

Level of divergence (K) between O. rufipogon and its related species was estimated for the functionally different regions of the Adh2 (Table 2). In all the regions, O. australiensis is highly diverged from the other species. In the coding region, the level of divergence at replacement sites was lower than that at synonymous sites. Particularly, the level of replacement divergence was constant over the A genome species. When level of divergence between each of the sequence types 1 and 7 of O. rufipogon and its related species was estimated, the replacement divergence between O. rufipogon and A genome species was also at a low level (Table 3). These results indicate that the amino acid sequences of the ADH2 were conserved among the A genome species.

The level of divergence was estimated for the three domains of the ADH2 (Table 4). When O. rufipogon and O. australiensis were compared, genetic distance at synonymous sites of the CBD was larger than that of the CD1. The distance at replacement sites of the CBD was a little larger than that of the CD1 between O. rufipogon and O. australiensis. The similar result was obtained when each of the divergent sequence types of O. rufipogon and O. australiensis was compared (data not shown). These results contrast with the observations for the Adh1 region that the levels of synonymous and replacement divergence in the CD1 were higher than those in the CBD between O. rufipogon and O. australiensis (Yoshida et al. 2004). In other words, different domains are more diverged in the two Adh genes.

Heuristic search for replacement substitutions between O. rufipogon and O. australiensis

To determine the branches on which the observed replacement substitutions in the CD1 and CBD of the ADH2 occurred, a heuristic search was conducted for O. rufipogon and O. australiensis (Table 5). When H. vulgare was used as outgroup species, the number of replacement substitutions on the branch of O. rufipogon and O. australiensis was six and one, respectively. When Z. mays was used, five and two replacement substitutions were detected on the branch of O. rufipogon and O. australiensis, respectively. These results indicate that the number of replacement substitutions on the branch of O. rufipogon was larger than that of O. australiensis. However, the χ2 tests did not support the difference statistically (Z. mays is used as outgroup: χ2 df=1=1.29 NS, H. vulgare: χ2 df=1=3.57 NS). This may be because the number of substitutions was small.

Table 5 Summary of amino acid substitution between O. rufipogon and O. australiensis

Two of the replacement substitutions on the branch of O. rufipogon were detected in the CD1 (Table 5). One of the replacement substitutions was located at the 103rd amino acid position of the ADH2. This amino acid position is involved in the formation of a lobe that binds the second zinc atom of the subunit of ADH (Eklund et al. 1976). The second zinc atom is essential for the catalytic activity. The other was located at the 130th amino acid position, which is not involved in the secondary structure of the CD1. The amino acid distance between O. rufipogon and O. australiensis at these positions was more than one, according to Miyata et al. (1979). Therefore, both replacement substitutions caused distinct physicochemical changes of amino acid. Considering the amino acid position of these substitutions, the replacement substitution at the 103rd position might be related to the activity of the ADH2. Three replacement substitutions (Table 5) in the α helix of the CBD of the ADH2 did not cause any distinct physicochemical changes of amino acid.

On the other hand, for the Adh1 region, the number of replacement substitutions on the branch of O. rufipogon was not high, and depended on the outgroup species in the heuristic search (Table 5). Two replacement substitutions in the CD1 caused physicochemical changes of amino acid. One of these substitutions occurred on the branch of O. rufipogon, irrespective of the choice between Z. mays and H. vulgare as outgroup species. We discussed possible reasons why the replacement substitutions with physicochemical changes of amino acid in the ADH1 and ADH2 occurred more on the branch of O. rufipogon in Discussion.

Divergence between the Adh1 and Adh2 regions of O. rufipogon

Homology plot analysis between the investigated regions of the Adh1 and Adh2 of O. rufipogon was conducted to examine gene conversion between these regions. Homology was detected only in the exons (data not shown). When the Adh1 region and each of the two divergent sequence types of the Adh2 region were compared, homology was detected again only in the exons. These results indicate no gene conversion between the two Adh regions in O. rufipogon.

Since only the sequences for the exons of the Adh1 and Adh2 regions can be aligned, NJ tree based on the nucleotide variations in the exons was constructed to clarify the phylogenetic relationship of these loci between O. rufipogon and grass family (data not shown). Clearly, the two Adh genes of O. rufipogon were divided into different clades, where those of other grass species clustered separately. These observations agree with the idea that the Adh1 and Adh2 duplicated before the divergence of the grass family (Gaut et al. 1999).

Linkage disequilibrium in the Adh1 and Adh2 regions of O. rufipogon

Intra- and interlocus LD was examined by χ2 tests to investigate the possibility of epistasis within and between the Adh1 and Adh2 regions of O. rufipogon (Fig. 5). The number of informative polymorphic variations including indels was 14 and 68 in the Adh1 and Adh2 regions, respectively. The number of tests within the Adh1 region was 91, of which 26 (28.6%) gave significant results. In the Adh2 region, 1,785 of the 2,278 tests (78.4%) were significant. When only the strains of O. rufipogon belonging to the sequence type 1 of the Adh2 were used in the analysis, the number of informative polymorphic variations was five and three in the Adh1 and Adh2 regions, respectively. Five of the ten tests were significant within the Adh1 region, whereas significance was not detected within the Adh2 region. These results indicate that the large number of significant pairwise comparisons within the Adh2 region was caused by the dimorphic pattern.

Fig. 5
figure 5

Summary of linkage disequilibrium within and between Adh1 and Adh2 regions of O. rufipogon. Significance detected by Chi-square test is shown. The test resulted for the nine strains of O. rufipogon belonging to the sequence type 1 of the Adh2 is shown at the upper right. The number indicates a position in the Adh1 and Adh2 regions. Black line connects to the position of the segregating site in the gene structure

The number of tests between loci was 952, of which 156 (16.4%) were significant. Among these significant comparisons, 56 (35.9%) were detected between polymorphic variations in the introns. This percent of significant pairs in the introns was higher than that within each region (Adh1: 26.9%, Adh2: 34.2%). Even when the strains of the sequence type 1 were used, four comparisons in the noncoding region were significant. This result contrasted with the observation that no significance was detected within the Adh2 region. These interlocus LDs might be caused by epistatic interaction between the two Adh genes.

Discussion

The level and pattern of polymorphism in the Adh2 region of O. rufipogon

The level of polymorphism (π) in the entire region of the Adh2 of O. rufipogon was 0.008, which is higher than that of the Adh1 of O. rufipogon. A high nucleotide diversity was detected at silent sites, while replacement diversity was low. When polymorphism for the sequence type 1 was analyzed, a low nucleotide diversity was detected even at silent sites. This result indicates that the high level of silent polymorphism was caused by the dimorphic pattern. On the other hand, the low level of variation at the replacement sites could be explained by purifying selection. The neutrality test of Fu and Li for the replacement sites indicated significantly negative deviation from the neutral mutation model. When the tests of Tajima, and Fu and Li were conducted for each domains of ADH2, significantly negative values were detected at replacement sites of the CBD. These results were indicative of purifying selection on the replacement sites, especially of the CBD.

Comparison of the level and pattern of polymorphism between the Adh1 and Adh2 regions of O. rufipogon

A large difference in the nucleotide diversity (π) between the Adh1 and 2 regions was detected at silent sites of the entire region (Adh1: 0.003, Adh2: 0.011). Considering that there was no dimorphic pattern in the Adh1 region (Yoshida et al. 2004), it is clear that the difference in the nucleotide diversity between the two Adh genes was related to the dimorphic pattern of the Adh2 region.

Taking into account the low level of polymorphism and significantly negative value in the neutrality test of Fu and Li at replacement sites of the Adh1 and 2 regions, it could be concluded that purifying selection operates on both enzymes. Since the ADH1 and ADH2 were induced in leaf and root under anaerobic conditions respectively (Xie and Wu 1989), each of the enzymes would contribute to tolerance to anaerobiosis in the respective tissues.

The low level of polymorphism in the Adh1 region of O. rufipogon could be also explained by small effective population size (Yoshida et al. 2004). Under this hypothesis, the Adh2 region is expected to have a low level of variation. Contrary to this expectation, the nucleotide diversity in the Adh2 region was higher than that of the Adh1 region. However, when level of polymorphism was estimated for the sequence type 1, a low nucleotide diversity was detected over all the functionally different regions of the Adh2. Therefore, from this study, it is not possible to conclude whether the low level of polymorphism in the Adh1 region of O. rufipogon is due to small effective population size. To examine this possibility, nucleotide polymorphism for other loci of O. rufipogon needs to be analyzed.

Divergence between O. rufipogon and its related species

As shown for the Adh1 region (Yoshida et al. 2004), interspecific comparisons of the Adh2 region revealed that amino acid sequences were conserved among the A genome species, whereas the E genome species O. australiensis was highly diverged from O. rufipogon. The divergence between O. rufipogon and O. australiensis was high in the CD1 and CBD of the ADH2. The heuristic search showed that the replacement substitutions on the branch of O. rufipogon largely contributed to the high level of replacement divergence in the two domains between O. rufipogon and O. australiensis. Especially, replacement substitutions in the CD1 caused distinct physicochemical changes of amino acid. Similarly, replacement substitutions on the branch of O. rufipogon were detected in the CD1 of the ADH1. One of these substitutions caused physicochemical changes of amino acid. The habitat of O. rufipogon is more deeply submerged in water than that of O. australiensis (Morishima 2002), while Z. mays and H. vulgare are adapted to dry conditions (Sauter 2000). Considering the different habitats of the four grass species, the replacement substitutions with the physicochemical changes of amino acid in the ADH1 and ADH2 on the branch of O. rufipogon may be related to the adaptation to the anaerobic environments.

Linkage disequilibrium between the Adh1 and Adh2 regions of O. rufipogon

This study showed that there were significant LDs between the two Adh regions of O. rufipogon. Significant LDs were detected even when the strains of O. rufipogon belonging to the sequence type 1 were used in the analysis. The two Adh regions are located at the same position in the genetic map of O. sativa, and the physical distance between them is only about 29 kbp (Tarchini et al. 2000). These observations suggest that the physical linkage between these regions could be responsible for the observed LDs. However, it was noted that the percent of significant pairs in the intron between the two Adh genes was higher than that within each gene. It is known that intron sequences in actin of O. sativa and Adh1 of Z. mays regulate gene expression (Koziel et al. 1996). The introns 1, 2 and 6 in the Adh1 of Z. mays increase gene expression in transient assays (Callis et al. 1987; Mascarenhas et al. 1990), although its molecular mechanism has not been fully identified (Clancy and Hannah 2002). If the introns of the two Adh genes of O. rufipogon synergistically regulate the gene expression of both genes, epistasis would be an attractive candidate to explain the significant interlocus LDs in the introns. In future, it is necessary to experimentally demonstrate that intron sequences of the Adh1 and Adh2 are involved in synergistic regulation of the two Adh gene expressions.

The origin of O. sativa Japonica and Indica

The study of the nucleotide variation in the Adh2 region of O. rufipogon, O. sativa Japonica and Indica showed that the sequence type of O. sativa Japonica was a recombinant between the sequence types 1 and 7 of O. rufipogon. The sequence of O. sativa Indica was similar to that of the sequence type 1. These results support the hypothesis that O. sativa Japonica and Indica would be originated from different O. rufipogon ancestors (Second 1982; Ishii et al. 1988). The estimated recombination time indicates that the ancestors of O. sativa Japonica and Indica would be separated at least 0.1 MYA. Considering that the perennial type of O. rufipogon has a higher outcrossing rate than the annual type of O. rufipogon (Morishima et al. 1984), the observed trace of recombination in the Adh2 region of O. sativa Japonica also supports the idea that the ancestor of O. sativa Japonica is the perennial type of O. rufipogon (Morishima et al. 1984; Cheng et al. 2003).