Introduction

Weedy rice, also called red rice because of its red pericarp, has been reported in almost all rice-growing regions in the world. In many regions, it has caused serious damage to rice production (Pantone et al. 1992; Ottis et al. 2005; Londo and Schaal 2007). In China, records of the existence of weedy rice can be found in the ancient literature of at least 2,000 years ago (You 1990). But the occurrence of weedy rice had been controlled at a low level in the rice transplanting system with the labor-intensive effort until 1980s. However, during the latest two decades, along with the prevalence of direct seeding technology and herbicides, weedy rice has re-flourished in the rice field, especially in northeast China and in the Jiangsu, Guangdong and Hainan provinces (Xu and Wu 1996; Ma et al. 2005; Zhong et al. 2006).

The origin of weedy rice remains controversial. At present, four hypotheses of weedy rice origin have been suggested: (a) adaptation of wild rice varieties to cultivation in order to survive the human disturbance, including some individuals with wild characters; (b) the creation of hybrids by Oryza sativa L. spp. indica Kato and O. sativa L. spp. japonica Kato; (c) the formation of hybrids of cultivated rice and wild rice; (d) de-domestication of cultivated rice to a wild form (Baker 1974; Londo and Schaal 2007). Each hypothesis can be supported by certain lines of evidence. Bhutan weedy rice, which shares distinct characteristics with Indica- and Japonica-type rice, was thought to have originated from hybridization of Indica and Japonica rice due to the presence of certain nuclear genome and cytoplasm genome markers (Ishikawa et al. 2005). Ludao, weedy rice occurring in the Jiangsu province of China, may have originated from the hybridization of ancient Japonica rice and wild rice, as suggested by the sterility gene that it shares with wild rice (Jing et al. 2007; Zhu et al. 2005). In Malaysia, weedy rice flourished with the popularization involved in volunteer-seeding technology and was thought to have evolved from cultivated rice (Bakar et al. 2000). Research on the genetic diversity of weedy rice in China’s Liaoning province revealed that weedy rice has a closer relationship to local cultivated rice than to wild rice or cultivated rice collected from other province and other areas of Asia (Cao et al. 2006). Recently it was suggested that US weedy rice originated from Asia; the four hypotheses outlined above were supported by the population structure of weedy red rice sampled from the USA, using both DNA sequence data from a neutral nuclear locus and microsatellite genotype data (Londo and Schaal 2007). From the above studies examining weedy rice origin, we may deduce that the weedy rice of the different regions or biotypes may have originated in different ways. Although Liaoning’s weedy rice has been shown to have originated from de-domestication, the origin of weedy rice with Indica rice characteristics in Guangdong is not clear. Indica weedy rice was not included as a comparison in Cao et al.’s study (Cao et al. 2006). Hence, the possibility of weedy rice originated from hybridization of Indica and Japonica rice cannot be eliminated easily. Further research is necessary to explore the issue because the former research works on genetic diversity of weedy rice in Liaoning province have yielded two different results (Cao et al. 2006; Yu et al. 2005). Because of the climatic environment, only O. sativa ssp. japonica is suitable to Liaoning province, while farmers in Guangdong province primarily cultivate O. sativa ssp. indica. In addition, Guangdong province is an important distribution area of Oryza rufipogon Griff., while Liaoning province is an area with no acknowledged record of wild rice cultivation. Many researchers doubt whether weedy rice in Guangdong has any relationship to wild rice. The Liaoning and Guangdong provinces represent one of the most important rice production regions in China, where weedy rice has caused extensive damage to rice yield and commercial quality.

Thus, to validate and further explore the origin of different weedy rice biotypes, we selected two biotypes of weedy rice collected from Liaoning and Guangdong provinces, located in northeastern and southern China, respectively. The weedy rice grain prevalent in Guangdong province is long and thin with Indica rice characteristics. In contrast, the weedy rice grain prevalent in Liaoning province is short and round, with Japonica rice characteristics. In this paper, we studied the genetic diversity, genetic structure and genetic relationships among two different biotypes of weedy rice from the two provinces. We also compared these parameters among weedy rice, cultivated rice and O. rufipogon, in order to explore the origins of different biotypes of weedy rice in the Liaoning and Guangdong Chinese provinces.

Materials and methods

Samples

Four weedy rice samples were collected from four sites each in both the Liaoning and Guangdong provinces (Fig. 1). Seeds from 13 to 35 individuals separated by a distance of more than 10 m were sampled for each population in the same field. As the references in this study, we used seven Asian weedy rice accessions, including one from Korea; Ludao from Jiangsu province; one from Hainan province in China; one from Vietnam; three others from Malaysia; six common wild rice (Oryza rufipogon) accessions from Jiangxi, Guangxi, Hunan and Fujian, respectively, in China; and 24 rice cultivars from primary rice-growing regions in China (Tables 1, 2).

Fig. 1
figure 1

Sampling location of weedy rice (Oryza sativa f. spontanea) populations in Liaoning and Guangdong Provinces of China. (PJ: Panjin, SY: Shenyang, TL: Tieling, DD: Dandong; ZQ: Zhaoqing; MM: Maoming; ZJ: Zhanjiang; YJ: Yangjiang)

Table 1 Weedy rice from Liaoning, Guangdong provinces
Table 2 Samples of cultivated rice (Oryza sativa), common wild rice (Oryza rufipogon) and Asian weedy rice (Oryza sativa f. spontanea Rosch.) used in the SSR analysis

In May of 2007, these materials were planted in the paddy field of Jiangpu Farm of Nanjing Agricultural University, Jiangsu province, China. Seeds from each plant were sown separately. One month later, seedlings were transplanted to small plots in a random design. One month later, young leaves were sampled from a single individual for DNA extraction. The soil in the paddy field was a sticky loam (pH 7.1, organic material content 6.7 g kg−1, total nitrogen 1.1 g kg−1).

DNA isolation

Five grams of fresh leaves were taken from each individual. Leaves were ground in liquid nitrogen, and the total genomic DNA extraction was performed according to the usual procedure with minor modifications (Lu and Zheng 1992). Twenty milliliters of preheated (65°C) extraction buffer (100 mM Tris–HCl pH 8.0, 50 mM EDTA pH 8.0, 500 mM NaCl, 1.5% SDS) was combined in a 50 mL centrifugal tube containing the tissue sample. After incubation at 65°C for 30 min with periodic gentle mixing, the tubes were transferred to an inverse shaker (60 min−1) at room temperature. We then added 20 mL chloroform containing 10–20% ethanol (V/V) and 4% iso-pentanol (V/V). After mixing for 20 min, the tube was centrifuged at 3,000 rpm for 15 min at room temperature. We then transferred the supernatant into another 50 mL tube, added 0.8 vol. pre-cooled isopropanol (kept at 4°C) and then mixed the sample by inversion. Samples were kept at room temperature for 30 min or until fibroid precipitate was seen floating at the top. The precipitate was taken out and transferred to a new tube containing 3 mL of TE (10 mM Tris–HCl, 1 mM EDTA, pH 8.0). We added 1/10 vol. 3 M sodium acetate (pH 5.2), mixed and then added 2 vol. of pre-chilled ethanol (kept at −20°C). We then extracted the DNA and dried samples at room temperature. DNA pellets were suspended in 500 μL of TE and quantified by agarose gel electrophoresis stained with ethidium bromide. A DNA standard concentration marker was used to evaluate the concentration of the extraction. DNA samples were adjusted to a concentration of 10–20 ng μL−1.

Microsatellite marker analysis

A total of 69 microsatellite loci spread throughout the genome were screened for amplification, polymorphism, and reproducibility. Twenty microsatellite loci were selected; the set contained at least one representative locus per chromosome (Table 3). In addition, a mitochondrial SSR was selected by screening a group of ten loci in mitochondria. The polymerase chain reaction (PCR) protocol of the manufacturer was followed with minor modifications. PCR was performed in an automated PCR apparatus(BioRAD PTC300). Ten microliters of reaction mixture (Mg2+-free buffer, 250 μmol dNTP, 2.0 mM MgCl2, and 0.1 U Taq polymerase) was added. PCR comprised the following steps: 94°C (3 min), 30 cycles of 94°C (40 s), 55°C (40 s), 72°C (40 s), followed by extension for 8 min at 72°C and then maintenance at 4°C.

Table 3 SSR primer pairs used for DNA amplification in this study

Electrophoresis and silver staining

PCR products were separated on 6% polyacrylamide gels. Sample lanes were loaded with 3 μL of each amplification product with 2 μL 6× loading buffer (2.5 g L−1 xylene cyanol FF, 2.5 g L−1 bromophenol blue, 400 g L−1 sucrose). After samples were stained with silver nitrate (0.1%) and air-dried, gels were scored for the presence or absence of clearly distinguishable, repeatable bands.

Data analysis

The amplified bands were scored as “1” if present and “0” if absent, and then the binary data matrices were used for genetic diversity parameter analysis. POPGENE 1.31 (Yeh et al. 1999) was used to determine Nei’s gene diversity, Shannon’s information index, total diversity, diversity within populations, diversity among populations, levels of gene flow among populations, and Nei’s (1978) unbiased genetic distance (GD ij  = −ln∑p i p j /(∑p 2 i p 2 j )1/2) among pairs of populations (Nei 1978). Among these parameters, Nei’s (1978) unbiased genetic distance matrix among pairs of populations was utilized to create the dendrogram, according to the UPGMA (Unweighted Pair Group Method, Arithmetic mean) method using NTSYSpc version 2.0 (Rohlf 1998).

To further assess the genetic relationships between all of the samples (weedy rice, cultivated rice and wild rice), principle component analysis (PCA) was conducted based on the numerical SSR bands. PCA examines a dissimilarity matrix of pairwise differences between samples and uses eigenvalue analysis to take the variation between samples and condense them into a limited number of dimensions. The maximum amount of variation is plotted as the first axis with subsequent variation of lesser magnitude explained by each additional dimension. PCA analysis was performed with MINITAB 15.0 (Minitab Inc. State College, PA, USA).

Results

Genetic diversity

Microsatellite markers were utilized to assess genetic diversity among accessions in this study. Twenty-one pairs of SSR primers were selected among 79 pairs that showed allelic polymorphism throughout the entirety of both samples. A total of 63 alleles within the entire data set were scored, and alleles per locus ranged from 2 to 6, with an average of 2.86. The highest number of alleles was scored at locus RM252 (six alleles). The average number of alleles produced in a single individual was 23.6 (range 20–28). Wild rice from Fujian province produced the maximum number of alleles.

Considerable genetic diversity was found in the overall weedy rice populations: Nei’s gene diversity was h = 0.317, and Shannon’s information index for the two groups of weedy rice was I = 0.462. In contrast to total diversity, each weedy rice group from both Liaoning and Guangdong had relatively low diversity, h = 0.086 and 0.160, and I = 0.140 and 0.253, respectively. All four populations in Liaoning province showed relatively low diversity, with the highest diversity exhibited by the Tieling population. Four populations in Guangdong exhibited relatively high diversity, excluding the Yangjiang population (Table 4).

Table 4 Genetic diversity of weedy rice populations from different regions in Liaoning and Guangdong provinces, estimated based on polymorphisms of 21 SSR loci

Differentiation of weedy rice populations

The genetic differentiation value Gcs revealed 62.3% of total genetic variation derived from the two groups studied (Table 5). The gene flow between the two groups was 0.302 (Nm (Gcs)), which was far from 1.0. The gene flow within these two groups was relatively high (0.79, Nm (Gst)).

Table 5 Genetic diversity and differentiation between weedy rice of the Liaoning and Guangdong provinces, estimated by POPGEN (version 1.31)

In the same province (Table 5), genetic diversity within different populations (Hs) was high in proportion to Ht. The gene flow index Nm (0.822, 0.776) showed that high gene flow occurred within Liaoning and Guangdong populations, respectively. On the contrary, Gst was low (0.378 and 0.392, respectively).

Genetic relationships of weedy rice populations

In order to assess the genetic relationships among weedy rice, cultivated rice, and common wild rice (O. rufipogon) in China, the genetic distance values were obtained through calculation of the genetic distance index between the populations, using the pairwise method (Table 6). Weedy rice populations displayed a closer genetic distance to rice varieties from the local region where the weedy rice populations were collected than to cultivated or wild rice from other regions.

Table 6 Genetic distance (Nei 1978) between weedy rice, cultivated rice and wild rice

A UPGMA dendrogram based on Nei’s genetic distance was constructed (Fig. 2), showing the genetic relationships among the various groups, including two weedy rice groups from Liaoning and Guangdong, the cultivated rice in the sample field, other Asian weedy rice, cultivated Indica (15) and Japonica (11) rice planted in the main rice-producing region of China, and some wild rice (O. rufipogon). The UPGMA dendrogram showed that all of the samples were clustered into two main groups. Group I contained the weedy rice populations and cultivated rice collected from Guangdong province. Indica rice from other provinces of China was also clustered in this group, associated with the weedy rice from Malaysia and Vietnam. Group I was denoted as the ‘Indica’-like group because all Indica rice accessions were included in this group. Group II contained the weedy rice populations collected from Liaoning province and cultivated rice collected from sample fields of weedy rice. Other Japonica rice from other provinces of China also appeared in this group, as did wild rice. Group II was denoted as the ‘Japonica’-like group because all Japonica rice accessions were included in this group. In the two main groups, some subgroups could be separated further. In Group I, the weedy rice of Guangdong province and cultivated rice collected from the sample fields of weedy rice are concentrated in a clade, with relatively short distances between one another and varying distance to the points representing other cultivated rice. Cultivated rice from different provinces showed relatively large distances among each point on the graph. In Group II, weedy rice and cultivated rice from Liaoning province, except for Tieling cultivated rice, also clustered into an independent clade, they had a large distance to other provinces’ Japonica rice and wild rice varieties from throughout China. Wild rice in this group formed a relatively independent clade with a large distance to Japonica rice from throughout China, especially with regard to cultivated rice and weedy rice in the Liaoning province of China.

Fig. 2
figure 2

UPGMA dendrogram constructed by NTSYSpc (version 2.0), based on Nei’s unbiased genetic distance among different populations of: weedy rice, cultivated rice from Guangdong and Liaoning provinces, cultivated rice from other provinces in China, Asian weedy rice and some types of wild rice from China. The genetic distance (Nei 1978) was calculated by POPGEN1.31 with the band information of 21 SSR loci. The two groups of weedy rice (Liaoning and Guangdong) cluster in two independent clades, with cultivated rice collected from the same sample field. Nomenclature: the weedy rice population in the dendrogram is represented by its sample site (e.g., YangJiang, Korea). Cultivated rice and wild rice are represented by the abbreviation of the location and ‘R’ or ‘W’ (e.g., cultivated rice from ZhaoQing is ZhaoQ.R in the dendrogram and wild rice from FuJian is FuJ.W, except Guangde rice, which is represented by Guangde.R, for different to Guangdong rice, which is represented by Guangd.R)

PCA, which can illustrate the genetic relationships of weedy and cultivated rice as individual units, was conducted based on the SSR data matrix of the 21 loci collected for all individuals (Fig. 3). The first and second principal components represented about 44% of all genetic variation. The scatter plots based on the two components showed a clear genetic relationship among weedy rice, cultivated rice, and common wild rice. Weedy rice individuals from Liaoning scattered in a limited space with cultivated rice in the sample field of weedy rice collected from Liaoning province, separated from most other Japonica rice cultivars from other provinces in China. The weedy rice individuals from Guangdong were located in another separate, broader space. Several Indica cultivars from other provinces in China overlapped with weedy rice in Guangdong, as did cultivated rice in the sample field of weedy rice collected from Guangdong province. Common wild rice was located between the Liaoning and Guangdong groups, scattered over a large area. In addition, weedy rice from Korea was represented close to the weedy rice of Liaoning, whereas weedy rice from Malaysia was represented as close to weedy rice from Guangdong. Nevertheless, weedy rice in Vietnam, which is closer to Guangdong province in terms of geographic distribution, was far away from weedy rice from Guangdong in terms of genetic background.

Fig. 3
figure 3

Two-dimensional PCA display of the distribution of weedy rice, cultivated rice, and wild rice individuals. Weedy rice and cultivated rice collected from the same field concentrated in a limited space, overlapping slightly with cultivated rice from other provinces in China. Wild rice is represented between the two groups of weedy rice from the Liaoning and Guangdong provinces. Ind-CHN and Jap-CHN represent Indica rice and Japonica rice, respectively, from other provinces in China; wild-CHN denotes the wild rice of China

Discussion

The genetic diversity of weedy rice was evaluated by SSR markers. These markers are robust, co-dominant, characterized by high allelic variation, and widely distributed throughout the Oryza genome (Panaud et al. 1996). However, the weedy rice of the two provinces exhibited relatively low genetic diversity. The weedy rice of Liaoning province exhibited the same low level of diversity described by Yu et al. who reported Nei’s diversity h = 0.053 (Yu et al. 2005). However, the level of diversity obtained in our study was much lower than that reported by Cao et al., who reported Nei’s diversity h = 0.313 (Cao et al. 2006). Nonetheless, the uneven distribution of the diversity of several populations in our study is similar to that reported by Cao et al. The discrepancies may originate from differences in sample size, sample sites and the SSR markers selected. The genetic diversity of weedy rice was relatively high, which may be due to the distinct ecotypes inhabited by weedy rice in each of the two provinces. The genetic diversity of Korean weedy rice (including 24 long and short grain strains) also showed high polymorphism between the two strain types, whereas lower polymorphism was found within each grain type (Cho et al. 1995).

In the two groups of weedy rice, we found low levels of genetic differentiation. The low gene differentiation (Gst) among populations was mainly caused by the high level of gene flow, but it could also be explained by weedy rice strains that originated from the same progenitor, not too long ago. However, compared to out-crossing plants (biennial wormwood, Gst = 0.22) and species with sexual reproduction (Gst = 0.225) (Mengistu et al. 2004), the weedy rice Gst was still at a high level, due to the inbreeding production system.

Both cluster analysis and PCA of whole individuals (including weedy rice strains, locally cultivated rice, cultivated Indica and Japonica rice from other provinces, wild rice, and weedy rice from Korea, Vietnam and Malaysia) exhibited similar results: weedy rice populations in the two provinces had very close genetic relationships with cultivated rice from the same field. The genetic distance between weedy rice and locally cultivated rice was shorter than that between weedy rice and cultivated rice from other provinces, and weedy rice exhibited a greater distance from wild rice.

Here, our results from weedy rice populations of two provinces, which cluster with Indica and Japonica rice, respectively, suggested that weedy rice in Liaoning and Guangdong provinces of China probably originated directly from local cultivated rice, possibly through rice reversion mutation or de-domestication (Cao et al. 2006).

Wild rice is not native to Liaoning province, so it is widely accepted that Liaoning weedy rice does not have any direct relationship with wild rice. However, Guangdong province is an area heavily populated by common wild rice; the weedy rice that occurred here exhibited a weak relationship with wild rice. Both varieties fell into different clades in the cluster dendrogram. The results suggest that weedy rice in the two provinces has not evolved from wild rice. The cluster dendrogram also showed the same result as the former reports, that wild rice distributed throughout China is Japonica-like (Wang 1996; Li et al. 2006; Chen et al. 2008).

The gene flow through pollen transfer between weedy rice and local cultivated rice might contribute to the close genetic relationship between these varieties (Lu and Snow 2005). However, the cluster analysis result did not always display the closest distance between weedy rice and cultivated rice from the same field (e.g., Panjing, Tieling and Zhanjiang weedy rice and Panjing, Tieling and Zhanjiang cultivated rice in Liaoning and Guangdong, respectively) (Fig. 2). Obviously, the gene flow may contribute to these relationships. Tieling weedy rice did not display a closer relationship to cultivated rice derived from the same field. Weedy rice may not always originate directly from the same field as local cultivated rice varieties. Possibly because it takes time for weedy rice to evolve from cultivated rice, the duration of rice cultivar cultivation may have been too short to allow for de-domestication. Moreover, some types of rice cultivars may more easily de-domesticate into weedy rice than others. Further investigation in this area is required.

Our results reveal no clear type that is intermediate to Indica and Japonica. We can also suggest that weedy rice in Liaoning and Guangdong provinces of China probably did not evolve from a hybrid between Indica and Japonica rice.

It is interesting to find that the former research about weedy rice around the world had showed some clues to our conclusion that weedy rice may originated from local cultivated rice. Ishikawa et al. (2005) found that in Bhutan, weedy rice and local cultivated rice shared a similar allelic pattern of microsatellite makers, although only four pairs of SSRs were involved in the analysis. In another interesting work (Suh et al. 1997), about 150 weedy rice individuals were collected from different countries, but primarily from Korea. The PCA analysis of all materials based on several traits showed that weedy rice also had a close relationship to local cultivated rice, similar to our result. Similar to weedy rice in the Asian region (Tang and Morishima 1988), weedy rice in the Mediterranean region (where Japonica rice was traditionally grown) was most closely related to Japonica rice. Furthermore, in Brazil, where Indica rice is typically grown, the weedy rice was also closely related to Indica varieties (Ferrero 2001). It may be concluded that the de-domestication of weedy rice from local cultivated rice frequently occurred in rice-growing regions.

Weedy rice origin is intimately linked with rice breeding and cultivation. The application of simple rice cultivation techniques such as direct seed cultivation may create an environment suitable to the evolution and survival of weedy rice. Under such a field environment, the mutations of cultivated rice may accumulate gradually during the transformation to weedy rice. Finally, weedy rice populations may evolve from mutants. It is accepted that mutation and evolution of cultivated rice to weedy rice may occur in any rice-growing area. However, only when the field environment is suitable to preserve those mutants will weedy rice evolve from cultivated rice. Hence, the control of weedy rice will comprise efforts to create field environments that do not preserve cultivated rice mutants.