Introduction

Oil camellia (Camellia spp., Theaceae), an evergreen shrub or tree native to China (Mondal 2011), has been widely cultivated in tropical and sub-tropical regions of Asia (Chen et al. 2010) for its valuable seed oil, which possesses several favorable characteristics such as abundant vitamins and unsaturated fatty acids (Long and Wang 2008; Wang et al. 2008; He et al. 2011), good storage stability, and positive health benefits (Zhong et al. 2001). The oil quality is comparable to that of olive oil (Long and Wang 2008). Additionally, camellia oil also has been used medicinally to treat intestinal disorders and burn injuries (Huang et al. 2005, 2007; Chen et al. 2007). Because of its high nutritional value, camellia oil has been deemed to be one of the four best edible oils produced by woody plants (Chen 2006). In China, there are over 3 million ha of oil camellia plantations, and the oil production is around 260,000 t per year (The State Forestry Administration of China 2009). To date, more than 100 elite oil camellia cultivars have been authorized for promotion by local and national forestry administrations in China (The State Forestry Administration of China 2009). The effective use of these elite resources is of critical importance for the continuing success of the camellia oil industry. Therefore, it is essential to develop a rapid and reliable toolkit for oil camellia cultivar identification.

Traditionally, cultivar identification of oil camellia mainly relies on phenotypic characteristics, such as morphology and colors of leaf, flower, and fruit (Que and Zhang 2008). But, the phenotypic distinctions among cultivars are normally ambiguous and unreliable since it varies in different developmental stages and in different environments (Huang et al. 2006; Que and Zhang 2008). DNA fingerprinting based on various molecular markers has been proved to be an efficient and promising means to distinguish different cultivars (Morell et al. 1995; Korir et al. 2013). During the past decades, a variety of DNA markers have been used for cultivar identification of oil camellia, including randomly amplified polymorphic DNA (RAPD, Zhang 2002; Chen et al. 2005), inter simple sequence repeat (ISSR, Zhang et al. 2007; Wen et al. 2008; Yu et al. 2013), and sequence-related amplified polymorphism (SRAP, Lin et al. 2010; Sun et al. 2014) markers. These molecular markers provide more definitive means to identify oil camellia cultivars over the traditional phenotypic distinction.

Recently, thousands of simple sequence repeat (SSR) markers were developed and characterized for oil camellia (Wen et al. 2012; Shi et al. 2013; Xia et al. 2014). SSR markers possess several advantages over the other molecular markers, including co-dominance, high polymorphism, and good reproducibility (Varshney et al. 2005; Oliveira et al. 2010; Kaur et al. 2015). Furthermore, microsatellite markers are species-specific and, therefore, avoid being disturbed by the cross-contamination of non-target organisms (Selkoe and Toonen 2006), such as plant endosymbiotic bacteria and other pathogenic microorganisms. The genetic markers generated by species-specific SSR primers are desirable genetic tools for distinctness, uniformity, and stability (DUS) testing [the Protection of New Varieties of Plants (UPOV International Union for the Protection of New Varieties of Plants 2010) in the protection of new plant varieties. SSRs have been successfully applied for cultivar identification in a wide range of plant species, such as cereals, vegetables, fruits, and oilseeds (Wünsch and Hormaza 2002; Korir et al. 2013; Kaur et al. 2015) and have been identified as one the most powerful marker systems for plant variety characterization.

Although large SSR primer databases have been developed for camellia species (Wen et al. 2012; Shi et al. 2013), their utility in cultivar identification has not been determined. In the present study, we aim to (1) develop a reliable genetic toolkit for oil camellia cultivar identification, (2) construct DNA fingerprints for the elite cultivars of oil camellia, and (3) examine whether there are synonyms in the authorized oil camellia cultivars.

Results and discussion

Determination of a core SSR primer set

The selection of a set of highly polymorphic core SSR primers is a crucial step for DNA fingerprinting in cultivar identification (UPOV International Union for the Protection of New Varieties of Plants 2010). A large number of SSR primers have so far been developed for oil camellia (Wen et al. 2012; Shi et al. 2013; Xia et al. 2014). Amplified with a subset of six oil camellia cultivars from different origins, ten SSR primers that generated highly polymorphic and clear fingerprinting profiles were screened from a total of 109 primer pairs based on denatured polyacrylamide gel electrophoresis (Fig. 1). The primer sequences, along with information on the SSR motif and melting temperature (T m) are provided in Table 1. Subsequently, DNA fingerprints for 128 elite oil camellia cultivars were conducted with the selected primer pairs on an ABI 3130 DNA Analyzer (Applied Biosystems, USA). Fingerprints generated by these primers exhibit perfect reproducibility under the highly stringent amplification conditions in this study. The distinct fingerprinting peaks revealed by these primers varied from 9 to 83, with an average of 42.4, and the sizes of these peaks ranged from 91 to 344 bp (Table 2). Normally, SSRs with polymorphism information content (PIC) values >0.5 were considered as highly informative markers (Botstein et al. 1980). The PIC values of these primers ranged from 0.696 to 0.973, with an average of 0.906, which was much higher than 0.5 (Table 2). The observed high number of fingerprinting peaks in this study might due to the following reasons: (1) Highly polymoprhic primers are selected for fingerprinting the oil camellia cultivars; (2) camellia trees possess a gigantic genome about 4.0 Gb (Junichi et al. 2006); the primers might have multiple copy in the genome; and (3) ploidal level of camellia trees varies from diploidy to octaploidy (Huang et al. 1984; Li 2001; Zhuang and Dong 1984); this will greatly increase the fingerprinting peaks revealed by these SSR primers.

Fig. 1
figure 1

Representative polyacrylamide gel electrophoresis profiles of PCR products amplified using the four SSR primer pairs

Table 1 Core SSR primer sequences
Table 2 Summary of the genotyping information for each SSR primer

According to the UPOV guidelines (UPOV International Union for the Protection of New Varieties of Plants 2010) for DNA fingerprinting using molecular markers for the protection of new plant varieties, only markers with distinct PCR bands, high reproducibility, and reasonable polymorphism can be selected to use for this purpose. The selection of a core SSR primer set for cultivar identification has been conducted in many economically important plants, such as maize (Wang et al. 2010, 2011), cotton (Pan et al. 2008), and Brassica (Wei et al. 2012); however, such a marker toolkit is not presently available for oil camellia cultivar identification. The SSR primer set screened in this study provides a useful genetic toolkit for the identification and protection of the elite oil camellia cultivars.

DNA fingerprinting of the elite oil camellia cultivars

The DNA fingerprints of 128 elite oil camellia cultivars were generated using the established marker toolkit (Fig. 2). The complete DNA fingerprinting profiles are available at http://115.29.234.170/Database/Camellia_fingerprinting/. The digital conversions of these profiles, based on the allele sizes, are displayed in Supplementary Table S2. The distinct genotypes identified by each primer pair ranged from 20 (NJFUC601) to 107 (NJFUC57, NJFUC243 and NJFUC833), and the majority of cultivars (124) could be distinguished by a subset of six SSR primers, including NJFUC57, NJFUC243, NJFUC273, NJFUC787, and NJFUC833 (Table 2 and Supplementary Table S3).

Fig. 2
figure 2

Electropherograms of DNA fingerprinting generated by two representative SSR primers on an ABI 3130 DNA Analyzer. a, b Primer NJFUC273 amplified in cultivar ‘AH19’ and ‘Eyou 54’; c, d Primer NJFUC600 amplified in cultivar ‘Feng 586’ and ‘Xing 48’

Based on the genotypes identified by each primer pair, we calculated the genotype frequencies at each SSR locus in the 128 oil camellia cultivars (Supplementary Table S3). The genotype frequency at each locus ranged from 0.008 to 0.579 (Supplementary Table S4). Under the assumption that the genotype frequencies at different loci were independent from each other, the probability of matching genotypes across all ten loci by chance alone can be calculated as \( P={\displaystyle \prod_{i=1}^{10}{G}_{fi}} \), where G fi was the genotype frequency of a particular cultivar revealed by the ith SSR primer. The by-chance-alone matching probabilities of genotypes across all ten loci were smaller than 1.95E-15 (Supplementary Table S4), which meant if two cultivars had identical genotypes across all of the ten SSR loci, the two cultivars should be synonyms or result from somatic mutation. In the examined cultivars, ‘Changlin 40’ and ‘Fuyang 40’ were found to have identical genotypes across all ten SSR loci; thus, these two cultivars should be synonyms or vegetative mutants. According to the registration files of these two cultivars, there were no records indicating that they were somatic mutants of each other. We planted these two cultivars in our greenhouse, and no phenotypic distinctions were noticed through a 3-year observation period thus far. Take together, ‘Changlin 40’ and ‘Fuyang 40’ were identified as synonyms. In the recent decade, there was a big booming of the oil camellia plantations in southern China, and seedlings of oil camellia were traded and introduced frequently across different areas. The synonym of cultivars is a major source of confusion in managing the elite cultivars of economic plants, including oil camellia. Although synonymy of these two cultivars is not completely for sure, our data indicate a high possibility of synonyms in the authorized oil camellia cultivars.

DNA fingerprinting has been applied for cultivar identification in various crop species, including cereals, vegetables, fruits, oilseeds, and nuts (Korir et al. 2013). However, strictly speaking, these techniques were not used for “identification” but for “distinction of a limited number of cultivars” (Kunihisa et al. 2009), since in most cases they were applied for a limited number of definitive cultivars. For managing the new plant varieties, we need to establish the DNA fingerprints for all of the authorized cultivars of a particular species; based on that, the probability of completely matching of a test cultivar with the cultivars on the authorized list can be calculated, and thus synonym can be avoided in authorizing new plant varieties. Kunihisa et al. (2009) identified 117 strawberry cultivars with a probability of >99.9 % with 16 cleavage amplified polymorphic sequence (CAPS) markers. In this study, we established a DNA fingerprints database for 128 elite oil camellia cultivars with highly variable SSR markers, which provide desirable fingerprinting references for managing and monitoring the oil camellia seedling trading activities.

The dendrogram of oil camellia cultivars

A dendrogram elucidating the genetic relationships among the 128 oil camellia cultivars was constructed based on the fingerprinting data using the UPGMA cluster analysis. The pairwise similarity coefficients between cultivars were in the range of 0.788 to 1.0, with an average of 0.851. As mentioned above, cultivars of ‘Changlin 40’ and ‘Fuyang 40’ were found to share a complete similarity of 1.0 (Fig. 3); together with other facts on file, they were highly possible to be synonyms. On the dendrogram, these two cultivars were found to be closely related to ‘Changlin 21’, indicating that ‘Changlin 40’ might be the senior synonym. It was noteworthy that most of the similarity coefficients were greater than 0.80 (accounted for 98.8 % of the pairwise comparison), indicating a narrow genetic base of these oil camellia cultivars. It is recorded that oil camellia has a cultivation history for over 2300 years in China (Xing et al. 2012). The narrow genetic base might be due to the selection pressure of domestication. Clonal selection and hybrid breeding have been the most common strategies in the oil camellia breeding program. Although conventional breeding is successful for plant variety improvement, it also leads to the tapering down of the genetic base (Mukhopadhyay and Mondal 2014). On the dendrogram, we also noticed that cultivars from a particular location were commonly found to scatter among different clades. Oil camellia has a long cultivation history and is frequently introduced and transported from one place to another. Moreover, the derived dendrogram would be different if we use different data sets and algorithms. Thus, the dendrogram constructed in this study might not reflect the original geographic origins of these cultivars.

Fig. 3
figure 3

The dendrogram of the 128 oil camellia cultivars built based on the fingerprinting data

Conclusion

In this study, we constructed the DNA fingerprinting profiles of 128 elite oil camellia cultivars with a set of highly polymorphic SSR primer pairs. This database supplied a useful reference for managing and monitoring the seedling trading activities. With the established marker toolkit, it was found that there was a high possibility of synonyms in the authorized elite cultivars of oil camellia. This study provided a reliable and efficient genetic toolkit for the characterization new varieties of oil camellia and also gave an example that demonstrated the necessity to introduce DNA fingerprinting in the management and protection of new varieties of plants.

Materials and methods

Plant material and DNA extraction

A total of 128 elite camellia cultivars were collected from ten provinces (Anhui, Fujian, Guangxi, Guizhou, Hubei, Hunan, Jiangxi, Shanxi, Sichuan, and Zhejiang) in China through the administration of the Southern Forest Tree Seeds Inspection Center of China. These cultivars were maintained in the greenhouse on the campus of Nanjing Forestry University. Detailed information on these cultivars is listed in Supplementary Table S1.

Genomic DNA was extracted from young leaves of each cultivar using a DNeasy Plant Mini kit (Qiagen, Valencia, CA). Qualitative and quantitative evaluations of the extracted DNA were performed using a NanoDrop 2000 spectrophotometer (Thermo Fisher Scientific, Wilmington, DE) and agarose gel electrophoresis.

Core SSR primer screening

We synthesized a total of 109 SSR primer pairs that showed polymorphisms among different camellia cultivars according to Shi et al.’s study (2013). These primers were screened by amplifying with DNAs extracted from six oil camellia cultivars from different origins. PCR was carried out in an ABI-9700 Thermocycler (Applied Biosystems, Foster City, CA) with a total volume of 15 μL, containing 25 ng DNA template, 20 ng of each primer, 1.5 μL 10× PCR buffer (Mg2+ plus), 0.15 μL BSA (10 mM), 0.4 μL dNTP (2.5 mM), and 1 U Taq DNA polymerase (Lifefeng, Shanghai, China). Amplifications were performed using a touchdown program with an initial denaturation at 94 °C for 4 min; then 94 °C for 30 s, 59 °C for 30 s, and 72 °C for 30 s, followed by 9 cycles having annealing temperatures decreasing by 1 °C per cycle; and 25 cycles of 94 °C for 30 s, 50 °C for 30 s, and 72 °C for 30 s, followed by a final extension at 72 °C for 7 min. The PCR amplification was repeated three times for each primer pair. Amplified products were separated on 8 % denatured polyacrylamide gels (acrylamide/bis-acrylamide = 39:1) and visualized by silver staining.

SSR primers for cultivar identification were selected using the following criteria: (1) successful amplification in all samples, (2) amplification of a clear and reproducible banding pattern, and (3) the number of alleles was ≥3 or the polymorphism information content (PIC) value was ≥0.65.

DNA fingerprinting

The selected SSR primers were labeled with 6-carboxy-fluorescine (FAM) for DNA fingerprinting of the 128 elite oil camellia cultivars. PCR amplification was carried out as described above. All of the fluorescent PCR amplicons were analyzed on an ABI 3130 DNA Analyzer (Applied Biosystems) with GeneScan™ 500 ROXTM (Applied Biosystems) as the internal size standard. Allele scoring was performed using GeneMapper v 3.7 (Applied Biosystems).

Data analysis

PIC was calculated for each core primer combination using the formula (Keim et al. 1992): \( PIC=1\hbox{-} {\displaystyle \sum_{\mathrm{i}}^{\mathrm{n}}{P}_{\mathrm{i}}^2} \), where n is the total number of alleles detected for a SSR marker, and P i is the frequency of the ith allele. Genetic relationships among the collected cultivars were analyzed using a UPGMA algorithm with the NEIGHBOR program in PHYLIP (V3.5c, Felsenstein 1989), and a dendrogram was constructed using the MEGA6 program (Tamura et al. 2013).

Data archiving statement

The DNA fingerprint data generated in this study were deposited at the website http://115.29.234.170/Database/Camellia_fingerprinting/. The accession numbers will be supplied once available.