Introduction

The genus Citrus is one of 33 (Swingle and Reece 1967) to 26 (Mabberley 2008a, b) genera in the subfamily Aurantioideae of the family Rutaceae. Citrus is divided into two subgenera: the common cultivated types of this fruit are placed in the subgenus Citrus, species of the subgenus Papeda do not bear edible fruit (Moore 2001).

The taxonomy of Citrus is complex and the precise number of natural species is unclear. Until the mid 1970s, Citrus taxonomy was based solely on morphological and geographical data, leading to two widely used classification systems. The Swingle system (Swingle and Reece 1967) is relatively simple, containing 16 species, but the Tanaka taxonomy recognizes up to 162 species (Tanaka 1977). This lack of agreement reflects differences of opinion as to what degree of difference justifies species status and Tanaka has split the genus Citrus into many small groups. There is no definitive work on Citrus taxonomy, and many scientists use a system intermediate between these two systems (Hodgson 1965, 1967: 36 species; Singh and Nath 1969: 31 species).

Scora (1975) and Barrett and Rhodes (1976) suggested that there are only three (or four) ‘basic’ true species within the subgenus Citrus (as defined by Swingle): citron (C. medica), mandarin (C. reticulata), and pummelo (C. maxima), and sometimes lime (now generally considered a hybrid, C. × aurantiifolia). Other cultivated Citrus taxa within the subgenus Citrus are believed to be hybrids derived from these true species, species of the subgenus Papeda, or closely related “genera”. Interestingly, the earliest plant taxonomists also believed that there were only few valid species of the subgenus Citrus (two species, and three varieties in Linnaeus 1753; three species Hooker 1875) and this idea has gained scientific support in recent years from molecular data (Federici et al. 1998; Nicolosi et al. 2000). For instance, morphological and molecular studies have indicated that lime (C. × aurantiifolia) and lemon (C. × limon) arose from interspecific crosses with C. medica (citron) as one of the parent species (Scora 1975; Barrett and Rhodes 1976; Federici et al. 1998; Nicolosi et al. 2000).

Genetic variability and relationships among cultivated taxa is complicated by several factors, like the high frequency of bud mutation and nucellar embryony, a long history of cultivation and wide cross-compatibility, leading to taxonomic ambiguities (Nicolosi et al. 2000; Moore 2001). Spontaneous or artificial hybridization and sport formation has probably played an important role in the origin of many cultivated Citrus taxa. The wide human-induced dispersion of Citrus across geographic boundaries has further facilitated “intergeneric” and intrageneric crossing leading to an immense variety of morphological forms. Interesting hybrids or sports can easily be vegetatively propagated, leading to a contrast between high levels of morphological (mainly agronomic) trait diversity versus low levels of genetic variability within taxa, as described for clementine (C. reticulata Blanco; Bretó et al. 2001); lemon (C. × limon (L.) Osbeck.; Gulsen and Roose 2001) and trifoliate oranges (Poncirus trifoliata; Fang et al. 1997).

Understanding taxonomy, phylogenetic relationships and genetic variability in Citrus is critical for determining genetic relationships, characterizing germplasm, establishing breeding programs and the registration of new cultivars. Vietnam, located in the South East Asian center of origin of Citrus (Webber 1967; Scora 1975), is a center of biodiversity for wild and cultivated Citrus accessions (Tanaka 1954). Citrus has always been one of Vietnam’s most popular fruit products and as a consequence they are grown widely from the North to the South of Vietnam, leading to a high abundance of Citrus genetic resources.

In this study the phylogenetic relationships among 61 Citrus accessions collected in Vietnam were analysed using ITS sequences of the rDNA, and compared with data from 8 accessions present in the NCBI-database.

Materials and methods

Plant material

Table 1 gives an overview of the analysed accessions, their origin and institute codes. Table 2 provides the scientific names of all taxa used in this manuscript, corresponding vernacular names and their names as put forward by Mabberley (2008a, b).

Table 1 Overview of analyzed taxa, accession names, isolate codes, GenBank accession numbers, and their origin
Table 2 Scientific names of all taxa used in this manuscript, corresponding vernacular names and their name as put forward by Mabberley (2008a, b)

In total 51 accessions belonging to Citrus subgenus Citrus were collected: 1 C. × paradisi (grapefruit: G), 11 C. maxima (pummelo: P), 5 C. medica (citron: C), 3 C. × aurantiifolia (lime: L), 1 C. × limon (lemon: LI), 12 C. × sinensis (sweet orange: O), 1 C. × aurantium (sour orange: OS), 15 C. reticulata (mandarin: M), 1 C. reticulata ‘Clementine’ (Clementine: M), 1 C. × nobilis (king mandarin: MK). Furthermore, the following 10 accessions, that have been described to belong to closely related genera or subgenera, were included in the sample set: 5 specimens of C. hystrix (kaffir lime: KL), belonging to the subgenus Papeda, 1 Poncirus trifoliata accession (PT), 3 Fortunella accessions (Fortunella japonica and Fortunella margarita: F) and 1 Murraya paniculata specimen.

Next to the 61 specimens collected in Vietnam, the NCBI database contains ITS sequence data from 6 Citrus accessions, and from 2 accessions from the related genus Murraya: Murraya paniculata and Murraya koenigii.

DNA isolation

Total cellular DNA was isolated as described by Rogers and Bendich (1988) with minor modifications. Young leaves from fully expanded and mature plants were collected and maintained at low temperature in polyethylene bags. In the laboratory, the leaves were washed in distilled water, ethanol 70% and ground using the Retsch mixer mill model MM 200. Each sample was suspended in 1.0 ml of DNA extraction buffer. After incubation at 65°C for 30 min with occasional vigorous shaking, the samples were centrifuged at 13,000g for 10 min. The supernatant was collected and an equal volume (about 700 μl) of iso-propanol was added. The samples were mixed, and placed on ice (or −20°C) for 2 h. The samples were centrifuged at 13,000g for 10 min and the supernatant was discarded. After addition of 400 μl of TE Buffer and 5 μl RNase the samples were incubated at 37°C for 20 min. 400 μl of CTAB Buffer was added and the samples were transferred to a warm water bath at 65°C for 15 min. Afterwards, an equal volume of isoamyl alcohol:chloroform (24:1) was added, and the samples were centrifuged at 13,000g. To the aqueous phase (upper phase) two volumes of 96% ethanol were added. After incubation at room temperature for 5 min the samples were centrifuged for 5 min (10,000g). The pellet was then washed twice with ethanol 70%. The DNA was resuspended in 200 μl TE Buffer, applying a short incubation at 37°C. DNA samples were stored at −20°C.

DNA extraction, PCR and sequencing

PCR amplification of the ITS region, including the 5.8 S rDNA region, was performed using primers ITS-1 and ITS-4 (ITS1: 5′ TCCGTAGGTGAACCTGCGG 3′; ITS4: 5′ TCCTCCGCTTATTGATATGC 3′ as described by White et al. (1990), using a Perkin Elmer 9700 thermal cycler (Applied Biosystems corporation). Final reaction volumes of 25 μl each contained 50 ng genomic DNA, 0.5 μM of each primer, 0.2 mM dNTPs, 0.5 U Taq DNA polymerase (Fermentas), 1× PCR buffer supplied by the manufacturer and about 2.5 mM MgCl2. The amplification programme consisted of predenaturation at 94°C for 90 s; 30 cycles at 95°C for 50 s, 55°C for 70 s and 72°C for 90 s; and a final incubation at 72° for 3 min; 1 min at 30°C; and a final hold at 4°C. MgCl2 concentration and annealing temperature had to be optimized for some of the samples to obtain a good amplification.

PCR products were purified by PureLinkTM PCR Purification kit (Invitrogen). Purified fragments were directly sequenced with PCR primers using the ABI prism BigDyeTM Terminator v1.1 Cycle Sequencing Kit (Applied Biosystems, Foster City, CA, USA) on an automated sequencer (ABI prism 3130, Applied Biosystems).

Phylogenetic analyses

Next to the 61 new sequences obtained from Vietnamese accessions in the current analysis, ITS sequence data from 6 Citrus accessions, and from 2 accessions from related genus Murraya (Murraya paniculata and Murraya koenigii) was included in the dataset.

Sequences were aligned using ClustalX (Thompson et al. 1997) followed by manual adjustments using BioEdit 7.0.5.3. Phylogenetic analyses were carried out using PAUP* v4.0b10 (Swofford 2002) and MrBayes 3.2.1 (Ronquist and Huelsenbeck 2003). Parsimony analyses were performed with PAUP* v4.0 b10 (Swofford 2002) using the heuristic search option with random sequence addition (100 random replications) and TBR branch-swapping. All characters had equal weight, and gaps were treated as missing characters. Constant and uninformative characters were removed from the data matrix. Consistency index (CI), retention index (RI) and rescaled consistency index (RC) were calculated. Support for the different clades was tested by bootstrap analysis (100 replicates using heuristic search, simple sequence addition). Bayesian analysis was run using MrBayes version 3.1.2. (Ronquist and Huelsenbeck 2003). Bayesian inference was run for 3,000,000 generations, and the first 100,000 generations were discarded as burn-in.

Results

Sequence data from the ITS region of the rDNA was analysed for 69 accessions, including 8 sequences obtained from the NCBI database. The alignment of all sequences included 703 positions (including gaps), 298 positions were variable among which 126 were parsimony-informative sites.

Parsimony analysis produced 14,778 equally parsimonious trees of 286 steps with a consistency index (CI) of 0.6538, a retention index (RI) of 0.8350, and a rescaled consistency index (RC) of 0.5460. A majority-rule consensus tree was constructed from these trees as shown in Fig. 1. The bootstrap values (100 replicates) are shown on each branch. The tree reconstructed by Bayesian analysis is shown in Fig. 2. The tree obtained is very similar to that of the parsimony method, although the separation of the subclusters is less obvious in the Bayesian tree.

Fig. 1
figure 1

Majority rule consensus tree of 14,778 maximum parsimonious trees of 69 Citrus accessions based on ITS of the rDNA sequence data. Tree length = 286; consistency index (CI) = 0.6538; retention index (RI) = 0.8350; rescaled consistency index (RC) = 0.5460. Bootstrap values above 40% are given on the nodes. The tree is rooted with the three Murraya accesssions

Fig. 2
figure 2

Bayesian phylogenetic tree of 69 Citrus accessions based on ITS of the rDNA sequence data. The posterior probability is given on each node. The tree is rooted with the three Murraya accesssions. The scale bar represents branch length (number of substitutions/site)

The phylogenetic trees based on both maximum parsimony and Bayesian analyses show a clear separation between the three ‘basic’ species as proposed by Scora (1975) and Barrett and Rhodes (1976). Cluster 1 contains all C. maxima accessions together with C. × paradisi, 1 C. × aurantiifolia (1a), and Poncirus trifoliata and some of the C. × sinensis genotypes (1b). Cluster 2 combines C. medica with all other C. × aurantiifolia and C. × limon (2a); and C. hystrix (2b). C. reticulata and C. × aurantium are grouped (3a), together with most of the C. × sinensis. Fortunella japonica and Fortunella margarita are in 3b. Pairwise sequence divergence ranged from 0 (between multiple accessions) to 0.215 (between Murraya koenigii and Cmedica1) with an average of 0.038. Within Citrus, average pairwise sequence divergence was 0.030, with a maximum of 0.143 (Cmedica1 vs. 043Xadoai). No sequence divergence was found among a few mandarin accessions from cluster 3a, and among some oranges from the same cluster. 3 C. medica accessions (cluster 2a) also revealed 100% ITS sequence identity. Furthermore, 7 C. maxima members of cluster 1 have identical ITS sequences. Interestingly, ITS sequence of the grapefruit accession was 100% identical to these C. maxima accessions.

Discussion

In this study sequence data from ITS of the rDNA of 69 accessions from the genus Citrus and related genera were obtained and their evolution was investigated using maximum parsimony and Bayesian analyses. In contrast to previous studies the current phylogenetic analysis includes a larger number of closely related accessions of a few closely related species instead of one (or a few) specimens from different genera, which allows us to investigate relationships at a lower taxonomic level and to investigate evolutionary divergence within taxa.

The separation of the three ‘true’ Citrus (C. medica, C. maxima and C. reticulata) is confirmed by their grouping in three different clusters in our ITS of the rDNA sequence analysis.

As isozyme and morphological data suggested before (Barrett and Rhodes 1976; Torres et al. 1978) and could be expected based on vegetative propagation of cultivars, ITS-data reveal a close evolutionary relationship among the analysed C. maxima accessions. Furthermore, our data confirm a close evolutionary relationship between C. maxima, C. × paradisi and, although more distantly, some sweet orange accessions (C. × sinensis). cpDNA analysis of these three species has also shown these species to be very closely related (Nicolosi et al. 2000; Kyndt et al., unpublished data).

Grapefruit (C. × paradisi) has been proposed to be of hybrid origin, with pummelo as mother and sweet orange as father (Gmitter 1995; Moore 2001) and subsequent backcrossing with pummelo (Fang and Roose 1997; Herrero et al. 1996; Pang et al. 2007, Mabberley 2008a, b). This hypothesis is confirmed by our ITS sequence data, since the grapefruit accession reveals 100% sequence identity with some pummelo accessions at ITS level.

The mandarin (C. reticulata) cluster is not well resolved, as also seen in other molecular data analyses (Federici et al. 1998; Barkley et al. 2006). Sour oranges (C. × aurantium) and most of the sweet oranges (C. × sinensis) cluster among the mandarins, confirming the mandarins as one of their parental species (Barrett and Rhodes 1976) as suggested by previous molecular data (Nicolosi et al. 2000; Barkley et al. 2006; Pang et al. 2007).

Sweet orange (C. × sinensis) is thought to be a natural hybrid between predominantly C. reticulata and some C. maxima traits (Scora 1975; Barrett and Rhodes 1976). Molecular data already confirmed that the chloroplast genome of sweet orange is derived from pummelo (Green et al. 1986; Nicolosi et al. 2000; Barkley et al. 2006; Kyndt et al., unpublished data). This ITS-sequence analysis and the chloroplast PCR-RFLP study of Jena et al. (2009) suggest that C. × sinensis has a polyphyletic origin. While some Vietnamese sweet orange genotypes are closely related with pummelo, others are grouped with mandarin. Most probably C. × sinensis originated from one or a few hybridization events between pummelo as maternal parent and mandarin as father and subsequent backcrosses with one of these parents. It has to be noted that some well-established C. × sinensis cultivars (‘Xadoai’ and ‘Valencia’) are highly supported in the phylogenetic trees, and are found within the mandarin group.

Citrus medica is grouped with the proposed hybrid species C. × limon (lemon) and C. × aurantiifolia (lime), which is consistent with the fact that the citron has been proposed to be the paternal ancestor of several hybrids in Citrus (Federici et al. 1998; Nicolosi et al. 2000). While Barrett and Rhodes (1976) proposed that lime arose from a trihybrid “intergeneric” cross involving C. medica, C. maxima and a “Microcitrus” species, RAPD and SCAR markers (Nicolosi et al. 2000) suggested that limes resulted from a cross between C. micrantha (subgenus Papeda) and C. medica (male parent). Isozyme analyses (Torres et al. 1978) found low diversity between seven cultivars of lime (C. × aurantiifolia) and this was suggested to be due to its apomictic perpetuation (Barrett and Rhodes 1976). However, this is contradicted by the current study, where one lime accession clusters with pummelo and all others with citron, suggesting C. maxima and C. medica as parent species of C. × aurantiifolia.

Lemons (C. × limon) are thought to be natural hybrids of C. medica and lime (Scora 1975; Barrett and Rhodes 1976) or a hybrid of citron and sour orange (Gulsen and Roose 2001). Lemon is clustered among the citron-lime group (cluster 2a in Fig. 1), suggesting that lime is indeed one of the ancestors of lemon (Scora 1975; Barrett and Rhodes 1976). No relationship with sour orange is seen in our study. Although our ITS data confirm a close relationship between C. × limon and C. medica, and thereby confirm the involvement of the latter species in its hybrid origin, no clear-cut conclusions can be drawn about the other hypothetic ancestor species of this hybrid taxon.

ITS data shows a close evolutionary relationship between Fortunella and Citrus spp., although their morphology is very different. This observation agrees with previous molecular studies (Green et al. 1986; Pang et al. 2007), where some analyses even showed a nested clustering of Fortunella in Citrus (Herrero et al. 1996; Federici et al. 1998; Nicolosi et al. 2000; Pang et al. 2003; Barkley et al. 2006). Also in our analysis Fortunella spp. are clustered within Citrus, close to the C. reticulata group, confirming their recent reclassification as Citrus japonica (Mabberley 2008a, b). The same is true for Poncirus trifoliata, which is clustered within Citrus, and is now called Citrus trifoliata (Mabberley 2008a, b).

While Herrero et al. (1996) and Federici et al. (1998) find C. hystrix (kaffir lime, subgenus Papeda) clustered with C. maxima, Nicolosi et al. (2000) and our ITS data suggest that C. hystrix is closer to C. medica, C. × limon and C. × aurantiifolia. Generally, these observations demonstrate that the subdivision of the subgenera Citrus and Papeda, as proposed by Swingle and Reece (1967), based on the abundant presence of acridic oil in the fruit and the very broadly winged petioles in subgenus Papeda is not confirmed by molecular data. Based on all observations we can hypothesize that C. hystrix is a probable (grand)parent or sister species of C. × aurantiifolia or C. maxima and subsequently diverged independently from the subgenus Citrus.