Introduction

The papaya tree cultivated in Chile belongs to the Caricaceae, a small family of dicotyledons that contains herbaceous and shrubby plants. It has been reclassified as Vasconcellea pubescens (Vasconcellea cundinamarcensis (Linden) Badillo, Carica pubescens Lenné et C. Koch) (Badillo 2000). Commonly the Vasconcellea species are called “highland papayas” because of the botanical similarity with Carica papaya L., and also because they are native to the high regions of the Ecuadorian Andes (National Research Council 1989; Badillo 2000; Van Droogenbroeck et al. 2002; Morales et al. 2004).

Although documented information about the introduction of V. pubescens into Chile is not abundant, its presence seems to be ancient. Campos (1998), points out that this species would have been introduced in the North of Chile before the Spanish arrived there—in 1535—by migrations of pre-Columbian peoples. In a similar way, information about how this species arrived to the Southern region of Chile is also unknown, but it is believed that the seeds came from the Northern Chilean plantations.

In Chile, V. pubescens fruits are mainly used to produce preserved fruit and small quantities of juice, jam and processed sweets. However, an additional feature of this species is its ability to produce latex with a high level of papain, an important and valuable proteolytic enzyme with industrial, pharmaceutical and research uses (Kiger 1986a, b).

Currently, only 225 hectares of V. pubescens are cultivated in Chile, mainly concentrated between 30° and 33° latitude south (94%). Most orchards in this area are relatively large (10–40 ha) and maintained under standard fruit management (terracing, fertigation, pest management and pruning, etc.), supplying fruit both for the local and the international market. This species is also cultivated in the south of Chile (35°–38° South Latitude) but in smaller orchards, mostly maintained as home gardens (ODEPA-CIREN 2005).

V. pubescens is an open pollinated species in which it is normally possible to distinguish male, female and hermaphrodite plants (Storey 1976). It has been a traditional practice that growers produce their own planting material from their own orchards (Kiger 1986a). During this process the growers choose the best plants to collect fruits and recover seeds to produce seedlings. The particular circumstances under which this species has evolved in Chile must have affected its genetic diversity and genetic structure.

The Inter Simple Sequence Repeats (ISSR) markers are a very useful tool for studies of genetic diversity (Wolfe et al. 1998). They are based on the use of the microsatellite sequences as primers to generate multiple band patterns. The necessity of previous knowledge of the genome is obviated in the design of primers, which makes this technique an attractive, quick and economic possibility (Zietkiewicz et al. 1994). Additionally, given the abundance of microsatellites sequences it is possible to analyze a large number of loci, giving high possibilities of finding polymorphisms, even in highly related genotypes (Arnau et al. 2003; Carrasco et al. 2007).

In this study we evaluate the level and organization of the genetic diversity in V. pubescens cultivated in Chile, in order to establish a base line to assist future conservation and breeding programmes of the Vasconcellea genus.

Material and methods

Plant material

Young leaves were collected from trees growing in all the cultivated areas of Chile, from 29°58′30.1″ to 38°06′47.9″ (Table 1), comprising 333 adult plants. The collected samples were frozen in liquid nitrogen and stored at −20°C until DNA extraction.

Table 1 Genetic diversity in Vasconcellea pubescens as determined by Inter Simple Sequence Repeats (ISSR) markers

Sample collection

Sample size varied from 10 to 20 individuals depending on the number and size of each orchard found in each study area (see Table 1). Orchards, and samples within orchards, were collected at random in relation to morphological characteristics and position within the area. Samples collected from growers that exchange planting material were considered as the same geographic sub-group for molecular analysis. In that way, 26 orchards and home gardens sampled in this study, were grouped into 5 groups for the Southern area and 5 groups for the Northern area.

Samples collected from Lipimavida, Iloca, Curanipe, Cobquecura and Contulmo- Lleu-Lleu (between 34°50′48.9″ S and 38°06′47.9″ S, Fig. 1; Table 1) were considered as belonging to the Southern area. In these locations all orchards were small home-gardens (30–100 plants per orchards). Samples collected from La Serena, Valle del Elqui, Ovalle, Huentelauquen and La Ligua-Quillota (between 29°58′30.1″ S and 32°23′35.5″S, Fig. 1; Table 1) were considered as belonging to the Northern area. In this last case the plant material was collected from commercial orchards (10–40 ha).

Fig. 1
figure 1

Geographical locations (latitude and longitude) of the 10 V. pubescens orchards sampled in this study and genetic structure. Orchards from La Ligua to La Serena were considered as the Northern area, while orchards from Lipimavida to Lleu-Lleu were considered as the Southern area. Coloured circles is a graphic representation that show all the genetic groups obtained (8) by Structure and Bayesian Analysis of Population Structure (BAPs). Population with the same colour represents similar genetic groups (Cobquecura and Lleu-Lleu; Curanipe and Lipimavida)

DNA extraction and PCR amplification

DNA extraction and PCR amplification were carried out according to Carrasco et al. (2007). A set of 36 primers (set ISSR 100/8, Biotechnology Laboratory from University of British Columbia, Vancouver) was tested. The basic characteristics of the primers that gave positive results are shown in Table 2. All PCR products were checked by electrophoresis on 2% agarose gels, run in TAE 1X buffer and visualized by ultraviolet fluorescence after staining with ethydium bromide (0.25 μl·ml−1 staining solution).

Table 2 Characteristics of the seven ISSR primers used for the analysis of V. pubescens

ISSR data analysis

From the agarose gels, each generated band was considered as a locus and was designated as present (1) or absent (0) for each individual. For all statistical analyses, the samples were recorded according to geographical origin (based on GPS positions). The POPGENE version 1.31 software (Yeh et al. 1999) was used to calculate the percentage of polymorphic ISSR loci (P), Nei's gene diversity (h), (Nei 1973) and Shannon's index (I, Shannon 1948). Nei's gene diversity is a simple measure of genetic variability and is defined as: h = 1 − ∑ x i 2, where (in the case of dominant molecular markers) x i is the population frequency of each allele (1 and 0) at locus i (Nei 1978). Shannon's index (I) is defined as: I = −∑ p i ln (p i ), where p i is the proportion of the ith allele in the population (Lewontin 1972). For each locus Nei's index produces values from 0–0.5 and Shannon's index produces values ranging between 0 and 0.73 on a natural logarithm scale (Lowe et al. 2004).

The hierarchical ISSR frequency distribution was analyzed using AMOVA incorporated using software GENALEX v. 6 (Peakall and Smouse 2006). To estimate the population genetic differentiation when binary data were analysed, this software calculates a parameter Φpt, an analogue to the currently used F st. Φpt is calculated from a matrix of Euclidean distances metric, unlike F st, which is a genetic frequency based measure (Peakall and Smouse 2006). The statistical significances of Φpt were evaluated using null distributions generated by random permutation of individuals (1,000 bootstrap permutations).

Population structure

The population structure determined by methods such as AMOVA, have the disadvantage that populations must be defined “a priori”, therefore the significance of values may change depending on the “a priori” grouping. In contrast, Bayesian methods have the advantage of inferring populations based on the frequencies of the alleles. The individuals in the sample are assigned probabilistically to populations, or jointly to two or more populations if their genotypes indicate that they are genetically mixed (admixture) (Pritchard et al. 2000; Pritchard and Wen 2003). In the first method, population structure within V. pubescens samples was inferred using a Bayesian model-clustering algorithm implemented in the computer programme, Structure (version 2.1) (Pritchard and Wen 2003). To run the software, a number K of genetic clusters characterized by the matrices of allele frequencies at each locus is first assumed. Then, for each individual, the proportion of its genome derived from each genetic cluster (proportion of ancestry) is estimated. Ten independent runs of the algorithm, assuming values of K from 1 to 10 with 600,000 Markov chain Monte Carlo (MCMC) repetitions and a burning period of 60,000 were performed, assuming population admixture. The posterior probability (probability of K given the data) was then calculated for each mean value of K using the mean estimated log-likelihood of K to choose the optimal K. The proportion of ancestry in a given cluster was calculated as an assignment rate (q), in general a level of 0.8 is accepted as a correct assignment to a single cluster. The algorithm used by Structure may be poorly suited for inferring the number of genetic clusters in a data set that has a relationship of identity by descent (Pritchard and Wen 2003). Considering the latter, the second method was a Bayesian clustering method described in Corander et al. (2003), implemented in software Baps (version 4.14). In contrast with Structure, it uses stochastic optimization to infer the genetic structure and it can use a spatial model that takes into account individual geo-referenced multilocus genotypes to assign a biologically relevant structure, thereby increasing the power to detect correctly the underlying population structure. Ten independent repetitions for each K from 1 to 10 were carried out.

Results

For V. pubescens cultivated in Chile, the seven ISSR primers studied in 333 samples collected along a geographic gradient, yielded a total of 114 bands (Table 2). Out of the 114 bands recorded, 63 proved to be polymorphic (P = 55.3%). At the species level, the genetic diversity was rather low (h = 0.01 ± 6.80188E-05, Shannon’s Index I = 0.16 ± 0.000148).

Similar results were observed at the geographic group level (Table 1) where the genetic diversity was low for every parameter measured, independent of the number of trees sampled per location.

Interestingly, despite the scarcity of the genetic diversity found in the cultivated V. pubescens in Chile, it showed a clear structure (Fig. 2). The major portion of the genetic diversity was found within groups (65% Fig. 2a) when all the samples were analysed together. The genetic differentiation between the different groups was extremely high and significant, as the AMOVA analysis suggests (Φpt = 0.35, P < 0.001, Fig. 2a). This was especially remarkable when analysing the Northern area alone, as the differentiation was even higher (Φpt = 0.40, P < 0.001, Fig. 2c). Whereas when only the Southern area was analysed, Φpt was reduced to 0.18 (P < 0.001, Fig. 2b), indicating greater genetic similarity.

Fig. 2
figure 2

Summary of genetic differentiation (Φpt = F st) among and within groups as determined by AMOVA considering all samples together (a) and the samples from South (b) and the North (c) separately (1,000 permutations). P < 0.001

The results generated from Structure and BAPs for all the populations showed a value of K = 8 as being the best average assignment rate (q > 0.8). This means it is possible to distinguish 8 genetically different groups, five of them located in the North and three in the South (Fig. 1). Similarly with the AMOVA results, the individuals within the Southern area appear to be more similar to each other than those in the North. For example, the individuals from Buchipureo and LleuLleu, were genetically similar therefore they were shown as clustering together (Fig. 1). A similar situation was also observed between Curanipe and Lipimavida, although these locations are 120 km apart (Fig. 1). Iloca proved to be different from the other two Southern groups. On the other hand, the individuals from the Northern area were genetically dissimilar and clustered into five different groups.

Discussion

This is the first report of investigations into the organization of the genetic diversity in a cultivated species of the Vasconcellea genus. Previously Kim et al. (2002); Saxena et al. (2005) studied C. papaya cultivars and breeding lines, in order to establish the genetic relationships among specific samples, and provide information for germplasm management, improvement and cultivar protection.

The genetic diversity of V. pubescens was remarkably low (h = 0.01, I = 0.16) compared with the average diversity reported for plants when using dominant markers (h = 0.19–0.23, Nybom 2004). While at the same time, the percentage of polymorphic loci (62.3%) showed similar levels to other plant studies (P = 51–91%, Nybom 2004). This reflects an interesting situation in which a high polymorphism does not necessarily generate high gene diversity.

Many aspects, such as the breeding-system, seed and pollen dispersal, plant longevity and agricultural practices, influence the genetic diversity, including the proportion of variation distributed within and between populations (Hamrick and Godt 1989, 1996a, b). Open-pollinated, or outcrossing species show a higher proportion of genetic diversity within populations with lower levels of genetic differentiation between populations (Hamrick and Godt 1996b).

The genetic differentiation between groups are in accordance with those observed in plants using dominant markers (F st = 0.34–0.35, Nybom 2004), as well as with clonally propagated and autogamous plants species (F st = 0.43, Morjan and Rieseberg 2004).

The higher genetic similarity among the Southern groups could be due to the active exchange of plant material among growers, since in these latitudes growers usually buy and share fruits with their neighbours. Iloca is a special case because it is the largest area cultivated in the south and from discussions with the growers it is clear that they frequently include seeds from the Northern area.

Compared to the Southern, the orchards from the Northern area were genetically more distinct from each other, regardless of their geographic distance. Growers in this area have larger orchards and they are much more competitive. From discussions with them they appear to never or rarely, exchange plant material.

The level of genetic diversity and genetic structure observed for V. pubescens cultivated in Chile also may be explained by founder events and extended selection. This species is an introduced crop into Chile and a reduction of genetic diversity by founder events as compared to the center of origen in the high land of Ecuador Andes could be expected. For the future it will be interesting to carry out an extensive study including samples from center of origin of the species in order to verify its genetic diversity patterns. A second founder event can explain the lower genetic diversity in the Southern population as compared with the Northern populations, given the introduction/migration of a few seed from the Northern genetic stock and the small extension of the plantations that could reduce the chances of crossing and hybridization.

Growers tend to select their own best trees to collect fruit and produce plants. The effects caused by selection on cultivated plants have been documented extensively (Tang and Knapp 2003; Hollingsworth et al. 2005; Hyten et al. 2006). Selection is similar to having a severe bottleneck removing most of the genetic variation from the target loci and other associated loci, if not from all the genome (Wright et al. 2005.).

In addition, the agricultural practices applied by growers generate high levels of inbreeding. Similarly to Carica papaya, V. pubescens is a trioic (it is possible to find male, female and hermaphrodite plants), self-compatible and open-pollinated species (Kiger 1986a; Yagnam 1993). Currently plantation systems used by farmers establish three plants per planting-hole in the field, to thereafter eliminate male plants. Such practices stimulate crossing among related individuals, and increase the degree of selfing. For instance, it is clearly possible to view the current genetics status of papaya resources present in Chile, as reflecting the result of many years of cultivation that have reduced the genetic diversity but generated a high level of genetic structure.

From a breeding programmes perspective, it would be advisable to exploit efficiently the genetic resources by incorporating all the available genetic diversity into one gene pool, and thus start a national breeding programme by crossing material from the different cultivated areas.