1 Introduction

Honeybee diversity has been extensively surveyed across its wide distributional range in Africa, Europe, and the Middle East using a multitude of markers ranging from the early morphological studies (Ruttner 1988) to the more recent analysis of single nucleotide polymorphisms (Whitfield et al. 2006). These surveys have provided a largely concordant grouping of diversity into four major evolutionary lineages, whose designations are roughly coincident with their geographical ranges, namely: the African (A), the Middle Eastern (O), the eastern European (C), and the western European (M). Current distribution of these lineages is largely parapatric with contact zones identified in Italy (Franck et al. 2000), Turkey (Kandemir et al. 2006), Libya (Shaibi et al. 2009a), and in the Iberian Peninsula (Smith et al. 1991; Franck et al. 1998; Garnery et al. 1998; Arias et al. 2006; Miguel et al. 2007; Cánovas et al. 2008). Special attention has been paid to the Iberian contact zone, which has been examined with biparental and, more extensively, with maternal markers (Smith et al. 1991; Franck et al. 1998; Garnery et al. 1998; Arias et al. 2006; Miguel et al. 2007, 2011; Cánovas et al. 2008, 2011).

The Iberian contact zone, formed by lineage A in the south and M in the north, was first reported by RFLP analysis of the entire mtDNA molecule in samples collected across the Mediterranean coast of Spain (Smith et al. 1991). Since then, multiple surveys of Spanish populations, particularly with the popular DraI test (Garnery et al. 1993), not only confirmed and refined the shape of the cline but also revealed that Iberian diversity patterns were in fact more complex than anticipated earlier (Franck et al. 1998; Garnery et al. 1998; De la Rúa et al. 2002; Miguel et al. 2007; Cánovas et al. 2008). Indeed, there are more haplotypes of lineage A and M ancestry in the Iberian Peninsula than in the remaining extensive territory where both lineages occur (Franck et al. 1998, 2001; Miguel et al. 2007; Cánovas et al. 2008; Pinto et al. 2012). While this observation has supported a consensual view of an Iberian glacial refuge for the M lineage (Miguel et al. 2007), the origin of African-derived patterns exhibited by Iberian honeybees is hotly debated (Franck et al. 1998; Cánovas et al. 2008).

Despite a shared African maternal ancestry, Iberian honeybees are highly differentiated from their African counterparts (Garnery et al. 1995; Franck et al. 1998, 2001). Northern African populations are dominated by haplotypes of sub-lineage AII ancestry, whereas Iberian populations harbor predominantly sub-lineage AI haplotypes, although haplotypes AII are also present but at low frequencies (Franck et al. 2001; Cánovas et al. 2008). However, contrasting with sub-Saharan African populations, similarly dominated by AI haplotypes, the most abundant AI haplotype (A2) in the Iberian Peninsula is rarely found in the African continent (Franck et al. 2001; Cánovas et al. 2008). To complicate matters further, northern Ibero-Atlantic populations descend from sub-lineage AIII, which is virtually absent in the African continent (Franck et al. 1998, 2001), but frequently observed in the Atlantic archipelagos (De la Rúa et al. 1998, 2001a, 2006). In summary, the Iberian Peninsula has seemingly been a stage for extensive interaction among honeybees of multiple origins.

While honeybee diversity patterns of the Mediterranean-influenced Iberian territory have been thoroughly documented, the Atlantic side has remained largely undersampled. To fill the gap, we performed a fine-scale maternal survey of Portuguese populations. The results presented here show a strong north–south partitioning of sub-lineage AIII and sub-lineage AI haplotypes suggesting that maternal composition of Ibero-Atlantic populations is distinct from that of the Ibero-Mediterranean populations. This remarkable differentiation and unique variation indicate that the Atlantic side of the Iberian Peninsula is an important reservoir of Apis mellifera iberiensis diversity.

2 Materials and methods

2.1 Samples and DNA extraction

Between 2008 and 2010, adult honeybee workers were sampled from 786 apiaries, located in every district of continental Portugal, and grouped into 18 populations (Table I and Figure 1). Samples were transported to the laboratory in absolute ethanol and stored at −20 °C until molecular analysis. Each analyzed individual represents a single colony and apiary.

Table I Distribution of haplotypes (number/frequency) and diversity measures (number of effective haplotypes; number of private haplotypes; and unbiased genetic diversity) per population sampled in mainland Portugal. Populations are ordered from north to south (see Figure 1 for population locations). Haplotypes were previously defined by Franck et al. (2001), De la Rúa et al. (2005), and Pinto et al. (2012). Haplotypes were grouped by African sub-lineages (AI, AII, AIII) and European lineages (M, western European and eastern European, C), as in Franck et al. (2001).
Figure 1.
figure 1

Synthetic map of Portugal illustrating geographic pattern of genetic diversity obtained from MDS analysis. The map was generated by interpolating the first dimension of MDS analysis using F ST values obtained with haplotypic frequency data (a) and lineages and sub-lineage frequency data (b). In (a), darker tones represent lower values of dimension 1 axis of the MDS analysis, whereas in (b), they represent higher values. Dots indicate the centroid of each sampled population.

2.2 Molecular analysis

Total DNA was extracted from one leg using a 10 % Chelex® protocol (Walsh et al. 1991). Haplotypes were identified using the DraI test (Garnery et al. 1993), which consists on the PCR amplification of the intergenic tRNAleu-cox2 region followed by digestion with the enzyme DraI (Promega®). Amplification was performed using the primers E2 and H2 and the PCR reaction and conditions detailed by Garnery et al. (1993), with minor modifications. PCR products were digested with DraI, following manufacturer's recommendations, and then separated and sized on a 4 % wide range/standard 3:1 agarose gel (Sigma-Aldrich®). The band patterns were examined and the haplotypes were identified following the complete set of restriction maps and restriction fragment sizes reported to date (Franck et al. 2001; De la Rúa et al. 2005; Collet et al. 2006; Pinto et al. 2012). Each haplotype was assigned to a lineage or to an African sub-lineage, as determined by absence of the P element (lineage C), the form of the P element (P in lineage M; P0 in sub-lineages AI and AII; P1 in sub-lineage AIII), DraI cut sites, and the number of repeats of the Q element (Garnery et al. 1993; De la Rúa et al. 1998; Franck et al. 1998, 2001; Rortais et al. 2011; Pinto et al. 2012).

2.3 Data analysis

Diversity measures were obtained with GenAlEx 6.4 (Peakall and Smouse 2006) and F ST values with Arlequin 3.5.1.2 (Excoffier et al. 2005), with significance tested by 10,100 permutations. F ST matrix was summarized in two major dimensions using multidimensional scaling (MDS) analysis, as implemented by Statistica 7.0 (StatSoft, Inc. [Tulsa, OK] 1999). Using the centroid of the sampled areas as the geographic coordinates for each of the 18 locations, the values obtained in the first MDS dimension were spatially interpolated through ordinary kriging, based on an omnidirectional spherical variogram model, to generate a synthetic map. Kriging was preceded by variography to analyze and explore the spatial autocorrelation of the variables, model parameterization, and error analysis. The relationships among the African haplotypes were established using the reduced median–median joining (RM-MJ) approach (Bandelt et al. 1999) implemented in Network 4.6.1.0 (Fluxus Engineering, Clare, UK; http://www.fluxus-engineering.com). A binary matrix was built using 16 characters including presence/absence of P1 and Q elements, large indels (>11 bp), and DraI cut sites. The network was constructed using the following settings: r = 2, ε = 10, the P1 element was upweighted, and the two most variables characters were downweighted. Where applicable, statistical significance levels were adjusted for multiple comparisons using the sequential Bonferroni procedure to correct for type I error (Rice 1989).

3 Results

A total of 32 haplotypes were observed among the 786 colonies sampled in this study (Table I). As expected, most (28) haplotypes were of African ancestry. Only three haplotypes (M4, M6, M7) belonged to the western European lineage and one (C1) to the eastern European. The frequency distribution of the 32 haplotypes shows that 98 % (772) of the colonies were of African ancestry being A1 the most common and widespread haplotype (159 colonies, 20.2 %). Over 50.5 % (397) of the African colonies carried only three haplotypes: A1 (159 colonies, 20.2 %), A16 (122 colonies, 15.5 %), and A2 (116 colonies, 14.8 %). Most haplotypes (25) exhibited frequencies lower than 5 %. Among those were the M and C-lineage haplotypes with frequencies of 1.3 % (10 colonies) and 0.5 % (four colonies), respectively (Table I).

Haplotype distribution per population revealed a dramatic change in maternal composition from north to south (Table I). Haplotypes A16 and A11 (sub-lineage AIII) were the most frequent in the northernmost populations (e.g., Viana, Braga, Vila Real, Bragança). However, they were virtually absent in the southern populations (e.g., Évora, Setúbal, Beja, Faro) where A1 and A2 (sub-lineage AI) become dominant. Non-African haplotypes were present at very low frequencies in only eight populations, mostly located in the northern half of the country. Bragança was the location with the greatest number of non-African haplotypes with one sample carrying an eastern European haplotype (C1) and four samples carrying three western European (one M4, one M6, two M7).

The diversity measures (Table I) showed that the population of Castelo (C.) Branco, located in the center inland of Portugal (Figure 1), exhibited the greatest number of haplotypes (17 total, 11 effective, and 4 private) and the highest genetic diversity (0.92). The least diverse population (0.55) was located in the south (Beja) with only two effective haplotypes and none private. The F ST values (Online resource 1) revealed that the highest differentiation was observed between Vila Real and Beja (0.359, P < 0.000) and the lowest between the population pairs of Bragança/Viseu, Portalegre/Setubal, and Braga/Vila Real (0.007, P > 0.05), respectively. The geographical pattern of genetic differentiation, represented in the synthetic map (Figure 1a), shows a north–south gradient that peaks in the south with the populations of Beja and Faro standing out as the most differentiated (clearest shade).

An RM-MJ network revealed a highly complex distribution of haplotypes forming two main groups (Figure 2). The more complex right-hand side group is composed of sub-lineage AIII haplotypes being A16 and A11 the most interior ones. The left-hand side joins haplotypes belonging to both sub-lineages AI and AII. Sub-lineage AII haplotypes form a group which is connected to A2. Except for A28 and A46′, all singletons are of sub-lineage AIII ancestry.

Figure 2.
figure 2

RM-MJ network relating the PCR-RFLP characters of the tRNAleu-cox2 intergenic region observed in populations sampled in mainland Portugal. Haplotypes were previously defined by Franck et al. (2001), De la Rúa et al. (2005), and Pinto et al. (2012). Hypothetical (unsampled or extinct) haplotypes are symbolized as filled squares. The cross lines along branches represent mutational steps between nodes (dashed lines are indels, whereas full lines are losses/gains of DraI cut site). The sizes of the circles size are proportional to the haplotype frequencies. The different tones represent African sub-lineages, as defined by Franck et al. (2001).

The frequency distribution of the African haplotypes grouped by sub-lineage (Table I) revealed that AI was the most frequent with 49 % (377) of the colonies, followed by AIII with 38 % (263). Sub-lineage AII was detected in only 17 % (132) of the colonies. The geographical distribution of sub-lineages (Table I) shows a sharp north–south trend. The Atlantic sub-lineage AIII was predominant in the northern populations contrasting with those from the south where colonies were mostly of sub-lineage AI ancestry. Sub-lineage AII was present everywhere, but at residual levels in some northern populations. The F ST values obtained with data grouped by lineage and sub-lineage were much higher (Online resource 1), but similarly to the previous comparisons Beja was the most differentiated population. The synthetic map reveals again a strong north–south structure (Figure 1b). In accordance with Figure 1a, the population of Leiria is genetically closer to the southern populations. The demarcation of Coimbra from the more inland populations of C. Branco and Guarda is roughly coincident with the highest Portuguese mountain range.

4 Discussion

This fine-scale maternal survey of Portuguese populations enabled a refinement of the Iberian cline reported by previous studies (Smith et al. 1991; Garnery et al. 1995, 1998; Franck et al. 1998; Arias et al. 2006; Miguel et al. 2007, 2011; Cánovas et al. 2008). As expected from the clinal pattern documented in those studies, most Portuguese populations were nearly pure African (98.2 %). Only 1.3 % of the colonies harbored a haplotype of lineage M ancestry. The spatial representation of this variation shows a well-defined north–south gradient (Figure 1a, b) suggesting that contemporary human-assisted dispersal (through, for example, queen trading and migratory beekeeping) has had a negligible impact on population sub-structure in Portugal. Similarly to Spanish populations (Franck et al. 1998; De la Rúa et al. 2005; Miguel et al. 2007), Portuguese populations harbor a low frequency of C-lineage haplotypes reflecting beekeeper's preference for local honeybees.

Two hypotheses have been proposed for the origin of the Iberian contact zone: ancient multiple range expansions (Cánovas et al. 2008) and human-mediated introductions during Muslim invasions (Franck et al. 1998). Iberian honeybees exhibit unparalleled patterns of maternal diversity (further magnified by the unique composition of the Atlantic populations) and are profoundly differentiated from northern African populations. Contrasting with northern African populations, which harbor mostly A8 and A9 haplotypes (Garnery et al. 1995; Franck et al. 2001), Portuguese populations are dominated by A1, A16, and A2 haplotypes and Spanish by A2, A1, and A3 (Cánovas et al. 2008). Haplotypes A8 and A9 are present in Portugal and Spain, although at low frequencies, and A2, A3, and A16 are rare or have not been sampled in North Africa (Garnery et al. 1995; Franck et al. 2001). Accordingly, the high levels of diversity and complexity of Iberian honeybees revealed in this and other studies, together with the profound differentiation from northern African populations (Garnery et al. 1995; Franck et al. 1998, 2001), are more compatible with the hypothesis of ancient invasions from Africa.

Regardless the timing or mode of expansion into Iberia, Apis mellifera intermissa has been assumed as the source of African genes in A. m. iberiensis (Smith et al. 1991). However, given the complex Iberian patterns and profound divergence between both subspecies, it is possible that multiple expansion waves of multiple origins might have occurred long time ago. We speculate that a component of the variation that we see today in A. m. iberiensis has its roots in an ancient expansion of Apis mellifera sahariensis. This subspecies may have been formerly distributed in a much wider area (Ruttner 1988), a hypothesis that is supported by a recent morphometric study that found more similarities between Libyan honeybees and A. m. sahariensis than with the geographically closer Apis mellifera lamarckii and A. m. intermissa (Shaibi et al. 2009b), although sequencing data of the tRNAleu-cox2 intergenic region identified those populations as being mostly of O lineage ancestry (Shaibi et al. 2009a).

While morphometry recognizes A. m. sahariensis as a distinct unit (Ruttner 1988), the tRNAleu-cox2 intergenic region does not produce subspecies-specific haplotypes (Franck et al. 2001). Nonetheless, haplotypic frequency distributions vary between A. m. intermissa and A. m. sahariensis (Franck et al. 2001), providing some level of maternal differentiation. Haplotype A1, for example, is rare in northern Moroccan populations of A. m. intermissa and highly frequent in the more southern populations of A. m. sahariensis (Garnery et al. 1995; Franck et al. 2001). The A1 trend was confirmed by another study (De la Rúa et al. 2007), which also reported the presence of haplotype A15 in a coastal Moroccan population of A. m. sahariensis. This was an interesting finding, as A15 and A1 are among the most prevalent haplotypes in Canaries and Madeira, further reinforcing the hypothesis of an ancient colonization from Africa into both archipelagos (De la Rúa et al. 1998, 2001a, 2006). Moreover, it suggests that sub-lineage AIII haplotypes might be a relic of a period that predates contraction of A. m. sahariensis. Alternatively, sub-lineage AIII evolved in the Atlantic coast of the Iberian Peninsula and has been introduced elsewhere by humans. While this is a plausible hypothesis, the high frequency of haplotype A1 in Portugal (Table I) is better explained by an African source population alternative to A. m. intermissa. Another compelling indication of a possible ancient invasion by A. m. sahariensis is provided by sequence data of an mtDNA coding region (Arias and Sheppard 1996). Interestingly, in this phylogenetic study, the sequences of A. m. iberiensis (collected in Portugal) were more identical to A. m. sahariensis than to A. m. intermissa.

The fact that A. m. sahariensis is now a relic subspecies (Ruttner 1988) complicates reconstruction of a putative invasion of the Iberian Peninsula. Whether sub-lineage AIII haplotypes were abundant in the Atlantic coast of Africa and preferentially occupied the Atlantic coast of the Iberian Peninsula during a range expansion is uncertain. Whether haplotype A2 was abundant in North Africa and was introduced into the Iberian Peninsula during a colonization wave that favored a Mediterranean route is also uncertain. Haplotype A2 is by far the most abundant and widespread haplotype in Spain (Franck et al. 1998; Miguel et al. 2007; Cánovas et al. 2008). It is also common in mainland Portugal and in the Mediterranean islands of Ibiza (De la Rúa et al. 2001b) and Sicily (Franck et al. 2001). In the remaining territory, it is rare and exhibits a disjunct distribution. It has been reported at low frequencies on Madeira (De la Rúa et al. 2006), Southern Morocco (Franck et al. 2001), Algeria (Chahbar et al. 2012), and as far as Syria (Alburaki et al. 2011). While A2 current distribution could lead to the assumption of an Iberian Mediterranean origin and ensuing human-assisted introduction elsewhere, its central position in the network and multiple connections with the other haplotypes (Figure 2) indicate that A2 is an old haplotype. Therefore, it is more likely that A2 arrived at the Iberian Peninsula a long time ago and current low frequency and disjunct distribution in Africa and Middle East resulted from a contraction process, although some level of human-assisted introduction cannot be excluded.

This study shows that diversity patterns exhibited by Ibero-Atlantic honeybee populations are very distinct from their Ibero-Mediterranean and African counterparts. It is a remarkable fact that in this corner of the extensive area occupied by the African lineage, three well-represented sub-lineages came together with two of them (AIII and AI) forming a stepped north–south gradient. While the encounter of genetically different populations would yield a pattern consistent with a stepped gradient, it is expectable that the high dispersal abilities and gene flow typical of honeybees would rapidly erase the signal of such historical event, especially at this small geographical scale. The alternative explanation for the observed cline is selection. Mitochondrial DNA has long been considered a neutral molecule. However, evidence is accumulating that it is subject to selection pressures (reviewed by Rand 2001; Ballard and Whitlock 2004). In honeybees, selection has been invoked to explain maintenance of the cline formed by western European and African lineages in the Iberian Peninsula (Franck et al. 1998; Cánovas et al. 2008) and the loss of European haplotypes in Africanized honeybee populations due to negative heterosis in hybrids (reviewed by Schneider et al. 2004). Although we do not have empirical support or indirect evidence from sequence variation analyses, we suggest that selection might have had a major role in shaping the observed Ibero-Atlantic well-defined maternal cline, a hypothesis that certainly deserves further attention.

Above we postulate that haplotypes of sub-lineage AIII ancestry arrived in the Iberian Peninsula through an ancient range expansion from Africa. However, given the high number of sub-lineage AIII haplotypes, which have only been reported for Portugal, we cannot exclude the hypothesis that diversification might have occurred in Iberia following colonization. Considering the high mutation rate for the tRNAleu-cox2 intergenic region, especially for sequences with more than two Q elements (Franck et al. 1998), and assuming a long time since African invasion, this is a plausible scenario. This study suggests that the Atlantic side of the Iberian Peninsula has been a place of contact of haplotypes of multiple origins and possibly a place of maternal diversification. Therefore, we may claim that the Atlantic side of the Iberian Peninsula has not only been a melting pot, but also a hotspot of African maternal diversity. Protecting this legacy that was left from a past history shared with Africa is of unquestionable importance, especially in a context of rapid environmental change with the Iberian Peninsula turning warmer and dryer.