Geographic clines in genetic polymorphisms

We can recognize spatial variations in the phenotypic traits of various organisms in nature. Among the spatial variations in traits, continuous spatial gradients are best described in terms of geographic gradients, which are also called clines (Haldane 1948; Endler 1977; Futuyma 2009). Traditionally, a cline in measurable traits is used as a phenotypic marker, and it is a natural model system for quantifying the effects of selection and stochastic events and their relative importance (Endler 1973; Cook et al. 1986; Huey et al. 2000; Schemske and Bierzychudek 2007; Saccheri et al. 2008; Brakefield and de Jong 2011). Moreover, geographic clines in phenotypic traits provide key insights into the evolutionary forces that lead to allopatric speciation in nature because geographic variation from one end of a region to the other can bridge early and late stages of speciation (Endler 1977; McLean and Stuart-Fox 2014). Thus, the underlying mechanisms for establishing clines and the evolutionary consequences of clines remain key topics in evolutionary biology.

Geographic clines are observed in both quantitative and qualitative traits (Hoffmann and Weeks 2006; Futuyma 2009). In the former case, the average value of the focal trait in a population changes smoothly with the environmental gradient. Bergmann’s cline is the best-known geographical pattern where body size is larger in cold regions than warm ones (Mayr 1963; Meiri et al. 2007). This pattern is interpreted as an adaptation to thermal conditions, i.e., large body species, which have lower surface-to-volume ratio, are favored in colder climates because of reduction heat loss. Body-size clines in ectotherms, which are called converse Bergmann clines, are also suggested to be an adaptation to gradual change in abiotic environments such as the length of the season for activity and/or development (Blanckenhorn and Demont 2004). Although some body-size clines are the results of a plastic response to environmental gradients (Blanckenhorn and Demont 2004), the establishment of a smooth cline in polygene traits is mainly explained by local adaptation along an environmental gradient. In Drosophila, clines in polygene traits (e.g., the size and shape of the body) can be established in nature within a couple of decades via a continuum of rapid local adaptations (Huey et al. 2000). Notably, these studies suggest that each population is ideally occupied by individuals with the most adaptive trait.

Geographic clines are also observed in polymorphisms of alleles, genotypes, or qualitative phenotypes (Haldane 1948; Slatkin 1973; Endler 1977). The clines in morph/allele frequencies should not be confused with those in qualitative traits, because local adaptation by selection (especially negative selection) alone cannot explain the coexistence of multiple morphs or alleles within a population (Endler 1977). Instead, clines in morph/allele frequencies are an outcome of complex combinations of selection and historical/ongoing stochastic events.

Geographic clines in qualitative traits, i.e., morph frequencies, along environmental gradients have been reported for many species, and various underlying mechanisms for establishing geographic clines in polymorphisms have been hitherto discussed at least in theory. Examples include birds (Itoh 1991), insects (Komai et al. 1950; Cook et al. 1986; Hammers and Van Gossum 2008; Cooper 2010; Gosden et al. 2011; Cook and Saccheri 2012), and plants (Schemske and Bierzychudek 2007; Hodgins and Barrett 2008). However, few experimental studies have confirmed the underlying mechanisms of geographic clines in morph frequencies in nature, probably because of the lack of understanding of such mechanisms or the lack of comprehensive tests. The present situation may lead to misunderstandings in the underlying mechanisms for establishing geographic clines in polymorphisms and misestimation of the strength of selection and stochastic factors in natural systems.

Thus, I present a general review of the underlying mechanisms for establishing geographic clines in polymorphisms, and a case study using the female dimorphic damselfly Ischnura senegalensis to illustrate a strategy that confirms the underlying mechanism of a geographic cline in morph frequencies (Takahashi et al. 2010, 2011, 2014a). This review may help to address the geographic clines in other polymorphic systems, thereby contributing to a comprehensive understanding of the establishment of geographic clines in quantitative traits and thus their evolutionary and ecological consequences in nature (Hugall and Stuart-Fox 2012; Takahashi et al. 2014b; McLean and Stuart-Fox 2014).

Mechanisms that generate geographic clines in polymorphisms

Geographic clines can be established by the combination of two antagonistic evolutionary forces: selection that generates spatial differentiation in morph frequencies, and selection or stochastic factors that leads to the coexistence of genetic variation within a population and thus to the homogenization of morph frequencies among populations (Endler 1977). First, I provide the details of these evolutionary forces and then describe how the combination of these two antagonistic evolutionary forces establishes geographic clines in morph frequencies in nature.

Selection to generate spatial differentiation

Selection to induce spatial variation in morph frequencies includes two types of divergent selection, which are derived from gene-by-environment interactions and secondary contact between different morphs. In the former case, the fitness advantage of each morph changes differentially across the environmental gradient and reverses across the intersection of fitness function where each phenotype has equal fitness (Slatkin 1973; Endler 1977). This type of selection generates spatial differentiation in the morph frequencies beyond the intersection.

In the latter case, spatial differentiation is generated by inter-morph interactions after secondary contact between two historically allopatric populations with different alleles/morphs (Whibley et al. 2006). When two populations come into secondary contact without any reproductive barriers, the boundary populations comprise two morphs; however, this continuous variation in morph frequency will be transient where one morph will outcompete the other unless the morph fitness is completely equal over time (Ford 1945). In contrast, if two populations come into secondary contact with reproductive barriers and the rare morphs tend to fail to reproduce in the area dominated by different morphs (i.e., positive frequency-dependent selection), inter-morph interactions constrain the admixture of the two populations (Mallet and Barton 1989). Thus, gene-by-social environment interactions induced by secondary contact can generate spatial variation in the morph frequency under overall divergent selection.

Some ecologists mistakenly believe that the morph frequency changes gradually when the relative fitness of two morph changes gradually along the environment continuum. However, this is not true because a single morph with the highest fitness should dominate in each population over an evolutionary time scale (Endler 1977; Takahashi et al. 2011). This means that the gene-by-environment interactions derived from divergent selection (gene-by-environment interaction) or secondary contact with positive frequency-dependent selection themselves leads to a stepwise pattern in the morph frequency across an the balancing point of fitness function, indicating that the evolutionary forces that allow the coexistence of multiple morphs within a population are required to maintain the stable cline in morph frequency.

Factors leading to the coexistence of multiple morphs

The factors that homogenize differentiation among populations comprise gene flow among populations or balancing selection within a population. Gene flow among populations across balancing point of fitness function or hybrid boundary can lead to multiple morphs within a population (Slatkin 1973). Note that because gene flow occurs among adjacent populations, contribution of gene flow on the maintenance of multiple morphs is typically restricted around the intersection of fitness function in the species with limited gene flow (Endler 1973). The second mechanism is based on balancing selection including the negative frequency-dependent selection and overdominant selection. Since balancing selection can lead to the coexistence of multiple morphs within a population even if each morph has different the reproductive success, multiple morphs will coexist in a certain geographic range (Endler 1977).

Mechanisms and structures of geographic clines

All smooth geographic clines, except transient ones, should be established by a combination of divergent selection and homogenizing forces. As shown in Fig. 1, the underlying mechanisms that establish clines in the morph frequency can be classified into three categories as follows: a combination of divergent selection derived from gene-by-environment interaction and gene flow among populations (i.e., selection–migration balance, Fig. 1a), a combination of divergent selection derived from secondary contact and gene flow among population (selection–migration balance, Fig. 1b), and a combination of divergent selection derived from gene-by-environment interactions and balancing selection within each population (i.e., selection–selection balance; Fig. 1c) (Endler 1977; McLean and Stuart-Fox 2014). Irrespective of the mechanism, the structure (i.e., shape and width of the cline) is determined by the relative strength of each evolutionary force, but they have different genetic and ecological features as discussed below (summarized in Table 1).

Fig. 1
figure 1

Mechanisms establishing smooth geographic cline in morph frequency. A cline is generated by a a combination of gene-by-environment interaction (G × E) along environmental continuum and gene flow among populations, b a combination of secondary contact and gene flow among populations, or c a combination of gene-by-environment interaction along environmental continuum and balancing selection within each population. Fitness functions of black and white morph predicted by gene-by-environmental interaction or secondary contact (left), and spatial variation of equilibrium morph frequency in the absence or presence of homogenizing forces derived form gene flow or balancing selection (right). Pie chart indicates the equilibrium frequency of black and white morph in each population. Divergent selection (gene-by-environmental interaction or secondary contact) does not generate smooth clinal variation, but maintains stepwise pattern in morph frequency. Gene flow and balancing selection contribute to maintaining smooth clinal variation along spatial position under the divergent selection

Table 1 Mechanisms and structures of geographic clines

When the combination of divergent selection derived from gene-by-environment interactions and gene flow among populations establishes a geographic cline in morph frequency, the relative fitness of each morph and the morph frequency changes with the environmental gradient (Fig. 1a). In addition, adjacent populations should have a similar population genetic structure due to migration among populations, thereby exhibiting an isolation-by-distance pattern in neutral genes. Note that the clines established by gene flow are expected to be steep since gene flow occurs among adjacent populations (Endler 1973). The most famous case of this kind of cline is the peppered moth Biston betularia, where relative fitness of melanic morph changes with spatial gradient of background color and the steep cline is maintained by gene flow among habitats. Importantly, under this process, the morph frequency cline should extend beyond balancing point of fitness function to preserve populations that maintain and supply maladaptive morphs/alleles to the population at the opposite side of the point. In the Australian grasshopper Phaulacridium vittatum, the frequency of the striped morph changes smoothly with latitude at the geographic scale, but this increase in frequency is restricted from 10 to 45 % due to the geographic limitation of its distribution range (Dearn 1981). In this case, gene flow cannot explain the establishment of the cline due to the lack of source populations of striped morphs, which may be more adaptive than other morphs at the opposite side of the balancing point of fitness function (Table 1).

When the combination of divergent selection derived from secondary contact and gene flow establishes a geographic cline, the morph frequencies do not always correlate with environmental factors, except the social environment (Fig. 1b). In these cases, historically allopatric populations with different alleles admix around the hybrid zone, thus an isolation-by-distance pattern should be found along the clinal variation. In addition, positive frequency-dependent selection should be detected within each polymorphic population. The clines induced by this process are expected to be steep due to the limitation of gene flow. One of the potential cases is flower color polymorphism in Antirrhinum spp., in which steep cline was observed at the hybrid zone (Whibley et al. 2006).

Finally, when the combination of divergent selection derived from gene-by-environment interactions and balancing selection within each population establishes a geographic cline, the relative fitness of each morph and the morph frequency changes with the environmental gradient (Fig. 1c). The population structure should not always exhibit an isolation-by-distance pattern in neutral genes because this process does not require any gene flow among populations. Theoretically, the combination of divergent and balancing selection is a feasible explanation for the maintenance of large-scale geographic clines (Endler 1973).

Testing the mechanisms underlying geographic clines

To elucidate the mechanisms that underlie the establishment of clines in each polymorphic system, we need to quantify the contributions of different types of selection (i.e., divergent, positive frequency-dependent, and balancing selection) and historical/ongoing stochastic events (i.e., secondary contact and ongoing gene flow) as well as the geographic structures of the cline (i.e., the width, shape, and correlation with environmental factors). Here I consider a series of studies that illustrate a geographic cline in nature using the female-polymorphic damselfly I. senegalensis. In damselflies, female-limited color polymorphisms are widespread, and it has been suggested that they are female evolutionary responses to sexual harassment by conspicuous males (Takahashi and Watanabe 2010). Typically, females have two or more morphs, one “andromorph” exhibits a male-like color pattern and one or two “gynomorph(s)” that express different color patterns from males (Takahashi and Watanabe 2011). The development of female morphs is governed by a single autosomal di- or triallelic locus, with expression limited to females. Large-scale clinal variations in morph frequencies have been reported in several damselfly species (Sánchez-Guillén et al. 2005; Van Gossum et al. 2007; Hammers and Van Gossum 2008; Cooper 2010; Gosden et al. 2011), which typically exhibit clinal variations along temperature gradients.

Spatial structure of clines

In I. senegalensis, a smooth cline in female morph frequency was observed with latitude in Japan. The frequency of andromorphs in each local population ranged from 0 % (South) to 80 % (North), and it was 50 % at a northern latitude of 36°. The cline width was wider than 2000 km (23–37°N) (Takahashi et al. 2011, 2014a). The morph frequency correlated with a principal component characterized by temperature and solar radiation rather than that characterized by precipitation, thereby suggesting that the morph frequency within each local population was determined by temperature. Downstream analysis suggested that the morph frequency correlated more strongly with temperature in winter than those in spring and summer, both of which are their flight seasons, indicating that the environmental gradient contributed to the establishment of the morph frequency cline in this species, especially during the larval stages (Takahashi et al. 2014a).

Evidence for gene-by-environment interactions

The geographic pattern of the reproductive potential of each morph was estimated based on body size that correlates strongly with the number of ovarioles, assuming there is no difference in survivorship between morphs. The reproductive potential of the two morphs produced different curves, which crossed at approximately 36°N (Takahashi et al. 2011). The reproductive potential of gynomorphs was higher in the south and lower in the north compared with andromorphs, suggesting differential reaction norms between morphs (gene-by-environment interactions). These gene-by-environment interactions should lead to divergent selection to generate spatial variation in the morph frequency beyond the intersection of fitness function on the basis of reproductive potential (approximately 36°N in this case). The 1:1 ratio in the cline in the morph frequency (36°N) agreed with that estimated from the function of reproductive potential (Takahashi et al. 2011), thereby suggesting that divergent selection determined by reproductive potential resulted in the spatial variation in morph frequency. However, under divergent selection, as the morph with a higher fitness will dominate the other morph within each population, the morph frequency should exhibit a threshold reaction where one morph shifts to the other at the balancing point of fitness function in the absence of homogenizing forces that are antagonistic to the local selection such as balancing selection or gene flow among populations.

Detecting homogenizing forces

To detect the balancing selection that acts on female morphs of I. senegalensis, the daily reproductive success of each morph was estimated in several natural populations with variable morph frequencies. The reproductive success of each morph was inversely proportional to its own frequency in the population, which was approximately equal when their frequencies are similar, thereby suggesting negative frequency-dependent selection (Takahashi et al. 2010). Positive frequency-dependent sexual harassment derived from search image formation by males could be underlying the negative frequency dependence in female reproductive success (Takahashi and Watanabe 2008, 2009). Since relative reproductive success change with female morph frequency even in a single population, female reproductive success is suggested to be determined by negative frequency-dependent selection rather than other processes, such as environmental heterogeneity or gene-by-environment interaction (Takahashi et al. 2010). Long-term field surveys showed that the morph frequencies within a population showed an oscillation with a period of two generations, which well agreed with that predicted by the mathematical model assuming simple negative frequency-dependent selection (Takahashi et al. 2010). These results confirmed that negative frequency-dependent selection contributes strongly to the maintenance of genetic polymorphisms in this species in nature. Negative frequency-dependent selection may be a potential factor that makes a cline smooth along an environmental gradient because multiple morphs will coexist in a certain geographic range where the difference in fitness between morphs is not so high. Note that this result does not exclude the effect of gene flow on the establishment of geographic clines. In general, however, the structure of morph frequency cline established by balancing selection may not be affected by gene flow among populations, since symmetric gene flow from both sides of the populations does not affect the equilibrium frequency in a population (Endler 1973).

Evidence from molecular analyses

Molecular analyses based on genetic markers can be used to examine the presence of gene flow, historical events, and selections. The present population structure is affected by current gene flow and historical events; thus, population structure analyses based on neutral genetic markers can detect ongoing gene flow as well as past secondary contact. In addition, insights into the presence of divergent selection, balancing selection, and stochastic events can be obtained by comparing the degree of genetic differentiation (e.g., F ST) at a focal locus with the degree of differentiation in neutral markers (Gillespie and Oxford 1998; Croucher et al. 2010). The strength of the correlation between the F ST values for neutral loci and those for the focal loci can indicate the degree of contribution of the stochastic effects on the population divergence of the loci (Runemark et al. 2010). In addition, comparing the population pairwise F ST values for neutral loci with those for loci that are suspected to be subject to selection can detect the effects of selection and stochastic factors on population divergence in natural populations (McKay and Latta 2002). In particular, divergent selection, balancing selection, or no selection can be identified if the F ST (focused)/F ST (neutral) ratio is greater than, smaller than, or equal to one, respectively (Takahashi et al. 2014a). Note that, in analyses using multi-locus or genome wide F ST values, we potentially have to consider the non-selective mechanisms maintaining a population differentiation after secondary contact between genetically different strains as well as selection factors. For instance, genome regions with restricted recombination tend to maintain greater population differentiation (i.e., higher F ST value) than the rest regions of genome, and thus show steeper cline across the hybrid zone even if they are selectively neutral (Nachman and Payseur 2012). This means that the degree of population differentiation and the structure of clinal variation are altered by difference in the rate of recombination relating to genomic structure such as chromosomal inversions. The effects of such non-selective mechanism are potentially important for interpreting on the mechanism responsible for the observed clines.

For comparison in I. senegalensis, the genotype frequencies at the color locus in each population were estimated based on the phenotypic frequency data obtained from the wild populations (Takahashi et al. 2014a). In I. senegalensis, the two morphs are controlled by two alleles at a single autosomal locus with sex-limited expression (Takahashi 2011). The andromorphic (d) allele is recessive to the gynomorphic (D) allele as in the case of other female dimorphic damselflies (Johnson 1964, 1966). The allelic frequencies in each population were estimated based on the color morph frequencies and then were used to calculate the genotypic frequencies of DD, Dd, and dd.

In this system, the isolation-by-distance pattern was unclear based on the microsatellite markers and mtDNA, thereby suggesting the weak contribution of gene flow and/or secondary contact between genetically different populations to the geographic structure of the species. The correlation between the degree of differentiation in the set of microsatellite loci and that based on the color locus was not significant, which indicates that the geographic divergence in the morph frequency cannot be explained by stochastic factors (Takahashi et al. 2014a). In contrast, two antagonistic selective factors that act on the color locus, i.e., balancing selection and divergent selection, were detected by comparing the population pairwise F ST values for neutral loci and those for color loci under consideration of the geographical distance between populations. That is, divergent selection was detected at a large geographic scale, and the balancing selection was detected at a small geographic scale (Takahashi et al. 2014a). This suggests that two antagonistic selection factors act simultaneously on the color locus whereas divergent selection and balancing selection act predominantly on the color locus at small and large geographic scales, respectively.

Conclusion

Geographic clines in morph/allele frequencies that are assumed to be established by migration–selection balance or selection–selection balance have been reported widely (Dearn 1981; Hoekstra et al. 2004; Hoffmann and Weeks 2006; Schemske and Bierzychudek 2007; Hodgins and Barrett 2008; Piel et al. 2010; Gosden et al. 2011; Brakefield and de Jong 2011). The geographic structures of these clines indicate the mechanism that underlies the cline, but the structure itself cannot confirm the mechanism because several mechanisms can generate clines in nature. Some studies have provided the possibility of divergent selection by detecting gene-by-environment interactions (Komai et al. 1950; Hoffmann and Weeks 2006; Cooper 2010), which can explain the mechanism that generates spatial variation in morph frequencies but cannot explain the mechanism that maintains the smoothness of the cline because the establishment of a smooth cline requires one or more homogenizing forces. In practice, the interpretation of the establishment of several large-scale geographic clines in morph frequencies is implicitly reliant solely on gene flow among populations (Dearn 1981), but the adequacy of the scenario has not been demonstrated. Evidence of secondary contact also explains the establishment of stable clines in nature without mechanisms that restrict homogenization by gene flow (Whibley et al. 2006). In addition, the evidence of selection and gene flow based on molecular analyses cannot identify the mechanism of selection without ecological surveys. Therefore, fragmentary evidence is not sufficient to demonstrate the underlying mechanisms of a cline.

In I. senegalensis, a large-scale geographic cline was detected, which implies that balancing selection rather than gene flow contributed to the establishment of the cline in the morph frequency. The strong correlation between morph frequency and temperature suggests the presence of gene-by-environment interactions as a cause of divergent selection rather than secondary contact with positive frequency-dependent selection. These results suggest that the large-scale morph frequency cline in I. senegalensis was established by a combination of divergent selection derived from gene-by-environment interactions and balancing selection, i.e., selection–selection balance. Fitness analyses and molecular experiments confirmed this scenario where divergent selection derived from gene-by-environment interactions and negative frequency-dependent selection were detected using multiple methods, and the contribution of gene flow to the geographic pattern in morph frequencies was not supported. These results provide unequivocal empirical support for a large-scale geographic cline in morph frequency in nature. Various mechanisms for establishing spatial variations in morph frequency should be addressed using the comprehensive approach presented here.