Keywords

13.1 Introduction

Over 150 years ago, Henry Walter Bates published his first observations of butterfly diversity in the New World Tropics (Bates 1862). Exploring deep within the Amazon basin for over a decade, Bates documented extraordinary cases of mimicry in the vivid wing patterns of distantly related butterfly species. His writings provided Darwin with some of the most visually stunning examples of evolution by natural selection and the best evidence of a link between natural selection and speciation. Thanks to Bates and Fritz Müller, who arrived in Brazil a few years after Bates, butterflies, arguably more than any other group, contributed to the early establishment and acceptance of evolutionary theory (Carroll et al. 2009). Research on butterflies continues to be as relevant today as it was 150 years ago. Using modern technologies, there is an active research community encompassing most areas of ecology and evolution, ranging from the molecular details of vision to the analysis of human impact on biodiversity (Boggs et al. 2003; Kotiaho et al. 2005; Briscoe et al. 2010). This includes a vibrant genomics research community working on a number of different species, including passion-vine butterflies (Heliconius spp., Heliconius Genome Consortium 2012), monarchs (Danaus plexippus, Zhan et al. 2011), swallowtails (Papilio spp., O’Neil et al. 2010), the Glanville fritillary (Melitae acinxia, Hanski 2011), Bicyclus anynana (Brakefield et al. 2009), and Lycaeides (Gompert et al. 2012). The past several years have seen remarkable progress in the development of genomic resources in these species, culminating in the publication of the first two butterfly reference genomes (Zhan et al. 2011; Heliconius Genome Consortium 2012), with a number of additional genomes scheduled to be released in the coming year.

In this review, we examine emerging ecological and evolutionary genomic research addressing adaptation and speciation in Heliconiusbutterflies. Genomic research is ongoing in several butterfly species, but genomic studies on Heliconius are arguably further developed. Research on Heliconius provides a foundation to discuss larger issues relating to the nature of adaptive differences and the formation of new species. We begin with an overview of the Heliconius radiation – a radiation that has produced an extraordinary evolutionary continuum composed of divergent races and species at different stages of speciation (Mallet et al. 1998; Mallet 2008; Merrill et al. 2011a). Using this continuum as a backdrop, we review recent progress to (i) identify functional variation in the group and reconstruct the history of adaptive alleles, (ii) understand the importance of hybridization in speciation, and (iii) explore the genomic architecture that allows speciation to proceed in the face of gene flow. We conclude with a discussion of how to move beyond patterns of genomic variation to gain a deeper understanding of the processes that drive divergence and speciation in nature.

13.2 The Heliconius Radiation: A Primer

The butterfly subtribe Heliconiina (Lepidoptera: Nymphalidae: Heliconiinae) is restricted to the New World tropics and has a host of life history, ecological, and phenotypic traits that have long fascinated biologists and naturalists. Heliconiines get their common name, passion-vine butterfly, from their strong association with the passion flower family (Passifloraceae). Passion vines are protected by a diverse arsenal of cyanogenic compounds that are likely a by-product of an evolutionary arms race with heliconiines (Spencer 1988). Heliconiines have adopted this defensive tactic–evolving the ability to make and, in some cases sequester, cyanogenic glycosides (Engler et al. 2000; Engler-Chaouat and Gilbert 2007). These compounds render the bearer highly distasteful and avian predators quickly learn to associate a wing color pattern with unpalatability (Chai 1986).

Within the subtribe, the genus Heliconius is characterized by an ecological shift to pollen feeding. Unlike most butterflies, which feed only on fluids (e.g. nectar, decomposing animals and fruits, and dung), Heliconius actively collect and process pollen (Gilbert 1972). The origin of pollen feeding is associated with subtle changes in morphology (Gilbert 1972; Krenn and Penz 1998; Eberhard et al. 2009), coupled with more profound changes in a host of life history and behavioral traits. In particular, the transition to pollen feeding is hypothesized to be important in the butterflies’ ability to synthesize toxic compounds and to enable a very long adult life – one of the longest recorded for a butterfly (Gilbert 1972). Pollen feeding is also associated with a rapid increase in brain size (Sivinski 1989), the evolution of slower and more maneuverable flight, the development of large eyes with accentuated ultraviolet color vision (Briscoe et al. 2010), and the evolution of a suite of complex behaviors, including trap-line feeding, gregarious roosting, and elaborate mating strategies (Brown 1981).

The Heliconius genus is best known for the extraordinary mimicry-related wing pattern diversity seen among its 43 species (Fig. 13.1). With over 400 distinct color pattern varieties, the group represents one of the most striking adaptive radiations in the animal kingdom. Repeated convergent and divergent evolution creates a colorful tapestry where distantly related species often look identical and closely related species or races can look strikingly different (Fig. 13.1). The complexity of this tapestry is exemplified by the parallel radiations of H. erato and H. melpomene, which, although phylogenetically distant and unable to hybridize, have converged on 25 different mimetic color patterns across the Neotropics (Fig. 13.2). Most of the diversity in these two species can be partitioned into two major phenotypic groups: (i) “postman” phenotypes, which have red on their forewing and either possess or lack a yellow hindwing bar; and (ii) “rayed” phenotypes, which have a yellow forewing band, a red patch on the proximal area of the forewing, and red rays on the hindwing. Variations on these themes generate the abundant pattern diversity we see in nature (Fig. 13.2b). In addition to the postman and rayed phenotypes, there are also a number of tiger striped Heliconius. For example, H. numata shows numerous sympatric color patterns races that are likely a result of strong selection pressure to mimic distantly related Melinaea species (Nymphalidae: Ithomiinae) (Brown and Benson 1974; Joron et al. 1999), which can vary dramatically in abundance over small spatial and temporal scales. Geographic variation and convergent evolution are common across Heliconius and the wing patterns of most species converge onto a handful of common color patterns, so called mimicry rings, which coexist locally (Mallet and Gilbert 1995). This convergence between species led to the original hypothesis of mimicry (Bates 1862) and Heliconius is now a classic example of Müllerian mimicry, where distantly related, but similarly distasteful, species converge on the same warningly coloredpattern.

Fig. 13.1
figure 1

Nature’s palette – the Heliconius radiation. This representation of the major Heliconius clades incorporates over half of the 43 different Heliconius species and a large portion of wing pattern variation. Note the reuse of phenotypes across the phylogeny and the extreme phenotypic variation that exists within and between closely related species. The origins of several major ecological and behavioral innovations are indicated, including pollen feeding, pupal mating, and gregarious larvae. Pupal mating is the reproductive behavior where males search and guard pupae and mate with females as they eclose (Deinert et al. 1994). Pollen feeding and pupal mating likely evolved only once in the radiation. In contrast, gregarious aggregations of larvae evolved several times including the lineage containing the primitive Heliconius. Branch length does not necessarily reflect divergence time (Beltrán et al. 2007)

Fig. 13.2
figure 2

Variations on a theme – parallel color pattern radiations in Heliconius erato and Heliconius melpomene . (a) Geographic distributions of the major phenotypes of H. erato and H. melpomene, showing the “rayed” phenotype throughout the Amazon basin and disjoint populations of the two variations of the “postman” phenotype. (b) Each row shows some of the color pattern divergence within the two concordant radiations, classified by the two major phenotypes (“postman” and “rayed”). Comparing color pattern phenotypes between the two species reflects the convergent color pattern evolution

Divergence in wing color pattern is also associated with speciation due to the dual role of color pattern in signaling to predators and in mate selection. For example, H. melpomene and H. cydno are very closely related, broadly sympatric, and occasionally hybridize in nature. In this case, speciation is associated with, and reinforced by, divergence in mimetic color patterns. Heliconius melpomene is generally black with red and yellow markings and mimics H. erato (Flanagan et al. 2004) (Fig. 13.2); whereas, H. cydno is black with white or yellow markings and typically mimics H. sapho and H. eleuchia (see Fig. 13.1). Mate recognition involves visual attraction of males to females, which leads to strong color-pattern based assortative mating and disruptive sexual selection against hybrids (Jiggins et al. 2001b; Naisbit et al. 2001). Furthermore, there is ecological post-mating isolation that results from increased predation on hybrids due to their non-mimetic wing patterns (Merrill et al. 2012). Species boundaries are often associated with mimetic color pattern shifts, highlighting the pervasive role that color pattern evolution plays in reproductive isolation across the radiation (Mallet et al. 1998).

13.3 The Genetics of Color Pattern in Heliconius

Given the visually dynamic nature and clear importance of coloration in the Heliconius radiation, much effort has focused on characterizing the genetic basis of wing pattern variation in the group. The spectrum of pattern diversity in Heliconius is largely controlled by variation at a handful of loci of large phenotypic effect (Sheppard et al. 1985; Joron et al. 2006, 2011; Reed et al. 2011; Martin et al. 2012; Papa et al. 2013). These loci are phylogenetically conserved “hotspots” of phenotypic evolution, responsible for color pattern evolution in both convergent and divergent Heliconius species. Allelic variation at these loci produce complex phenotypic changes that show spatial variation across wing surfaces (Fig. 13.3). The identification and characterization of these color pattern alleles has been a major step forward in understanding the evolution and development of lepidopteran wing patterns and provides a unique glimpse into the developmental genetic architecture of pattern evolution in nature.

Fig. 13.3
figure 3

Hotspots of phenotypic evolution. Although more than two dozen loci have been described, color pattern variation across Heliconius is largely regulated by four major effect loci – two color “switch” loci and two melanin “shutter” loci. These loci interact to create the natural diversity of Heliconius color patterns. The “red switch” controls various red elements across the wing surfaces and was originally described as multiple loci (Sheppard et al. 1985). The “yellow switch” changes forewing and hindwing pattern elements between white and yellow. The “forewing melanin shutter” controls the distribution of black/melanic wing scale cells on the forewing to either expose or cover white or yellow pattern elements and generates variation in forewing band shape, size and position. This locus also originally described as at least five distinct loci in different Heliconius species. Finally, the “global melanin shutter” is similar to the forewing melanin shutter, but it acts more broadly across the wing. It was similarly described as a number of distinct loci in different Heliconius species. Allelic variation at this locus can have multiple phenotypic effects across the wing including: (i) creating the white fringes on the fore wing and hindwing, (ii) causing the presence or absence of the yellow hindwing bar, and (iii) changing the shape and color of the forewing band in some species, including H. himera and H. heurippa. In addition, allelic variation at this locus, in the form of a series of localized inversions, controls all color and pattern variation in H. numata (see Joron et al. 2011)

The nomenclature surrounding the genetics of color pattern variation in Heliconius that has developed over the past 50 years is complicated; however, most described color pattern variation is due to the combined effects of four major loci. All four loci contain functional variation that affects distinct color pattern elements. For simplicity, we will refer to them as the “red switch”, the “yellow switch”, the “forewing melanin shutter”, and the “global melanin shutter” (Fig. 13.3). The red switch controls the presence of several discrete red elements, including the forewing band, the proximal forewing “dennis” patch, and the hindwing rays (Fig. 13.3). The yellow switch causes a loss of yellow ommochrome pigments across both wing surfaces, resulting in a switch from yellow to white pattern elements (Linares 1997; Naisbit et al. 2003) (Fig. 13.3). This locus has largely been described from crosses of H. cydno, but probably underlies pattern differences in races of H. sapho and H. eleuchia as well. The two “shutters” (sensu Gilbert 2003) modulate the complex distribution of black/melanin across both wing surfaces. The forewing melanin shutter acts predominately in the central portion of the forewing, generating variation in the shape, position, and size of the forewing band (Fig. 13.3). The global melanin shutter affects the distribution of melanin across both wing surfaces to create variation in a number of red, yellow, and white pattern elements (Fig. 13.3). Due to the broad phenotypic effects of these color pattern loci, most were originally described as supergenes (Mallet 1989), or clusters of linked genes that facilitate co-segregation of adaptive variation (sensu Mather 1950).

The above synthesis is an oversimplification in several ways. First, there are other loci that have moderate size effects that contribute to pattern variation in important ways (Papa et al. 2013). This includes completely unexplored loci that alter the nanostructures of wing scale surfaces, causing light to scatter, which results in iridescent wing surfaces. Second, all of the major loci have pleiotropic effects that extend beyond the neat categorizations described above. Furthermore, most interact epistatically with each other to generate additional pattern diversity. For example, in H. melpomene the global melanin shutter interacts with the red switch and the forewing melanin shutter to finely control the color, size, and position of the forewing band. Finally, the magnitude of color pattern variation generated by allelic differences within these loci can be extreme. This is best exemplified in H. numata, where the entire spectrum of wing pattern variation is controlled by variation at a single locus – the global melanin shutter (Fig. 13.3) (Joron et al. 2006).

13.4 Genomic Divergence Across the Speciation Continuum

A major strength of Heliconius as a genomic model system is that the radiation has produced a continuum of divergent races and species, which provide an excellent opportunity to study taxa at different stages of speciation (Fig. 13.4). At one end of the continuum, many divergent color pattern races freely hybridize, showing only weak reproductive isolation from each other. These racial boundaries are maintained primarily by selection on wing color patterns, with free gene flow across the rest of the genome. This heterogeneity in gene flow across the genome is ideal for identifying functional variation, as the genomes of divergent, hybridizing races should only differ at regions responsible for phenotypic differences. For some pairs of Heliconius taxa, speciation has progressed further. For example, the hybrid zone between H. erato and H. himera in Ecuador is characterized by the evolution of strong pre-mating isolation through assortative mating, but there is no evidence for hybrid inviability or sterility (McMillan et al. 1997). Both prezygotic and postzygotic isolation have been shown to contribute to isolation in other Heliconius hybrid zones, including the Colombian hybrid zone between H. e. venus and H. e. chestertonii (Arias et al. 2008). Species pairs such as these permit analysis of the earliest stages of speciation, where speciation has begun, but reproductive isolation is not complete, with hybrids making up a small proportion of the population in narrow areas of overlap (McMillan et al. 1997; Arias et al. 2008). Further along the continuum, there are many closely related, sympatric species where hybridization occurs, but it is rare (Mallet et al. 2007; Mallet 2009). For example, H. melpomene and H. cydno are broadly sympatric in Central America and northern South America, coexisting as a result of several ecological differences, including their mimetic association, host plant association, and habitat preference (Naisbit 2001). There are also more distantly related species in sympatry, such as H. melpomene and H. elevatus, where the occurrence of very rare hybridization has facilitated adaptive introgression of color pattern alleles and the spread of mimetic patterns across the genus (Heliconius Genome Consortium 2012). Finally, there are non-hybridizing species, such as H. erato and H. melpomene, which are ecologically and behaviorally distinct, yet share identical mimetic wing patterns. These species provide a comparative framework for exploring the repeatability of evolution and for gaining a more general understanding of how genomic changes influence developmental pathways, phenotype, and ultimately fitness.

Fig. 13.4
figure 4

Across the speciation continuum. The Heliconius radiation provides an exceptional opportunity to study taxa at different stages of speciation. The continuum, from freely hybridizing populations to non-hybridizing species, provides insight into a variety of key areas of research related to speciation. The genomes of freely hybridizing races should diverge only at genomic regions driving phenotypic divergence, allowing identification of important functional variation. Species pairs at the earliest stages of speciation give insights into how reproductive isolation is established. As reproductive isolation increases, but low levels of hybridization remain, patterns of divergence across the genome reflect the complex interplay between evolutionary forces and genome structure. Studying mimetic species pairs that do not hybridize gives insights into how genomes diverge with complete reproductive isolation and how distantly related species can converge on nearly identical phenotypes

13.4.1 Identifying Functional Variation

Decades of research in Heliconius has shown that wing color patterns are under strong natural selection (Benson 1972; Mallet and Barton 1989; Mallet et al. 1990; Kapan 2001). In one of the best studied Heliconius hybrid zones, the transition between divergent postman and rayed color pattern races of H. erato in Northeastern Peru is sharp and occurs across a narrow 10 km transect (Fig. 13.5a). Strong natural selection on color pattern was demonstrated experimentally on either side of the hybrid zone by releasing individuals with the postman pattern within the rayed population, and vice versa, and estimating survivorship (Mallet and Barton 1989). On both sides of the hybrid zone, recapture rates were significantly lower for butterflies with the foreign color pattern, yielding an estimated overall selection coefficient of 0.52. This estimate was comparable to indirect measures of selection based on fitting linkage disequilibria and cline theory models to extensive hybrid zone data (Mallet et al. 1990).

Fig. 13.5
figure 5

Linking genotype to phenotype to fitness in Heliconius . (a) Allele frequency variation at the red switch locus across the Peruvian hybrid zone reflects strong natural selection on color pattern. (b) optix is the only gene in the red switch locus showing strong and significant differential expression between the red and yellow forewing bands of divergent red phenotypes. (c) In situ hybridizations show that spatial expression of optix exactly prefigures red color pattern elements in pupal wings 60 hours after pupation. (d) Sliding window genomic divergence between two major H. erato phenotypes (“postman” and “rayed”) from three hybrid zones across the red switch locus. There are two peaks of divergence, one near optix and one at a 65 kb region 3’ of optix, indicating potentially functional regions. The 65 kb peak also contains numerous SNPs perfectly associated with phenotype. The divergent peak stands in marked contrast to the lack of differentiation at regions unlinked to color pattern \( \left({\protect\widehat{\theta}}_s=-0.07\right) \) (Figure modified from Mallet and Barton 1989; Reed et al. 2011; Supple et al. 2013)

Recent research has focused on identifying and understanding the architecture of color pattern loci in order to connect genotype to phenotype and fitness. Using a combination of traditional genetic and emerging genomic approaches, the genomic regions containing the four major color pattern loci have been localized to small intervals and, in two cases, very narrow genomic regions, with specific genes being strongly implicated (Joron et al. 2006; Counterman et al. 2010; Baxter et al. 2010; Ferguson et al. 2010; Reed et al. 2011; Joron et al. 2011; Martin et al. 2012; Heliconius Genome Consortium 2012; Supple et al. 2013). The most progress has been made in understanding red color pattern variation, with research on the red switch locus serving as an exemplar of how to identify functionally important variation. Allelic variation at the red switch controls multiple distinct red color pattern elements that vary between the two divergent color pattern phenotypes (Fig. 13.3). The genomic interval containing this switch was localized to a 400 kb region on chromosome 18 that contained more than a dozen predicted genes (Counterman et al. 2010). Within this region, microarray expression studies, using probes tiled across this region, identified optix, a homeobox transcription factor, as the only gene that was consistently differently expressed between divergent color pattern races – showing high upregulation in regions of the pupal wing fated to become red in the adult wing (Fig. 13.5b). Furthermore, beginning at approximately 60 hours after pupation, optix expression perfectly prefigures red pattern elements (Fig. 13.5c) – even reflectingsubtlepatterndifferences between co-mimics (Reed et al. 2011). The optix amino acid sequence is highly conserved within Heliconius, which suggests that the control of red pattern variation is due to allelic variation in cis-regulatory elements (Hines et al. 2011; Reed et al. 2011).

The prediction that cis-regulatory variation modulates red color pattern variation is supported by population genomic analyses of individuals collected across narrow hybrid zones between divergent color pattern phenotypes. These hybrid zones contain individuals representing many generations of recombination and phenotypically distinct individuals collected within them differ only at genomic regions responsible for the those differences (Counterman et al. 2010; Baxter et al. 2010; Nadeau et al. 2012). Analysis of three replicated hybrid zones between postman and rayed H. erato races shows a sharp peak of genomic divergence in a region approximately 100 kb 3′ of optix, in a “gene desert” containing no predicted genes nor any transcriptional activity. The peak of divergence is approximately 65 kb wide and contains numerous SNPs perfectly associated with color pattern across the broader H. erato pattern radiation (Fig. 13.5d) (Supple et al. 2013).

Relative to other regions of the genome, there is extensive linkage disequilibrium (LD), reduced heterozygosity, reduced nucleotide diversity, and high levels of population differentiation across the 65 kb peak (Supple et al. 2013)—hallmarks of a history of strong selection. A compelling hypothesis is that this region contains a series of modular enhancers that regulate the spatial expression of optix, similar to the architecture recently described for genes responsible for morphological variation in melanin and trichome patterns in Drosophila (Bickel et al. 2011; Frankel et al. 2011). This model is consistent with phenotypic recombinants occasionally collected in postman/rayed hybrid zones that disassociate the proximal red patch on the forewing from the red hindwing rays, which are red color pattern elements that typically occur together (Mallet 1989). In fact, in both co-mimics, H. erato and H. melpomene, there is a single race found in a narrow geographic region in the Guianas that have the red forewing patch, but lack the hindwing rays. Additionally, one of the polymorphic forms of H. timareta in Ecuador also has a recombinant phenotype – showing hindwing rays, but no red forewing patch (Mallet 1999).

The genome scans described above are exceptionally powerful for localizing functional regions. However, the ability to more finely characterize these regions and to identify functional elements or functional changes using population genomic approaches will ultimately depend on what is causing the strong LD seen across the region. One possibility is that the region contains an inversion that suppresses recombination and locks loci into specific allelic combinations. Inversions have been shown to be important in maintaining a series of major effect color alleles in H. numata (Joron et al. 2011). However, we see no evidence for inversions in the red switch locus in H. erato and H. melpomene pair-end sequence data (Supple et al. 2013), nor were inversions evident in analysis of different color pattern races of H. melpomene (Nadeau et al. 2012). Moreover, fine scale analysis of haplotype structure across this region suggests that recombination occurs (Supple et al. 2013). Given the lack of evidence of an inversion and the presence of recombinant individuals, it is most likely that the observed LD is a result of strong natural selection. Strong selection can establish extensive LD among loci, even among unlinked color loci (Mallet et al. 1990). In this case, the presence of recombination raises the possibility that more extensive population and taxon sampling will further refine the boundaries of the functional elements. However, in order to fully understand how variation in this region modulates pattern evolution, population genomic approaches must be coupled with other strategies, including exploring protein-DNA interactions with DNA foot printing (Cai and Huang 2012) and confirming functional mutations with transgenic manipulations (Merlin et al. 2013; Cong et al. 2013).

There has been similar progress in identifying a candidate gene for the forewing melanin shutter. Linkage mapping, gene expression analysis, and pharmacological treatments all indicate that the WntA ligand modulates variation of the forewing band (Martin et al. 2012). The spatial pattern of WntA expression corresponds to the black forewing pattern in multiple species across the Heliconius radiation. As with optix, the WntA protein is highly conserved and variation in cis-regulatory regions is likely responsible for pattern diversity. WntA is a signaling ligand that creates a morphogen gradient across the developing wing tissue and is the type of molecule predicted to underlie pattern formation in theoretical models of wing color pattern development (Kondo and Miura 2010; Nijhout 1991). WntA is expressed earlier in pattern formation than optix and may act as a negative regulator of optix, with the interaction between the two genes being responsible for establishing black versus non-black wing pattern boundaries. This is only the second report of a morphogen involved in pattern generation (see Werner et al. 2010), but it is the first that directly links change in a patterning molecule to the evolution of a highly variable trait with clear adaptive significance (Martin et al. 2012).

The yellow switch and the global melanin shutter have similarly been positionally cloned (Joron et al. 2006; M. Kronforst, pers. comm.). Nonetheless, specific candidate genes and/or functional elements have yet to be identified. The global melanin shutter, in particular, has proven difficult to characterize. It was the first color pattern locus to be positionally cloned and it has the broadest range of phenotypic effects of any of the Heliconius color loci. Moreover, this region has recently been shown to underlie pattern change in several Lepidoptera species, including eyespot size in Bicyclus (Saenko et al. 2010) and the classic case of industrial melanism in the British peppermoth, Biston betularia (van’t Hof et al. 2011), underscoring the flexibility and broad evolutionary importance of the Heliconius patterning loci. In Heliconius, this locus has been hypothesized to be a “supergene” composed of a number of distinct co-adapted protein coding loci. Support for this comes from recent work on H. numata that showed that allelic variants at this locus are actually a set of different chromosomal inversions across a region containing at least 18 genes (Joron et al. 2011). However, ongoing expression and genome scan studies in H. erato and H. melpomene indicate that, similar to the red switch and the forewing melanin shutter, only a single protein coding region at this locus may underlie the global variation in melanin pattern across Heliconius wings (C. Jiggins, pers. comm.).

13.4.2 The Origins of Novel Phenotypes

Natural selection can explain why wing patterns of different Heliconius species should converge – strong selection against rare color patterns promotes mimicry (Müller 1879). Natural selection can also explain the maintenance of existing wing pattern diversity – strong frequency-dependent selection removes non-mimetic individuals creating sharp transition zones between divergent phenotypes (Mallet et al. 1998). However, natural selection cannot easily explain the origin and spread of new phenotypes in Heliconius. This is a complex issue at the core of the mimicry paradox – the frequency – dependent selection that stabilizes existing patterns is the same force that eliminates novel forms, yet pattern divergence is rampant (Mallet and Gilbert 1995; Turner and Mallet 1996; Joron and Mallet 1998). To begin to explain this paradox, we first need to understand the evolutionary dynamics of the genomic regions that cause pattern change. An essential first question to ask is whether novel phenotypes arise once and spread within and between species or are there multiple, independent origins of the same phenotype? It has only been with the identification of the regions that modulate phenotypic differences that we can begin to address this question. The answer seems to be a bit of both – phenotypic evolution within races and species with even low levels of hybridization occurs by sharing uniquely derived color pattern alleles; while convergent phenotypes evolve independently in more distantly related co-mimics.

Analyses of the genomic region responsible for color pattern diversity support a single origin for major red color pattern phenotypes within species. For both H. erato and H. melpomene, variation around the red switch locus sorts by color pattern phenotype (Hines et al. 2011; Supple et al. 2013). In both species, individuals possessing a rayed phenotype cluster together to the exclusion of individuals possessing the postman phenotype (Fig. 13.6). Rayed patterns are found in the Amazon basin and are co-mimetic with several other Heliconius species and day flying moths; whereas, the postman phenotypes are largely unique to H. erato and H. melpomene and are found in multiple disjunct regions around the periphery of the Amazon and in Central America (Fig. 13.2). The pattern of genetic variation around the red switch locus supports the hypothesis that one rayed phenotype evolved in each species and spread quickly, fragmenting the geographic distribution of the older postman phenotypes. This phylogenetic signal is restricted to a region containing the 65 kb divergence peak identified in the hybrid zone comparisons (Fig. 13.5d). As you move away from this region, the phylogenetic signal increasingly reflects a history of recent gene flow, with variation clustering by geographic proximity and biogeographic boundaries, regardless of color pattern (Fig. 13.6). This pattern of clustering by geography is the same pattern that is observed across regions unlinked to color pattern, which previously led to the incorrect conclusion that similar color pattern phenotypes evolved multiple times within each species (Brower 1994; Flanagan et al. 2004; Quek et al. 2010). This discordance demonstrates how inferences drawn from a specific subset of the genome can be misleading (Hines et al. 2011).

Fig. 13.6
figure 6

The origins of an adaptive radiation – phylogenetic analysis of the red switch locus in H. erato . Phylogenetic analysis of optimal topological partitions highlight a region around the gene optix as clustering samples by phenotype, rather than geographic proximity. The tree topologies are shown, with phenotypes represented by branch color and geographic regions by terminal node color. Trees were generated from SNPs determined by aligning short sequencing reads to a reference genome. The grey scale bar is colored by the general history inferred. Around optix, samples are clustered by phenotype (black bar), while the farthest partition from optix clusters by geography (light grey bar), and the regions in between are intermediate, clustering by a mix of geography and phenotype (dark grey bar). Gene annotations, with the gene optix indicated, are shown below (Figure modified from Supple et al. 2013)

Mimetic convergence between distantly related species, in contrast, likely occurs by independent evolution. For example, population genetic and phylogenetic analyses of the H. erato and H. melpomene radiations, using variation within color pattern intervals, consistently clusters individuals by species designation, which is similar to the groupings obtained at loci unlinked to color pattern. Thus, although the same genomic region regulates mimetic color pattern variation, the changes responsible for mimetic convergence likely arose independently in the two species. The independent origin of similar color patterns in H. erato and H. melpomene is perhaps not unexpected, given that the two species diverged from each other over 15 million years ago (Pohl et al. 2009) and do not hybridize.

It is less clear how often more closely related species “borrow” color pattern alleles to acquire a mimetic wing color pattern. A cursory review of a phylogeny of the Heliconius radiation shows the high frequency that similar wing patterns are shared by species across the tree (Fig. 13.1). For example, within the melpomene/cydno/silvaniform (MCS) clade, the rayed and postman phenotypes occur within three of the four major lineages (Fig. 13.1). These species are all closely related and many are known to hybridize in the wild (Mallet et al. 2007) and in greenhouses (Gilbert 2003). These observations lead to a proposed model whereby Heliconius mimicry evolved by repeated interspecific transfer of color patterning alleles (Gilbert 2003). Adaptive introgression, which is the spread of beneficial variation through interspecific hybridization, may have provided the genetic raw material for both accelerated adaptation and speciation due to the dual role of color patterns in mimicry and mating behavior.

Compelling evidence for hybridization and introgression, particularly around color pattern loci, comes from analyses of closely related Heliconius species that share similar mimetic patterns. Phylogenetic analysis of genetic variation from the region identified as crucial to red color pattern differences, clusters populations and species by red pattern phenotype across species boundaries, rather than by phylogenetic relationship (Heliconius Genome Consortium 2012; Pardo-Diaz et al. 2012) (Fig. 13.7). This clustering by phenotype includes H. elevatus, which is the only rayed species in the more distantly related silvaniform clade, which usually have orange, black, and yellow tiger patterns and are known to hybridize, albeit rarely, with H. melpomene. Additional support comes from genome-wide tests that attempt to distinguish shared ancestral polymorphism from shared polymorphism resulting from recent gene flow (Green et al. 2010; Durand et al. 2011). This analysis shows a statistically significant excess of shared polymorphisms between sympatric co-mimics than would be expected from random sorting of ancestral polymorphisms, with a particularly strong signal at known color pattern loci. Although this test has been shown to be biased by population structure and to be sensitive to genotyping errors (Durand et al. 2011), the pattern of shared polymorphisms combined with the traditional phylogenetic analyses paints a dramatic picture of the adaptive spread of color pattern alleles across species boundaries.

Fig. 13.7
figure 7

Evolution by adaptive introgression. Phylogenetic analysis across the red switch locus shows introgression between sympatric, mimetic species in the genomic region believed to control red color pattern variation. A single 50 kb region clusters all rayed samples together, including H. elevatus, a proposed hybrid species. Windows farthest from this region generate the expected species tree (Figure modified from Heliconius Genome Consortium 2012)

13.4.3 Wing Color Patterns as a “Magic Trait” Promoting Speciation

As we are beginning to understand the role that adaptive introgression plays in the spread of mimetic color pattern alleles, it is becoming equally clear that divergent color pattern alleles in Heliconius likewise play a profound role in speciation. Disruptive selection on an ecological trait, such as Heliconus wing patterns, imposes a barrier to gene flow (Schluter 2009; Nosil 2012). We see this clearly in the signature of differentiation across hybrid zones between color pattern races of H. erato (Fig. 13.5d). However, in the absence of strong assortative mating, even with this barrier to gene flow, intermediate phenotypes will be continually produced and recombination will prevent speciation. This antagonism is the principal reason why the idea of speciation with gene flow remains extremely controversial (Felsenstein 1981). One way around this difficulty is if the trait under disruptive selection also contributes to nonrandom mating. In this case, there is a clear path to speciation (Dieckmann and Doebeli 1999; Gavrilets 2005) and such traits have become known as “magic traits”. Rather than the term “magic trait”, which implies a trait encoded by a “magic gene”, a better term would be “multiple-effect trait”. A multiple-effect trait is simply a trait that has multiple functions – it is under disruptive selection and contributes to non-random mating. This definition is of more value because it does not presuppose any particular underlying genetic architecture (c.f. Servedio et al. 2011).

The wing patterns of Heliconius provide one of the best experimental systems to study how “magic” or “multiple-effect” traits can generate biodiversity. Research on how these traits can promote speciation is most progressed in H. melpomene and H. cydno, where experimental manipulation demonstrates the importance of wing color patterns in both natural and sexual selection (Naisbit et al. 2001; Jiggins et al. 2001b; Merrill et al. 2011b, 2012). Recent field and cage experiments demonstrate that wing color patterns are under disruptive natural selection – F1 hybrids, whose wing color patterns show an intermediate forewing phenotype, were attacked more frequently than either parental species (Merrill et al. 2012). Mate choice experiments demonstrate both color pattern based assortative mating between the two species and disruptive sexual selection against hybrids (Jiggins et al. 2001b; Naisbit et al. 2001; Merrill et al. 2011b). In addition, assortative mating is much higher in populations where H. melpomene and H. cydno are sympatric, as compared to populations where H. melpomene does not encounter H. cydno. This is consistent with the expectation of the reinforcement hypothesis – selection against hybrids lead to the evolution of stronger pre-mating isolation (Jiggins et al. 2001b). It is important to point out that, in addition to color pattern based mating, there are other forms of isolation between the pair, including host plant preferences, microhabitat usage, and sterility barriers. The sterility barriers occur because the F1 offspring follow Haldane’s rule – the homogametic males are fertile and the heterogametic females are sterile. However, the strength of selection against hybrids that results from female sterility is only about as strong as mimicry selection and not nearly as strong as pre-mating isolation due to assortative mating (Naisbit et al. 2002).

In Heliconius, ongoing research is beginning to make the connection between multiple-effect traits and the loci that underlie these traits. A series of recent studies have demonstrated that the loci causing color pattern differences and the loci responsible for color pattern based mating preference are physically linked in Heliconius (Kronforst et al. 2006a; Merrill et al. 2011b). Physical linkage between a color pattern locus and a male mating preference was demonstrated first in H. pachinus and H. cydno galantus – mating was strongly assortative by white versus yellow color and variation in male mating preference mapped to the yellow switch locus (Kronforst et al. 2006a). A similar association was demonstrated between H. cydno and H. melpomene. In this species pair, male approach and courtship behavior was also highly assortative by coloration and strong male preference for red pattern mapped to the red switch locus (Merrill et al. 2011b). This is a very intriguing finding given that the gene that controls the distribution of red on a Heliconius wing, optix, also plays a role in compound eye development (Seimiya and Gehring 2000). This raises the possibility for a direct link between the perception and transmission of color pattern cues. Ongoing research, including experiments to create introgression lines that differ primarily around the regions responsible for red pattern variation, seeks to better understand the nature of the observed association and its role in promoting the radiation of Heliconius butterflies.

In addition to facilitating speciation by divergent natural selection, physical linkage between color pattern traits and the mating preference for those traits can promote the formation of new species through hybridization (Arnold 2006). Hybrid speciation results when hybridization produces novel genotypes that are reproductively isolated from the parental species. In this regard, hybrid genotypes that confer an ecological advantage and influence assortative mating (sensu Smith 1966) could quickly result in the origin of a novel hybrid population that is reproductively isolated from the parental species. This process has been termed hybrid trait speciation (Jiggins et al. 2008). Although hybrid speciation is thought to be rare in animals, hybrid trait speciation may provide a route for hybridization to play a role in animal diversification.

In Heliconius there are some of the most compelling Heliconius examples of hybrid trait speciation in the animal kingdom (Mavárez et al. 2006; Salazar et al. 2010; Heliconius Genome Consortium 2012). For example, evidence from a number of independent datasets suggest Heliconius heuripa arose through hybrid speciation: (i) the observation of regions in Venezuela where hybrids between the proposed parental species commonly occur (Mavárez et al. 2006), (ii) laboratory crosses demonstrating a clear path to the H. heurippa phenotype (Mavárez et al. 2006), (iii) molecular genetic analysis showing that the H. heurippa genome is a mosaic of pieces from the parental species (Salazar et al. 2010), and (iv) mate choice experiments demonstrating incipient reproductive isolation from the parental species via assortative mating (Mavárez et al. 2006; Melo et al. 2009). Heliconius elevatus is another interesting example of a putative hybrid species, as speciation potentially involves both color pattern mimicry and mate choice (Heliconius Genome Consortium 2012). The hypothesis is that hybridization between H. pardalinus and H. melpomene resulted in adaptive introgression of color pattern alleles, followed by reproductive isolation due to assortative mating on wing color pattern. Genomic data strongly support adaptive introgression of the H. melpomene rayed color pattern allele into a H. pardalinus genome (Heliconius Genome Consortium 2012). The prediction of reproductive isolation secondary to adaptive introgression remains untested. It is predicted that the new rayed H. pardalinus population became reproductively isolated from other H. pardalinus and H. melpomene due to assortative mating based on color pattern and perhaps other signals, such as short-range chemical cues (Estrada and Jiggins 2008), resulting in a new species – H. elevatus. Although the genomic, ecological, and behavioral evidence for these examples are impressive, further studies are needed as alternative speciation scenarios have been proposed for these species that do not invoke introgression and hybridization (see Brower 2013). A whole genome perspective should help shed light on this debate, but it requires a more fundamental understanding of how genomes change during speciation and what signature hybridization and introgression would leave on expected patterns of genomic divergence.

13.4.4 Genomic Heterogeneity at the Species Boundary

Although hybridization between closely related species is common in nature (Mallet 2005; Rieseberg 2009), the idea that divergence and speciation can occur in the face of ongoing gene flow remains contentious. Nonetheless, over the last decade the debate has largely shifted from questions about the geographic context of speciation towards gaining an understanding of the processes and mechanisms that can generate biodiversity in the face of gene flow (Nosil 2012). More recently, next-generation sequencing technologies have matured to permit a whole-genome perspective on divergence during speciation. These data promise to advance our understanding of the origins of reproductive isolation by moving research towards the processes that shape patterns of divergence across whole genomes (Feder et al. 2012).

With the publication of the first Heliconius genome (Heliconius Genome Consortium 2012), a number of studies have used whole genome resequencing to explore the genomic landscape of divergence and speciation along an evolutionary continuum of hybridizing taxa. These studies have focused on the melpomene/cydno/silvaniform clade and use the H. melpomene genome as a reference to layer resequence data and to characterize individual variation across the genome (Kronforst et al. 2013; Martin et al. 2013). These studies join several recent studies in other organisms (Hohenlohe et al. 2010; Lawniczak et al. 2010; Ellegren et al. 2012; Gagnaire et al. 2013) to provide the first full genome perspectives on speciation.

Genomic analyses across the Heliconius speciation continuum highlight some characteristics that are emerging from these early speciation genomics studies. At the early stages in the speciation continuum, recently diverged populations freely hybridize but show strong divergence at regions of the genome known to be under strong selection. The result is differentiation that is restricted to few areas of the genome. For example, hybridizing races within both H. erato and H. melpomene showed clear regions of divergence around known mimicry loci, with little divergence evident elsewhere in the genome (Figs. 13.5d and 13.8). It is notable just how restricted differentiation is, even under conditions of very strong natural selection when patterns of divergence are expected to extend well beyond functional sites. Despite strong frequency dependent selection on color pattern, genomic divergence is limited to sharp and narrow peaks tightly linked to color pattern loci. These divergence peaks have long tails that extend about 1 Mb, but differentiation is only slightly above background levels. This observation is interesting given that differences in a number of important ecological traits, including host plant preference and larval survival, map to color pattern regions (Merrill et al. 2013).

Fig. 13.8
figure 8

Genomic architecture of speciation – empirical data. The empirical divergence data from the melpomene/cydno clade, with arrows pointing to known color pattern loci. Hybridizing races of H. melpomene show islands of divergence at the color pattern loci. Comparisons between closely related, sympatric species, show heterogeneous patterns of divergence across the whole genome. Allopatric species pairs also show high heterogeneity, but with elevated divergence across the genome (Figure modified from Martin et al. 2013)

As speciation progresses, selection, genetic hitchhiking, and the accumulation of neutral mutations during these latter stages of speciation result in highly heterogeneous patterns of genomic divergence across the genome. This pattern of heterogeneous divergence is evident across the continuum from incipient species with pre-mating isolation, to species with strong pre-mating and post-mating isolation. For example, H. cydno and H. pachinus are two closely related species that differ in color pattern and show strong color pattern based assortative mating (Kronforst et al. 2006a), yet hybridize occasionally in narrow regions of overlap in Costa Rica (Kronforst et al. 2006b). In addition to peaks of divergence at known color pattern loci, there are over a dozen regions of the genome that are more divergent than expected by chance and may harbor previously unidentified ecologically important variation. The heterogeneous patterns of divergence persist as the evolutionary distance increases to closely-related species with stronger reproductive isolation, such as H. melpomene versus H. timareta and H. cydno, (Martin et al. 2013) (Fig.13.8).

Heterogeneity is emerging as a common feature of genomic divergence in a number of recent studies. For example, studies of differentiation in Anopheles mosquitoes, Ficedula flycatchers, and Coregonus whitefish also showed highly heterogeneous patterns of differentiation (Lawniczak et al. 2010; Ellegren et al. 2012; Gagnaire et al. 2013). In Ficedula and Coregonus, the patterns are thought to be the result of recent admixture following allopatric divergence. In contrast, the Heliconius and Anopheles patterns are thought to have emerged as a result of speciation without periods of allopatry. Both speciation with gene flow and allopatric divergence with secondary contact can generate genomic heterogeneity. However, finer analysis of the patterns should allow one to distinguish the two scenarios. A commonly used measure of the extent and timing of gene flow is the number and distribution of shared polymorphisms between species. The basic principle is that (i) longer periods of gene flow should result in a greater proportion of shared polymorphic alleles between older species, (ii) recent gene flow will reduce differentiation and increase the proportion of shared alleles among sympatric populations, as compared to allopatric populations, and (iii) recent gene flow should initially result in strong linkage disequilibrium between shared alleles at linked sites, which would breakdown over time if gene flow was ongoing for longer periods. Various methods for comparing the numbers and distribution of shared polymorphisms are being developed and have recently been used to study the role of gene flow during speciation in the handful of organisms with population genomic data available (Kulathinal et al. 2009; Ellegren et al. 2012), including humans and neandertals (Green et al. 2010).

The melpomene/cydno/silvaniform clade provides an ideal opportunity to explore the genomics of speciation with gene flow versus allopatric divergence with secondary contact in a rich comparative framework. This is because the clade includes many allopatric and sympatric/parapatric races and species with varying degrees of known phylogenetic relatedness that can be used to compare patterns of shared polymorphisms and test different speciation models. For example, the observed increase of shared polymorphisms across the genome at increasing phylogenetic depths is suggestive of a long history of gene flow during speciation (Martin et al. 2013). Interestingly, a similar conclusion was reached using a completely different approach that involved modeling introgression rates in a community assessment of genomic differentiation among Costa Rican Heliconius species (Kronforst et al. 2013). However, extreme caution in interpreting these patterns is warranted. High heterogeneity in genomic divergence is indicative of the complex interplay between a diverse array of ecological and demographic factors, including selection, gene flow, and population history, as well as intrinsic genomic features such as variation in recombination rate.

With these new data, we are beginning to appreciate the complexity and the challenges of identifying genomic regions responsible for adaptive divergence and reproductive isolation and understanding how they affect genome-wide patterns of divergence throughout the speciation process. This is a serious challenge, yet systems with replicated examples of adaptation or speciation, such as Heliconius, sticklebacks, and whitefish, can be extremely powerful for inferring the functional importance of regions of divergence and understanding the history of gene flow between species.

13.5 From Patterns to Process

Moving forward requires a much better understanding of the how genomes diverge. As genomic technologies advance, empirical descriptions of genomic divergence will be layered onto one another to describe how the genomes of species change through space and time. The accumulating genomic data are already revealing extremely heterogeneous patterns of divergence that result from complex interactions between selection and gene flow. The research is quickly transitioning to identifying systems that have the most promise to provide insights into the process of genomic divergence. In this respect, new model systems, such as Heliconius, which (i) have replicated examples of adaptation, (ii) are composed of taxa representing distinct stages of the speciation process, and (iii) have traits that are known to contribute to adaptation and speciation, will provide an important framework to determine the processes that drive ecological divergence and speciation from the patterns of genomic divergence.

Genomic data in Heliconius highlight the ability to identify the genomic regions that are known targets of selection and show how divergence around these regions changes when populations are increasingly isolated from each other. However, divergent races and incipient Heliconius species differ by more than wing color patterns. They show differences in mating preference (McMillan et al. 1997; Jiggins et al. 2001b; Chamberlain et al. 2009; Merrill et al. 2011a), hybrid sterility (Jiggins et al. 2001a; Naisbit et al. 2002), host plant choice (Brown 1981; Estrada and Jiggins 2002), and physiology (Davison et al. 1999) – all of which may play key roles in speciation. Leveraging the extraordinary radiation for broader insights into the origins of diversity requires that we better utilize genomic datasets. We need to identify regions under divergent selection and understand how they contribute to differences in survival or otherwise cause a reduction in gene flow between incipient forms. This challenge is not unique to Heliconius – the overarching goal of ecological and speciation genomic research is to identify and characterize regions of the genome under divergent selection and to understand what role they play in speciation.

To reach this goal, we need new theory that will “transform current predictions concerning genetic divergence into more dynamic recreations of how genomic differentiation unfolds through time during speciation” (Feder et al. 2012). Presently, the analysis of genomic landscapes is largely descriptive. Formal models that explain how selection and genetic hitchhiking can drive patterns of genomic divergence are beginning to emerge (Smadja et al. 2008; Feder and Nosil 2010; Feder et al. 2012), but presently there is no standardized procedure to rigorously delimit the shape, size, and distribution of divergent regions of the genomes, and more importantly, to model how they change through time (Feder et al. 2012). Even among Heliconius studies, different strategies were used to identify the location and size of divergent regions and to estimate the degree and timing of gene flow. Without common tools and practices, it becomes very difficult to compare patterns of genomic divergence and to identify general patterns emerging from genome-wide studies of speciation. However, just as new and better datasets will be generated, new and better theories will be developed. The field needs to (i) establish better understandings of the genomic architectures of the traits under divergent selection and influencing reproductive isolation, including the number of loci, their size effect on isolation, and their relative contribution in the speciation process; (ii) investigate how mutation and recombination rates vary locally across the genome and between populations, particularly for those regions of the genome that influence isolation; and (iii) develop increasingly complex models of speciation history and understand how heterogeneous patterns of divergence evolve as species diverge with and without gene flow.

When studying adaptation and speciation, we speculate on the specific historical events that generated the extant genomic patterns that we observe. As such, we have to be very careful to temper our enthusiasm (Nielsen 2009; Barrett and Hoekstra 2011). Molecular “spandrels” (sensu Gould and Lewontin 1979) abound in the genomes of all organisms and establishing direct links between genotype, phenotype, form, and fitness requires integrated datasets. Identifying highly divergent regions of the genome is a starting point for building an integrative understanding of the nature of variation between taxa. For some species, experiments can be designed that measure the success of individuals under specific ecological conditions. In these cases, researchers can actually follow genomic change forward in time. Experimental genomics is moving beyond the laboratory to directly testing hypotheses about how selection causes genome-wide change (Barrett et al. 2008) and provides a powerful approach that can be used in a number of emerging model systems (Barrett and Hoekstra 2011). For other groups, a combination of traditional genetic and functional genomic approaches, coupled with functional manipulation experiments, remains the best strategy to identify functionally important variation. In either case, the combination of technological and analytical advances ensures that genomic exploration will continue to transform our understanding of the origins of biodiversity.