Introduction

The Malawi Rift is approximately 900-km long, stretching from the Rungwe volcanic province in the north to the Urema graben in the south (Delvaux, 1995). The rift contains Lake Malawi, currently 570-km long, 60-km average width with a maximum depth of ~700 m. The lake currently has three significant basins, a major deep ‘Central’ basin near Nkhata Bay in the northwest of the lake at ~700 m, and a slightly shallower ‘Northern’ basin the northeast at ~550 m (Scholz et al., 2007), and a shallow southern basin to the south. Dating of volcanic material from the Rungwe volcanic province in the north of the rift basin indicates that rifting that formed the lake began 5.45–8.60 Ma (Ebinger et al., 1989, 1993). The earliest evidence of aquatic habitat is the presence of basalt pebbles on fluviatile sediment dated to 6.27 Ma, reported in Delvaux (1995). The earliest conclusively lacustrine fossils are from the Chiwondo Beds, dated to between 4 and 1.5 Ma (Bromage et al., 1995; Schrenk et al., 1995; Kullmer, 2008). These beds are most renowned for their hominid remains (Bromage et al., 1995), but they also provide an important record of aquatic ecosystems. Taxa represented in the fossil record include hippopotamus, Nile crocodile, fish (e.g. lungfish, bagrid catfish, claroteid catfish and clariid catfish; Stewart & Murray, 2013) and aquatic invertebrates (e.g. gastropods Bellamya and Melanoides; Schrenk et al., 1995). The fauna is thus largely a characteristic of a shallow lacustrine habitat with swampy margins, and it has been directly compared with that of the present-day shallow water Lake Malombe (Schrenk et al., 1995). This early Lake Malawi has termed palaeolake Mwenirondo (Van Damme & Pickford, 2003), and the earliest cichlids from the catchment have also been reported from these fossil beds (Murray, 2001; Stewart & Murray, 2013).

Lake Malawi now contains one of the richest freshwater faunas on earth. The most striking evolutionary lineage within the lake is that of the Astatotilapia-derived haplochromine cichlids that contains at least 392 described species (www.malawicichlids.com, accessed 1 January 2014), but possibly as many as 835 species (Konings, 2007; Genner & Turner, 2012). This haplochromine flock shows extreme diversity in body morphology, trophic characteristics (e.g. tooth and jaw form; Fryer, 1959), and male breeding colours (Konings, 2007). Most species exhibit habitat specialisations, for example for rock, sand, macrophyte beds, open water, or the extreme depths (Ribbink et al., 1983; Genner & Turner, 2005; Konings, 2007). Given the absence of a comprehensive fossil record for this radiating haplochromine flock, the timescale of adaptive evolution is the topic of some controversy (Danley et al., 2012). It has been proposed that the basal divergences took place in the region of 0.57–1.64 Ma (Sturmbauer et al., 2001; Koblmüller et al., 2008), based on partly on presumed links between lake level changes and aspects of lineage divergence within the flock. Other evidence from fossil- and biogeographically calibrated phylogenies suggests that basal divergence took place in the region of 2.3–4.6 Ma, (Genner et al., 2007a), approximating to those of the earliest known aquatic fauna (4–1.5 Ma; Schrenk et al., 1995; Sandrock et al., 2007). Despite this lack of consensus regarding the timing of basal divergence of radiating Lake Malawi haplochromines, it is generally accepted that it took place one million years ago or earlier (Danley et al., 2012). By comparison, the history of individual species and populations within the flock are much less clear. A better understanding of past changes to populations, when linked to data on palaeoclimatic reconstructions, could help us to determine whether species first arose in conditions more similar to the modern lake, or instead in very different circumstances.

The evidence of lake level changes over most of the history of Lake Malawi is steadily improving. Delvaux (1995) reported a major regression event starting some 1.6 Ma, and ending 0.57–1 Ma, followed by a lake level rise. Delvaux also reported a second regression event lasting from 420,000 years ago to between 250,000 and 120,000 years ago. Sediment cores have given strong evidence of changes within the last 160,000 years, including major low stands during “megadroughts” ~150,000 years ago, ~130,000 years ago, and from 110,000 to 90,000 years ago (Lyons et al., 2011). Subsequently, the lake level showed an overall rise, reaching levels just short of the present day by 50,000 years ago (Scholz et al., 2007; Lyons et al., 2011). The major low stands resolved through these sediment cores suggest that Lake Malawi has undergone major change in water chemistry and the structure of benthic habitat. During the last major low stand, the lake would have been shallow (150 m deep), turbid, well-mixed, alkaline, and saline. It also seems likely that the littoral rock and sand habitat would have been significantly depleted when compared to present-day conditions (Cohen et al., 2007). Importantly sediment cores show that haplochromine cichlid fossils have been present throughout the last ~130,000 years (Reinthal et al., 2011), but unfortunately do not provide detailed resolution of changes to the fish assemblage during that period.

Given evidence that habitat preferences and philopatry of species strongly influence dispersal ability in African lake cichlids (e.g. Pereyra et al., 2004), it seems plausible that the effects of these lake level changes on cichlid populations will depend on the habitat preferences of species, their ability to respond to ecological opportunity, and the size of the habitat and the extent of change in the habitat. In Lake Tanganyika rock cichlids, for example, it has been shown lake level rises have coincided with periods of strong population growth as new shallow water habitats opened up over the last 135,000–60,000 years (Koblmüller et al., 2011; Nevado et al., 2013). By contrast, populations next to deep waters were relatively stable, with populations presumably moving with changing lake level. Such vertical movements are, however, less likely to have enabled the persistence of populations in shallower Lake Malawi. During the low stands, rock cichlids of the ‘mbuna’ grouping (Table 1) that are most abundant in shallow littoral habitats will have lost considerable amounts of suitable habitat (Cohen et al., 2007), perhaps persisting only on wave-washed outcrops. Shallow-water ‘utaka’ cichlids preferring sandy bottoms or sheltered weedy bays may potentially have thrived. Open water predators such as Diplotaxodon and Rhamphochromis that appear to show little intraspecific genetic structuring across Lake Malawi (Shaw et al., 2000; Genner et al., 2008, 2009) may have persisted in the residual lake, or became isolated in peripheral lakes that were periodically isolated and reconnected (Genner et al., 2007b). Such habitat changes, together with increasing turbidity, may have driven a substantial loss of diversity, even driving interspecific introgression to form a hybrid swarm (Seehausen, 2004). By contrast, the subsequent rise in lake level may have provided considerable new ecological opportunity through new habitat and resources, and increased water clarity may have provided opportunity for visually mediated sexual selection, suggested to promote evolutionary diversification in cichlids (Seehausen et al., 2008).

Table 1 Population genetic diversity and estimated times of population expansions

There is some evidence that the most recent rise in lake level over the last 90,000 years led to the establishment and expansion of allopatric species of rock-restricted mbuna cichlids such as Metriaclima (Genner et al., 2010). However, it is unknown if such patterns occur in other species groups. In this study, we explored the historical population demography of 28 species of Lake Malawi cichlids. Specifically, we reconstructed changes in effective population sizes using a Bayesian coalescent approach applied to mtDNA control region sequence data, employing a strict molecular clock derived from published cichlid biogeographic calibrations. This enabled us to identify the timing of establishment and expansion of populations from species within multiple genera with a range of habitat preferences. We discuss these results in light of available information on lake level fluctuations.

Materials and methods

DNA sequence data

Mitochondrial control region data were largely obtained from Genbank. Published papers associated with the data are Genner et al. (2007b, c, 2010) and Anseeuw et al. (2008, 2011). Populations were only included if a minimum of 14 sequences were available. All populations included were sampled from the main body of Lake Malawi, except Rhamphochromis sp. “chilingali” from the nearby satellite Lake Chilingali. Data from four species of Diplotaxodon were newly generated (D. aeneus, D. greenwoodi, D. sp. ‘holochromis’, D. sp. ‘limnothrissa black dorsal’; Table 1). Samples for these species were collected, preserved and their mtDNA sequenced following Genner et al. (2010). DNA was extracted using a Wizard DNA extraction kit (Promega, Madison WI, USA), and a ~900 base pair section of the mtDNA control region was amplified using the forward primer HapThr-2 + 4 (5′-CCT ACT CCC AAA GCT AGG ATC-3′) and the reverse primer Fish12 s (5′-TGC GGA GAC TTG CAT GTG TAA G-3′) employed by Joyce et al. (2005). PCRs were performed in 25-µl reactions including 1-µl genomic DNA, 2.5-µl 10× PCR buffer, 2.5-µl dNTPs (1-mM each dNTP); 1-µl each primer (10-mM stock), 1-µl MgCl2 (25-mM stock), 0.5-units Taq and 14.9-µl double-distilled water. PCR conditions were as follows: 1 min at 95°C; then 34 cycles of 95°C for 30 s, 50°C for 30 s and 72°C for 1 min, followed by 72°C for 5 min. PCR product was directly sequenced after clean-up using an ABI sequencer and the forward primer by Macrogen (Seoul, Korea). New sequences generated have Genbank Accession numbers KJ696605 to KJ696683.

Bayesian skyline plots and timescales of population expansion

Only data from populations, where 14 or more individuals were sequenced, were included in the analyses. This resulted in 46 populations from 28 species, with an average of 19.5 sequences per population (range 14–39). Sequences were aligned using DAMBE (Xia & Xie, 2001) and alignments checked by eye. Alignments ranged in size from 384 to 482 bp (Table 1). Populations were analysed independently, and thus alignment lengths differed among populations and species depending on data availability (Table 1). Temporal changes in effective female population size were estimated using the coalescent Bayesian Skyline Plot approach in BEAST v1.4.8, for each population independently (Drummond et al., 2005). Each analysis was run for 25 million steps using the HKY + Γ substitution model identified as the most appropriate in test runs using Modeltest (Posada & Crandall, 1998). We also used a strict molecular clock and a coalescent Bayesian skyline tree prior with the default 10 grouped coalescent intervals. Operators were set to auto-optimise, and parameters were logged every 1,000 iterations. We used a published biogeographic calibration of 0.0324 changes per site per million years (SE 0.0139) for the cichlid control region based on series of 10 biogeographic calibrations described in Genner et al. (2010). All other parameters were as the default.

Results were visualised using Tracer v1.4 (Drummond et al., 2005). Convergence of chains to a stationary distribution was confirmed by visual inspection of plotted posterior estimates. Bayesian skyline plots were plotted as time versus median N e τ for each of the 100 equally distributed time intervals in the output, where N e τ is the product of effective population size and generation time (Ho & Shapiro, 2011). In this study, both mutation rate and generation time were measured in years × 10−6. Bayesian skyline plots tended to indicate a pattern of population expansion after a period of stasis or decline. The point in time at which the expansion started was quantitatively estimated as first time interval from which there was continuous positive expansion towards the present effective population size. The time interval during which the population increase was the greatest was also identified, and is here referred to as the time of peak population expansion. Bayesian skyline plots provide an output that is the product of effective population size and generation time. Here, we assume that generation times of Lake Malawi cichlids show no systematic variation, and refer to this output as ‘effective population size’.

Estimates of genetic diversity

Sequences of all analysed individuals (898 in total) were aligned, and trimmed to a standard 320 bp alignment. To account for differences in sample size, we calculated rarefied haplotype richness, to a sample size of 14, using the software Primer 6 (Primer-E Ltd, Plymouth, UK).

Results

Bayesian skyline reconstructions provided evidence of an increase in effective population size towards the present day in all studied populations. There was generally high consistency in the timing of this expansion between populations of the same species (Fig. 1; Table 1). There was no significant overall association between the number of sequences within a population and the time of the start of population expansion (r = 0.039, P = 0.795) or the time of maximum population growth (r = 0.069, P = 0.647). Similarly, there were no associations between the size of the sequence alignment and the time of the start of population expansion (r = 0.208, P = 0.165) or the time of maximum population growth (r = 0.283, P = 0.0560). It was notable that several populations showed evidence of population increase after long-term stasis or population decline (Fig. 1). The timescales of start of expansion ranged between 16 Ka and 1.26 Ma, while peak time of expansions ranged between 0.3 and 424 Ka (Table 1; Figs. 1, 2).

Fig. 1
figure 1

ah Bayesian skyline plots derived from mtDNA control region haplotypes, where N e τ represents a product of female effective population size (N e) and generation time (τ, years × 10−6). Note the y-axis range differs among panels

Fig. 2
figure 2

Associations between estimates of genetic diversity (rarefied haplotype richness) and population expansion times in the Lake Malawi haplochromine radiation. N e τ represents a product of effective population size (Ne) and generation time (τ, years × 10−6)

Populations of the typically rock-associated ‘mbuna’ group showed an average start of population expansion 45 Ka, and peak population expansion 9 Ka (Table 1). Populations within the four species of the ‘utaka’ group represented by Copadichromis and Mchenga showed an average start of population expansion 123 Ka, and average time of peak population expansion 17 Ka (Table 1). By contrast, the average start of population expansion for offshore cichlid species of the genus Diplotaxodon was 266 Ka, with the average time of peak population expansion at 98 Ka. Similarly, species of the offshore Rhamphochromis also showed relatively ancient expansions, starting on average 435 Ka, and peaking 128 Ka (Table 1). Rhamphochromis sp. ‘chilingali’, the only peripheral lake population within the genus that has been reported to date, exhibited the most recent peak rate of population expansion in the genus at 20 Ka (Fig. 1; Table 1).

The start of population expansion, the peak times of population expansion and current effective population size were strongly associated with rarefied haplotype diversity (Fig. 2). Populations of Metriaclima showed the smallest effective population sizes, and the lowest genetic diversity. The most substantial effective population sizes and genetic diversity were present in Diplotaxodon and Rhamphochromis. The isolated R. sp. ‘chilingali’ had the smallest current effective population size of any taxon in the genus (Fig. 1; Table 1).

Discussion

Our results revealed a striking contrast between the estimated age of the Lake Malawi haplochromine radiation of 2.3–4.6 million years, and timing of expansions of the current populations within it. Indeed, all studied taxa have undergone population expansion within the last 500 Ka, and most possess diversity that originated within the last 90,000 years. This is consistent with a model where the major lineages that comprise the flock are ancient, but where most extant populations are very young, and indeed in many cases originated since the major lake level rise. This suggests that the diversity of Lake Malawi has not been steadily accumulating, but instead has been subject to sharp increases and declines linked to major environmental change. Pre-existing genetic diversity within species of littoral habitat specialists is likely to have been partitioned among allopatric populations when lake levels rose. By contrast, habitat loss during lake level falls may have driven local extinction or genetic homogenization of populations of these littoral species.

Offshore species in Rhamphochromis and Diplotaxodon were typically older than inshore populations typically found around rock, sand, or shallow benthic habitats, namely Copadichromis, Mchenga and Metriaclima. This suggests that species present in offshore habitats have persisted through major lake level changes without substantial loss of genetic diversity. By contrast, those species that require inshore rock or soft benthic habitats for at least part of their life history have been strongly affected by the lake level changes. Either the existing species persisted in refugia in low abundance during the low-stands, or alternatively speciation has taken place within the inshore lineages in response to rising water levels. Notably allopatric populations within the mbuna rock cichlid genus Metriaclima often show differences in male breeding colours (Allender et al., 2003; Konings, 2007), genetic differences linked to a lack of dispersal across habitat discontinuities such as sand and deep water (Smith & Kornfield, 2002; van Oppen et al., 1998), and exhibit partial assortative mating in laboratory free mate choice trials driven by direct female mate choice (Knight et al., 1998; Knight & Turner, 2004). This is consistent with them representing at least incipient species, but with potential to be fully reproductively isolated species. Equivalent differences in male breeding colour are present among populations of the ‘utaka’ group of species studied here. Breeding populations of Copadichromis show local adaptation in morphology and body size, and extremely limited gene flow among some populations (Anseeuw et al., 2008, 2011). Thus, it seems plausible that the lake level rise not only provided new habitat, but also opportunities for speciation in mbuna and other shallow water genera. Smaller lake level changes may also have promoted evolution of new phenotypes in newly isolated lakes, that perhaps resembled peripheral lakes that surround Lake Malawi today, including Lake Chilingali (Genner et al., 2007b) and Lake Kingiri (Delalande et al., 2008). The presence of at least two putatively endemic haplochromine species in Lake Chilingali (Genner et al., 2007b; Tyers et al., 2014) suggests that speciation in the Malawi flock was promoted, at least partly, through isolation of populations within peripheral water bodies.

Research on Lake Tanganyika rock cichlids has revealed strikingly similar reconstructions to those of the current study, with rapid increases in effective population size of inshore rock-associated cichlid populations in response to rising lake levels following Pleistocene megadroughts (Koblmüller et al., 2011; Nevado et al., 2013). Thus, the picture emerging from comparing sediment core data and reconstructions of effective population size of African Great Lake cichlids in general is one of temporally dynamic environments, where both the limnological parameters, and the species assemblages present, have changed dramatically, and likely repeatedly over the history of the lakes. Notably, in both Lake Malawi and Lake Tanganyika offshore species have undergone only limited speciation when compared with to inshore species groups. In Lake Malawi, for example, the Diplotaxodon (including the closely related Pallidochromis) and Rhamphochromis groups each contain only 15 species. This contrasts greatly with estimates of species richness of the mbuna group (over 320 species; Konings, 2007) and a ‘benthic’ group that includes the utaka (over 470 species Konings, 2007). Many inshore species in both Lake Malawi and Tanganyika show strong preferences for specific depths and substrates, thus it is possible that the evolution of these preferences has contributed to speciation and adaptive evolution when coupled with the ecological opportunity provided by major lake level changes. Such diversification may have taken place following widespread extinction events and/or interspecific hybridization driven by habitat loss and increased turbidity characteristic of low lake stands (Cohen et al., 2007).

Widespread introgression may explain partially the high proportions of the nuclear genome shared among species in the Lake Malawi flock (Albertson et al., 1999; Won et al., 2006; Loh et al., 2013), and there is evidence of both ancient (Joyce et al., 2011; Genner & Turner, 2012), and recent introgression (Zidana et al., 2009; Mims et al., 2010) among species of Lake Malawi haplochromines. In principle, given genetic diversity is strongly correlated with the time of population expansion in these cichlids (Fig. 2), then populations could be younger than indicated by Bayesian skyline plots. Such ongoing hybridization is most likely to be prevalent among species with recent common ancestry sharing overlapping habitat preferences, particularly sympatric species belonging to the same genus. Our results tend to indicate that populations from the same genus show similar trends in population growth irrespective of geographic overlap, suggesting that contemporary hybridization may have only minimal impact on observed patterns. This could be partly be related to the low likelihood of encountering introgressed mitochondrial DNA alleles following an individual hybridization event where populations have high population sizes (Ricklefs & Bermingham, 2004). However, more ancient hybridization during low stands could in principle have homogenised the genetic diversity more widely both within and among genera, contributing to the extensive incomplete lineage sorting of mtDNA sequences commonly observed within studies of Lake Malawi cichlids (Genner et al., 2007b, c).

Concluding remarks

The evidence from this study suggests that components of the Lake Malawi species flock differ substantially in the time of population expansion in part due to contrasts among species in habitat preferences. We also observed strong associations between the effective population size, the time of population expansion and the genetic diversity within populations. Such correlations between genetic diversity and effective population size are perhaps to be expected, since offshore taxa have both larger populations than inshore range restricted taxa, and the offshore populations also exhibit extensive intraspecific spatial genetic homogeneity among populations. Correlations between genetic diversity and the time of population expansion also indicate that populations require time to accumulate this genetic diversity, either through new mutation, or potentially through introgression with sympatric species (Seehausen, 2004). Future studies, potentially accounting for known issues in time-dependency in rates of molecular evolution (Ho et al., 2011), and reconstructing temporal changes in population sizes using multiple genes (Heled & Drummond, 2010), or whole genome data (Li & Durbin, 2011) are likely to add further detail about evolutionary dynamics of Lake Malawi’s haplochromine cichlids.