Introduction

In the Indo-Pacific, the Caribbean region, West Africa and Central America, insular river systems (young oligotrophic rivers) are subject to extreme climatic and hydrological seasonal variation. They are inhabited by gobioids with a fascinating life cycle adapted to the ecological conditions prevailing in these distinctive habitats. Endemic, or more broadly distributed, these species spawn in freshwater, the free embryos drift downstream to the sea where they undergo a planktonic phase and they return upstream with an impressive rock climbing ability to these rivers to grow and reproduce (Keith 2003; Keith et al. 2008). This type of life cycle is called amphidromous (McDowall 2007, 2009a), and it is a successful adaptation to the colonisation of new and remote islands (McDowall 2009b). The exact details of their biological cycle, as well as the factors leading to such extreme adaptation in these gobies are poorly known, despite the fact that they are the biggest contributors to the diversity of fish communities in the Indo-Pacific and the Caribbean insular river systems and have the highest levels of endemism (Radtke and Kinzie 1996; Keith et al. 2000; Keith et al. 2004; Watson et al. 2002, 2007; Lord and Keith 2006; Keith and Lord 2010a). Moreover, at certain times of the year, the biomass of larvae migrating upstream is so great that they are an important source of food for local human populations in some island archipelagos (Bell 1999; Keith et al. 2006a). However, the harvesting of this food resource is highly unsustainable, on account of the complexity of the species’ life cycle and the hydrological specificities of these islands (Keith 2003; Lord and Keith 2008; Valade et al. 2009). The understanding of the evolution and dispersal of these groups is crucial for their management.

Among amphidromous gobies, the Sicydiinae subfamily, Bleeker, 1874, comprises eight described genera of nearly 80–90 species (Keith 2003; Pezold et al. 2006; Froese et al. 2009. FishBase. World Wide Web electronic publication. www.fishbase.org, version (12/2009); Keith and Lord 2010b). Many Sicydiinae species are rare, with a very restricted distribution. Fifty-seven of the species were discovered and described in the last 20 years, and about half of them are known from only a few specimens. The Sicydiinae genera are Stiphodon Weber 1895; Sicyopus Gill 1863; Lentipes Günther 1861; Cotylopus Guichenot 1864; Sicyopterus Gill 1860; Sicydium Valenciennes 1837; Akihito Watson Keith and Marquet 2007 and Parasicydium Risch 1980.

All of these genera have specific distributions (Fig. 1). Some have a restricted range: Cotylopus is present only in the West Indian Ocean (Keith et al. 2005b); Parasicydium is restricted to Western Africa (Pezold et al. 2006) and Akihito, a recently described genus (Watson et al. 2007; Keith et al. 2007b) appears to be restricted to the Western Pacific Ocean. The other genera are more widely distributed: Sicydium is present throughout the Caribbean, Central America and West Africa (Pezold et al. 2006); Sicyopterus is distributed in the Indo-Pacific from the Western Indian Ocean to the Eastern Pacific (Keith et al. 2005a); Stiphodon, Sicyopus, and Lentipes can be found from the Eastern Indian Ocean to the Eastern Pacific (Watson et al. 2001; Watson et al. 2002; Keith and Marquet 2007; Keith et al. 2007a). It has been proven that all Sicydiinae species are amphidromous (see Keith and Lord 2010b). Given the specific biological cycle of amphidromous gobies, the marine larval phase is the key element for species dispersal and colonisation of remote islands. The strength and direction of past and present marine currents as well as the duration of this phase could influence the dispersion ability and therefore the distribution area of genera and species (Planes 1993; Gaither et al. 2010) and the evolution of the subfamily. Because of the particular life cycle of Sicydiinae, their importance in Indo-Pacific and Caribbean island biodiversity and their important role in local fisheries (Bell 1999), it is crucial to understand the phylogenetic relationships among the genera to evaluate the age of the different lineages and the history of the colonisation of the Indo-Pacific and Caribbean islands. A reliable phylogeny for the group would help setting evolutionary hypotheses and establishing a timeline for speciation and biogeographical events. Yet little is known about the exact relationships, as there is little molecular data, and no previous wide-scale study has been performed on this group.

Fig. 1
figure 1

Distribution range of the Sicydiinae genera

The aim of this study is to analyse for the first time a wide sampling of Sicydiinae genera using mitochondrial 16S rDNA and COI coding-gene partial sequences, as well as nuclear rhodopsin coding-gene partial sequences and to discuss the results in the light of amphidromy, dispersal and marine larval duration. Phylogenetic analysis and molecular dating provide improved knowledge and understanding of the evolutionary history of this group of gobiids. We discuss the contrasted distribution of the related genera and species throughout the Caribbean, Indian and Pacific Oceans and the means used for the colonisation of these oceans.

Materials and methods

A total of 57 specimens were analysed. Fifty of them are Sicydiinae from 19 islands. Of the eight known genera, we included seven, and they are all represented by several species and specimens: 6 Stiphodon species (14 specimens), 4 Sicyopus (3 Sicyopus (Smilosicyopus) species and one Sicyopus (Sicyopus) species totalling nine specimens), two Lentipes species (six specimens), two Cotylopus species (four specimens), two Sicyopterus species (six specimens), two Sicydium species (five specimens), two Akihito species (five specimens), and one new genus not yet described (Nov. Gen.) (Table 1). Several non-monophyletic outgroups were included. First, Awaous (Gobiidae: Gobionellinae) was selected as it has been identified as the sister group to Sicydiinae in several previous studies (Birdsong et al. 1988; Nelson 2006; Thacker 2009). Protogobius and Rhyacichthys (Rhyacichthyidae) were included as a second outgroup, as these genera are thought to have diverged first within Gobioidei (Hoese and Gill 1993; Thacker 2003), so they are distant enough to have the certainty that they do not form a monophyletic group with Awaous.

Table 1 Sampling table organised according to taxonomy and origin

The samples were collected between 2001 and 2008 from Mascarene Islands (Reunion and Mauritius), Comoros (Mohéli), Madagascar, in the Indian Ocean; Ryukyu Islands (Japan), Vanuatu (Ambae, Santo, Efate, Gaua and Malakula), New Caledonia (Grande Terre and Bélep), Futuna, Samoa, French Polynesia (Society Islands (Moorea), Marquesas (Tahuata) and Australs (Tubuai, Rurutu and Rapa)) in the Pacific Ocean; and from Guadeloupe in the Caribbean. Species were morphologically identified and compared to specimens and type specimens of the collection held by the National Museum of Natural History, Paris (France). The detailed sampling is given in Table 1.

DNA extraction and PCR

Fin clips were preserved and stored in 95° ethanol. Total DNA was extracted following the protocol described by Winnepenninckx et al. (1993). Three molecular markers were amplified: two mitochondrial (partial cytochrome oxidase I -COI- and partial 16S rDNA) and one nuclear (partial rhodopsin retrogene -Rh-). The primers used were FishF1-5′TCAACCAACCACAAAGACATTGGCAC3′ and FishR1-5′ACTTCAGGGTGACCGAAGAATCAGAA3′ (Ward et al. 2005) for COI, L2510-5′CGCCTGTTTACCAAAAACAT3′ and H3084 -5′AGATAGAAACTGACCTGGAT3′ (Palumbi 1996a) for 16S, and RhF193-5′CNTATGAATAYCCTCAGTACTACC3′ and RhR1039-5′TGCTTGTTCATGCAGATGTAGA3′ (Chen et al. 2003) for rhodopsin. For 12 Sicydiinae specimens out of the 50, a fragment of the cytochrome b (cytb) was amplified using primers CytbF217-5’TCGAAAYATACATGCCAATGG3’ and CytbR1043-5′GAAGTAYAGGAAGGAYGCAATTT3′ to help with the mutation rate calibration. All PCRs were performed in a 25 μl volume of 5% DMSO, 5 μg of bovine serum albumine, 300 μM of each dNTP, 0.3 μM of Taq DNA polymerase from Qiagen, 2.5 μl of the corresponding buffer and 1.7 pM of each of the two primers. For all four markers, after a 2-min. denaturation at 94°C, the PCR ran on Biometra thermocyclers for 45–55 cycles of 25 s at 94°C, 25 s at 52°C and 1 min. at 72°C, with a terminal elongation of 3 min. Purification and sequencing of the PCRs were performed at the Genoscope (http://www.genoscope.cns.fr/) using the same primers. All sequences were obtained in both directions and checked manually against their chromatogram using Sequencher (Gene Codes Corporation). They were aligned by hand using Bioedit (Hall 1999) with the criteria listed by Barriel (1994).

Phylogenetic analyses

For each gene, the best-fitting model of nucleotide evolution was selected using MrModeltest 2.3 (Nylander 2004). Trees were inferred only from the Rh, COI and 16S data sets, as the CytB encoding gene was drastically under-sampled. For both single marker data sets and the combined data set, maximum parsimony (MP), maximum likelihood (ML) and Bayesian analyses were performed using, respectively, PAUP* version 4.0B10 (Swofford 2002), Treefinder (Jobb et al. 2004) and Beast (Drummond and Rambaut 2007). MP analysis was performed with heuristic search with 1,000 iteration of random addition of sequences and branch rearrangement with Tree Bissection Reconnection with gaps coded as missing. ML analyses were run with default parameters, and best-fitting models were set separately for each data set. The robustness of nodes was estimated from 1,000 bootstrap (BP) replicates for ML and 10,000 for MP (Felsenstein and Kishino 1993).

Bayesian analyses were run from the Cornell University’s website (http://cbsuapps.tc.cornell.edu). The Yule model, which assumes a constant speciation rate among lineages, was used as a prior for the speciation processes, and an uncorrelated lognormal prior was used to relax the molecular clock hypothesis. According to results from MrModeltest 2.3, exponential priors were used instead of Jeffreys priors for very low substitution rates (<10−3). Other priors were left at their default values, except for those implied in the molecular dating (see below). Four Markov Chains were run over 10,000,000 and 20,000,000 generations for single and combined data sets, respectively, with a sampling frequency of 1,000 generations. Results were examined with Tracer v1.4.1 (Rambaut and Drummond 2007) to assess the effective sample sizes (ESS) and the convergence among Markov chains. The four runs from each analysis were then combined using LogCombiner v1.4.8, with a 2,000,000 burn in for each run and a sample frequency of 4,000. The final tree was produced by determining a consensus among the combined sets of accepted trees (burn in: 10,000) with the “maximum clade probability” option and the mean node height using TreeAnnotator v1.4.8. The majority rule consensus tree was drawn using FigTree v1.1.2 (Rambaut 2008).

In order to propose biogeographical hypotheses, ancestral localities were reconstructed using the maximum likelihood method implemented in Mesquite (Maddison and Maddison 2009). The geographic distribution of species was discretized into three states, with respect to major biogeographical entities (Pacific Ocean, Indian Ocean, Carribean Sea). Locations of outgroups were coded as missing data, not to influence the reconstruction within the ingroup. The evolution of habitat use was set under a Markov model with three states and one transition parameter (Lewis 2001). To take the tree uncertainties into account, ancestral states were reconstructed for all Bayesian trees retained from the combined analysis, and their mean likelihood was then plotted on the maximum clade credibility tree.

Molecular clock calibration

Priors used to calibrate the uncorrelated molecular clocks include data on both mutation rates and biogeographical events. In Gobiidae, the mean mutation rate for the cytochrome b ranges from 1.93 to 2.17% per Myr (Rocha et al. 2005). As this marker was not included in our phylogenetic analyses, we estimated a prior for the mean mutation rate across the three genes analysed in this study. The nucleotidic evolution models selected using MrModeltest 2.3 were first used to calculate pairwise genetic distances within each data set. Pairwise mutation rates for COI, 16S and Rh were then estimated from a linear relationship between the COI, 16S or Rh branch length: the corresponding CytB branch length and upper/lower value of the CytB mutation rate as calculated by Rocha et al. (2005). The upper/lower mean of pairwise mutation rates was then calculated for each data set, and their average was used as a uniform prior for the branch rate mean (UCLD). The ‘meanRate’ parameter itself was left under a non-informative lognormal prior.

Because of a lack of fossils for the Sicydiinae, emergence ages of some archipelagos where endemic species occur were selected as “geological” constraints. Using current high island ages to date speciation makes the assumption that the currently observed geographic range of an endemic species is the same as its past range.

We have chosen two islands with endemic species of Sicydiinae in areas where there are no older islands, and where all known endemic species are included in the sampling. The emergence times of Ambae (Vanuatu) and Futuna—dated, respectively, at 5 Myr (Keith et al. 2009) and 3.5 Myr—were used to calibrate our relaxed molecular clock. We assume that the diversification of Akihito vanuatu (endemic to Ambae) and Akihito futuna (endemic to Futuna) is not older than the emergence of the younger of the two islands (Futuna). The same hypothesis is used for the divergence of Sicyopus (Smilosicyopus) sasali (endemic to Futuna) and the other Sicyopus (Smilosicyopus), as well as for the divergence of Stiphodon rubromaculatus (endemic to Futuna) and the other Stiphodon.

To provide a more realistic assessment of the uncertainty associated with the divergence of the species, normally distributed priors were used for the divergence of species according to the island emergence dates. Indeed, the divergence of an endemic species from its sister species probably could not have happened before the emergence of the island where it is endemic, but could have happened at any time between this emergence and the present time. The time constraints for the divergence of these species were therefore calibrated using the interval [0–3.5] as upper and lower bounds.

The two Cotylopus species were used as an additional calibration point. Keith et al. (2005b), using molecular data, estimated the divergence between Cotylopus acutipinnis and Cotylopus rubripinnis at 4.66 ± 2.53 Myr. This divergence estimate is consistent with the biogeographical history of the Mascarene Islands and Comoros. We therefore used a normal distributed prior to calibrate the divergence between these two Cotylopus, using the interval [2.13–7.19] as upper and lower bounds.

Results

The COI data set was 679 base pairs (bp) long (232 variable sites and 225 parsimony informative sites). The 16S data set was 596 bp long (123 variable sites and 118 parsimony informative sites). The Rh data set was 836 bp long (164 variable sites and 154 parsimony informative sites). For the 12 specimens sequenced for the CytB marker, we obtained a 605 bp data set (216 variable sites and 178 parsimony informative sites). Based on the Akaike information criterion (AIC), a GTR + I+G model is adequate to the sequence data for all three markers.

Phylogenetic analysis

For each data set, MP, ML and BA recovered the same supported clades. With each method (MP, ML and BA), there were no supported incongruencies between the topologies inferred from single gene data set (figures not shown). A number of relationships are unambiguously supported by either bootstrap and/or posterior probability (Fig. 2). Clade A comprises the two rhyacichthyids, Protogobius attiti and Rhyacichthyis guilberti. Clade B includes only the Awaous specimens. The monophyly of the Sicydiinae is always recovered.

Fig. 2
figure 2

Phylogram inferred from Bayesian analysis of the combined molecular data (COI, 16S and Rh) Numbers on the nodes are posterior probability values, then bootstrap values for maximum parsimony analysis and then bootstrap values for maximum likelihood analysis. When PP and BP where less than 60%, the value was represented by “- -”. The black bar indicates occurrence in the Pacific Ocean, the light grey bar indicates occurrence in the Indo-Pacific Oceans, the dark grey bar indicates occurrence in the Indian Ocean and the white bar indicates occurrence in the Caribbean. The scale is given in substitutions per site

The subfamily is split into five clades. The relationships among these clades are not recovered from one data set to another, and they are not supported in any of the analyses by bootstrap or posterior probabilities. We therefore consider that the relationships among these clades are not resolved for the time being.

Well-supported clades C (Cotylopus) and F (Stiphodon) group all members of a single genus together. Clade F is structured into two well-supported subclades.

Clade G is also mono-generic and well supported, but it contains only the species of Sicyopus belonging to the subgenus Smilosicyopus. The missing species from the same genus, Sicyopus zosterophorum is robustly included in clade E. Sicyopus is the only non-monophyletic genus in the present study.

The last two clades (E and D) group members from several genera. Clade D is composed of two monophyletic taxa, Sicyopterus and Sicydium. Clade E includes members of four genera: Akihito, Lentipes, a new genus from the Indian Ocean not yet described (Nov. Gen.), and Sicyopus (Sicyopus) not included in clade G with its congenerics. The relative position of the four genera in clade E is not recovered from a data set to another and is poorly supported. However, all these genera are monophyletic and some are well supported.

Biogeographical hypotheses

The reconstructed ancestral localities (Fig. 3) show that the origin of the Sicydiinae subfamily was probably in the Pacific Ocean. Indeed, the percentage at the Sicydiinae node is 77.5% for a Pacific origin against 7.4% for the Indian Ocean and 15.1% for the Caribbean. Nearly, all the clades within Sicydiinae also show a Pacific origin. Clade D (Sicyopterus/Sicydium) also shows a Pacific origin (Pacific: 77.6%; Indian Ocean: 21.3%; Caribbean: 1.1%), which is of particular interest in the discussion of Sicydiinae biogeography, dispersal and evolutionary hypothesis.

Fig. 3
figure 3

Reconstructed ancestral localities using the maximum likelihood method implemented in Mesquite (Maddison and Maddison 2009). The geographic distribution of species was discretized into three states, with respect to major biogeographical entities (Western Pacific, Indian Ocean, Carribean). Locations of outgroups were coded as missing data

Dating

The mutation rates estimated using the range of the CytB mutation rate estimated by Rocha et al. (2005) were 1.81–2.04, 0.5–0.56 and 0.45–0.5% for COI, 16S and Rh, respectively. Our estimate for the COI mutation rate was consistent with literature data (Bowen et al. 2001; Muss et al. 2001). Divergence times are plotted on the chronogram (Fig. 4).

Fig. 4
figure 4

Chronogram inferred from the Bayesian analysis and a time scale as inferred from a Bayesian relaxed molecular clock approach. The black bar indicates occurrence in the Pacific Ocean, the light grey bar indicates occurrence in the Indo-Pacific Oceans, the dark grey bar indicates occurrence in the Indian Ocean and the white bar indicates occurrence in the Caribbean

The times estimated for the divergence of the five clades within Sicydiinae are very close to one another from 6.7 Myrs for the divergence of Cotylopus to 4.84 Myrs for the divergence of Sicyopus (Smilosicyopus) and Stiphodon, and the confidence intervals for these estimates overlap considerably. Within clade D, the divergence between genera Sicyopterus and Sicydium is estimated at 4.07 Mrys (95% HPD: 2.53–5.58 Myrs).

The divergences for the three clades (E, F and G) occurring only in the Pacific Ocean are very close to one another, at 5.33 Myrs ago (95% HPD interval, 4.07–6.6 Myrs) for E from F and G, and at 4.84 Myrs (95% HPD interval, 3.56–6.25 Myrs) for F from G.

Discussion

Sicydiinae phylogeny

The present molecular study includes all but one of the Sicydiinae genera, and it supports the monophyly of this subfamily. Among the Gobiidae, the Sicydiinae subfamily has never been completely studied using molecular analyses (Akihito 1986; Akihito et al. 2000; Wang et al. 2001). The present study is the first one to include such a large number of Sicydiinae representatives (species and genera), drawing the first conclusions on this subfamily’s organisation. Several authors working on morphological characters (Sakai and Nakamura 1979; Harrison 1993; Pezold 1993; Parenti and Maciolek 1993; Parenti and Thomas 1998; Watson et al. 2007) indicated that the buccal morphology and the specificities of pelvic fins of Sicydiinae provide valuable characters to assess this group’s phylogeny. Harrison (1989), using these morphological characters, supported monophyly of the subfamily, but he examined only three genera (Stiphodon, Sicyopterus and Sicydium). Parenti and Thomas (1998) examined the morphology of two additional genera (Lentipes and Sicyopus) and also concluded that Sicydiinae are a monophyletic group.

Our molecular study confirms these works based on morphology and splits the subfamily into five clades.

Cotylopus clade (Clade C)

Cotylopus is the only genus in the Sicydiinae group that is strictly endemic to the Indian Ocean, in islands belonging to the Mascarene and Comoros (Keith et al. 2005b; Keith et al. 2006b). Cotylopus has been confused with Sicyopterus in the past but our results confirm the conclusions of Watson (1995) and Keith et al. (2005a) who elevated Cotylopus to a valid and monophyletic genus.

SicyopterusSicydium clade (Clade D)

Our molecular data confirm the monophyly of the genus Sicyopterus, as shown by Keith et al. (2005a) and Berrebi et al. (2006), and uncover a sister-group relationship with Sicydium. Sicydium and Sicyopterus share two putative synapomorphies: a short, blunt ascending process of the premaxillae and a unique oculoscapular canal pore pattern (Pezold 1993; Parenti and Maciolek 1993). Parenti and Maciolek (1993) considered an inclusive monophyletic group comprising Stiphodon, Sicyopterus and Sicydium. However, our study places Stiphodon closer to clade G and clade E, although this is not well supported and not recovered in all separate analysis; therefore, more data are needed to choose between the two proposed topologies.

Stiphodon clade (Clade F)

Stiphodon species are unique among the Sicydiinae in having three anal pterygiophores prior to the first haemal spine (Birdsong et al. 1988); in all other Sicydiinae genera, there are only two. Our molecular results support the monophyly of the genus Stiphodon and divide it into two clades. The first comprises Stiphodon species with mainly 13–14 rays in pectoral fins and of small adult size (generally less than 4 cm standard length) and comprising in our study S. sapphirinus, S. rutilaureus, S. hydoreibatus and S. rubromaculatus; referred here to as the “sapphirinus group” (Watson et al. 2005; Keith and Marquet 2007; Keith et al. 2007a). The second group comprises Stiphodon species with mainly 14–16 rays in pectoral fins and of large adult size (generally from 4 to 7 cm standard length) and comprising in our study S. elegans and S. atratus, referred here to as the “elegans group”. As Stiphodon is probably the most diverse Sicydiinae genus (more than 25 species), the exact composition of these clades must be confirmed by further studies including more species.

The case of the Sicyopus (Smilosicyopus) and Sicyopus (Sicyopus) clades (Clade G and part of Clade E)

Our study splits the genus Sicyopus into two distinct clades, Sicyopus (Smilosicyopus) (clade G) and Sicyopus (Sicyopus) (including in clade E). Parenti and Maciolek (1993) proposed that Sicyopus is the most plesiomorphic sicydiine genus and considered that the monophyly of Sicyopus was doubtful; our molecular analysis is in agreement with their results based on morphology.

Watson (1999) defined two main subgenera of Sicyopus, Smilosicyopus and Sicyopus, based mainly on dental characteristics found in both jaws. Our study reflects this taxonomic subdivision. We therefore propose a nomenclatural change for the genus Sicyopus. As Sicyopus (Sicyopus) zosterophorum, Bleeker, 1857, is the type species for the genus Sicyopus, the clade Sicyopus (Sicyopus) is maintained in the genus Sicyopus. Based on our molecular results and on Watson (1999), we herein elevate to genus level the subgenus Smilosicyopus Watson 1999, type species Smilosicyopus leprurus Sakai and Nakamura 1979, adding thus a new genus to the Sicydiinae subfamily.

In our molecular analysis, whatever the method used (Bayesian, MP, and ML), Stiphodon and Smilosicyopus are always sister taxa.

Lentipes/Sicyopus (Sicyopus)/Akihito/Nov. gen. (Clade E)

Our molecular study shows a sister-group relationship between Lentipes, Sicyopus, the undescribed new genus (Nov. gen.) and Akihito. In this clade, from a morphological point of view, Akihito appears closest to Sicyopus. More specimens of the new genus (Nov. gen.) are needed to define its relationships within this clade.

The MP, ML and Bayesian analyses recover the same supported clades, but our molecular results lack resolution preventing us from drawing conclusions as to the position of the five clades relative to each other. For example, the Cotylopus clade is placed at the base of the subfamily in ML and Bayesian analyses, but in MP it is at the base of the clade E in a sister-group relationship with Lentipes. Neither position is supported by high posterior probability or bootstrap values. However, the morphological characters tend to support the MP results. Indeed, Cotylopus shares with Lentipes numerous morphological and osteological characteristics (Watson 1995; Watson et al. 2002). Our molecular analysis does not allow us to conclude as to the true position of Cotylopus, and further studies, using more molecular markers, are needed. However, the presence of short branch lengths between the five clades, the differences in topologies depending on methods and markers and the lack of support at the basis of the Sicydiinae might indicate that the basal diversification of the subfamily happened within a short amount of time. This is also congruent with the close divergence dates and overlapping confidence intervals estimated for these nodes. If this is the case, finding support for one hypothesis or the other might prove very difficult.

Biogeography of Sicydiinae

Amphidromy, dispersal and larval duration

All Sicydiinae genera have a specific distribution. Some have a very restricted range and some are more or less widely distributed (Fig. 1). Amphidromy is an adaptation to the colonisation of insular and isolated environments and to the dispersion to and from such environments (McDowall 2007). Therefore, amphidromy and pelagic larval duration (PLD) are major elements to understand the biogeography and phylogeny of Sicydiinae. Arai et al. (2001) and McDowall (2007) suggested that the oceanic life stages and the larval duration of Sicydiinae, and other amphidromous gobies, could be regarded as the key element in explaining their dispersal abilities and distribution. However, few studies have investigated the relationship between the migration strategies of Sicydiinae with prolonged oceanic life stages and their distribution (Iida et al. 2009).

Lord et al. (2010), working on Sicyopterus species, have shown that the mean marine larval duration is significantly longer for the widespread species, Sicyopterus lagocephalus, (around 130 days) than for species with a much more restricted range (around 80 days). For Sicyopterus japonicus, a species that ranges from Taiwan to southern Japan, the larval duration has been estimated to be 176 days (Shen and Tzeng 2002). For the other Sicydiinae studied, the mean larval duration is nearly 86 days for Lentipes concolor (Radtke et al. 1988), and nearly 100 days for Stiphodon percnopterygionus (Yamasaki et al. 2007) and Cotylopus acutipinnis (Lord et al. 2010). In other words, all Sicydiinae species that have been studied have a significantly longer PLD than the 30–50 days usually reported for reef fish (Wellington and Victor 1989; Wilson and McCormick 1999) and the 20–50 days typically reported for marine gobies (Shafer 1998). The longer marine larval duration for amphidromous species may be a developmental adaptation to their special and complex life cycle that requires them to complete the two migrations to and from marine environments. In fact, the long duration might be advantageous to locate isolated freshwater settlement sites (Radtke et al. 1988; Murphy and Cowan 2007) and colonise new and remote islands (Lord et al. 2010). These abilities could have allowed some Sicydiinae species to reach other oceans over evolutionary times.

Larval phase duration could therefore partly explain the evolution of Sicydiines and the distribution range observed for each genus. The shorter marine phase would suggest limited dispersal abilities, hence a restricted distribution range. The species having a longer marine phase (like Sicyopterus) would be able to delay the recruitment (Victor 1986; Keith et al. 2008) and could therefore colonise distant geographical islands, spreading their geographic distribution to all islands, colonising them gradually (Keith et al. 2005a; Keith et al. 2005b).

For eels, the shorter larval phase is known to be the ancestral state (tropical species), whereas the temperate species present longer, more derived larval phases (Kuroki et al. 2007). This plasticity in the duration of the larval phase might correspond to the expression of selected strategies, which are defined as genetically determined life histories or behaviours (Robinet et al. 2007). For tropical Sicydiinae gobies, there might be genetically determined behaviour as well, explaining their distribution range. Our molecular dating results show that the most ancient genera (Sicyopterus, Sicydium) are the ones with the largest distribution area (Sicyopterus) or which dispersed far from the Pacific (Sicydium, Cotylopus). Inversely, the most recent genera (Akihito, Lentipes, Sicyopus) are those with a more restricted distribution area (Fig. 1).

The variability of the duration of the marine larval phase could be one factor explaining the species’ distribution by favouring or limiting their dispersal, and its study is of a great interest to explain Sicydiinae biogeography and phylogeny; but dispersal also depends on currents and regional biogeography. Indeed, as suggested by Lord et al. (2010), length of larval duration is not sufficient to explain species distribution. There are undoubtedly other factors of biogeographical (variation of sea level and terrestrial barriers), physical (currents used) and ecological (behaviour and swimming depth of larvae) origin that weigh on species distribution range.

The case of the Sicyopterus—Sicydium clade (Clade D)

The dates found with our molecular clock analysis concerning the divergence of Sicyopterus species are congruent with the distribution range of the genus and with the findings of Keith et al. (2005a).

To understand the evolutionary story of the Sicydiinae, one of the most interesting clades is Sicyopterus and Sicydium (Clade D). Sicyopterus (the genus which includes the species with the longest PLD) occurs in the Indo-Pacific from 18,000 km in the western Indian Ocean (with Madagascar and the Comoros Islands) to the eastern Pacific Ocean (including the Austral Islands in French Polynesia and Hawaii) (Watson et al. 2000; Keith et al. 2005a). Sicydium is found throughout western and eastern Middle America, extending from Mexico to Peru along the Pacific coast, from Mexico to Venezuela along the Atlantic coast, on the islands of Greater and Lesser Antilles of the Caribbean Sea and in West Africa (mainly in the volcanic islands of Gulf of Guinea) (Watson 2000; Pezold et al. 2006). Our study shows a sister-group relationship between these genera with a divergence time estimated at between 2.53 and 5.58 Myrs. To understand the dispersion of the common ancestor, it is necessary to explain the zoogeography of the distribution pattern observed in these two genera.

Our work shows that the origin of the Sicydiinae subfamily was probably in the Pacific Ocean (Fig. 3). As with many other taxonomic groups (see Springer 1982; Rosen 1988; Wallace et al. 1989; Parenti 1991; Mukai 1992; Pandolfi 1992; Stoddart 1992; Veron 1995; Planes and Galzin 1997…) the Sicydiinae subfamily might have originated in the western Pacific as discussed by Parenti (1991), as 90% of the species inhabit this area, while the other 10% are from western and eastern Middle America and West Africa (Keith and Lord 2010b). This area is also supposed to be the centre of origin for the ancestor of Sicyopterus (Keith et al. 2005a; Berrebi et al. 2006). The phylogenetic tree obtained here and the apparent Pacific origin of the ancestor of the Sicydium/Sicyopterus species raised the issue of the dispersal route. It is impossible to rule out the possibility of changes in the world distribution of Sicydiinae due to extinctions and global climatic changes in the past. But if the current distribution (Fig. 1) reflects the past distribution, and as it is supposed that amphidromy was never lost in Sicydiinae (McDowall 2009b), three possible routes of entry into the Atlantic Ocean based on zoogeography, paleogeography and paleo-circulation for the ancestor of the Sicyopterus/Sicydium clade can be discussed: (1) the Panama route, (2) the Tethys Sea route and (3) the Cape of Good Hope route.

Two further elements need to be discussed first. The presence of Sicydium species along the eastern margin of the Pacific (Central America and the north of South America), shared with the Caribbean Sea, as for some other fish species (Parenti 1991; Rocha et al. 2005), suggests that the complete expansion of this genus occurred before the completion of the Isthmus of Panama (3–4 Myrs). Additionally, the presence of Sicydium species in a small area along the western margin of Africa should be considered, as well as the presence of Sicyopterus species in the eastern Pacific as far as Hawaii and the east of French Polynesia, but not in the most eastern part (western part of Central America). This Pacific “gap” (the east Pacific barrier (EPB)) between two genera is often observed in marine species (Victor and Wellington 2000).

In the Panama route hypothesis, the ancestors would have reached the western part of Central America and the north of South America from the centre of the Pacific, crossed the Isthmus of Panama before its completion and reached the Caribbean Sea. But this does not concur with our knowledge of the north and south equatorial currents, which flowed at this period from east to west with no possibility of crossing the isthmus (Iturralde-Vinent and Mac Phee 1999). The crossing of the Pacific Ocean from the western to eastern area was also highly improbable because of the EPB and due to the sheer size of the Pacific Ocean, which was bigger than at present. The absence of islands with freshwater in this part of the Ocean is another element against this hypothesis. Even if the PLD may be long, such islands are necessary for amphidromous species, as they would need spawning grounds in order to reach the eastern part of the Pacific step-by-step. Aoyama et al. (2001) have discussed this for the Anguilla species and also conclude that this route was improbable.

The Tethys Sea separated Laurasia from Gondwana during the Mesozoic to the beginning of the Tertiary period (200–30 Myrs) (Aoyama et al. 2001). This could have provided a circumglobal oceanic pathway for dispersion. Nevertheless, according to our study, the Sicydiinae appear to have diversified between 5.9 and 12.1 Myrs ago, so the ancestors of clade D could not have entered the Atlantic before the Tethys flow became intermittent because of the closure of the Tethys Sea (in the Oligocene, c. 20–30 Myrs) (Tsukamoto and Aoyama 1998).

Although the possibility of changes in worldwide distribution cannot be ruled out due to, for example, past extinctions, the most likely dispersal route still appears to have been the Cape of Good Hope route (Fig. 5). In this hypothesis, the clade’s ancestor would have reached the western part of Africa around 6–4 Myrs ago, as a first step in the dispersal from the Pacific to the Atlantic via the Indian Ocean going towards Central America and the Caribbean later, and crossing the Isthmus of Panama before its completion. The existence of West African Sicydium species, i.e. at an intermediate geographic location between the South Indian Ocean and the Caribbean, would appear to support the hypothesis for this route. Subsequent faunal exchanges of tropical organisms between the Indian and Atlantic oceans through the Cape of Good Hope were strongly curtailed after the late Pliocene (2.5 Myrs) by the establishment of the Benguela current. This current forms a cold water barrier along the Atlantic Coast of Southern Africa (Rocha et al. 2005) and prevents dispersal. After the appearance of this current, the migration route was cut.

Fig. 5
figure 5

Hypothetical migration route via the Cape of Good Hope for Sicyopterus/Sicydium clade

Based on our timing of the divergences and our reconstructed ancestral localities, we propose the following evolutionary scenario for the speciation and dispersal of freshwater gobies of the Sicyopterus and Sicydium clades. At the Miocene—Pliocene boundary, the ancestral population originated in the western Pacific and split into two groups. The Sicyopterus group dispersed in the Indo-Pacific. The future Sicydium group dispersed westward throughout the Indian Ocean with the equatorial current along Madagascar, Comoros, Mascarenes and the eastern margin of South Africa, crossing the Cape of Good Hope with the Agulhas current and reaching the volcanic islands of the Gulf of Guinea in West Africa by the South Atlantic current before the appearance of the Benguela current. The volcanic islands of Bioko, Principe, Sao Tomé and Annobon constitute the oceanic sector of the ‘Cameroon line’ and formed during the Miocene (Harrison 1993). They are almost the only available tropical islands in this area that offer the distinctive habitats necessary for Sicydiinae. At a later point in time, some Sicydium reached the Caribbean Sea with the south equatorial current. Among those, some dispersed later through the Isthmus of Panama to colonise the western coast of Central America and could not cross the Pacific from east to west partly due to the size of the Pacific Ocean and/or the total absence of islands in this part of the Ocean (EPB).

Since the late Eocene (35–32 Myrs), the course of westward flowing currents in the mid-Atlantic and Caribbean area has been greatly affected by the appearance and disappearance of various land barriers. Thus, the Circumtropical Current, which originally passed into the western Pacific, was temporarily disrupted in the latest Eocene and early Oligocene and was permanently disrupted by the completion of the Isthmus of Panama in the late Pliocene (Iturralde-Vinent and Mac Phee 1999). Around the Miocene-Pliocene boundary (3–4 Myrs), the closure of the Panamanian waterway was complete, and the Circumtropical Current was replaced by the Caribbean Current, fed directly by the Atlantic Equatorial Current. Surface currents are now mostly directed towards the northwest (Iturralde-Vinent and Mac Phee 1999).

The case of the other Sicydiinae genera

In the Pacific Ocean, the Sicydiinae have largely diversified, and there are now many genera (6) and species (60–70), similarly to the diversification observed in this region for other faunistic groups. Indeed, the Pacific area contains thousands of marine species, which make this ecosystem the most diverse among marine habitats. Such a high diversity is not uniformly distributed but shows a gradient of species richness that extends from the highly diverse centre of the Indonesian-Philippines area to the surrounding archipelagos lying further East (Briggs 1974). The reasons for the present-day distribution are complex and are the subject of fascinating studies, many of which are fairly recent and ongoing (Planes and Galzin 1997; Keith et al. 2005a).

In the Pacific Ocean, the divergence dates found by our study are congruent with the biodiversity of each genus of Sicydiinae, and with a gradient of species richness (Fig. 1). Sicyopterus and Stiphodon, comprising nearly 25 species each, separated from their sister genera around 4 Myrs ago, have more species than the others and more recent (1–3 Myrs) genera that have between 2 and 12 species. We have not been able to resolve the position of each clade within the Sicydiinae subfamily; adding molecular markers may help resolve the topology within this subfamily.

The various hypotheses put forward to explain biogeographical patterns in the Pacific Ocean (Rosen 1988) break down into three major theories (Palumbi 1996b; Planes and Galzin 1997): centre of origins, centre of accumulation and centre of overlap. These suggest mechanisms that have led to high diversity in the Indonesian-Philippines area. Many studies on corals, seagrasses, molluscs and fish (Springer 1982; Rosen 1988; Wallace et al., 1989; Mukai 1992; Pandolfi 1992; Stoddart 1992; Veron 1995; Planes and Galzin 1997; Keith et al. 2005a) have investigated species’ distribution and support one or the other of the theories. More studies on Sicydiinae will have to be undertaken in order to support one of these theories for this subfamily.

Conclusion

The present analysis of all but one known Sicydiinae genera is the first work ever realised on this group. It is of major importance for our understanding of the origin and the diversification of the Sicydiinae subfamily, which has colonised all the insular systems of the Indo-Pacific and Caribbean.

The outgroups used in the current study do not challenge the monophyly of the Sicydiinae, now composed of 10 genera. The monophyly of Sicydiinae, their diversity as well as the existence of a pelagic larval duration which is about twice as long as for reef fish (Lord et al. 2010) provide us with the opportunity to explore the aptitude of this group to disperse widely, a phenomenon which probably occurred several times during the group’s history.

As a result of the molecular dating performed here, a hypothesis concerning the dispersion route for the Sicyopterus/Sicydium clade may be put forward: the common ancestor would have appeared in the Pacific Ocean, dispersing first to the Indian Ocean and then in the Atlantic Ocean via the Cape of Good Hope.

This study needs to be developed further in order to confirm the phylogenetic relationships, reconstruct the history of the group and support our preliminary hypothesis concerning the dispersion route. West African species of Sicydium, and the only genus not included in this study, Parasicydium, need to be sampled and included. Moreover, more molecular markers are needed to explore the relationships between the clades defined in the present study, as these deeper relations are not repeated nor supported. However, the clades themselves give important information, improving our understanding of Sicydiinae evolution and history.