Introduction

The Mugilidae family, commonly referred to as grey mullets, is distributed worldwide, inhabiting coastal and brackish waters of all temperate and tropical regions of the globe. According to Nelson (1994), the family was previously classified in the order Perciformes but is now considered the sole representative of the order Mugiliformes. In the latest taxonomic revision made by Thomson (1997), the Mugilidae family seems to include 14 genera and a total of 64 valid species, most of them classified in the genera Mugil and Liza. Despite all these major revisions (Nelson 1994; Thomson 1981, 1997), the systematic status of some species and genera within the family is still confused (Rossi et al. 1998).

Six species of the Mugilidae family can be found in the Mediterranean Sea: Mugil cephalus, Chelon labrosus, Oedalechilus labeo, Liza saliens, Liza aurata, and Liza ramada (Tortonese 1975). Three other species have been recently reported in the same region: Liza carinata, which invaded the eastern Mediterranean from the Red Sea through the Suez Canal (Thomson 1997); Mugil soiuy, which was initially introduced into the Azof Sea and now has been reported in the Black Sea and more recently in the Aegean Sea (Kaya et al. 1998); and finally Liza abu, which migrated into the Mediterranean through channels from the Orontes River (Yalcin-Ozdilek 2004).

Elucidation of the taxonomic status of the Mugilidae family has been attempted at various levels by several investigators (e.g., Schultz 1946; Trewavas and Ingham 1972; Thomson 1981, 1997; Harrison and Howes 1991). These studies were traditionally based on the use of morphological characters such as the pharyngobranchial organ or the presence of an adipose eyelid, among other features, as key characters. The results obtained were in most cases controversial, failing to provide any conclusive answers. In fact, most members of the family display a general morphological uniformity, which as a consequence restricts the number of suitable characters that can be used to answer phylogenetic questions unambiguously. As a result, the phylogenetic status of the Mugilidae family remains particularly obscure, especially at the interspecific level (Stiassny 1993; Rossi et al. 1998).

More recently the phylogenetic relationships of grey mullets have been investigated with the use of nonmorphological characters, employing biochemical and nucleic acid markers. In addition, Delgado et al. (1992), Rossi et al. (1996, 1997, 2000), Gornung et al. (2001, 2004), and Nirchio et al. (2003) analyzed several mullet species cytogenetically to detect karyotypic markers that could be integrated into a consistent evolutionary pattern and thus provide useful phylogenetic information. Allozymic and mtDNA studies have also been conducted to provide more evidence regarding the phylogenetic relationships of these species. Those different approaches agree to some extent, especially regarding the position of M. cephalus with respect to the other species, but there are important issues that need further clarification. Studies based on allozyme electrophoresis of Mediterranean mullets (Autem and Bonhomme 1980; Papasotiropoulos et al. 2001; Rossi et al. 2004; Turan et al. 2005) provided conflicting results not only among themselves but also compared with other studies based on mtDNA data (Caldara et al. 1996; Papasotiropoulos et al. 2002; Murgia et al. 2002; Rossi et al. 2004). The major issue concerns the phylogenetic relationships among the Liza species and particularly the existence of a common Liza-Chelon clade, which if proven will bring into question the monophyletic origin of the genus Liza. The systematic classification of these two genera has been the subject of a long-running debate. Schultz (1946) did not recognize the genus Liza and included within Chelon all the species later reported as Liza by Thomson (1997), who considered Chelon a valid genus, derived from Liza.

In the current study three congeneric species (L. saliens, L. aurata, L. ramada) and two noncongeneric species (M. cephalus and C. labrosus) of Mediterranean grey mullets were investigated using partial sequences of three mtDNA segments. Mitochondrial DNA markers have been successfully used to decipher evolutionary relationships at multiple taxonomic levels among different organisms (Stepien and Kocher 1997). With the use of mtDNA sequence analysis, we aimed to shed more light on the evolutionary history of the Mugilidae family and more specifically on the existing debate regarding the phylogenetic relationships among the Chelon and Liza species. Furthermore, we would like to compare our current data with that of others and to evaluate the effect of different data sets or different methodologies on the same problems.

Materials and Methods

Sampling and DNA Sequencing

Four to six specimens from each of the five species of the Mugilidae family were collected from the Messolongi Lagoon in Greece. The liver from each individual was dissected, placed in liquid nitrogen, and stored at −80°C. Total DNA was isolated according to the protocol included in the DNAeasy Tissue kit (Qiagen, Valencia) and examined through agarose gel electrophoresis.

Fragments of the three mtDNA genes were amplified by PCR using a PTC-100 (MJ-Research) thermocycler. PCR reactions were carried out in 50 μL volumes (2 U Promega Taq polymerase, 5 μL 10 × Promega PCR buffer, 0.2–0.5 mM dNTP, 1.5 mM MgCl2, approximately 100–200 ng template DNA, and 15 pmoles of each primer, filled to 50 μL with water). For the PCR amplifications of the three mtDNA segments, three different sets of primers were used. For the 12s rRNA as well as the 16s rRNA genes we used the universal primers 12SAL-12SBH and 16SARL-SBRH, respectively, described by Palumbi et al. (1991). For the CO I segment we used the primers described by Normark et al. (1991). The thermocycler program for each mtDNA segment was as follows: one preliminary denaturation step at 95°C for 3 min followed by 35 PCR cycles, each one with strand denaturation at 94°C (1 min), annealing at 57°C (12s rRNA)/51°C (16s rRNA and CO I) for 1 min, and primer extension at 72°C (1.5 min). A final extension step was performed at 72°C for 5 min. The PCR products were purified using the QIAquick PCR purification kit (Qiagen) according to the supplier’s protocol. Double-strand DNA sequencing reactions were prepared with the Sequenase kit (Version 2.0 U.S. Biochemical), and sequencing was performed using the same primers as the PCR reactions. Some samples for verification purposes were also sequenced by MWG Biotech-Germany.

The sequences obtained have been deposited in GenBank with accession nos. EF 437062 to EF 437095. Amino acid CO I sequences were translated from nucleotide sequences applying the vertebrate mitochondrial genetic code.

Data and Phylogenetic Analysis

The same type of analysis was applied to all sets of sequences. Multiple sequence alignment was performed by Clustal W suite (Thomson et al. 1994) included in BioEdit version 5.0.9 (Hall 1999). Gaps were not taken into account for the statistical analysis. The number of transitions (TIs) and transversions (TVs) occurring among each pairwise combination of individual sequences, including the outgroup, were plotted against pairwise (p) genetic distance in order to evaluate possible mutation saturation (following Lydeard et al. 1996). To evaluate the homogeneity of nucleotide composition, we calculated the base compositional bias according to Irwin et al. (1991). Values of C (compositional bias) were estimated for the two rRNA genes separately as well as for each codon position (first, second, and third) of CO I. The nonsynonymous vs. synonymous substitution ratio for the CO I gene was compared within the five Mugilid species studied, and the amino acid differences per site (KA) were plotted against the synonymous differences per site (KS).

To determine whether we could use the combined data sets in the phylogenetic analysis, we performed a partition homogeneity test using PAUP version 4.b10 (Swofford 2002). Pairwise genetic distances were estimated using MEGA (Kumar et al. 2004) and the Kimura 2-parameter model (Kimura 1980) for each mtDNA segment as well as for the combined data set.

The phylogenetic analyses used were neighbor-joining and Bayesian inference. Neighbor-joining trees (Saitou and Nei 1987) were generated using MEGA version 3.1, and confidence in the nodes was evaluated by 10,000 bootstrap pseudoreplicates (Felsenstein 1985). A Bayesian analysis was performed with the program MrBayes 3.1 (Ronquist and Huelsenbeck 2003) using the partitioned data set; in each gene partition, the parameters of the substitution model suggested by Modeltest 3.7 (Posada and Crandall 1998) were applied according to the Akaike Information Criterion (Akaike 1974). The number of generations was set to 1 × 106. The average standard deviation of split frequencies of the two simultaneous and independent runs performed by MrBayes3.1 reached stationarity well before 2 × 105 generations. A tree was sampled every hundredth generation; consequently, the summaries of the Bayesian inference relied on 20,000 samples (from 2 runs). From each run 7,501 samples were used, and 2,499 were discarded as the burn-in phase. From the remaining 15,002 trees, a consensus tree was constructed. Support of the nodes was assessed with the posterior probabilities of reconstructed clades as estimated by MrBayes 3.1 (Ronquist and Huelsenbeck 2003).

For the construction of the phylogenetic trees, sequences of Gambusia affinis (accession no. AP004422) were used as outgroups to root the trees.

Results

The sequencing of the three mtDNA genes studied produced an alignment of 367 bp for 12s rRNA and also two fragments of 586 bp and 478 bp for the 16s rRNA and CO I genes, respectively. As a result, a combined data set of 1,431 nucleotide sites was obtained, including alignment gaps that nevertheless were omitted from the phylogenetic analysis. The number of variable sites ranged from 56 (15.3%) for 12s rRNA, 78 (13.3%) for 16s rRNA, to 119 (24.9%) for CO I. As expected, most of the CO I variable sites occurred at the third codon position (113 vs. 0 at the second and 6 at the first). In three instances the nucleotide substitution led to amino acid replacement. No nucleotide compositional bias was evident in either 12s or 16s rRNA genes. On the contrary, a slight bias toward G at the first codon position and, more significant, toward T at the second and C at the third codon position was observed in the CO I gene. In addition, an anti-G bias was also detected at the second and third codon positions, a general feature of the mtDNA genes encoded on the H strand. As a result, the compositional bias was high at the second and third codon positions (0.245 and 0.238, respectively) but considerably lower at the first codon position (0.1) (data not shown).

In total, from the combined data set, 253 sites out of 1,431 varied among the different species. Among the taxa examined, small size differences were revealed in both 12s and 16s rRNA genes, which were probably due to small deletions or insertions. The most apparent one has been observed in the16s rRNA gene, where three small insertions were detected in M. cephalus, resulting in a bigger fragment, 586 bp instead of 564 bp in the other taxa.

Among the five species studied, 10 different haplotypes were revealed for 12s rRNA, 11 for 16s rRNA, and 13 for CO I genes. The sequences from all haplotypes have been deposited in GeneBank (EF 437062 to EF 437095).

The scatter plots of transitions and transversions against genetic distances for each pairwise comparison revealed that in the 12 s rRNA gene, transitions started to become saturated when sequence divergence was near 12%; for CO I, saturation occurred when divergence was about 17%. There was no saturation detected for the 16s rRNA gene. As deduced from the sequence analysis, the vast majority of nucleotide substitutions occurred between M. cephalus and the other species in all three genes studied. In the 16s rRNA gene, the lowest ratio of transitions to transversions was found between M. cephalus and C. labrosus (1.5), and the highest was between L. aurata and L. saliens (15). In CO I the lowest ratio was found between M. cephalus and L. aurata (2.36) and the highest between L. saliens and L. ramada (6.16). Surprisingly, the same situation did not occur in the 12s rRNA gene, which had the lowest ratio between C. labrosus and L. aurata (1.1) and the highest between C. labrosus and L. ramada (5.4).

The rate of synonymous vs. nonsynonymous substitutions within the Mugilidae taxa showed a linear relationship with a high correlation coefficient (0.97).

For each mtDNA segment analyzed, the genetic distances based on the Kimura 2–parameter model among the species studied are presented in Tables 1, 2, and 3. The highest genetic divergence values among the three mtDNA segments studied were observed in CO I. In the 12s rRNA gene the highest divergence value was observed between M. cephalus A and L. aurata A (14.9%), and the lowest was between C. labrosus A and L. saliens A (1.11%). In the 16s rRNA gene the highest divergence value was observed between M. cephalus C and L. aurata B (13.38%), and the lowest was between L. saliens A and L. ramada A (1.26%). Finally, in CO I the highest value was observed between M. cephalus C and L. aurata B (21.98%), and the lowest was between C. labrosus A and L. aurata D (7.79%). Since the results of the partition homogeneity test allowed us to combine the data sets, we calculated the genetic distances based on the combined data (Table 4). (G. affinis was not taken into account for the calculation of distances). The divergence values estimated among the five species ranged from 3.9% to 16.25%. The lowest values observed were between L. aurata and C. labrosus haplotypes; the highest values were between M. cephalus and L. aurata. Within the genus Liza the values were about 4.9%.

Table 1 Kimura 2-parameter (Kimura 1980) distances (×100) calculated for 12s rRNA segment among haplotypes of five species
Table 2 Kimura 2-parameter (Kimura 1980) distances (×100) calculated for 16s rRNA segment among haplotypes of five species
Table 3 Kimura 2-parameter (Kimura 1980) distances (×100) calculated for CO I segment among haplotypes of five species
Table 4 Kimura 2-parameter (Kimura 1980) distances (×100) calculated (net between-group averages) for the combined data set among the five species

The best-fit models selected by Modeltest 3.7 (Posada and Crandall 1998) were K81uf + I for the 12s rRNA gene (base frequencies: A 0.3126, C 0.2540, G 0.2017, T 0.2316; proportion of invariable sites 0.6715, and equal rates for all sites), GTR + I for 16s rRNA (A 0.2976, C 0.2593, G 0.2119, T 0.2312; proportion of invariable sites 0.4676, and equal rates for all sites), and HKY + I + G for CO I (A 0.2493, C 0.2927, G 0.1771, T 0.2809, ti/tv ratio = 5.7603; proportion of invariable sites 0.5205, and gamma distribution shape parameter = 0.7071).

The trees drawn by both the neighbor-joining and Bayesian analyses exhibited exactly the same topology (Fig. 1). According to this topology M. cephalus forms the most distinct clade, emphasized also by the length of the branch. The remaining taxa are all grouped in a single cluster, L. saliens being the sister group of the two other Liza species and C. labrosus. Both phylogenetic reconstructions suggested that L. ramada forms a cluster with L. aurata and C. labrosus, which are grouped together, being the closest taxa. This clustering brings into question the monophyletic origin of the genus Liza. We have to note at this point that all specimens are always clustered with the typical species haplotype. The suggested topology is supported by high bootstrap values, especially those displayed in the Bayesian tree.

Fig. 1
figure 1

Neighbor-joining and Bayesian inference tree (50% majority rule consensus tree) for the five species studied. The two trees exhibit the same topology. Numbers (NJ/BI) indicate the percentage of 10,000 bootstrap replicates at each node in the majority rule consensus tree

Discussion

This paper investigates phylogenetic relationships among five species of the Mugilidae family based on partial sequencing of the mitochondrial 12s rRNA, 16s rRNA, and CO I genes. Sequence analysis of the 12s rRNA gene revealed 10 different haplotypes among the five species; 11 haplotypes were found for 16s rRNA and 13 for CO I. The proportion of conserved sites among the three DNA segments was 84.6% for 12s rRNA, 86.5% for 16s rRNA, and 75.1% for CO I. This probably reflects the slightly slower evolutionary rate of the two rRNA genes compared to CO I, as the selective pressure exerted on them has conserved more of their sequence among the five species (Brown et al. 1982). We have not detected any nucleotide compositional bias in either the 12s or 16s rRNA gene. There was significant bias, however, in the second and third codon position of the CO I gene. Compositional bias could have two origins, selection or mutation pressure (Sueoka 1988). A consequence independent of the origin of the bias is that the transition/transversion ratio will vary at different positions due to differences in the selection pressure (Irwin et al. 1991). As deduced from the sequence analysis, the vast majority of the nucleotide substitutions occurred between M. cephalus and all the other species in all three genes studied. In both the 16s rRNA and CO I genes, the lowest ratio of transitions to transversions was found between M. cephalus and the other species studied. It is generally believed that the ratio of transitional mutations to transversions in the mtDNA molecule decreases toward higher taxonomic levels because undetectable multiple transitional events at more variable sites accumulate as a function of time since divergence (Brown et al. 1982). The generally low ratio of transitions/tranversions found for CO I (especially at the third codon position) might suggest a relative saturation of the CO I sites, as was also shown in the scatter plot of transitions and pairwise genetic distance. This situation could affect the accuracy of the analysis used. In order to examine this possibility we performed the additional phylogenetic analysis based on the CO I segment by excluding those sites. The results obtained were similar to those described below.

Another notable outcome of this study concerns the tempo of amino acid replacement in the CO I region. The sequence, which spanned 159 amino acid residues, was highly conservative among the five species. The only amino acid replacements occurred in M. cephalus (2) and in C. labrosus haplotype b (1). This is in accordance with previous studies (Kocher et al. 1989) that inferred an exceptionally low rate of amino acid replacement in the fish lineage, only about a fifth as rapid as in mammals or birds. In congruence with the above, the rate of synonymous vs. nonsynonymous substitutions within the Mugilidae taxa (calculated to test whether these species are subject to different selective constraints) showed a linear relationship among them, suggesting that differential selective pressures acting on CO I of the grey mullets considered in this work should be excluded (Adachi et al. 1993).

The levels of divergence estimated among the five species using the combined data sets ranged from 3.9% to 16.25%. These values are in general agreement with those reported by Billington and Hebert (1991), as well as with those proposed by Gonzalez-Villasenor and Powers (1990) for marine species. Moreover, the level of nucleotide divergence observed among the three Liza species (∼4.9%) is in congruence with that proposed by Avise et al. (1987) and Moritz et al. (1987) among congeneric species.

The highest degree of genetic divergence has been estimated among M. cephalus and all the other species. This could be the result of the faster substitution rate observed in this species, and it could be explained as a combined effect of nucleotide bias (especially on the third codon position of CO I) and saturation of signal (Martin 1995). This observation is in agreement with previous studies by our group (Papasotiropoulos et al. 2001, 2002) using allozyme and PCR-RFLP analysis, as well as with the studies performed by Caldara et al. (1996), Murgia et al. (2002), Rossi et al. (2004), and Turan et al. (2005), among others, supporting the hypothesis that M. cephalus is the most genetically distinct species belonging to a different lineage that excludes the genera Liza and Chelon. This hypothesis is also supported by chromosome studies by Cataudella et al. (1974) and more recently by Rossi et al. (1997) and Gornung et al. (2001), who stated that the M. cephalus karyotype is considered closest to the karyotype described by Ohno (1974) as ancestral for all teleosts. The karyoevolutive pattern proposed by the above-mentioned investigators suggests that the karyotypes of the species belonging to the genera Liza and Chelon might have derived through a translocation event from an ancestral karyotype similar to the one found in M. cephalus.

The large genetic divergence observed between M. cephalus and the other grey mullets is in contrast with their high morphological similarity, although such contradictions are often present in the literature (Patterson et al. 1993). The lack of parallel evolution between morphology and some portions of DNA has already been reported for other groups of fish (e.g., Meyer et al. 1990) and might be explained by differences in the selective constraints operating on these two characters (Caldara et al. 1996). At the same time, we did not detect any appreciable degree of genetic differentiation between Liza and Chelon species. This was not supported by our previous study (Papasotiropoulos et al. 2001) or by Autem and Bonhomme (1980) based on allozymic data. Other studies, however, based on mtDNA and allozyme data (e.g., Caldara et al. 1996; Rossi et al. 2004; Turan et al. 2005) seem to agree with our current findings, leaving room for reconsidering the modern systematic classification of those species.

The issue of whether to combine data sets in phylogenetic analysis remains a subject of much debate, and so far no clear consensus has emerged. In most cases, combined analyses are more likely to recover a phylogenetic tree close or identical to the true tree, if for no other reason than the amount of information available to infer a phylogenetic tree is maximized (Smith et al. 2003). In addition, all the genes used here were mitochondrial, and as such they presumably share the same evolutionary history. This was also confirmed by the partition homogeneity test (Farris et al. 1995) performed using PAUP, which showed that a similar phylogenetic signal was inferred among the three data sets used.

The N-J method and the Bayesian analysis used for the reconstruction of phylogeny produced identical results, both agreeing on a single tree. This agreement suggests that our data may be relatively free of positive-misleading systematic biases and increases our confidence in the recovered topology (Funk et al. 1995). In addition, the high bootstrap values, especially in the Bayesian tree, provide good support at all nodes. Hillis and Bull (1993) report that bootstrap values of at least 70% reflect a probability of 95% that the node is real. Both the N-J and Bayesian topologies agree that M. cephalus falls into a completely separate phylogenetic branch, being the sister group to all other Mediterranean species studied here. This is in agreement with our previous studies based on allozyme and PCR-RFLP data (Papasotiropoulos et al. 2001, 2002), as well as with the phylogenetic trees produced by other investigators (e.g., Caldara et al. 1996; Murgia et al. 2002). Moreover, Turan et al. (2005), who studied another Mugil species, M. soiuy, reported that these two species are clustered together in a separate lineage, supporting the monophyletic status of this genus, clearly isolated from the other genera.

As regards the Liza-Chelon clade, the sub-branching pattern within this clade remains problematic. As in our previous study (Papasotiropoulos et al. 2002), there appears again a reduced interspecific differentiation between Chelon and Liza species, which is now leading to the grouping of C. labrosus with L. aurata. On the other hand, L. ramada seems to be the sister group to the C. labrosus-L. aurata lineage, while L. saliens is more distant from those three and closer to M. cephalus. This new suggested topology, with the formation of a C. labrosus-L. aurata cluster (supported by high bootstrap values), is different from that supported by our previous study. This difference, as well as any others observed between the two studies, could be simply attributed to the different methodologies followed and consequently to the greater resolution provided by nucleotide sequence compared to RFLP analysis.

In addition, discrepancies in the phylogenetic relationships within the Liza-Chelon clade between our study and those of Caldara et al. (1996) and Turan et al. (2005) may be due to the different genetic systems used (mtDNA vs. allozymes), to the different mtDNA segments studied, or to geographic differentiation of the species under study. It is also worth mentioning at this point that Rossi et al. (2004), who studied a small portion of the 16s rRNA gene, reached the same conclusions that we did regarding the phylogeny of grey mullets.

This study and most of the previous ones (with the exception of those by Autem and Bonhomme 1980 and Papasotiropoulos et al. 2001, which were based on allozyme data) are consistent in not identifying the three Mediterranean Liza species as a monophyletic group. In the same manner, Harrison and Howes (1991), who analyzed the pharyngobranchial organ of Mugilid species, stated that Liza could be a nonmonophyletic assemblage. Moreover, recent cytogenetic evidence provided by Rossi et al. (1997) and Gornung et al. (2001, 2004) underlies the high cytoevolutive proximity between the genera Liza and Chelon, suggesting also that L. saliens may be the most primitive species in the Liza-Chelon series, having also played a role in the lineage split of the genus Liza.

In this work we surveyed 1431 bp from three different mtDNA portions, which is considered a large fragment, in order to resolve phylogenies. We conclude from our data that M. cephalus seems to be the most divergent species and probably is the common ancestor to all the species studied here, which is congruent with the results obtained in previous studies performed by our group and by other investigators. In addition, there are strong indications that the genus Liza might not be of monophyletic origin and that the genus Chelon could be included, resulting in a single taxon, as it was before Thomson’s (1997) major revision. According to Senou et al. (1996), the synonymization of these two genera could be considered, but as Rossi et al. (2004) point out, there are more than 20 species classified in the genus Liza, as well as the remaining species of the genus Chelon, which have not yet been included in any phylogenetic study. Thus, any conclusions would be subject to dispute. Therefore, more studies are needed that integrate not only more species but also different genetic data (nuclear vs. mtDNA), in order to determine whether these two genera can be synonymized or represent different clades.