INTRODUCTION

The fish family of true flounder Pleuronectidae, to which the main attention is paid in the article, is one of the largest in the order Pleuronectiformes, including 59 nominal fish species of right-sided flounders common in the sea waters of the Northern Hemisphere [1, 2]. In their analysis, J. Cooper and F. Chapleau [1] considered the Pleuronectidae family as a monophyletic taxon, based on ten synapomorphies by morphological characters. An important result obtained by the aforementioned authors is generally consistent with the topology of the branches of the family established in several studies of molecular phylogenetics [310]. According to [1], this family includes the subfamilies Hippoglossinae, Eopsettinae, Lyopsettinae, Hippoglossoidinae, and Pleuronectinae, which are represented by genera usually consisting of species with high commercial value (for example, species of the genus of halibut-like flounders, Hippoglossoides). In connection with the fishery importance of these and other flounders and the need to manage such valuable fishery resources, both the precise classification of specimens of individuals and species within the genus and the entire systematics’ relationships among taxa in this family are very important.

Taxonomic studies of the Pleuronectidae have traditionally been based on morphological characters, as follows from the above paragraph. However, the frequent absence of clear evidence of the homology of species’ characters even at low taxonomic levels (within a genus) makes the postulated taxonomic and phylogenetic relationships of many groups of flounders not always convincing if they are justified only by morphology. There are several versions of the classification of flounders, which were proposed by different authors [1, 1113]. Certain disagreements are also noted regarding the phylogenetic relationships of flounders obtained on the basis of morphological and molecular genetic data [1, 4, 14, 15]. Development of new nuclear and mitochondrial markers based on DNA makes it possible to better identify morphologically similar fish species [16], including many species of flounders. Therefore, it is relevant to search for new or already known, but insufficiently developed, molecular markers for the reconstruction of gene trees, as well as combined or species phylogenetic trees for flounders of the Pleuronectidae family.

In this study, taking into account the above, we present a comparative analysis of incomplete nucleotide sequences (hereinafter referred to as sequences) of the gene 16S rRNA for 14 species belonging to Pleuronectidae, previously not used in such a volume for flounders, in order to assess the success of taxonomic identification of specimens and to establish phylogenetic and taxonomic relationships in this family of flounders. The novelty of this study lies in the fact that the systematics of this group in the authors’ works cited above were not considered on the basis of 16S rRNA. Accordingly, in the presented article, we considered the potential of this marker on a sufficient number of specimens for taxonomic and evolutionary genetic studies of flounders of the Russian Federation.

MATERIALS AND METHODS

A total of 62 sequences of 16S rRNA and additionally 24 sequences of genes 16S rRNA, Co-1, and Cyt-b for 14 species belonging to seven genera of the Pleuronectidae family were analyzed. Latin names are given in accordance with the classification [1]. Sampled fish (2–5 specimens of muscle tissue subjected to ethanol fixation, 95%) were taken from the existing collection of the Laboratory of Molecular Systematics, and voucher specimens of the fish themselves are kept in safe custody at the museum of the A.V. Zhirmunsky National Scientific Center of Marine Biology. DNA isolation was performed using commercial kits (DNA Extran-2, Syntol, Russia).

A fragment of the sequence of gene 16S rRNA was amplified by polymerase chain reaction (PCR) using primers 16Sbr-H and 16Sar-L. The PCR reaction was carried out in a volume of 25 μL of a solution containing the following: distilled deionized water—17.8 μL; dNTP (ZAO Evrogen, Moscow, Russia)—0.5 μL; 5× Buffer (Evrogen)—5 µL; primers at a concentration of 10 μM/μL—0.3 μL for each; Taq polymerase—0.1 μL. The following heat program was used: denaturation at 93°C for 1 min, annealing at 55°C for 1 min, and elongation at 72°C for 1 min for 33 cycles. To determine the localization and arrangement of nucleotides in the sequences, PCR products (DNA specimens) were subjected to cyclic sequencing using the BrightDye Terminator Cycle Sequencing Kit according to the following program: denaturing at 96°C for 10 s, annealing at 45°C for 10 s, elongation at 60°C for 2 min.

Bidirectional DNA strand sequences of genes 16S rRNA were obtained for each DNA specimen. These sequences were then pooled together to generate consensus sequences for each specimen (individual). This procedure was performed using the Geneious, Free Trial software package [17].

Since the length of the obtained sequences varied quite significantly, in the range from 355 base pairs (bp) to 642 bp, two sets of sequences were composed for more accurate further analysis. One set represented all the sequences obtained, while the other included only the longest sequences. According to the data transformation, set 1 contained 62, and set 2 contained 27 sequences. After the alignment procedure and removal of gaps (indels), the length of the sequences of the two sets was 291 and 617 bp, respectively.

Sequence alignment for all taxa was performed using the MEGA-X software package, SP (http://megasoftware.net/) [18] based on the ClustalW [19] module as an integrated MEGA product. The penalties for opening of gaps and for lengthening of gaps were set at 15.0 and 5.0, respectively (for other alignment program settings, the default parameters were used). After the first alignment step, large gaps were manually removed and the final alignment in the second step was done with reduced penalty levels (5.0 and 0.5 for the two options, respectively). All spaces were then manually removed again.

To increase the information capacity, besides the gene 16S rRNA, the analysis included sequences of the genes Co-1 and Cyt-b previously used in the analysis [10]; taken together, these data constituted the third set of sequences, including a total of 24 specimens with a length of 2926 bp.

For further analysis of sequences and construction of gene trees, an optimal model of nucleotide substitutions for the obtained set of sequences was defined. The best evolutionary model that fit the data obtained was evaluated by means of a special module of the MEGA program. For the set of gene 16S rRNA with short sequences (291 bp), the best model was K2P + G (two-parameter model of M. Kimura with a gamma distribution of substitutions) [20]; for the set of this gene with long sequences (617 bp), the best model was JC + G (Jukes–Cantor model with a gamma distribution of substitutions) [21]; for the set of sequences of three genes, the best model was HKY + G (Hasegawa–Kishino–Yano model with a gamma distribution of substitutions) [22].

Gene trees were constructed using four reconstruction methods: Bayesian Analysis (BA), Maximum Likelihood (ML), Neighbor Joining (NJ), and Minimum Evolution (ME). They were implemented in MrBayes 3.2.7 (http://nbisweden.github.io/MrBayes/ download.html) [23, 24] and MEGA-X [18]. The process of tree reconstruction in BA was simulated for one million generations, n (n = 106). The three other reconstructions ML, NJ, and ME were performed with replicates equal to k = 1000 copies of bootstrap (bootstrap support).

The sequence of Platichthys stellatus was chosen as an outgroup when rooting trees, a representative of which, according to the data for the complete mitogenome (mtDNA), was previously assigned to an external branch in the Pleuronectidae family [10]. Phylogenetic trees were visualized and edited, if necessary, using the FigTree software [25] and MEGA-X [18].

All obtained sequences by gene 16S rRNA were registered in GenBank (https://www.ncbi.nlm. nih.gov/); part of previously unpublished sequences of Co-1 and Cyt-b are also included in the article (Table 1).

Table 1. List of species and their assigned numbers in the GenBank

Statistical analysis of the nucleotide composition was performed using SP MEGA-X. Additionally with the Statistica 6 SP [29], one-way analysis of variance (ANOVA) of the nucleotide composition was conducted separately for each gene.

RESULTS

The analysis was constructed on the basis of the sequences of the three previously presented data sets. The full length of the segment of gene 16S rRNA (full-length sequences “from primer to primer”) is 596–631 bp. The numbers in GenBank of the full-length regions of gene 16S rRNA are as follows: MN888911, MN888895, MN888908, MN888877, MN888901, MN888894, MN888893, MN888892, MN888917, MN888916, MN888915, MN888903, MN888924, MN888918, MN888912, MN888904, MN888905, MN888868, MN888873, MN888898, MN888899, MN888902, MN888927, MN888909, MN888907, MN888884, MN888883, MN888876.

However, not all of the sequences obtained reached their full size. The difference in the length of the sequences is due to poor-quality sequencing of some specimens, which led to a greater “cutoff” of regions near the primers when forming consensus sequences. These sequencing errors are possible owing to the fact that some of the tissue specimens were stored for several years before analysis. However, short fragments are not necessarily bad for assessing variability in closely related taxa and comparing the degree of their similarity-difference for gene tree reconstruction. In this regard, for analysis, the material was divided as explained before into two groups, including long (1) and short (2) gene sequences. In accordance with the outlined approaches (see the Materials and Methods section), four types of trees were constructed: BA, ML, NJ, and ME.

Analysis of all Sequences of 16S rRNA

Figure 1 shows a rooted ML tree derived from a set of sequences of gene 16S rRNA 291 bp in length. The tree nodes support are listed in the following order: BA/ME/NJ/ML.

Fig. 1.
figure 1

Rooted gene tree showing phylogenetic relationships based on 62 short nucleotide sequences of 16S rRNA. The topology is presented on the basis of ML reconstruction. The nodes give support values for four tree reconstruction methods in the following order: BA/ME/NJ/ML. For the BA tree, a posteriori probabilities (%, n = 106 generations) are shown, and bootstrap supports are given for the other three reconstructions (k = 1000 replicas).

The branch with specimens of Limanda sakhalinensis became part of the subfamily Hippoglossoidinae, forming a separate, topologically unresolved branch (node) together with Cleisthenes pinetorum of subfamily Hippoglossoidinae. A separate, also unauthorized branch was formed by representatives of three nominal species of halibut-like flounders of the genus Hippoglossoides. Lepidopsetta mochigarei, a representative of the tribe Microstomini, has for all five corresponding branches on the tree unresolved topology. A separate node on the tree is formed by representatives of the Pacific halibut Hippoglossus stenolepis from subfamily Hippoglossinae.

Analysis of Longer Sequences of 16S rRNA

Figure 2 shows a rooted NJ tree derived from a set of gene sequences of 16S rRNA 617 bp in length. The branch Limanda sakhalinensis is included in the subfamily Hippoglossoidinae, located in the same cluster with Cleisthenes pinetorum. The sequence representing the subfamily Hippoglossinae forms a separate node. Lepidopsetta mochigarei, as before for a short set, forms an unresolved node, but topologically it belongs to the tribe Microstomini.

Fig. 2.
figure 2

Rooted NJ tree topology showing phylogenetic relationships based on data from 27 sequences of a region of gene 16S rRNA. The nodes give the values of supports for the BA tree (n = 106 generations) and the three other reconstructions in the following general order: BA/ME/ML/NJ (k = 1000 bootstrap replicas for the last three reconstruction methods).

Analysis of the Reconstruction of Gene Trees from the Combined Sequences of Three Genes

For this analysis, the aligned sequences of the gene 16S rRNA region were compared, jointly with genes Co-1 and Cyt-b. The sequences were concatenated using special SP MEGA-X utility and then subjected to further analysis. According to the data obtained, the branch Limanda sakhalinensis is included in the subfamily Hippoglossoidinae (Fig. 3). The branch Cleisthenes pinetorum is also included in the subfamily Hippoglossoidinae (Fig. 3).

Fig. 3.
figure 3

Rooted ML tree showing phylogenetic relationships based on data from 24 concatenated sequences of gene 16S rRNA, Co-1, and Cyt-b. The nodes give the values of supports for the BA tree (n = 106 generations) and for the three other reconstructions in the following general order: BA/NJ/ME/ML (k = 1000 bootstrap replicas for the last three reconstruction methods).

Nucleotide Composition

The ratio of pyrimidines (T, C) and purines (A, G) in genes 16S rRNA, Co-1, and Cyt-b deviated from the ratio of 50 : 50 (Appendix, Fig. 4). In sequences of 16S rRNA, there are no large differences in the ratio of pyrimidines (T, C) and purines (A, G), but general heterogeneity of the nucleotide composition with a predominance of C and A nucleotides can be observed (Fig. 4a). For Co-1 and Cyt-b, there is a statistically significant deviation in the ratio of pyrimidines to purines with a predominance of pyrimidines (Figs. 4b, 4c).

Fig. 4.
figure 4

Average values of the composition (%) of four nucleotides in 24 studied sequences of genes 16S rRNA (a), Co-1 (b), and Cyt-b (c). According to the results of one-way analysis of variance (ANOVA). The vertical lines represent the 95% confidence interval.

ANOVA for each gene found that the differences for four nucleotides were statistically significant: for 16S rRNA, F = 2147.9, d.f. = 3; 92, P < 0.0001; for Co-1, F = 3673.3, d.f. = 3; 92, P < 0.0001; for Cyt-b, F = 4320.7, d.f. = 3; 92, P < 0.0001. The ratio of (T + C) : (A + G) for 16S rRNA, Co-1, and Cyt-b was 45.4 : 54.6, 56.5 : 43.5, and 61.1 : 38.9%, respectively.

DISCUSSION

As noted in the Introduction, the largest subfamily in the family is Pleuronectinae. This subfamily is represented by two tribes, Microstomini and Pleuronectini. According to the data obtained by 16S rRNA, they do not form monophyletic branches (see Figs. 1, 2). Thus, the taxonomy at the subfamily level needs further clarification. For example, for a larger information signal, it is necessary to increase the number of both nuclear and mitochondrial markers in the study. This will help reduce the number of topologically unresolved nodes in the resulting trees.

As noted above, J. Cooper and F. Chapleau [1], in their revision of this family based on traditional morphological characters, substantiated that Pleuronectidae is a monophyletic group. The monophyly of flounders established on the basis of the classical approach corresponds in many cases to molecular phylogenetic reconstructions in studies of this family for such markers as 12S rRNA and 16S rRNA, as well as for genes Co-1 and Cyt-b [46, 8, 9] and the complete mitogenome [10].

The subfamily Hippoglossoidinae, in the most representative material in the work, included two of the three genera Cleisthenes (C. pinetorum) and Hippoglossoides (H. dubius Schmidt, 1904, H. elassodon Jordan & Gilbert, 1180, H. robustus Gill & Townsend, 1897) (Fig. 3). Species of the genus Hippoglossoides form a mixed cluster on the BA tree (Fig. 3). On the basis of this, it can be assumed that two taxa H. elassodon and H. robustus are synonyms of a single species. Synonymy of H. elassodon and H. robustus was already proposed earlier on the basis of morphological and molecular-phylogenetic data [4, 2731]. By the principle of seniority, a taxon of the species rank, H. elassodon Jordan & Gilbert, 1880 can be accepted as valid, and H. robustus Gill & Townsend, 1897 can be considered a junior synonym for this species. The proposal of synonymy of H. elassodon and H. robustus has was already made previously, as noted above. However, K.A. Vinnikov et al. [31] in their two-sided analysis (morphology + genetics) propose to reintroduce synonymy. But synonymy was never introduced. In the databases, these two taxa still appear as separate species. One of the tasks of the article is to sharpen this issue in order to finally resolve this case in a special publication.

The data for the genus Limanda deserve a separate discussion. Sequences of the species Limanda sakhalinensis occur, as noted in the results and shown in Figs. 1–3, and are included in the branch of the subfamily Hippoglossoidinae. In a comparative anatomical study by J. Cooper and F. Chapleau [1], monophyly of this genus was not confirmed. In our study, as in previous molecular phylogenetic studies [4, 5, 10], Limanda sakhalinensis Hubbs, 1915 was included in the subfamily Hippoglossoidinae. Considering all these data, it is quite appropriate to recommend revising the position of Limanda sakhalinensis Hubbs, 1915, moving it to the genus Hippoglossoides with the name Hippoglossoides (Limanda) sakhalinensis in the composition of subfamily Hippoglossoidinae. Accordingly, it is necessary to revise the morphology and species characteristics, as well as diagnostic keys, which is supposed to be done in independent work.

Genus Lepidopsetta (L. mochigarei) was included in the tribe Pleuronectini, subfamily Pleuronectinae by our data, while in [1] this genus was considered exclusively within the tribe Microstomini of the subfamily Pleuronectinae. In molecular phylogenetic studies based on Co-1 and Cyt-b [4, 5, 10], genus Lepidopsetta was considered within the tribe Pleuronectini. Thus, it is preferable to consider the genus Lepidopsetta as a part of the tribe Pleuronectini. However, this question, given the weak topological signal for this branch by the marker 16S rRNA in the work, requires further refinement using a larger number of genes.

Deviation in the equality in ratio (T + C): (A + T) is well described in the literature for many protein-coding genes [4, 32]. The presented analysis (Fig. 4, Appendix) shows that the bias for genes Co-1 and Cyt-b in the ratio of purines to pyrimidines is significantly different from the bias for the gene 16S rRNA. Obviously, the detected displacement of the nucleotide composition for the structural genes studied in this work reflects the hydrophobic properties of the proteins encoded by them [33]. Discovery of the cause of the heterogeneity of the nucleotide composition in the sequences of 16S rRNA requires further research.