Introduction

Solea senegalensis is a flatfish species belonging to the Pleuronectiformes order, which comprises about 570 species. This species has been identified as a target species for diversification in marine aquaculture due to its growth rates and flesh quality (Imsland et al. 2004). Although significant advances have been achieved in recent years concerning the procedures for larval rearing and ongrowing, sexual dysfunction of males reared in captivity still remains as a major bottleneck for the expansion of the aquaculture industry, thus limiting the establishment of commercial breeding programs (Guzmán et al. 2009).

Until now, among all fish species, sex master genes have been described only in Oryzias latipes (in which the determining male gene is DMY or Dmrt1bY) and in Oncorhynchus mykiss (in which the sex determining gene is sdy): in flatfish they are still unknown. The transcription factors that are members of the double-sex and mab-3-related (dmrt) family are involved in gonad development and share a common DNA-binding domain (the DM domain). These factors show a little DNA sequence conservation and are responsible for sexual dimorphism in diverse organisms (Kopp 2012). In particular, the dmrt1 gene is considered the first conserved gene in the sex determination/sexual differentiation cascade among several phyla (Marchand et al. 2000).

The genetic determination of sex is often associated with the formation of sex chromosomes. These chromosomes are usually genetically degenerate and have a high content of repetitive DNA, making it difficult to analyze both genetic content and gene organization (Cioffi et al. 2010). In teleosts, both heterogametic systems, the XX/XY and ZZ/ZW types, have been reported. Notably, these sex chromosomes are mostly homomorphic without morphological differentiation that could explain the existence of different systems of sex chromosomes even among closely related species (Mank et al. 2006; Mank and Avise 2009). From the genetic perspective, fish provide a paradigmatic example because their sex determination mechanisms range from the environmental to various different modes of genetic determination. The evolutionary significance of this remarkable plasticity is unknown (Heule et al. 2014).

Cynoglossus semilaevis is a flatfish species closely related to Senegalese sole whose genome has been recently described (Chen et al. 2014). This species possesses heteromorphic sex chromosomes with a ZZ/ZW sex determination system (Chen et al. 2007). In contrast, S. senegalensis lacks heteromorphic sex chromosomes and a putative XX/XY determination system has been proposed (Molina-Luzón et al. 2014). Interestingly, in Scophthalmus rhombus, an ancient XX/XY system that changed to a ZZ/ZW mechanism in S. maximus has been reported (Haffray et al. 2009; Taboada et al., 2014a). Although a major sex-determining region has been described in S. maximus, and several candidate genes related to sex determination and gonad differentiation have been mapped close to that region (Viñas et al. 2012), no heteromorphisms of sex chromosomes have been found in that species. In all these flatfish species, genetic and environmental influences on sex determination have been determined.

In aquaculture, it is essential to unravel the sex-determining mechanisms particularly for those species with sexual dimorphism in several traits such as growth. This is the case of some flatfish species such as Hipoglossus hipoglossus, C. semilaevis, and S. senegalensis, in which females grow faster than males do (Tvedt et al., 2006; Shao et al. 2010; Sánchez et al., 2010). Strategies to produce all-female stocks are based on a clear and independent identification of genetic and phenotypic sexes (Chen et al. 2009). Therefore, the identification of genetic markers linked to sex determination has become a major issue for producing monosex stocks.

Integrated genetic maps are a powerful tool for both genetic and evolutionary studies. Maps bring together data from gene sequencing and physical mapping on chromosomes. At present, bacterial artificial chromosomes (BACs) are the main tool for building physical maps and analyzing gene synteny, due to their stability and simplicity of handling (Cação et al. 2013). In recent years, improvements in the resolution and accuracy of the fluorescence in situ hybridization (FISH) technique have converted this technique into an indispensable tool for filling the gaps in genome sequencing projects (Gan et al. 2012). Moreover, it has established itself as a method for assembling high-resolution physical maps (Greulich-Bode et al. 2008). Together, BAC-FISH and next-generation sequencing (NGS) represent an efficient approach for anchoring genomic and linkage sequence data onto physical chromosomes (García-Cegarra et al. 2013).

The aim of this paper is to provide new information about the chromosome structure and arrangement of certain genes involved in sex determination and sexual differentiation processes in S. senegalensis, which may be of relevance for improving the commercial production of this species. Moreover, the results of this study, together with those previously published (Ponce et al. 2011; García-Cegarra et al. 2013), will enable us to build an integrated and updated genetic map. This information could not only help us to understand the development and evolutionary mechanisms in vertebrates but will also contribute to improving the production of target species for aquaculture.

Material and methods

BAC library

A BAC library was constructed using S. senegalensis larvae before mouth opening (3 days after hatching) as starting material. Larvae were washed with DEPC water, frozen in liquid nitrogen, and kept at −80 °C until use. High molecular weight genomic DNA was isolated and, after that, digested with Bam HI, before cloning into the CopyControl pCC1 BAC (Epicentre Biotechnologies, Madison, USA) and transformed into the host cell DH10B (Invitrogen, Life Technologies, Carlsbad, California, USA). The final BAC library comprised 29,184 positive clones distributed in 384-well plates (76 plates in total). Approximately 99.99 % of the clones contained nuclear DNA inserts (average size, 285 kb).

PCR screening of the S. senegalensis BAC library

To find and isolate BAC clones bearing targeted gene sequences, the 4D-PCR method was carried out (Asakawa et al. 1997). Briefly, plates were pooled in four dimensions that were used as template DNA. The first and second dimensions identified the plate in which the targeted BAC clone was located. The third and fourth dimensions provided the information about the well coordinates.

The following sex-determining candidate genes were chosen: members of the Sry-related high mobility group box (sox) family, particularly sox3, sox6, sox8, and sox9; members of the dmrt family, particularly dmrt1, dmrt2, and dmrt4; cytochrome P450 aromatase 19a (cyp 19a1a); anti-Mullerian hormone (amh); follicle-stimulating hormone (fshb); luteinizing hormone (lh); nanos 3 (nanos3); and ATP-dependent RNA helicase DEAD box protein 4 (vasa). Specific primers (see Supplementary Material 1) were designed using template sequences from the SoleaDB (Benzekri et al. 2014) and from orthologous sequences of different fish species available in the ENSEMBL database. The fshb primers and PCR conditions were the same as those described in García-Cegarra et al. (2013).

BAC clone sequencing and bioinformatic analysis

BAC clones identified by the 4D-PCR method were checked by PCR using specific gene primers, followed by Sanger sequencing. Validated clones were sequenced by 454 Roche Technology. BAC clones were isolated using the Large-Construct Kit (Qiagen, Hilden, Germany), then digested and separated with the restriction endonucleases Hae II and Rsa I. The fragments generated were ligated to AP11/12 adapters using T4 DNA ligase, and were pre-amplified using the single primer AP11 and the Elongase Enzyme Mix (Invitrogen Life Technologies, Carlsbad, California, USA), according to the supplier’s recommendations. Pre-amplified products were purified, cloned, and sequenced by the same procedure. The sequencing quality was assessed by the analysis of various parameters, such as the number of reads, the average size of reads, the total length sequenced, the total number of contigs assembled, and the number of large contigs (more than 500 bp in length). The level of Escherichia coli contamination and the N50 value were also evaluated. Contig N50 is a weighted median statistic such that 50 % of the entire assembly is contained in contigs or scaffolds equal to or larger than this value.

The functional and structural annotations of the gene sequences identified in each BAC were carried out in a semi-automated process. Protein and EST from S. senegalensis and related species were compared. The homologous sequences obtained were used to get the best predictions for gene annotation. Finally, all available information was used to create plausible models and, when possible, functional information was added. Using the Apollo genome editor (Lewis et al. 2002), Signal map software (Roche Applied Science, Penzberg, Germany), and Geneious basic 5.6.5 (http://www.geneious.com/), the results were individually completed and adjusted in the final edition process of the annotation.

Cross-species genome comparisons were carried out at two levels. At the first level, a micro-synteny study was performed using the Genomicus (Louis et al. 2015) platform, which takes the genome information from the ENSEMBL database. For this micro-synteny analysis, the species Gasterosteus aculeatus was used as reference genome, with the exception of BAC2K18, for which the reference species selected was Danio rerio, because the lhb gene could not be found in G. aculeatus. The order of the contigs within each BAC of S. senegalensis was estimated using the information provided by the Genomicus program. In the schematic figures of the Genomicus program, the blocks that appear colorless correspond to genes that are not represented in the reference species (G. aculeatus) in the analyzed region. In addition, an orthology comparison was performed between C. semilaevis and S. senegalensis using MAFFT alignment (Katoh and Toh 2008) between protein sequences of the candidate genes, in order to confirm that ortholog genes were compared. At the second level, a synteny analysis was performed using the Circos software (Krzywinski et al. 2009); the Circos program provides an efficient and scalable way to illustrate relationships between genomic positions, and the elements of the image allow the rearrangement to be easily understood. The Circos diagram facilitates the visualization of genome similarity between two species. Thus, the thinner the lines that appear in the diagram, the more chromosome rearrangements that have occurred between the species and, consequently, the greater the genetic distance between the species. The species used in this analysis were available at ENSEMBL: Tetraodon nigroviridis, D. rerio, O. latipes and G. aculeatus. In order to compare S. senegalensis with the closely related C. semilaevis, a cytogenetic map of this species was produced with the Map Viewer tool, using public genome data available at NCBI.

mFISH analysis

Chromosome preparations

Chromosome preparations were made from S. senegalensis larvae (age 1–3 days after hatching). The specimens were pre-treated with 0.02 % colchicine for 3 h to accumulate a larger number of metaphase cells. They were then subjected to hypotonic shock with KCl (0.4 %), and finally fixed in a freshly prepared solution of absolute ethanol-acetic acid (3:1) (Carnoy solution). Larvae were homogenized in Carnoy, and the preparations were then dropped onto wet slides and placed on a hot plate with damp paper to create the necessary moisture for a good spread of the chromosomes.

FISH probes

To prepare FISH probes, BAC clones were grown on LB containing chloramphenicol at 37 °C overnight. BAC-DNA was extracted using the BACMAX DNA purification kit (Epicentre Biotechnologies, Madison, USA), following the manufacturer’s instructions. The insert was extracted by digestion with Eco RI and analyzed by agarose gel electrophoresis (0.8 %). The probes were amplified by DOP-PCR and then labeled by a conventional PCR using four different fluorochromes, i.e. Texas Red (Life Technologies, Carlsbad, California, USA), Spectrum Orange, Fluorescein isothiocyanate (FITC) (Abbott Molecular/ENZO, Illinois, USA), and diethylaminocoumarin (DEAC) (Vysis, Downers Grove, USA), using the protocol described in Liehr (2009). Finally, the probes were precipitated using a protocol with NaAc and ethanol. In addition to the BAC clones considered in this study, the BAC clones studied in previous works (Ponce et al. 2011; García-Cegarra et al. 2013) were also included as FISH probes.

Hybridization and post-hybridization washes

For hybridization, chromosome preparations were pre-treated with pepsin solution at 37 °C and fixed with paraformaldehyde solution. Finally, the preparations were dehydrated with ethanol series of 70, 90, and 100 %, and air-dried before hybridization. Hybridization was carried out by denaturation of the probes and chromosome preparations in parallel, following the protocol described by Liehr (2009) with some modifications. These modifications involved the labeling of the probes by DOP-PCR instead of nick translation, and the blocking DNA was sonicated genomic DNA from S. senegalensis instead of the human-COT1 (Invitrogen, Life Technologies, Carlsbad, California, USA).

The post-hybridization treatment consisted of serial washes of SSC, Tween20 (Panreac, Barcelona, Spain), and PBS. The preparations were then dehydrated with ethanol and counterstained with antifade-DAPI solution (VectorLabs, Burlingame, California, USA). Hybridization images were obtained with a digital CCD camera (Olympus DP70) coupled to a fluorescence microscope (Olympus BX51 and/or Zeiss Axioplan using software of MetaSystems, Altlussheim, Germany).

Phylogenetic analysis

The protein sequences of ten candidate genes (amh, cyp19a1a, dmrt2, dmrt3, dmrt4, lhb, nanos3, sox3, sox6, and vasa) were concatenated to carry out the phylogenetic analysis. Twenty-one vertebrate species were included to generate the phylogenetic tree, including S. senegalensis (Supplementary Material 2). Additionally, the arthropod Drosophila melanogaster was included to root the tree. The sequence alignment was performed with the MAFFT tool (Katoh and Toh 2008) using an iterative method. The PhyML 3.0 program (Guindon et al. 2010) was used to determine the best-fit phylogenetic model and then to run the model. The resulting best-fit model predicted the JTT model, considering a proportion of invariable sites (+I), gamma distribution (+G), and heterogeneous frequencies (+F). The statistic used for model selection was the akaike information criterion (AIC), the value of which was 239,159.76, and the -LnL was −119,521.53. Branch support was tested by the fast likelihood-based method using aLRT SH-like (Anisimova et al. 2011). Finally, the tree was edited in the MEGA6 program (Tamura et al. 2013).

Results

Sequence and micro-synteny analysis

A total of 93 genes were annotated on 13 BAC clones (Table 1). The set of candidate genes were detected in 11 out of 13 BAC clones. The dmrt1 gene (BAC11O20) and sox8 gene (BAC10K23) were partially sequenced by Sanger technology (acc. no. KT724725 and KT724726, respectively). Assembled BAC contigs were deposited in the GenBank database (NCBI) under accession numbers AC270096 to AC270104 and AC270124 to AC270125. The complete name of the annotated genes can be found in the Supplementary Material 3. Contamination with E. coli was less than 4 %, thus indicating a satisfactory BAC isolation and library preparation (Supplementary Material 4). Only one BAC (BAC6P22) showed a higher contamination level (16 %) considering only 11 out of 63 contigs assembled for annotation on the basis of its similarity to eukaryotic species. The N50 values ranged from 2505 to 47,966 bp (mean 23,201 ± 13,858). The results obtained showed that dmrt2 and dmrt3 were co-localized within the same BAC (BAC16E16) although assembled in different contigs, whereas the gene dmrt1 was found within BAC11O20 (Table 1). The dmrt4 gene (also named dmrta1) was also isolated from a different BAC (BAC21O23) and two genes (dmrt4 and fabp2) were annotated in the only useful contig obtained (Table 1). The sequencing results provided a non-linked arrangement of sox8 and sox9 genes in S. senegalensis.

Table 1 Name of the BACs studied, candidate gene used for 4D-PCR, and genes annotated within each BAC (review Supplementary Material 1 for full name of the genes)

The micro-synteny analysis showed that most, but not all, of the candidate genes showed a similar genomic gene organization between S. senegalensis and other teleosts (Supplementary Material 5). However, some differences were observed in some cases in regions upstream and/or downstream from the candidate gene, especially between non-closely related species. For the dmrt family, micro-synteny analysis confirmed that the dmrt1-dmrt3-dmrt2 organization was preserved among teleosts, including the closely related species C. semilaevis, in which these genes were located in the Z sex-chromosome (Supplementary Material 6). The micro-synteny analysis also showed that sox8 and sox9 genes were linked and located near each other in all teleost species considered (Supplementary Material 5 and 6). However, sox3 and sox6 were neither linked to each other nor to the sox8/sox9 genes among the species, including C. semilaevis (Supplementary Material 5 and 6). Moreover, sox6 presents two paralog sequences in some of the species analyzed, with inversions in several species, such as D. rerio, Astyanax mexicanus, and O. latipes.

The region surrounding the amh gene was highly conserved in all teleosts (Supplementary Material 5). However, for the nanos3 BAC clone, the results show that the region is more conserved among the species of more recent appearance than among those species considered more ancient, i.e., the region is less conserved in D. rerio, A. mexicanus, and Latimeria chalumnae.

mFISH analysis of BACs

The chromosome mapping of BAC clones listed in Table 1 are depicted in Fig. 1, Table 2, and Supplementary Material 7. All metaphases analyzed showed 21 pairs of chromosomes that correspond with the expected karyotype of S. senegalensis (Vega et al. 2002).

Fig. 1
figure 1

mFISH of the BACs isolated in the library that contain the following candidate genes: a amh (green), dmrt2 (pink), and sox3 (blue); b sox9 (green), nos3 (orange), vasa (pink), and dmrt4 (blue); c sox6 (green), fshb (orange), sox8 (pink), and cyp19a1a (blue); d dmrt2 (green), nos3 (orange), and cyp19a1a (blue); e sox8 (green), dmrt1 (orange), and aqp3 (pink); f sox6 (green), sox9 (orange), sox8 (pink), and sox3 (blue); g dmrt2 (green), and dmrt1 (pink); h sox9 (green), nos3 (orange), vasa (pink), and thrb (blue). In those cases in which two or more probes are co-localized in one chromosome, a diagrammatic representation is included

Table 2 Number of FISH signals and localization of BAC clones onto Solea senegalensis chromosomes

The mFISH technique located the 13 BAC clones on 10 different chromosome pairs. Nevertheless, additional secondary signals (four or six) were detected in some BAC hybridizations (Table 2). As a whole, the mFISH resulted in nine BAC clones producing single signals, three localized into two pairs, and just one into three pairs. The lhb BAC clone produced multiple signals and could not be assigned to a specific chromosome pair.

Results showed that some BAC clones were co-localized in the same chromosome. The BAC containing dmrt2 and dmrt3 genes produced signals on three chromosome pairs: the main signal was on the largest metacentric chromosome pair and co-hybridized with the dmrt1 BAC clone (Fig. 1g); a second signal was localized on a subtelocentric chromosome pair and co-hybridized with both aqp3 and sox8 BAC clones (Fig. 1e); and the third signal was on an acrocentric chromosome pair. The gene sox9 hybridized on two chromosome pairs: one signal was on a metacentric pair and co-hybridized with the dmrt4, nanos3, and thrb BAC clones (Fig. 1b, h) and the other signal was on an acrocentric chromosome pair and co-hybridized with the thraa BAC clone. Finally, the fshb BAC clone co-hybridized with the cyp19a1a and sox6 BAC clones (Fig. 1c).

Integrated genetic map

The integrated genetic map is shown in Fig. 2, which summarizes the cytogenetic map, sequence distances in base pairs, and the annotation results. This integrated map shows that the largest metacentric chromosome contains the three important genes for sex determination, i.e., dmrt1, dmrt2, and dmrt3. The second metacentric pair, however, is the chromosome for which the most information has been obtained. In this chromosome up to four BAC clones were co-localized by mFISH, more than 24 genes were annotated and, in some cases, the physical distance could be determined. Other chromosomes with high information density were the first, second, and fourth subtelocentric pairs, together with the two acrocentric chromosome pairs that bear the sox6 and sox9 genes.

Fig. 2
figure 2

Integration of cytogenetic and physical maps of S. senegalensis. Cytogenetic results are shown in boxes within the chromosome diagram; the red box refers to the results obtained by Ponce et al. (2011), blue boxes those by García-Cegarra et al. (2013) and black boxes those by this study. Underlined BAC names indicate the main FISH signal of those BACs with more than one signal. The sequencing result of each BAC is shown in parenthesis, and the physical distance between genes is represented in bp units. When two contiguous genes come from different contigs, the physical distance cannot be shown. Asterisk indicates that contigs cannot be ordered by micro-synteny

Comparative mapping

The comparative mapping revealed that rearrangements were more common between D. rerio and S. senegalensis than between any other combination of species (Figs. 3, 4, 5, and 6). Conversely, chromosome gene arrangements were highly conserved between G. aculeatus and S. senegalensis. Alignment of protein sequences of the candidate genes between S. senegalensis and C. semilaevis (Supplementary Material 8) confirmed that the same orthologous gene copies were compared between these two species. Sequence comparisons revealed a large conserved region of the nanos3 BAC clone in chromosome 9 of G. aculeatus (Fig. 3). Moreover, several regions of the aqp3 BAC, dmrt2/dmrt3 BAC, the amh BAC, and the sox9 BAC clones were localized in the chromosome 8. However, although the majority of the genes within the sox9 BAC clone were present in chromosome 8, the sox9 gene could not be identified. Two other candidate genes, dmrt4 and sox3, were co-localized in chromosome 7.

Fig. 3
figure 3

Circos analysis in the species G. aculeatus. On the left side, the distribution of the BAC clones of S. senegalensis can be observed. Indicated within each BAC are the genes found by annotation, and the corresponding localizations in the G. aculeatus genome are denoted by crossing lines. The chromosomes of that genome are represented on the right side of the figure. BAC clones analyzed are those given in Table 1, in addition to the fshb-bearing BAC obtained from García-Cegarra et al. (2013)

Fig. 4
figure 4

Circos analysis in the species D. rerio. On the left side, the distribution of the BAC clones of S. senegalensis can be observed. Indicated within each BAC are the genes found by annotation, and the corresponding localizations in the D. rerio genome are denoted by crossing lines. The chromosomes of that genome are represented on the right side of the figure. BAC clones analyzed are those given in Table 1, in addition to the fshb-bearing BAC obtained from García-Cegarra et al. (2013)

Fig. 5
figure 5

Circos analysis in the species T. nigroviridis. On the left side, the distribution of the BAC clones of S. senegalensis can be observed. Indicated within each BAC are the genes found by annotation, and the corresponding localizations in the T. nigroviridis genome are denoted by crossing lines. The chromosomes of that genome are represented on the right side of the figure. BAC clones analyzed are those given in Table 1, in addition to the fshb-bearing BAC obtained from García-Cegarra et al. (2013)

Fig. 6
figure 6

Circos analysis in the species O. latipes. On the left side, the distribution of the BAC clones of S. senegalensis can be observed. Indicated within each BAC are the genes found by annotation, and the corresponding localizations in the O. latipes genome are denoted by crossing lines. The chromosomes of that genome are represented on the right side of the figure. BAC clones analyzed are those given in Table 1, in addition to the fshb-bearing BAC obtained from García-Cegarra et al. (2013)

The comparison with D. rerio showed more gene re-arrangements, based on the lower number and smaller size of conserved regions (Fig. 4). Again, the largest conserved region was observed in the nanos3 BAC clone, which is localized in chromosome 1 of D. rerio. The aqp3 and vasa BAC clones were partially co-localized in chromosome 6. The fshb and sox6 BAC clones were also partially co-localized in chromosome 7. The genes within the sox9 BAC clone were distributed in seven different chromosomes in D. rerio, thus showing large gene re-arrangements. The sox9 gene was co-localized with the vasa gene in chromosome 10; however, this co-localization is not found in S. senegalensis.

Concerning the comparison with T. nigroviridis (Fig. 5), the nanos3-bearing BAC also presents the largest conserved region and it is localized on chromosome 18. Partial co-localizations which involve several candidate genes were detected including amh and sox3 in chromosome 1, vasa and sox9 in chromosome 3, and cyp19a1a, fshb, and sox6 in chromosome 5.

In the comparison between O. latipes and S. senegalensis (Fig. 6), the largest conserved region was again the nanos3-bearing BAC, although the sox3-bearing BAC was also highly conserved. A partial co-localization of the dmrt2, sox9, and amh BAC clones was observed in chromosome 4. In addition, the two gonadotropin genes were found in the same chromosome (chr. 15). The most surprising finding is that the dmrt2 and dmrt3 genes are not co-localized: instead, the dmrt3 and dmrt4 genes were both found in chromosome 18.

The comparative analysis demonstrates that all the genes identified in the BAC clones were distributed in 9 chromosomes in S. senegalensis, whereas they appeared distributed in a total of 14, 19, 16, and 17 chromosomes in G. aculeatus, D. rerio, T. nigroviridis, and O. latipes, respectively (Figs. 3, 4, 5, and 6, Table 3). This increasing number of BAC-bearing chromosomes is associated with the increasing number of the chromosome complement in the species analyzed (Table 3, column 1). If only the 12 candidate genes are considered, such a trend would disappear, since all the species showed fewer candidate gene-bearing chromosomes. Of these species, S. senegalensis and T. nigroviridis show the smallest number of sex gene-bearing chromosomes (Table 3, column 2), thus indicating a greater specialization of chromosomes. Moreover, T. nigroviridis and O. latipes are the two species that have the most chromosomes with more than one candidate gene in them (Table 3, column 3).

Table 3 Distribution of candidate genes among the chromosomes of different species

Phylogenetic analysis

The JTT phylogenetic tree obtained using sequences of 10 concatenated genes showed a good resolution and a robust branch support (Fig. 7). The phylogeny clearly separated the two Classes included, i.e., Sarcoptherygii and Actinoptherygii. Mammals are together in the same clade, and the coelacanth (L. chalumnae) is found clustered apart from the remaining Sarcoptherygii species (tetrapods). The ray-finned fish species are grouped together and, among the clades, modern fishes clustered together with a clear separation between fishes belonging to the Otomorpha cohort (A. mexicanus and D. rerio) and the Euteleosteomorpha cohort (the remaining ray-finned fishes).

Fig. 7
figure 7

Phylogenetic tree made from ten candidate genes concatenated (amh, cyp19a1a, dmrt2, dmrt3, dmrt4, lhb, nanos3, sox3, sox6, vasa) (see Supplementary Material 2 for accession numbers)

Discussion

The dmrt genes are transcription factors belonging to the DM domain gene family that are associated with sex determination and differentiation. The gene cluster dmrt1-dmrt3-dmrt2 appears widely conserved in vertebrates, including teleosts (El-Mogharbel et al. 2007; Brunner et al. 2001; Sheng et al. 2014; Chen et al. 2014). Although our micro-synteny analysis could not clearly establish this cluster, probably because of non-overlapping BAC clones, they were found by the mFISH technique co-localized and near each other in the same chromosome (Fig. 1g). The distance observed between FISH signals produced by dmrt2/dmrt3-containing BAC and the signal produced by the dmrt1-BAC clone could be due to some chromosomal rearrangement. The evolution of these genes is still not clear because of lack of data in basal metazoans. However, using the available whole-genome sequences, it can be deduced that the DM domain probably arose during early metazoan evolution, after the divergence of the choanoflagellates, and the domain subsequently expanded in the metazoan lineage (Bellefroid et al. 2013) and, more accurately, during the interval between Trichoplax and eumetazoans (Wexler et al. 2014). The cluster dmrt1-dmrt3-dmrt2 was found in a linkage group (LG) different from that of the dmrt4 in G. aculeatus and Lepisosteus oculatus (Supplementary Material 5). This situation has also been observed in D. rerio and O. latipes (Kondo et al. 2002; Woods et al. 2000). However, in humans, dmrt1, dmrt2, dmrt3, and dmrt4 genes are linked in chromosome 9 (Kondo et al. 2002). It has been hypothesized that these four dmrt genes could have arisen after several rounds of tandem duplications; hence, they indicate an ancestral origin (Kondo et al. 2002). Curiously, the dmrt1-dmrt3-dmrt2 cluster has been located in the Z sex chromosome of the closely related flatfish C. semilaevis (Chen et al. 2014). In S. senegalensis, the dmrt cluster appears also linked to a histone cluster (Supplementary Material 9). The location of multi-gene families in sex chromosomes has also been reported in some other species (Utsunomia et al. 2014) and, indeed, the 18S rDNA has been accumulated mainly in the X chromosome of the fish species Hoplias malabaricus (Cioffi et al. 2010), and even among karyomorphs of the same species (Bertollo et al. 1997).

The evolution of sex chromosomes generally involves the accumulation of repetitive elements by different strategies that lead, in most of the cases, to heteromorphism. Kejnovsky et al. (2009) proposed the repetitive DNA as the initial mechanism involved in the evolution of sex chromosomes. A high density of GATA-motif repeats has been reported in the W chromosome of the female snake Elaphe radiata (Jones and Singh 1985) and such repeats have enabled the sex chromosomes in the guppy fish to be identified (Nanda et al. 1990). A high concentration of (GATA)n repeats at a specific chromosome pair was described in the toadfish Halobatrachus didactylus (Merlo et al. 2007). However, previous studies did not find an accumulation of those sequences in any of the chromosomes of S. senegalensis (Cross et al. 2006). This is not a surprise considering the small size of the species’ genome, since it has been postulated that in species with a compact genome, repeated sequences are less frequently present than those in species with a larger genome. Therefore, the quantity of repeated sequences might not be enough to be detected by FISH.

Flatfish genomes are compact, and the number of chromosomes ranges from 21 pairs in Soleidae and Cynoglossidae to 24 in Pleuronectidae and Paralichthyidae (Cerdà and Manchado, 2013). We hypothesize that a Robertsonian fusion between two acrocentric chromosomes could have occurred during the evolution of Pleuronectiformes, giving arise to a large metacentric chromosome in S. senegalensis. The cluster dmrt1-dmrt2-dmrt3 might be located in such acrocentric chromosomes linked to a histone gene family. Although interstitial telomeric (TTAGGG)n sequences were not found on the metacentric chromosome, its origin in a Robertsonian fusion cannot be excluded because the loss of telomeric sequences can occur after such rearrangements (Cross et al. 2006). These rearrangements could explain the differences in sex-determination systems among closely related species, since they have been proposed as a major driving force for speciation (Heule et al. 2014; Ser et al. 2010). In S. maximus and C. semilaevis, a ZZ/WZ system for sex determination has been described (Hu et al. 2014). In contrast, an XX/XY system was proposed in S. senegalensis (Molina-Luzón et al. 2014). A recent theoretical model raises the possibility of transitions between the XY/XX and ZZ/ZW systems and environmental sex determination, and some species such as Xiphophorus maculatus are at an intermediate stage with both ZW and XY systems occurring in the same population (Pennell et al. 2015).

The sox8 and sox9 genes are important in fish reproduction: sox8 is involved in Sertoli cell development and in spermatogenesis (O’Bryan et al. 2008), whereas sox9 is initially expressed on the lateral side of the bi-potential genital ridge and up-regulated in the Sertoli cell precursors in the XY male gonad, immediately after the onset of SRY gene expression (Chaboissier et al. 2004). The linkage between sox8 and sox9 genes appeared to be highly conserved across teleosts including C. semilaevis. However, these two genes are located separately in the chromosomes of S. senegalensis (Fig. 1f) and S. maximus (Viñas et al. 2012), which could be a derived situation within the Pleuronectiformes group. The sox8 and sox9 genes belong to the same sox subgroup, i.e., soxE, which accounts for three sox genes that arose from tandem duplications (Heenan et al. 2015). Thus, it is possible that originally, these two genes were linked, although some re-arrangements occurred during Pleuronectiformes evolution. Moreover, sox3 and sox6 also appeared distributed separately in the genome, similar to S. maximus (Viñas et al. 2012). The sox3, sox6, and sox8/sox9 genes belong to different subgroups of sox genes (soxB1, soxD, and soxE, respectively) that seem to have arisen by whole-genome duplications (WGDs) followed by sub- and neo-functionalization events (Heenan et al. 2015). Studies in humans and mice with sox3 suggest a role in the central nervous system and during development (Cheah and Thomas 2015). However, in the fish Oryzias dancena, sox3 has been associated with male sex differentiation (Takehana et al. 2014). Furthermore, sox6 could be involved in the maturation of sperm in vertebrates (Hagiwara 2011) and, in O. mykiss, sox6 is only expressed in the testis, although it is not the primary sex-determining gene (Alfaqih et al. 2009).

Intriguingly, the secondary hybridization signals found in the sox6, sox9, dmrt2, and dmrt4 BAC clones suggest some gene duplications dispersed in the genome. Genes involved in transcription and signaling cascades, as well as those encoding for proteins with more than average protein–protein interactions, are examples of genes over-retained after WGD (Hufton et al. 2009). Indeed, teleost fishes have suffered a third round of genome duplication, termed as teleost-specific whole-genome duplication (TS-WGDs) (Glasauer and Neuhauss 2014), which could explain the presence of the secondary signals found in the most fish species. Genes acting on essential metabolic pathways are present in the four BAC clones previously mentioned as having more than two FISH signals. Indeed, sox6, sox9, and dmrt2 present paralog sequences in the majority of the species analyzed by micro-synteny, and the secondary weaker signals might correspond to similar fragments in duplicated regions.

The parts of the genome surrounding both the amh and nos3 genes are fully conserved in teleosts; a conserved group of genes could indicate a functional cluster (Overbeek et al. 1999). Paibomesai et al. (2010) studied the genes surrounding the amh gene in several fish species, as well as in human and mice, and identified functional clusters associated with sexual maturation and cell cycling. The amh is an example of the first class of cluster, since it is a member of the transforming growth factor-beta gene family, which mediates male sexual differentiation and participates in the development and maintenance of the male and female gonads (Durlinger et al. 2002a, b). A study conducted in four nanos genes determined that nanos3 was conserved in terms of expression and synteny (Aoki et al. 2009), which could also indicate the existence of a functional cluster in the area surrounding the nanos3 gene. Both the nanos and vasa families are involved in the specification of primordial germ cells (PGCs) in sexual reproduction (Cho et al. 2014), but no linkage between these two genes was observed among the species analyzed in this study. In S. senegalensis, four vasa transcripts have been described, two of them with an ovary-specific expression (Pacchiarini et al. 2013).

The linkages among fshb, lhb, and cyp19a1a genes are not fully conserved in teleosts, and the linkage between fshb and cyp19a1a was observed only in S. senegalensis and T. nigroviridis (Figs. 1c and 5). It is well known that T. nigroviridis has a very compact genome (a genome with a similar quantity of genes but of a smaller DNA size) and that property has also been described in flatfish (Zaucker et al. 2014). From an evolutionary point of view, the S. senegalensis and T. nigroviridis genomes could have evolved in a similar way to optimize the expression of genes such as fshb and cyp19a1a with similar functions. On the other hand, the linkage between sox6 and cyp19a1a appears to be common in flatfish, since it has been observed in S. senegalensis (Fig. 1c), C. semilaevis (Supplementary Material 6), S. maximus (Viñas et al. 2012), and T. nigroviridis (Fig. 5). However, more species need to be analyzed to conclude definitively that this linkage is an ancestral condition in Pleuronectiformes.

Previous cytogenetic studies have also been undertaken in S. senegalensis to complete the genetic knowledge and the karyotype of this species (Vega et al. 2002; Manchado et al. 2006; Cross et al. 2006). Ponce et al. (2011) used for the first time the BAC-FISH technique in S. senegalensis to localize the BAC containing the lysozyme gene. In other work, a preliminary BAC-based cytogenetic map of S. senegalensis was presented with 11 chromosomal markers, which mapped onto 13 chromosomes (García-Cegarra et al. 2013). This study completes those previous works and localizes BAC clones of up to 15 (out of 21) chromosome pairs (Fig. 2). This is an important point because the karyotype of S. senegalensis contains 12 pairs of acrocentric chromosomes that are difficult to distinguish because of the similar size that they present. The FISH technique has also been used in the cytogenetic characterization of other flatfish species, such as S. maximus; in addition, this technique has been used to consolidate the linkage map produced for this species, thus helping to bring together several previously established linkage groups (Taboada et al. 2014b).

Several authors have reported that the Pleuronectiformes group evolved from Perciformes fishes (Ivankov et al. 2008; Flores and Martínez 2013), so this could be reflected by the very close relationship between S. senegalensis and G. aculeatus compared with the other species considered in the comparative analysis. Using a concatenated protein sequence provided more robustness to the result, since this approach gives a more accurate tree (Gadagkar et al. 2005). However, a problem could arise from uncertain orthology or hidden paralogy (Thiergart et al. 2014), so special care must be taken in selecting the orthologs. It has been proposed that sex determination signals and mechanisms evolve so rapidly that the master gene rarely stays at the top of the sex determination cascade for very long, although the rest of the genes acting further down in the network are more highly conserved (Heule et al. 2014). The phylogenetic analysis with ten concatenated sex-related genes indicates the possible existence of a similar network among neighbor species; this would give important clues regarding the genes involved in processes of sex determination and sexual differentiation and reproduction studied in this paper. Moreover, the phylogenetic analysis agrees with the comparative mapping, whereby D. rerio was the species most distant from S. senegalensis. Finally, the phylogenetic analysis supports the relationships previously established between Sarcoptherygii and Actinoptherygii, and among fish species (Betancur et al. 2013).

Conclusions

The present work has followed on from previous research, to complete the karyotype characterization of S. senegalensis, focusing on BACs containing genes associated with sex determination and differentiation, using the mFISH technique. Results of co-localizations of candidate genes and synteny studies point to the largest metacentric chromosome of S. senegalensis as a sex proto-chromosome. The integrated genetic map showed 15 pairs out of 21 with at least one BAC. This result is important for distinguishing those chromosome pairs of S. senegalensis that are similar in shape and size.