Introduction

Class Bivalvia is represented by a large number of marine and freshwater mollusks found in aquatic habitats throughout the world. Despite their significant ecological and commercial value, genomic studies on these organisms are not abundant, and information is still limited to a few species. Genome sequencing projects have been performed only for the pearl oyster, Pinctada fucata (Takeuchi et al. 2012); pacific oyster, Crassostrea gigas (Zhang et al. 2012); and filter-feeder mussel, Mytilus galloprovincialis (Murgarella et al. 2016).

Significant fraction of all eukaryotic genomes is made of repetitive sequences. Among them are satellite DNA (satDNA) sequences, organized in arrays of tandemly repeated monomers (Miklos 1985). Donax trunculus is an example of a bivalve species harbouring a number of different satDNAs: DTE (Plohl and Cornudella 1996), DTHS1, DTHS2, DTHS3, DTHS4 (Plohl and Cornudella 1997), DTF1 (Petrović and Plohl 2005), DTF2 (Petrović et al. 2009) and BIV160 (Plohl et al. 2010). At the same time, some satDNAs can be shared among closely (phBglII400, Passamonti et al. 1998), or even distantly related species (BIV160, Plohl et al. 2010). DTHS3 satDNA was originally discovered in D. trunculus after HindIII digestion of genomic DNA (Plohl and Cornudella 1997). Experimental procedure based on restriction enzyme digestion generated a regular ladder of genomic fragments, a feature typical for sequences repeated in tandem. DTHS3 was therefore characterized as a satDNA. It has 145-bp long monomer sequence which holds several copies of GGTCA pentanucleotide motif and comprises 0.035% of D. trunculus genome. Presence of this satDNA in species Ruditapes decussatus and R. philippinarum has been recently reported in Šatović et al. (2016).

Here, we report the presence of DTHS3 satDNA in nine additional bivalve species, further expand its characterization, and estimate its age to a minimum of 516 Ma.

Materials and methods

Genomic DNAs of D. trunculus, R. decussatus and R. philippinarum were partially digested and used for preparation of genomic libraries, followed by generation of small local databases enriched in repetitive DNA sequences (details in Šatović and Plohl 2013; Šatović et al. 2016). This methodology followed the approach of Biscotti et al. (2007) based on the fact that during hybridization process, nucleotide sequences that are present in the genome in multiple copy number will give more intense hybridization signals compared to the single-copy ones. Repetitive sequences belonging to DTHS3 satDNA have previously been deposited in GenBank in course of more detailed studies on repetitive sequences in these organisms (Šatović and Plohl 2013; Šatović et al. 2016). Here, GenBank search showed its presence in additional bivalve species. Sequences used during this study have the following accession numbers: KU682294, KU682297, KU682299, AM852203.1, AM871347.1, AM851753.1, KU682284, KU682285, X94611.1, X94540.1, X94542.1, KU682312, KU682313, KU682310, AM872828.1, AM874475.1, AF108933.1, FY999295.1, FY999259.1, FY999264.1, GR210966.1, GR211262.1, HO206039.1, AB231865.1, EU054059.1, GO309863.1, GO308956.1, GO309030.1, GO308318.1, GO310525.1, AM853355.1, AM853502.1, EY436206.1, EY436734.1, EY435449.1, EY436295.1, EY436878.1, EY434278.1, EY436850.1, EY433829.1, EY435144.1, EY435174.1, EY433716.1, EY435319.1, MWPT01000253.1 and 413002.

DTHS3 monomer frame was determined in relation to previously reported monomers by Plohl and Cornudella (1997). For phylogenetic analysis, monomers have been aligned using the MUSCLE algorithm implemented in Mega v. 5.05 (Tamura et al. 2011). The same program was used to determine the best substitution models and to generate maximum likelihood trees; nodal support has been obtained with the bootstrap of 500 replicates. Phylogenetic tree showing the relationships between species was build based on sequences derived from cytochrome oxidase subunit 1 (COI) genes, holding accession numbers: AB076943.1, AB076949.1, EU484436.1, FJ434678.1, FJ434679.1, JX051541.1, KC429143.1, KP099052.1, KU252878.1, KU905884.1, KU905937.1, KU906094.1. Rest of the sequence alignments and sequence analysis were performed with Geneious 9.1.7 program (Biomatters, Auckland, New Zealand).

Results

For the purpose of more detailed investigation of DTHS3 satDNA, this sequence was extracted from local databases obtained earlier for three bivalve genomes (Šatović and Plohl 2013) and complemented with results obtained after a screen of GenBank against this sequence. This search yielded 93 monomers across 10 bivalve species: D. trunculus, R. decussatus, R. philippinarum, Sinonovacula constricta, Mercenaria mercenaria, Meretrix lusoria, M. meretrix, Spisula solidissima, Crassostrea gigas, C. virginica, Mytilus galloprovincialis and Dreissena bugensis; their classification is provided in table 1. In the analysed set of sequences, longest array of DTHS3 was found in S. solidissima containing six and eight complete monomers. In other species up to four complete monomers were found repeated in tandem, not excluding the possibility that longer arrays exist in the genome.

Table 1 Classification of 12 bivalve species used in this study by World Register of Marine Species (http://www.marinespecies.org/index.php).

First report on DTHS3 satellite in D. trunculus was brought by Plohl and Cornudella (1996), yielding three complete and one truncated monomer sequence. Our local database search determined its presence in two more genomic clones, 84_19 and DTC32Alu, the latter harbouring an array of four monomers repeated in tandem. Complete monomers obtained from this species exhibit nucleotide sequence similarity between 83.2 and 98.7%.

For R. decussatus, local database search found presence of this sequence in five genomic clones. In the case of D12 (KU682294), monomer was interrupted by insertion of SINE element RUDI (Luchetti et al. 2016), leaving the rest of the monomer sequence intact. In addition, we found three sequences in EST database, and altogether eight complete monomers were extracted. Similarity between monomers from this species varies between 61.5 and 96.6%.

In R. philippinarum, DTHS3 was found in two genomic clones and two more sequences that belong to the EST database, yielding altogether 10 complete monomers. Overall monomer similarity varied between 61 and 95.2%.

EST database search in the case of S. constricta yielded six sequences from which 11 complete monomers were extracted, showing similarity between 55.1 and 85.5%, while for M. meretrix, four EST database hits have yielded seven DTHS3 monomers, with sequence similarity between 60.0 and 76.5%.

In M. mercenaria, only one fragment holding one truncated and two complete DTHS3 monomers were found, the latter ones sharing 83.5% similarity. The repeats are here followed by a microsatellite sequence, similarly as in DTC84 (Šatović and Plohl 2013) and pearl mobile elements (Gaffney et al. 2003).

Four M. lusoria cDNA fragments harbour DTHS3 sequence, from which nine complete highly similar monomers were extracted (86.9–100%). As three of four cDNA sequences show high overall sequence similarity, it is possible that they are transcripts from the same genomic location.

GenBank database search yielded two genomic fragments of S. solidissima containing arrays of this satellite DNA. Similarity among complete monomers is 69.2–97.9%. In the case of AB231865.1 genomic fragment, DTHS3 array is again followed by a microsatellite sequence.

For D. bugensis, DTHS3 sequence was found within 12 fragments, giving 15 monomers whose nucleotide similarity varied between 57.5 and 100%.

Four C. gigas cDNA fragments contained DTHS3 sequence, from where four monomers were extracted, exhibiting 55.1–78.9% of similarity among each other.

Interesting organization of DTHS3 repeats is found in C. virginica and M. galloprovincialis. One scaffold from C. virginica WGS base holds two DTHS3 repeats followed by a microsatellite, while M. galloprovincialis WGS contig harbours three monomers in tandem, also followed by a microsatellite. Genomic fragments from the two species share sequence similarity among monomers and microsatellite, as well as in 505 bp preceding the repeats. Altogether, they exhibit 96.5% identity over 836 bp, not taking into consideration the central repeat of the array belonging to M. galloprovincialis, which holds a 19 bp insertion. In addition, behind 836-bp fragment, C. virginica harbours another monomer and a half, positioned 335 bp downstream of the end of the microsatellite segment. Within each species, monomers derived from C. virginica exhibit 73.8–86.9% similarity, while those derived from M. galloprovincialis have 78.0–86.2% similarity. Blast search against C. gigas genome, closely related to C. virginica did not show presence of 836-bp sequence segment present in C. virginica and M. galloprovincialis.

As mentioned, in M. mercenaria and S. solidissima, DTHS3 monomers are also found followed by a microsatellite, but only in case of C. virginica and M. galloprovincialis, a strong homology in sequences preceding and following the repeats are observed.

Fig. 1
figure 1

(a) Maximum likelihood bootstrap consensus tree built on 93 DTHS3 monomers from 12 bivalve species. (b) Maximum likelihood phylogenetic tree based on COI sequences showing relationships between species.

Phylogenetic analysis revealed two patterns. One is species-specific clustering of monomers, as noticeable for S. solidissima, M. mercenaria, D. trunculus and S. constricta. Second pattern groups monomers derived from different species, as observed for R. decussatus and C. gigas, C. virginica and M. galloprovincialis and for R. philippinarum, M. meretrix and M. lusoria (figure 1a). D. bugensis monomers partially group together, and at the same time cluster with those from C. virginica and M. galloprovincialis. Monomers derived from closely related species can be found intermingled together, as for M. lusoria and M. meretrix. At the same time, monomers derived from R. decussatus and R. philippinarum group closer to those belonging to more distantly related species than to each other. In that respect, monomers derived from R. decussatus and C. gigas cluster together, although these species belong to two different subclasses (table 1; figure 1b). The same accounts for monomers of C. virginica and M. galloprovincialis, belonging to different orders, with exception of the monomer positioned outside the 836-bp long fragment of C. virginica, that positions closer to monomers from D. bugensis.

Discussion

Satellite repeats belonging to DTHS3 family have been detected in 12 species belonging to the two subclasses of the class Bivalvia, Heterodonta and Pteriomorphia. As estimated time of diversification for Pteriomorphia is 516 Ma (Bieler et al. 2014), this satDNA can be dated very closely to BIV160 satDNA, estimated to appear 540 Ma and representing the oldest described satDNA so far (Plohl et al. 2010). One of the potential ways of genomic dispersal of BIV160 sequences was proposed to be linked to miniature inverted-repeat transposable elements (MITEs), based on sequence similarity with the pearl family of elements found in oysters (Gaffney et al. 2003). These elements are composed of a central region holding different number of monomer-like repeats followed by a microsatellite segment, while the whole structure is flanked by sequences holding short inverted repeats. The same structural organization has been found in DTC84 element of D. trunculus (Šatović and Plohl 2013). Similarly structured tandem repeat-carrying elements also exist in evolutionary distant organisms such as Drosophila, and are altogether categorized as nonautonomous eukaryotic rolling-circle transposable elements of the Helitron/Helentron superfamily (Thomas and Pritham 2015).

In this study, DTHS3 repeats in M. mercenaria, S. solidissima, C. virginica and M. galloprovincialis have been found in arrays that are followed by a microsatellite sequence, opening the possibility that these repeats can also be found within structures similar to that of Helitron/Helentron mobile elements (Gaffney et al. 2003; Šatović and Plohl 2013).

The obtained results suggest that DTHS3 behaves quite similar to BIV160 satDNA (Plohl et al. 2010) in estimated age, sequence conservation, and potential to generate novel species-specific subfamilies. In both satDNAs, observed results indicate presence of species-specific subfamilies, together with interspecifically intermingled monomer variants (figure 1). Most probably, diversification of species-specific variants from a common ancestral sequence has occurred following or in course of speciation (Plohl et al. 2010).

Existence of the 836-bp long, extremely similar fragments in C. virginica and M. galloprovincialis is intriguing, as the same sequence was not found in C. gigas, closely related to C. virginica. There are three possible explanations for this observation: the sequence existed in the common Pteriomorphia ancestor but in course of time was lost in the genome of C. gigas; the sequence is present in C. gigas genome but it was omitted in the genome sequencing process; a horizontal transfer event occurred. In addition, the discrepancy in sequence grouping and known mollusk systematics (table 1; figure 1b) might indicate a horizontal transfer as a possible way by which DTHS3 monomers became shared between R. decussatus and C. gigas, distant species that belong to different subclasses. Existing overlap in their geographical distribution along British Islands, Iberian Peninsula and the Mediterranean (information retrieved from http://www.fao.org/home/en/) and ancient divergent time between two species (516 Ma, Bieler et al. 2014) would go in favour of this hypothesis.

In conclusion, DTHS3 is a satDNA family that is widely spread among bivalve species. It persisted throughout long evolutionary periods, and is probably one representative of ancient repetitive sequences that were present in a common ancestor of Bivalve species. Thus, diversification of DTHS3, together with different potential ways of its propagation, have influenced and contributed to the structuring of bivalve genomes on a large evolutionary scale.