Introduction

Metazoan mitochondrial DNA (mtDNA) is nearly always a closed circular molecule except for some cnidarians (Bridge et al. 1992). It contains the same 37 genes, encoding 13 proteins of the respiratory chain [cytochrome c oxidase subunits I-III (cox1cox3), apocytochrome b (cob), ATP synthase subunits 6 and 8 (atp6 and atp8), and NADH dehydrogenase subunits 1–6 and 4L (nad1–6 and nad4L)], two ribosomal RNAs [small and large subunit ribosomal RNA (rrnS and rrnL)], and 22 transfer RNAs. Although there are exceptions, most metazoan mtDNAs range in size from 14 to 17 kb. Typically, there are few intergenic nucleotides except for a single large non-coding region termed control region, which is generally thought to contain elements that control the initiation of replication and transcription (Shadel and Clayton 1997). Size variation in metazoan mtDNA is usually due to the different length of the non-coding regions. It has been found that the non-coding regions occasionally contain repeated elements (Rigaa et al. 1995) or pseudogenes (Arndt and Smith 1998; Mueller and Boore 2005). The “universal” genetic code has been modified in many animal lineages, including the use of alternative start codons and abbreviated stop codons (Ojala et al. 1981; Wolstenholme 1992).

In contrast to other metazoan phyla, Mollusca, the second largest animal phylum, exhibits much variation in the features of their mitochondrial genomes. Some bivalve lineages (seven families Mytilidae, Unionidae, Margaritiferidae, Hyriidae, Donacidae, Solenidae, and Veneridae) have an unusual mode of inheritance for mtDNA, termed doubly uniparental inheritance (DUI) (Theologidis et al. 2008). Some pulmonate gastropods have unusual tRNAs lacking the T-stem or the D-stem, similar to nematode mitochondrial tRNAs (Yamazaki et al. 1997). The atp6 and atp8 genes are separated in the scaphopods and two groups of gastropods (Patellogastropoda and Heterobranchia). This is unusual because the atp6atp8 cluster is common to most animal mitochondrial genomes, often with overlapping reading frames (Boore 1999). Most bivalves lack atp8 gene, only in Hiatella arctica and Lampsilis ornata, atp8 gene is found to date. In addition, several duplicated genes have been observed in cephalopods Watasenia scintillans and Todarodes Pacificus mtDNAs (Akasaki et al. 2006). The mtDNA gene arrangement is considered to be relatively conserved within each metazoan phylum. However, molluscs are the largest exception to this rule. Most molluscan mitochondrial genomes reported so far have different gene arrangements, and there are large differences in mitochondrial gene arrangement within each class. The bivalves and the scaphopods have the greatest amount of mitochondrial gene rearrangement although this degree of variation may be an artifact of a taxonomic sampling bias among the currently available taxa.

Little study has been done on molluscan mtDNAs compared to those of vertebrates or arthropods. In this study, the mitochondrial genome features of two scallops, Argopecten irradians and Chlamys farreri, are described in detail. In addition, comparisons of genomic character among bivalves are also presented.

Materials and Methods

Samples Collection and DNA Extraction

The specimens of A. irradians and C. farreri were obtained from the Nanshan seafood market (Qingdao, Shandong province, China). The adductors were dissected and frozen in −80°C. The total genomic DNA of each scallop was isolated with the Wizard ® genomic DNA purification kit (Promega) following the manufacturer’s protocol and stored at −20°C.

Long PCR Amplification and DNA Sequencing

The whole mitochondrial genomes of A. irradians and C. farreri were amplified with a long PCR technique (Cheng et al. 1994). Oligonucleotide primers were designed based on the partial mtDNA sequences (cox1, cox3, rrnL, cob for A. irradians from EST data; rrnS, rrnL for C. farreri from GenBank). Three pairs of primers: A-cox1-cox3-F (5′-TTC TAC TTC ACA GCG GTT ACT ATG T-3′), A-cox1-cox3-R (5′-GGA GAA ATG GCC TAA TCG AG-3′); A-cox3-rrnL-F (5′-CGT TAG GGT TGG TAA TAA TCT TAG GCG GAT GT-3′), A-cox3-rrnL-R (5′-TAC AGA ACC TCT GAA GCC AAA TGT CCT TTC GT-3′); A-rrnL-cob-F (5′-ATT GTA GGC CCT GTG AAT GGT TTG ACG AGT TT-3′), A-rrnL-cob-R (5′-AGT ATC CTC CAA GTG CTG CAT AAT CCC AAC TA-3′) for A. irradians, and two pairs: C-rrnL-rrnS-F (5′-CCT ACT TTG GGA CAG ATT CTA CAG-3′), C-rrnL-rrnS-R (5′-GCG ACT GCT GGC ACC TGG TTG GAC-3′); C-rrnS-rrnL-F (5′-TAA TTG ATC CAT ACC TCC AGT G-3′), C-rrnS-rrnL-R (5′-TCA GGA TAC CCA GAG CCA ACA T-3′) for C. farreri were designed to amplify the whole mitochondrial genomes. PCR products were sequenced either by primer walking or by first cloning them into pUC18, and then sequencing positive clones using M13 universal primers.

The PCR reactions were performed with a Mastercycler gradient machine (Eppendorf). The cycling was set up with an initial denaturation step at 94°C for 2 min, followed by 35 cycles comprising denaturation at 94°C for 20 s, annealing at 52–58°C for 1 min and elongation at 68 or 72°C for 5 or 16 min depending on the expected length of the PCR products. The process was completed with a final elongation at 72°C for 10 min. The reaction volume amounted to 25 μl containing 0.5 μl dNTP mix (10 mM each), 1 μl each primer (5 μM), 2.5 μl 10× LA PCR buffer (Mg2+ plus, Takara), 0.2 μl LA Taq DNA polymerase (5 U/μl, Takara), 0.5 μl DNA template (50 ng/μl), and 19.3 μl sterile deionized water. PCR products were directly purified with MultiScreen-PCR96 Filter Plate (Millipore) and sequenced with ABI 3730x1 DNA Analyzer.

Gene Annotation and Sequence Analysis

The raw sequencing reads were first processed using Phred with a quality score of 20 and assembled in Phrap with default parameters (Ewing et al. 1998; Ewing and Green 1998). Then, all assemblies and sequence quality were verified manually using Consed to remove misassemblies (Gordon et al. 1998). Protein-coding genes and rRNA genes were identified with DOGMA (Wyman et al. 2004) and BLAST searches (http://www.ncbi.nlm.nih.gov/BLAST). The boundaries of each gene were determined with multiple alignments of other published bivalve mitochondrial sequences. The majorities of tRNA genes were identified by tRNAscan-SE 1.21 (Lowe and Eddy 1997) and DOGMA, Additional tRNAs were identified by inspecting sequences for tRNA-like secondary structures and anticodons. The two mtDNA sequences were scanned for potential tandem repeats by Tandem Repeats Finder 4.0 (Benson 1999). The sequences have been deposited in the GenBank database under accession number: A. irradians DQ665851 and C. farreri EF473269.

Results and Discussion

Genome Composition

Genome composition and gene arrangement of A. irradians and C. farreri is summarized in Fig. 1 and Tables 1, 2. The complete mitochondrial genome of A. irradians is 16,211 nts in length and the nearly complete mitochondrial genome of C. farreri is 20,789 nts in length (a possible GC-rich domain within the largest non-coding region could not be fully sequenced). The lengths of two scallop mitochondrial genomes are within the range of genome sizes for already sequenced molluscan mtDNAs. The size of molluscan mitochondrial genomes varies dramatically ranging from 13,670 nts in the snail Biomphalaria glabrata (DeJong et al. 2004) to 40,725 nts in the sea scallop P. magellanicus (Smith and Snyder 2007), respectively. All genes are encoded on the same strand within the two newly sequenced mitochondrial genomes. Both of the genomes contain 35 genes including 12 protein-coding genes, 2 ribosomal RNAs, and 21 transfer RNAs. In contrast to the typical animal mitochondrial genome, both of them lack one protein-coding gene atp8 and two trnSs, but show an additional copy of trnF in A. irradians and of trnM in C. farreri, respectively (Fig. 1; Tables 1, 2).

Fig. 1
figure 1

Mitochondrial gene maps of the scallops Argopecten irradians and Chlamys farreri. All of 35 genes are on the same DNA strand. Genes for proteins and rRNAs are shown with standard abbreviation. Genes for tRNAs are designated by a single letter for the corresponding amino acid with two leucine tRNAs and two phenylalanine tRNAs (in A. irradians) and two methionine tRNAs (in C. farreri) differentiated by numeral as identified in Figs. 3, 4. “NCR” indicates the largest non-coding regions. Scaling is only approximate

Table 1 Profile of the mitochondrial genome of Argopecten irradians
Table 2 Profile of the mitochondrial genome of Chlamys farreri

The mitochondrial genome of A. irradians has an overall 57.3% A + T content and the size of the coding region is 14,631 nts in length accounting for 90.3% of the whole genome. The C. farreri mtDNA is 58.7% A + T, which is slightly higher than those of the other two scallops P. magellanicus (55.7%) and M. yessoensis (55.2%) (Table 3). The size of coding region in C. farreri mtDNA is 15,090 nts in proportion of 72.6% of the whole genome. In contrast to the genome of A. irradians, which has six overlapping regions between genes, there are no overlapping genes in the genome of C. farreri (Table 1, 2).

Table 3 Genomic characteristics of mollusca bivalve mitochondrial genome

Gene Arrangement

In contrast to Arthropoda, Mollusca, the second largest animal phylum, displays an extraordinary amount of variation in gene arrangement. To our great surprise, the gene arrangements among A. irradians, C. farreri and P. magellanicus, which belong to the same family Pectinidae, were quite different. Studying upon the molluscan gene arrangement map reveals many interesting features (Fig. 2). The chiton K. tunicata is the only sequenced representative of the Polyplacophora, an early diverged class within Mollusca; its gene order may represent molluscan ancestral pattern of gene arrangement. The gene arrangement of K. tunicata differs from that of Octopus vulgaris by only the inversion of trnP and the translocation of trnD. Gene orders of other cephalopods resemble that of O. vulgaris with several translocations of tRNA and the switch of large gene blocks (Akasaki et al. 2006). Additionally, the gene arrangement of K. tunicata differs from that of Haliotis rubra only by the inversion of trnP plus additional four tRNAs transposition. The gene orders of caenogastropods show a close similarity to that of H. rubra; however, the gene arrangements of opisthobranch and pulmonate are very similar to each other, but highly divergent from those of vetigastropod and caenogastropod (Grande et al. 2008). It is thus clear that vetigastropod retains the presumed ancestral gene arrangement; caenogastropod appears minor gene arrangement relative to ancestral pattern; however, the remaining gastropods (Patellogastropoda) have gone through dramatic gene arrangements. In Scaphopoda, the mitochondrial genomes of only two species, Graptacme eborea and Siphonodentalium lobatum, have been sequenced so far. Gene orders of the two species appear almost completely rearranged. Here, G. eborea was chosen as the representative of this class for comparison analysis. Comparing the gene arrangements of G. eborea with K. tunicata, they merely shared two gene blocks, nad6-nad1 and nad5-nad4-nad4L, if the tRNA genes were excluded. More extensive taxonomic sampling is needed to further study the gene arrangement in Scaphopoda. To date, all the bivalves available mitochondrial genomes belong to three subclasses Palaeoheterodonta, Heterodonta, and Pteriomorphia. The gene orders of bivalves are most highly rearranged among all classes of Mollusca. Gene order of freshwater mussel Lampsilis ornata (Palaeoheterodonta) is nearly identical to that of the same family species Inversidens japanensis except for the translocations of several tRNAs and the protein-coding genes nad2 and nad3. Additionally, L. ornata mitochondrial genome contains the protein-coding gene atp8 which is absent in I. japanensis mitochondrial genome (data not shown). The remaining bivalves with mitochondrial genome are all marine species, in which the gene orders are dramatically rearranged. Ruditapes philippinarum (Heterodonta) does not share any gene block with L. ornata. In Pteriomorphia, the mitochondrial genomes of nine species belonging to three groups Mytiloida, Ostreoida, and Pectinoida have been sequenced. Few gene blocks were shared between pairs of them. Three mussels within genus Mytilus share the same gene arrangements except for an additional trnQ in M. trossulus. Two oysters belonging to the same genus even have many tRNA translocations. Here, we focus on the gene arrangements of four scallops in the family Pectinidae. Among the four scallops, gene arrangement of M. yessoensis closely resembles that of C. farreri. They shared three large gene blocks nad4L-nad6-L 2 -cob, cox3-K–F-Q-E-atp6-cox2-nad2-T-P–I-L 1 -M-nad3-nad4, and N-nad1-R-rrnL-M. As the genome of M. yessoensis remains a gap between trnV and trnN, and trnH-W-Y-G as observed in C. farreri was not detected in the present sequence of M. yessoensis. So, it is possible that the block trnH-W-Y-G is positioned or rearranged in the unsequenced region. Therefore, it is likely that the gene arrangements in the two scallops are the same except for translocations of genes rrnS and nad5, and several tRNAs. The similarity in the mtDNA gene arrangements between M. yessoensis and C. farreri implies that they are very close lineages. The gene arrangements of M. yessoensis and C. farreri are very different from those of other two scallops although all of them belong to the same family. The tRNAs are more variable because the secondary structure allows them to translocate more frequently (Boore and Brown 1994). Even after excluding the tRNAs from the comparison, the two genomes of C. farreri and A. irradians only show three small identical gene blocks nad6-L2-cob, nad3-nad4, and nad1-R-rrnL; meanwhile the two genomes of A. irradians and P. magellanicus only share one gene block L 2 -cob-cox2. The gene blocks nad6-L 2 -cob-cox2 and nad1-R-rrnL may be inherited from a common ancestor of the four scallops with the former block reduced to nad6-L 2 -cob in M. yessoensis and C. farreri, and reduced to L 2 -cob-cox2 in P. magellanicus with the apomorphic loss of the block nad1-R-rrnL. Comparison of the gene arrangement demonstrated that the four scallops share few identical gene blocks although they are close lineages. This feature is seldom observed in Metazoa, even in other molluscan classes.

Fig. 2
figure 2

The gene arrangement map of molluscan mitochondrial genomes. All genes are transcribed from left-to-right except those indicated by underlining, which are transcribed from right to left. The bars show identical gene blocks. Arrows denote gene translocations while the circling arrows indicate inversions. Gray boxes indicate the remaining gap in the genomes. The non-coding regions are not presented. Gene segments are not drawn to scale. Asterisk denotes that the genome contains the split rrnL and duplicated rrnS according to Milbury and Gaffney (2005)

Protein-Coding Genes

Of the 13 typical protein-coding genes (cox1cox3, nad1nad6, nad4L, cob, atp6, and atp8), twelve genes were determined. atp8 coding sequence was not identified in both mitochondrial genomes. All genes are transcribed from the same strand. These findings have been observed in all other marine bivalve genomes published except for H. arctica in which an atp8 gene was reported. Thus, coding genes on the same strand and missing atp8 gene is one of the most important characteristics of marine bivalve mitochondrial genomes.

Mitochondrial genomes often use a variety of non-standard initiation codons (Wolstenholme 1992). In some cases, identification of the very clear initiation codon was difficult when several alternatives were present in a region inferred to represent the start of coding sequence. In A. irradians mitochondrial genome, four genes (cox2, cox3, nad1, and atp6) start with GTG and two genes (nad2 and cob) use TTG as start codon. In the rest of six genes, three genes (cox1, nad3, and nad4L) use a standard start codon ATA and the other three genes (nad4, nad5, and nad6) use a standard start codon ATG. All genes except nad4L with an incomplete stop codon have complete termination codon, either TAA (cox1, nad4, and atp6) or TAG (cox2–cox3, nad1nad3, nad5nad6, and cob), respectively (Table 1). In C. farreri mtDNA, three genes (nad2, nad6, and nad4L) and other three genes (cox1, cox3, and atp6) initiate with start codon ATA and ATG, respectively. In the rest of six genes, three genes (cox2, nad1, and nad3) start with GTG and the other three genes (nad4, nad5, and cob) start with TTG codon. Out of 12 genes, seven genes (nad1–6, atp6) ended with stop codon TAG and three genes (cox1, cox3, and nad4L) ended with TAA, and the remaining two genes (cox2 and cob) ended with incomplete stop codon TA and T, respectively (Table 2). Such immature stop codons are common among animal mitochondrial genes, and it has been shown that TAA stop codon is created via posttranscriptional polyadenylation (Ojala et al. 1981).

There are a total of 3,681 codons excluding stop codons in all protein-coding genes of A. irradians. The overall A + T composition of protein-coding regions is 57.0%, however, the A + T content at the first condon positions is 52.7% and that at the second and third codon positions is 58.5% and 59.6%, respectively. The C. farreri mtDNA encodes for a total of 3,737 amino acids in all protein-coding genes excluding stop codons. The overall A + T content of all protein-coding regions is 58.9%. The A + T content at the three codon positions is 55.2%, 58.4%, and 63.1%, respectively. It is an obvious characteristic that A + T content of the third codon positions is higher than that of the first and second codon positions. The genomic characteristics of 14 bivalve sequences published so far are presented in Table 3.

Transfer RNA Genes

The tRNA genes are usually little conserved in molluscs. Both of the mitochondrial genomes encode 21 tRNA genes, ranging in size from 62 (trnF 1 ) to 71 nts (trnI) in A. irradians and from 64 (trnL 2 ) to 73 nts (trnM 2 ) in C. farreri, which can be folded into typical clover-leaf secondary structures (Figs. 3, 4). Compared with the standard complement of 22 tRNA genes, two trnSs have not been detected in both of the mtDNAs, but an additional copy of trnF and trnM has been identified in A. irradians and C. farreri, respectively. Among the tRNA genes, trnS is the least conserved tRNA in molluscs. Only one trnS has been identified in R. philippinarum and Lottia digitalis among the published molluscan mtDNAs. In K. tunicata, another copy of trnF was reported as well (Boore and Brown 1994). There are four trnFs in P. magellanicus. A second trnM is also present in the mussel M. edulis, M. galloprovincialis, and M. trossulus (Hoffmann et al. 1992; Mizi et al. 2005; Breton et al. 2006), the clam R. philippinarum, H. arctica, and A. tuberculata (Dreyer and Steiner 2006), and the oyster C. virginica (Milbury and Gaffney 2005). Even in P. magellanicus mtDNAs, there are up to 10 trnMs.

Fig. 3
figure 3

The potential secondary structures of the 21 inferred tRNAs of Argopecten irradians mtDNA. Nomenclature for portions of tRNA structure is shown for tRNA (V). Codons recognized are shown for the pairs of Leucine. The duplication of Phenylalanine is named F1 and F2, respectively

Fig. 4
figure 4

The potential secondary structures of the 21 inferred tRNAs of Chlamys farreri mtDNA. Codons recognized are shown for the pairs of leucine. The duplication of methionine is named M1 and M2, respectively

The anticodon usage of two scallops was congruent to the corresponding tRNA genes of other molluscs with one exception. The trnW gene of the two scallops possessed a CCA anticodon, but it was TCA in other molluscan mtDNAs. The difference of this anticodon corresponds to the third wobble position. The anticodon of another trnF in A. irradians, AAA, is the same as the additional trnF reported in K. tunicata (Boore and Brown 1994). The anticodon of two trnMs in C. farreri is CAU while that of two trnMs are CAU and UAU in the mussel Mytilus.

Ribosomal RNA Genes

Identification of both the small and the large ribosomal RNA genes in A. irradians and C. farreri was accomplished by comparison with other known molluscan ribosomal RNA genes, especially with P. magellanicus. Although putative gene boundaries for the two rRNA genes have been found, these cannot be precisely determined until transcript mapping is carried out. The length of rrnS and rrnL in A. irradians mtDNA is 904 and 1,292 nts, and the A + T content of them is 56.8 and 59.5%, respectively. Whereas, the size of rrnS and rrnL in C. farreri mtDNA is 953 and 1,479 nts, the overall A + T content of them is 52.4% and 58.3%, respectively. The length of the rrnS and rrnL is similar to that of most bivalves. But, the size of rrnS in both scallops is obviously smaller than that of R. philippinarum (1,249 nts). rrnL in C. farreri is the largest one reported so far among bivalves (Table 3).

Non-Coding Regions

As in most bivalves, the A. irradians and C. farreri mtDNAs contain a large number of unassigned nucleotides. There are as many as 24 non-coding regions up to 1,580 nts found throughout the A. irradians mitochondrial genome. The largest non-coding region located between nad4 and rrnF 2 genes accounted for 65.7% of all unassigned nucleotides. This region has a higher A + T content of 63.6% which is higher than the average of the whole genome (57.3%). However, in C. farreri mtDNAs, the size of non-coding regions is up to 5,699 nts distributed within 36 regions in the proportion of 27.4% of the whole genome. The largest non-coding region located between nad4 and nad1 genes is 3,859 nts in the proportion of 68.2% of all unassigned nucleotides. It has a higher A + T content of 63.7% which is also higher than the average of whole genome (58.7%).

Metazoan mtDNAs usually have lengthy non-coding regions that vary in size (Jacobs et al. 1988; Boyce et al. 1989). In most metazoan mtDNAs, the largest non-coding region is thought to contain the signals for replication and transcription, and hence is referred to as the control region (Wolstenholme 1992). Non-coding region has an increased A + T composition, which is one characteristic typically used to identify origins of replication. Unlike in vertebrates, the control region in invertebrates is not well characterized and lacks discrete and conserved sequence blocks used in identification (Hoffmann et al. 1992). Multiple copies of the control region also have been documented in vertebrates (Kumazawa et al. 1998), which may further confound the identification of the origin of replication in less well-studied organisms such as many invertebrate taxa. All bivalve sequences examined for this study had multiple non-coding regions throughout their genomes. As the highly rearranged gene order in bivalves, the largest non-coding region is not conserved at the same location among bivalve mtDNAs. There is no obvious conservation of either nucleotide identities or potential secondary structures between the bivalve non-coding regions. Tandem repeats are common within the control region of animal mtDNAs (Lunt et al. 1998). It often forms stable secondary structures and play an important role in the early stages of the replication and transcription process (Wilkinson and Chapman 1991; Arnason and Rand 1992). Tandem repeat units within non-coding regions have also been extensively found in molluscs such as K. tunicata, O. vulgaris, L. bleekeri, N. macromphalus, H. rubra, A. tuberculata. (Boore and Brown 1994; Yokobori et al. 2004; Tomita et al. 2002; Boore 2006; Maynard et al. 2005; Dreyer and Steiner 2006). In this study, the largest non-coding region in C. farreri mtDNA contains four tandem repeats regions, a 696 nts fragment composed of 9.7 nearly identical copies of a 72 nts motif, a 145 nts fragment composed of 2.1 copies of a 67 nts motif and two 137 nts fragments consisted of 2.1 copies of a 65 nts motif, respectively.