Introduction

The high information content of mitochondrial genomes has been proved very useful in phylogenetic analyses of higher ranked taxa (Boore, 1999). In animals the mitochondrial genome is typically a single circular duplex molecule, about 16 kb in size, and contains 13 protein-coding genes, 2 ribosomal RNA genes, 22 transfer RNA genes, and one AT-rich noncoding part, the control region (Wolstenholme, 1992). Nucleotide or amino acid sequences, mitochondrial gene order, transfer RNA secondary structure, and mitochondrial deviations from the universal genetic code were formerly used as characters for phylogenetic analyses. Sequence data from single genes or the mitochondrial control region also have proved useful in studies concerning population genetics and speciation events.

For a long time the phylogeny of Crustacea has been controversial (for review, see Richter, 2002). Even the monophyly of this group has been questioned. In the last years molecular phylogenetic approaches strengthened this debate, while different approaches using different molecular data sets (18S rRNA, elongation factors, and mitochondrial genes) led to diverse views of crustacean phylogeny. A few examples of such controversial statements from molecular analyses might illustrate the present situation.

In a combined approach (8 partial gene sequences and morphology; Giribet et al., 2001), almost all Crustacea form a monophyletic group with Malacostraca and Copepoda combined in one clade, while Branchiopods, Remipedia, and Cephalocarids are grouped together as a sistergroup to that branch (in addition, an odd clade consisting of Barnacles, a dipluran insect, and Drosophila is found between the remainder of Crustacea and Hexapoda). Several studies using complete mitochondrial genomes suggested a sistergroup relation between Malacostraca and Hexapoda (Wilson et al., 2000; Hwang et al., 2001), but in most analyses only decapod and branchiopod species represented the Crustacea. Recently additional mitochondrial genomes of species from Remipedia, Cephalocarida, Branchiura, and Cirripedia (Lavrov et al., 2004) were published. Using tRNA translocation data these authors combined Cirripedia, Cephalocarida, Branchiura, and the formerly enigmatic Pentastomida in one clade. In the same study phylogenetic analysis using sequence information supported monophyly of Malacostraca and Branchiopoda, but failed to support the cases of Maxillopoda (Cirripedia and Copepoda-Branchiura-Pentastomida as separate lineages) and Hexapoda (with insects and collembolans as separate lineages). All in all the interrelation between the above-mentioned taxa remain poorly resolved. Hexapod polyphyly was proposed previously by Nardi et al. (2003), who also used mitochondrial genome data. Finally, a study of combined 18S and 28S rRNA (Mallatt et al., 2004) found a monophyletic Hexapoda as a sistergroup to copepods, as well and branchiopods as next relatives to that clade. These 3 taxa form in turn the sistergroup to Malacostraca. Other crustacean species in this study were taken from Branchiura and Cirripedia, while Remipedia and Cephalocarida were not represented.

The phylogenetic position of Stomatopoda among Malacostraca is also under discussion. Schram (1986) as well as Richter and Scholtz (2001) proposed Stomatopods as a sistergroup to all other Eumalacostraca. Schram and Hof (1998) and Wills (1998) found that Stomatopoda was a sistergroup to Eucarida, while Peracarida and Syncarida formed more basal branches in Eumalacostraca. To date phylogenetic analyses of arthropod or crustacean relations with mitochondrial genomes have included malacostracan species only from the derived clade Decapoda. Recently the nearly complete mitochondrial genome of a euphausiacean species was added (Machida et al., 2004), but phylogenetic analysis was not included in this study. To sequence mitochondrial genomes of malacostracan species, we developed a versatile polymerase chain reaction (PCR) primer set for short fragments of 6 mitochondrial genes (Table 1). For better taxon sampling among Malacostraca, we sequenced the nearly complete mitochondrial genome of Pseudosquilla ciliata (Stomatopoda). We then used the concatenated sequence data of 12 protein-coding genes for phylogenetic analyses of eumalacostracan and pancrustacean interrelations in order to obtain better information about the phylogenetic positions of Stomatopoda and Malacostraca, respectively.

Table 1 PCR Primers Used to Amplify Crustacean Mitochondrial Gene Fragments

Materials and Methods

DNA Isolation and PCR

A specimen of Pseudo- squilla ciliata was obtained from a commercial source. A single pleopod was used for total genomic DNA extraction, using the DNeasy Tissue Kit (Qiagen) and following the manufacturer’s protocol.

Initially 6 partial mitochondrial sequences (cox1, cox3, nd4, nd5, 16S, and 12S) were determined with PCR primer pairs specially designed for this purpose (Table 1) by looking for conserved regions of mitochondrial genes from other crustacean sequences. PCR primers were purchased from Metabion. PCR was performed on Mastercycler and Mastercycler Gradient (Eppendorf) using the Eppendorf HotMasterTaq kit. Reaction volumes of 50 μl were set up in the following manner: 42 μl sterilized distilled water, 5 μl 10× reaction buffer, 1 μl dNTP mix (Eppendorf), 1 μl primer mix (10 μM each), 1 μl DNA template, 0.2 μl (1 U) HotMasterTaq polymerase. The cycling protocol includes an initial denaturation step (94°C, 2 minutes), 40 cycles of denaturation (94°C, 30 seconds), annealing (1 minute; see Table 1 for annealing temperature of each primer), and extension (68°C, 90 seconds), and a final extension step (68°C, 1 minute). After agarose (0.9%) gel separation and ethidium bromide staining, PCR products were inspected under UV transillumination. PCR purification was done using the PCR purification kit (Qiagen) or if necessary using the gel extraction kit (Qiagen). PCR products were subsequently sequenced (see below).

In a second step the determined first sequences were used to design 5 additional PCR primer pairs bridging the gaps between them. PCR was performed as described above, except an extension time of 7 minutes was used. PCR products were inspected and purified as described above.

Sequencing and Sequence Analysis

All se- quencing reactions were run on Mastercycler and Mastercycler Gradient using the CEQ DTCS Kit following manufacturer’s protocols. Initially PCR primers were used, and subsequently newly designed internal primers were used until completion of sequences (primer walking). Separation was done on a CEQ 8000 (Beckman Coulter); sequencing results were primarily analyzed using CEQ software.

To determine gene identity BLAST searches on NCBI Blast Entrez databases were performed. Not determinable by sequence information alone, boundaries of ribosomal RNAs were assumed to extend to the boundaries of flanking genes. Start codons in protein-coding genes were presumed to be the nearest start codon in frame around the beginning of the sequence alignment of genes homologous with other malacostracan species. Most of the tRNAs were identified using tRNAscan-SE 1.21 (Lowe and Eddy, 1997) and DOGMA (Wyman et al., 2004); the others were found by visual inspection of the suspected regions. Transfer RNA identities were specified by their anticodon sequence.

Methods of Phylogenetic Inference

A concatenated data set of individual amino acid alignments from 12 protein-coding genes was used for phylogenetic analysis (only atp8 was excluded, due to its ambigous alignment). The alignment was done with CLUSTAL X (Version 1.81; Jeanmougin et al., 1998). Ambiguously aligned regions were omitted using GENEBLOCKS software (Version 0.91b; Castresana, 2000) with user-defined settings. Parameters in analysis of Eumalacostraca (10 taxa) were as follows: minimum number of sequences for a conserved position, 8; minimum number of sequences for a flanking position, 8; maximum number of contiguous nonconserved positions, 3; minimum length of a block, 5; and recovered amino acids, 2309. Parameters in analysis of Pancrustacea (21 taxa) were 17, 17, 3, 5, and 1899, respectively.

Phylogenetic analyses were done with two different taxa sets. One included the majority of malacostracan species with published mitochondrial genomes and was used to address the question of stomatopod relations among Eumalacostraca. The other included representatives from all crustacean subtaxa and was used to evaluate the position of Malacostraca among Pancrustacea.

Species names and accession numbers are listed in Table 2. Phylogenetic inference was estimated by the following methods. (1) Distance analysis used logdet distances for amino acids under the minimum evolution criterion implemented in the DAMBE package (Version 4.2.13; Xia and Xie, 2001), with 100 bootstrap replicates. (2) Maximum parsimony analysis used PROTPARS from the PHYLIP package (Version 3.62; Felsenstein, 1989). Bootstrap replicates (100) were created and analyzed using the SEQBOOT and CONSENSE tools. (3) Maximum likelihood analysis used PROTML from the same package, with JTT model and gamma distribution with invariant sites. (4) Bayesian analysis was performed with MRBAYES (Version 3.04; Huelsenbeck and Ronquist, 2001; Ronquist and Huelsenbeck, 2003). One million generations were computed with 4 parallel chains. The model set was JTT, gamma distribution with invariant sites. We preferred the JTT model over mtRev because the latter was created with mitochondrial data exclusively from vertebrate taxa, while we found better performance with arthropod data using JTT.

Table 2 Taxa and Accession Numbers of Mitogenomic Sequences Used in Phylogenetic Analyses

Results and Discussion

Genome Organization

Long-PCR with primer pairs designed for the gap between 12S rRNA and cox1 yielded only a PCR fragment starting with tRNA-Gln and cox1. Several attempts to bridge the gap between 12S rRNA and tRNA-Gln failed, so we present here an incomplete mitochondrial genome, missing a part of 12S rRNA, the mitochondrial control region, and tRNA-Ile. The mitochondrial genome of P. ciliata includes 13 protein-coding and 2 rRNA genes, as found in most other metazoans. Furthermore, we were able to detect 21 of the 22 tRNA genes usually present in metazoan animals (Table 3). Gene overlaps exist at 12 gene boundaries, extending up to 7 nucleotides (between atp8/atp6 and nad4/nad4L). Genome arrangement of P. ciliata was identical to the mitochondrial genomes of Panulirus japonicus, Penaeus monodon, and Marsupenaeus japonicus (Malacostraca; Wilson et al., 2000; Yamauchi et al., 2002, 2005), Daphnia pulex (Branchiopoda; Crease, 1999), and various insects. This is thought to be the pancrustacean ground pattern, which differs in the position of tRNA-Leu(UUR) from the putative ground pattern of Euarthropoda. Other malacostracan representatives show derivations of the pancrustacean ground pattern: 2 tRNA translocations are reported in Euphausia superba (Machida et al., 2004); one is reported in Portunus trituberculatus (Yamauchi et al., 2003); and Pagurus longicarpus (Hickerson and Cunningham, 2000) and Cherax destructor (Miller et al., 2004) show several unique major rearrangements. Further rearrangements in decapod species are reported without complete mitogenomic data by Morrison et al. (2002).

Table 3 Organization of the Mitochondrial Genome of P. ciliata

Protein-Coding Genes

The AT content of the protein-coding genes is 67.6% (A, 28.1%; C, 15.7%; G, 16.7%; T, 39.5%), which is within the range of other malacostracan species; The minimum is given by Cherax destructor, 60.0% (Miller et al., 2004), and the maximum by Penaeus monodon, 69.3% (Wilson et al., 2000).

Most protein coding genes begin with the usual start codons in mitochondria, but cox1 is an exception in that its putative start codon is ACG (coding for Thr). Except for P. trituberculata (Yamauchi et al., 2003), all other malacostracan species with determined mitochondrial genomes show this start codon at cox1, whereas other crustacean species have usual start codons. This may be an apomorphic character for eumalacostraca or malacostraca. Three of the 13 protein-coding genes show incomplete stop codons. This is often described in other animal mitochondrial genomes and is hypothesized to be completed by polyadenylation after cleavage of messenger RNA from the polycistronic transcript (Ojala et al., 1981).

Phylogenetic Analysis

All phylogenetic analyses were performed with concatenated amino acid sequences from 12 protein-coding genes. Ambigously aligned portions were sorted using GENEBLOCKS software (Castresana, 2000) for maximum objectivity. Our phylogenetic analysis of malacostracan mitochondrial genomes (Figure 1) shows that P. ciliata splits off from the base of Eumalacostraca. Caridoida and Decapoda show good support in Bayesian inference, but not in distance and maximum parsimony analyses. Owing to the lack of mitogenomic data for Syncarida and Peracarida, no decision between the conflicting morphology-based hypotheses (Schram, 1986; Wills, 1997; Richter and Scholtz, 2001) is possible.

Fig. 1
figure 1

Maximum likelihood tree of Eumalacostraca based on concatenated amino acid alignments of 12 protein-coding genes (2854 amino acids). Numbers above branches, from top to bottom: Bayesian posterior probabilities (%), maximum parsimony bootstrap values (%, out of 100 trees), and minimum evolution bootstrap values (%, 100 trees).

Because of the higher divergence of sequences in the alignment of pancrustacean species, we sorted out ambiguously aligned positions more strictly. In this way an alignment of 1899 amino acids was recovered and used for phylogenetic analysis. Only Malacostraca were recovered with highly significant support by minimum evolution, maximum parsimony, and Bayesian inference analysis (Figure 2). In other cases only Bayesian inference led to highly significant support of nodes (0.99–1.00). In detail these are Branchiopoda, a branch combining Branchiopoda and Malacostraca, and a branch combining Branchiura, Pentastomida, Copepoda, Ostracoda, and Cephalocarida. The latter grouping corresponds to the interpretation of tRNA translocation data by Lavrov et al. (2004), with the exception that Cirripedia are also included in that group, whereas in our analysis they are not.

Fig. 2
figure 2

Maximum likelihood tree of Pancrustacea based on concatenated amino acid alignments of 12 protein-coding genes (1899 amino acids). Limulus polyphemus and 2 myriapod species serve as outgroup members. Numbers above branches from top to bottom: Bayesian posterior probability (%), maximum parsimony bootstrap value (%, out of 100 trees), and minimum evolution bootstrap value (%, 100 trees).

Contrary to the analysis of Nardi et al. (2003), we found collembolans and insects combined in a monophyletic Hexapoda in the maximum likelihood tree, but with moderate support only in Bayesian inference (0.89). Hexapoda were not recovered in analyses in which alignments were only superficially cleaned from ambiguously aligned parts (data not shown), so this may be a crucial point in phylogenetic analyses based on mitogenomic amino acid alignments. Hexapods form a monophyletic clade with Malacostraca and Branchiopoda in the maximum likelihood tree, but again with only moderate support from Bayesian inference (0.88). None of the tested methods support a monophyletic Crustacea, and although our results are not highly significant, we believe that hexapods are derived from a crustacean ancestor as mentioned previously in other molecular studies of arthropod relations (e.g., Wilson et al., 2000; Hwang et al., 2001, Mallatt et al., 2004). The relations of Hexapoda to the crustacean subtaxa remains an open question. While mitogenomic data up to now have favored Malacostraca and analyses based on 18S rRNA have favored Branchiopoda as sistergroup to Hexapoda, we propose a third hypothesis for consideration: a combined taxon of Malacostraca and Branchiopoda as sistergroup to Hexapoda.