Introduction

Animal mitochondrial DNA (mtDNA) typically encodes a small set of 13 proteins that, together with up to 65 nuclear-encoded proteins, form the enzyme complexes comprising the electron transport chain (Brown and Wallace 1994). Cytochrome c oxidase (COX or complex IV), in particular, is composed of 3 mitochondrial- and 10 nuclear-encoded subunits. The three mitochondrial subunits of COX form a “catalytic core” that interacts with the nuclear-encoded cytochrome c (CYC; Tsukihara et al. 1996). Due to these protein-protein interactions, amino acid residues within the mitochondrial subunits of COX must act cooperatively with amino acid residues in other mitochondrial subunits, with residues in nuclear COX subunits, and with residues in CYC. The high degree of cooperation between COX subunits provides an excellent opportunity to study the importance of such interactions on the evolution of the mitochondrial and nuclear genomes (Schmidt et al. 2001).

In mammals, studies have shown that protein-protein interactions impact the rate and pattern of amino acid substitution in COX and CYC. The in vitro studies of Osheroff et al. (1983) documented decreased catalytic activity when CYC and COX from divergent taxa were cross-reacted. Analyses of human and orangutan cell lines containing heterospecific mtDNA demonstrated extensive disruption of the electron transport system function, including COX activity, relative to cell lines containing homospecific mtDNA (e.g., Barrientos et al. 2000; Dey et al. 2000; Bayona-Bafaluy et al. 2005). In addition, phylogenetic analyses have provided evidence for an accelerated rate of nonsynonymous substitutions in COX proteins among anthropoid primates. For example, the studies by Cann et al. (1984) and Adkins et al. (1996) found that increased rates of amino acid substitution in the mitochondrial-encoded COX2 subunit was associated with increased rates of substitution in CYC. More recently, Grossman et al. (2004) have reported an accelerated rate of nonsynonymous substitution in 9 of 13 COX genes. These subunits, including the mitochondrial-encoded COX1 and COX2 subunits, cluster together in the COX holoenzyme and several key substitutions lie within 20 Å of the site where electrons are transferred from CYC to COX2 (Grossman et al. 2004). Finally, at the structural level, Schmidt et al. (2001) have shown that the rates of amino acid evolution are accelerated at interaction points for the mitochondrial-encoded subunits of COX. Collectively, these studies provide convincing evidence for the coevolution of the mitochondrial and nuclear genes involved in COX expression and function, a phenomenon that Grossman et al. (2004) suggest has been ongoing since the origin of mitochondria.

Tigriopus californicus, an abundant harpacticoid copepod found in high intertidal rock pools on the Pacific coast of North America, provides an alternative, non-mammalian, model for studying the extent to which COX gene evolution is shaped by the interaction between mitochondrial- and nuclear-encoded proteins. Sequence divergence between populations of T. californicus is typically very high for mitochondrial-encoded COX genes. For example, Burton and Lee (1994) observed interpopulation sequence divergence in excess of 20% at the mitochondrial COI locus. The rate of synonymous and nonsynonymous substitution appears to be higher for mtDNA relative to nuclear protein-coding genes in T. californicus (Willet and Burton 2004). Nevertheless substantial divergence has also been documented for nuclear-encoded COX-related genes. Rawson et al. (2000) observed five fixed amino acid substitutions (∼5% divergence) among CYC coding sequences from copepods sampled from Abalone Cove (Los Angeles County), California, and Santa Cruz, California.

Functional studies have demonstrated that interactions between the divergent mitochondrial and nuclear genomes in T. californicus affect COX activity and individual fitness and therefore may also impact the patterns of amino acid substitution in the genes encoding COX subunits. For example, Edmands and Burton (1999) introduced population-specific mtDNA genotypes into the genomic background of alternative populations through repeated backcrossing. They observed a progressive decline in COX activity with increasing genomic introgression, particularly among crosses involving copepods sampled from Santa Cruz. Their results indicate that mtDNA-encoded COX proteins have decreased function when their nuclear counterparts are derived from divergent populations. Willet and Burton (2001, 2003) observed that the viability of CYC nuclear genotypes in the F2 generation of interpopulation crosses was highly dependent on the population source for the mtDNA. Using in vitro assays, Rawson and Burton (2002) found that CYC derived from a San Diego population yielded significantly higher activity in combination with COX from the same population compared to reactions with COX from Santa Cruz copepods, and vice versa. More recently, Harrison and Burton (2006) have employed site-directed mutagenesis to show that single amino acid substitutions at key residues in CYC differentially affect the rates of CYC oxidation by COX derived from divergent copepod populations. Taken together, these studies suggest that there has been substantial coevolution of the mitochondrial and nuclear COX-related genes at the interpopulation level in T. californicus.

The present study examines DNA-sequence variation among full-length COII sequences sampled from seven T. californicus populations. Despite the critical role that the COX2 subunit plays in cellular respiration, Burton et al. (1999) observed extremely high amino acid divergence among four COII sequences from San Diego and Santa Cruz copepods. We have expanded this data set to include COII sequences from additional individuals and populations. Further, we have hypothesized that the evolution of this gene has been shaped by the extensive interactions between COX2 and the nuclear-encoded COX subunits and CYC. For example, population-specific amino acid substitutions in the mitochondrial subunits of COX may be adaptive in the sense that they compensate for divergence in other COX subunits or CYC in order to maintain COX function. For amino acid residues that are involved in intergenomic coadaptation (i.e., at which compensatory changes have occurred), there should be evidence of positive Darwinian selection (Rand et al. 2004). One approach to detecting positive selection is to compare the rate of nonsynonymous to synonymous substitutions in a protein coding gene (Yang 2005). The ratio of nonsynonymous to synonymous substitutions (dN/dS = ω) is expected to equal 1 if nonsynonymous substitutions have no impact on fitness and are fixed at the same rate as synonymous substitutions. When nonsynonymous substitutions are deleterious, then the nonsynonymous rate will be lower than the synonymous rate and ω < 1. However, in cases where nonsynonymous substitutions are adaptive they should be fixed at a higher rate than synonymous substitutions and ω > 1. Thus, ω should be significantly >1 for amino acid residues in either nuclear or mitochondrial subunits which are directly involved in any intergenomic coadapted gene complex.

However, the coevolution of nuclear and mitochondrial genes need not require positive selection. Instead, coevolution can occur through coordinated, reciprocal amino acid substitutions that simply change the functional constraints on interacting proteins (Rand et al. 2004). Under such a scenario, purifying selection is relaxed so that ω ∼ 1. We have used a maximum likelihood (ML) approach to examine whether there is evidence of positive selection or reduced selective constraint among codons in T. californicus COII. In particular, we have employed the program CODEML to obtain site- and lineage-specific estimates of the nonsynonymous synonymous rate ratio (ω) for a set of 26 COII sequences drawn from seven highly divergent copepod populations. Although our analysis suggests that most of T. californicus COII is under strong purifying selection, we have identified a small set of codons that are likely to play a significant role in nuclear-mitochondrial coevolution in this species.

Materials and Methods

Isolation of T. californicus COII

Initial sequence information for a portion of the COII gene was obtained for T. californicus by employing the polymerase chain reaction (PCR) with oligonucleotide primers CO2a (CCACAAA TTTCGGAGCATTGACC) and CO2b (ATAGGNCATCART GATACTGA). These primers target conserved regions in the 3′ end of COX2 and were designed by aligning COII nucleotide sequences for human (X93334), mouse (J01420), cow (J01394), frog (M10217), sea urchin (X12631), fruit fly (U37541), Artemia (X69067), and honeybee (L06178) and by comparison to the conserved primer sequences (C2-J-3400 and C2-N-3661) given by Simon et al. (1994). DNA template was prepared for PCR by boiling individual copepods sampled from San Diego, California, in 30 μl of distilled water for 5 min. All 30 μl was subsequently used in a 50-μl PCR reaction with 50 pmol of primers CO2a and CO2b, 2.5 nmol of each dNTP, and 2.0 units of Taq DNA polymerase in the buffer supplied by the manufacturer (Promega). The reactions were denatured for 3 min at 94°C and then incubated for 35 cycles of 94°C for 30 s, 40°C for 30 s, and 72°C for 90 s.

The initial PCR reactions amplified several products including one that was ∼290 base pairs in size, as predicted by the position of the primers. This product was gel purified and direct sequenced once in each direction using the original PCR primers, a Big Dye sequencing kit (Applied Biosystems), and an ABI model 373 automated sequencer. From this sequence, two new PCR primers, CO2-LF (GAGATGCTGTCCCTGGGCGGCTAAATCA) and CO2-LR (AGTAACCAAAATTCGCACAGGCACATTAATAG G), were designed to amplify the whole mitochondrial genome following the methods of Nelson et al. (1996). DNA template was prepared for T. californicus individuals from the San Diego population as described above and used in long PCR amplifications using Expand Taq polymerase (Boehringer-Mannheim) with buffer system 3 and the protocol provided by the manufacturer. A large PCR product (approx 15 kb) was obtained by performing 30 cycles of 92°C for 30 s, 60°C for 30 s, and 68°C for 15 min. The whole mitochondrial amplification product was digested simultaneously with the restriction enzymes BamHI and HindIII and the resulting fragments were ligated into the vector pGEM-3z (Promega) digested with the same enzymes. The resulting recombinant plasmids were used to transform chemically competent JM109 cells (Promega) and positive colonies were selected by following the manufacturer’s blue/white screening protocol. Clones containing putative COII sequence were identified in PCR amplifications with a vector-specific primer, either M13 forward or M13 reverse, in combination with one of the two gene-specific primers, CO2a or CO2b. Two COII-positive clones, one containing the 5′ end and the other the 3′ end of the T. californicus COII gene, were thus identified and sequenced. When combined with our initial sequence obtained with primers CO2a and Co2b, these new sequences generated a full-length COII sequence from which the primers CO2e (AAACCTAAATTTATTGCACTAATCTGCC) and CO2f (TTA GGGTATGTGGTTTTACAGG) were designed to amplify the complete coding sequence (227 codons) of the T. californicus COII gene.

Analysis of COX2 Variation in T. californicus

We sampled two to four adult copepods from high intertidal pools at each of seven sites along the Pacific coast (Fig. 1). DNA template from individual copepods was prepared by the boiling method described above and used in a 50-μl PCR reaction with 50 pmol each of primers CO2e and CO2f, 2.5 nmol of each dNTP, and 2.0 units of Taq DNA polymerase in the buffer supplied by the manufacturer (Promega) with a MgCl2 concentration of 1.5 mM. The reactions were denatured for 3 min at 94°C and then incubated for 35 cycles of 94°C for 30 s, 55°C for 30 s, and 72°C for 90 s. The resulting PCR products were spin column purified (Qiagen) and direct sequenced using the PCR primers CO2e and CO2f. The overlap of the sequence information generated in each direction was nearly complete for each copepod; consensus sequences for each individual were determined by comparing sequence chromatograms with the program Sequencher ver 3.0 (Genecodes Corp) and have been submitted to GenBank under accession numbers AY293745 to AY293770. The program Sequencher (Gene Codes Inc) was used to align the 26 sequences obtained in this study along with a full-length COII sequence for the congener T. japonicus (NC003979). The within-population polymorphism and between-population divergence at synonymous and nonsynonymous sites was estimated using the modified Nei-Gojobori method with Jukes Cantor correction as implemented in the MEGA2 software package (Kumar et al. 2001). The program PAUP* (Swofford 1999) was used to conduct an heuristic parsimony search with simple stepwise addition of sequences and tree reconnection branch swapping in order to estimate the phylogenetic relationships among the sequences.

Figure 1
figure 1

A maximum parsimony phylogram based on 26 full-length nucleotide sequences for T. californicus COII sampled from these populations, with T. japonicus COII serving as the outgroup. The phylogeny is a consensus of four trees, each 480 steps long. The approximate number of nucleotide substitutions is presented above each branch, with the bootstrap support value >80% (in parentheses). The approximate number of amino acid substitutions is given below each branch. Inset Geographic location of the seven populations of Tigriopus californicus from which COII gene sequences were sampled.

We examined variation in the nonsynonymous to synonymous rate ratio (ω) among sites and lineages in the estimated phylogeny using the ML approach initially described by Yang and Nielsen (2002). First, we applied a series of models in which ω varied across sites (codons) to the T. californicus COII data in order to test for the presence of codons under positive selection. These models were implemented using the program CODEML in the PAML software package (vers. 3.14 [Yang 1997, 2004]). The ML analysis was restricted to unique sequences so that some populations (Abalone Cove, Flat Rock, Morro Bay, and Santa Cruz) were represented by a single sequence, while other populations (La Jolla, Carmel, and San Diego) were represented by two or three sequences each. In the ML analysis, we first fit a one-ratio model (M0) that assumes ω is constant for all sites and lineages. We then fit a series of more complex models including model M1a, which assumes that there is a proportion (p 0 ) of conserved sites at which ω << 1 while the remaining codons (p 1 ) are neutral (ω1 = 1), and model M2a, which includes the same site classes as model M1a but adds a third class of codons, for which ω is a freely varying parameter and the proportion of which (p 2 ) is estimated from the data. Two additional models that we considered were model M7 that presupposes the ω values are drawn from a beta distribution within the interval (0,1), while model M8 adds one extra class of codons for which ω is a freely varying parameter and can exceed 1. The relative fit of these models was assessed using likelihood ratio tests (LRTs). For each pair of models to be compared a test statistic was calculated as twice the difference between the log-likelihood values of the two models. This test statistic was compared to a chi-square distribution with the degrees of freedom equal to the difference in the number of parameters for the two models. Comparisons of model M2a to M1a and M8 to M7 provide for tests of selection, and when the alternative models (M2a or M8) indicate the presence of sites for which ω > 1, they can also be considered tests for positive selection.

We employed a second set of ML models, as implemented in CODEML, to examine the degree to which the ω ratio varies among lineages. The free-ratio model was used to estimate a unique value for ω along each branch in the T. californicus COII phylogeny. Comparison of this model to the one-ratio model (M0) by an LRT constitutes a test of the heterogeneity of the nonsynonymous to synonymous rate ratio among lineages. We then fit a series of two-ratio and four-ratio models in which ω for focal branches in our phylogeny were assumed to be different from the background ratio (ω0) estimated for the rest of the phylogeny. An LRT was used to compare the fit of each of these models relative to the single-ratio model (M0) and therefore represents a test of whether selection pressure varies along these specific branches in the phylogeny. When ω was observed to be greater than one for any focal branch, we also tested the relative fit of a model where ω for that branch was constrained to be one. An LRT between the constrained and the unconstrained models was then conducted as a test for positive selection along that branch.

We also employed the “branch-site” approach outlined by Yang and Nielsen (2002) and later modified by Yang (2004) to our T. californicus COII phylogeny. This approach tests for positive selection among residues along focal “foreground” branches in the underlying phylogeny. For each focal branch, we fit model A (Yang 2004), in which one class of sites was assumed to be conserved with ω0 << 1 and a second class of sites was assumed to be neutral with ω1 = 1, with the proportion of each class (p 0 and p 1 ) estimated from the data. The model, however, includes two additional site classes that occur at proportions p 2 and p 3 . These sites are assumed to have ω0 or ω1, respectively, in the background lineages of the phylogeny but in specified “foreground” branches they have a ratio ω2 that is free to vary and can exceed 1. For each run of model A, an LRT was constructed in which the log-likelihood value obtained under model A was compared to the value obtained from the nearly neutral model (M1a). However, this test, called test 1 by Yang (2004), can potentially mistake reduced selective constraint with positive selection among residues on the foreground branch. Thus, when there were significant differences in the relative fit of model A and M1a, we also compared the log-likelihood of model A with a null model in which ω2 in the foreground branch was constrained to be equal to 1. The latter test, referred to as test 2 by Yang (2004), represents a more stringent test of positive selection. When the relative fit of the various models is significantly different, CODEML also implements a Bayes empirical Bayes procedure to infer which site class a particular codon most likely belongs to (Yang and Nielsen 2002; Yang 2004). Thus, we used CODEML to examine the posterior probability that particular codons belonged to site classes with ω << 1, ω ≈ 1, and ω > 1in the branch-site models as well as the site-specific M1a and M2a models (above).

Results

We observed nucleotide substitutions at 290 of the 678 bases (43%) that comprise T. californicus COII. Among the 26 full-length sequences we obtained, the vast majority of these substitutions occurred between sequences sampled from different populations and the average pairwise sequence divergence was nearly 20% (range, 0.9 to 26.4%). In contrast, sequence polymorphism within populations was much lower. No intrapopulation variation was observed for the sequences drawn from four of the seven populations sampled (Table 1), and the average pairwise sequence polymorphism was only 0.18% (range, 0 to 1.04%). The low intra- versus high interpopulation divergence is reflected in the COII phylogeny estimated using maximum parsimony where the sequences from each population form distinct clades well supported by bootstrap analysis (Fig. 1). The rate of synonymous substitution greatly exceeded the rate of nonsynonymous substitution both within and between populations (Table 1). In some cases, substitutions occurred at nearly every synonymous site, so that estimates of the synonymous substitution rate among sequences sampled from populations such as La Jolla and Santa Cruz exceeded 1.

Table 1 Estimates of within-population nucleotide polymorphism and between-population nucleotide divergence: For each comparison the top value is for synonymous sites, while the lower value is for nonsynonymous sites

Despite the lower rate of nonsynonymous substitution, there was still extensive interpopulation sequence divergence at the amino acid level. We observed replacement substitutions at 38 of the 226 (17%) residues in T. californicus COX2. Although sequences from some populations differed by up to 30 amino acid substitutions, only three clades were supported by bootstrap analysis (Fig. 2). The sequences from Santa Cruz, Carmel, and Morro Bay formed one clade, those from Abalone Cove and Flat Rock formed a second clade, and those from San Diego and La Jolla formed the third. We labeled these clades the northern, central, and southern sequence clades, respectively.

Figure 2
figure 2

Phylogenetic relationships among the inferred amino acid sequences for the 12 unique COII sequences from Fig. 1. Branch lengths are proportional to the number of amino acid replacements along each lineage in the phylogeny, and bootstrap support values for the three clades identified in the phylogeny are provided at the right. Focal branches used in the maximum likelihood analysis are indicated in boldface. Maximum likelihood estimate of the nonsynonymous/synonymous rate ratio for each branch in the phylogeny is given above each branch.

The site-specific maximum likelihood models we employed indicate that the vast majority of codons in T. californicus COII are under purifying selection. The estimate of ω obtained under the one-ratio model (M0) and thus averaged across all sites and lineages in the phylogeny was 0.030 (Table 2). The nearly neutral model (M1a) provided a much better fit to the data than the one-ratio model, by 24.5 log-likelihood units. This model suggested that nearly 96% of the codons in COII are under purifying selection (ω0 = 0.020), while 4% of the sites in T. californicus COII were under relaxed selective constraint (ω1 = 1). Although model M2a allows for sites under positive selection, none were detected, as there was little improvement in the fit of model M2a relative to model M1a and an LRT for these two models was not significant. Likewise, a comparison of models M7 and M8 provides for a test of positive selection but an LRT for these models was not significant.

Table 2 Log-likelihood values (l) and parameter estimates for the codon-based maximum likelihood models in which ω varies among sites only

The branch-specific analysis indicated that there is significant heterogeneity in the nonsynonymous to synonymous rate ratio among lineages in the T. californicus COII phylogeny (Table 3). The free-ratio model, which assumes a unique ω for every lineage fit the data significantly better than the one-ratio model (2Δl = 49.16; p < 0.0001; df = 20). Lineage-specific estimates of ω were < 1 with the exception of the branch leading to the Abalone Cove-Flat Rock (central) sequence clade (Fig. 2). The free-ratio model, however, is parameter rich and not amenable to hypothesis testing. Thus, based on previous studies of COX function in T. californicus, we focused specific attention on the branches leading to the northern, central, and southern amino acid clades. The two-ratio model, in which ω for each of these three branches was constrained to be equal but different from the background ratio (ω0), did not provide an improved fit relative to the one-ratio model (2Δl = 0.86; p = 0.354; df = 1). A four-ratio model in which ω for each of the three focal branches was free fit the data significantly better than the one-ratio model (2Δl = 29.02; p < 0.0001; df = 3). Under this model, ω was substantially < 1 for the branches leading to the northern and southern sequence clades, while ω = ∞ for the central sequence clade branch. However, the model in which ω was free to vary for the southern and northern branches but was constrained to be equal to 1 for the central branch was not significantly different from the unconstrained four-ratio model. Thus, our analysis provides no evidence for lineage-specific positive selection and the high ω value for the central branch that we observed is likely due to the fact that this branch subtends the outgroup in our analysis.

Table 3 Log-likelihood values (l) and parameter estimates for the codon-based maximum likelihood models in which ω varies among lineages within the T. californicus COII phylogeny

We also employed a set of branch-site models to examine whether there was among-site variation in selection pressure along each of the northern, southern, or central branches in the T. californicus COII phylogeny (Table 4). For the branch leading to the northern sequence clade, there was no significant difference between model A and the nearly neutral model (2Δl = 3.54; p = 0.170; df = 2). Similarly, model A for the southern branch did not fit the data better than model M1a. However, model A for the branch leading to the central sequence clade fit the data better than the nearly neutral model, by > 14 log-likelihood units; a difference that was statistically significant (test 1; p < 0.0001; df = 2). Model A for this branch was also significantly better than the alternative model where ω for the central branch was constrained to be equal to 1 (test 2; 2Δl = 4.9; p < 0.027; df = 1). The results from model A for the central branch suggested that perhaps 13% of the codons in T. californicus COII may have been under positive selection along this branch. However, examination of the site class posterior probabilities indicated positive selection at only three sites (see Fig. 3) and only at a probability level of 0.50 < p < 0.60. The results from all of the maximum likelihood models discussed above were obtained consistently regardless of the substitution model or starting parameter estimates we used in the analysis.

Table 4 Parameter estimates and log-likelihood values (l) for the codon-based maximum likelihood models in which the ω ratio varies among sites and branches
Figure 3
figure 3

Consensus COX2 amino acid sequences for each of seven T. californicus populations. For comparison the sequences for T. japonicus and bovine COX2 (accession no. J01394) are also included in the alignment. This alignment indicates that there has been a three-amino acid insertion in the copepod sequences between residue 114 and residue 115 in the bovine sequence. These inserted residues were numbered 114a–c in our analysis in order to keep the numbering consistent with bovine COX2. The relative positions of the extramembrane, transmembrane, and matrix domains, as well as the “heme-patch” and Cu a -binding residues, are indicated above the alignment, while residues that have been implicated in either direct transfer of electrons from cytochrome c or binding of cytochrome c are in boldface. Sites experiencing either relaxed selective constraint or positive selection, as inferred by the site-specific or branch-site model, are indicated by open and filled diamonds, respectively.

Discussion

Population genetic studies have repeatedly demonstrated that, for both nuclear and mitochondrial genes, genetic divergence between some T. californicus populations is extremely high and temporally stable even over short geographic distances (Burton 1986, 1997, 1998; Burton and Lee 1994; Rawson et al. 2000; Willet and Burton 2001). In the present study, we have shown that within-population sequence polymorphism at the mitochondrial COII locus was virtually nonexistent. In contrast, sequence divergence between populations in the central and northern sequence clades exceeded 20%. Thus, our observations for mtCOII are consistent with the high levels of variation observed at other loci for this species and are further evidence that these populations have long histories of independent evolution.

Despite the critical role that the COX2 subunit plays in electron transfer, we observed amino acid substitutions at nearly 17% of the residues which comprise this protein in T. californicus. These replacement substitutions are not evenly distributed across the gene. The COX2 subunit consists of two extramembrane domains, the second of which is a large C-terminal domain of 143 residues (Fig. 3). Experimental and molecular modeling studies (e.g., Millett et al. 1983; Holm et al. 1987; Wilson and Cameron 1994; Witt et al. 1998; Roberts and Pique 1999; Wang et al. 1999) have identified 18 residues in this C-terminal domain that either interact directly with cytochrome c or act as the binding site for a copper (Cu a ) atom and thus play a central role in the transfer of electrons from cytochrome c to oxygen during oxidative phosphorylation. All 18 of these residues were fixed among the T. californicus and T. japonicus sequences we analyzed. Outside of these residues, we observed replacements at 31 of 118 (26%) residues in the large C-terminal domain. In contrast, amino acid substitutions were observed at only 10% (4 of 42) and 6% (1 of 14) residues in the two transmembrane and mitochondrial matrix domains, respectively.

We are particularly interested in the degree to which the amino acid substitutions in T. californicus COII are the result of or facilitated by interactions between COX2 and nuclear-encoded COX subunits and cytochrome c. The amino acid replacements we have observed may be adaptive if they compensate for divergence in other COX-related proteins in order to maintain COX function and such compensatory mutations are expected to drive nuclear-mitochondrial coadaptation (Rand et al. 2004). Further, any residues directly involved in intergenomic coadaptation should show evidence of positive selection. Although the site- and lineage-specific maximum likelihood models we employed failed to detect positive selection, the branch-site models suggested that positive selection has occurred along the branch leading to the central sequence clade in our COII phylogeny. However, even though model A for this branch was statistically significant under both test 1 and test 2 recommended by Yang (2004), and indicated that perhaps as many as 13% of the codons in COII may have been under positive selection along this branch, examination of the site class posterior probabilities provided only weak support for positive selection at three sites (shown in Fig. 3).

The lack of strong support for positive selection acting on particular residues in T. californicus COII is likely to be a function of the power of our analysis. Anisimova et al. (2001, 2002) have shown that when codon substitution models are analyzed in a ML framework, the power of the analysis is dependent on the sequence length, the level of divergence in the data set, and the number of sequences analyzed. In the present study, the number of sequences analyzed is likely to have reduced the effective power of our analysis. Anisimova et al. (2002) have suggested that power under the site-specific models is low when there are only 5 or 6 sequences in the data set but increases to 100% with 17 sequences. The branch-site models are intended to have greater power to detect positive selection, relative to the site-specific models where positive selection is inferred only when the average rate of nonsynonymous substitution (dN) for a particular codon is higher than the average rate of synonymous substitution (dS) when averaged over all lineages (Yang and Nielsen 2002). This appears to be the case in our analysis; the site- and lineage-specific models failed to detect positive selection, while the branch-site models suggest that some codons are under positive selection along the central branch in our phylogeny. Even so, many data sets may not contain enough information for the empirical Bayes analysis to identify specific residues under positive selection (Yang and Nielsen 2002). Because the branch-site model focuses on only one lineage, there is no opportunity for multiple changes at each site to accumulate in small data sets, thereby reducing the ability of the method to reliably obtain parameter estimates. Our analysis included 26 sequences of which 12 were unique. More importantly, in the central sequence clade there were only two unique sequences. Thus, the detection of positive selection on T. californicus COII could be enhanced by increasing the number of sequences in the data set as a whole and, particularly, by increasing the number of unique sequences in the central sequence clade.

Alternatively, the lack of positive selection may have a biological basis. ML approaches have been highly successful at detecting positive selection for three categories of genes, genes encoding proteins involved in host defense or pathogen avoidance of host defenses, genes encoding proteins or pheromones involved in reproduction, and genes which have acquired new function, often as a result of a gene duplication event (Yang 2005). In the case of mitochondrial genes, clonal inheritance reduces the effective population size for genes contained in organellar genomes relative to nuclear genes. Reduced effective population size in turn increases the likelihood that deleterious mutations become fixed in a population by chance. In addition, the lack of recombination in the mitochondrial genome increases the opportunity for deleterious mutations to become fixed as part of a selective sweep, even when selection occurs outside of COII. Thus, the lack of recombination in mitochondrial genomes along with clonal inheritance suggests that cytonuclear coadaptation should be asymmetric, with compensatory changes and positive selection more likely to occur in the nuclear genome (Rand et al. 2004), and to date, positive selection on mitochondrial variation has been documented in very few cases (e.g., Ruiz-Pesini et al. 2004; reviewed by Rand et al. 2004).

However, the coevolution of nuclear and mitochondrial genes can proceed without coadaptation and positive selection. Nuclear-mitochondrial coevolution may, instead, occur through coordinated amino acid substitutions in both genomes that alter the functional constraints on the interacting proteins by relaxing purifying selection (Rand et al. 2004). In this regard, our ML analysis suggests that there are sites in T. californicus COII where purifying selection has been relaxed (see Fig. 3). It is important to note that CODEML is not designed to test for relaxed selective constraint because neutral evolution is the null model on which the method tests for positive selection. Even so, the site-specific, nearly neutral model (M1a) provided a much better fit to the data than did the one-ratio model (M0) and examination of the site class posterior probabilities under this model identified four sites that had a high probability of belonging to the site class with ω = 1 (see Fig. 3). Thus, for these sites the null model of neutral evolution could not be rejected even though our analysis indicates that there is strong purifying selection (ω = 0.20) for the rest of the codons in T. californicus COII.

If our hypothesis is correct that nuclear-mitochondrial interactions have played a role in the molecular evolution of COII, then there should be evidence that the four residues under relaxed selective constraint and the three residues potentially under weak positive selection in the central sequence clade interact in some way with residues in nuclear-encoded genes. To address this, we used information from experimental and molecular modeling studies (e.g., Millett et al. 1983; Holm et al. 1987; Wilson and Cameron 1994; Roberts and Pique 1999) and the crystal structure for bovine COX (Protein Data Bank Accession 1OCC; Tsukihara et al. 1996) to ask whether such interactions are likely. Following the reasoning of Schmidt et al. (2001) we defined interacting residues as those in COX2 which are ≤4 Å apart from a residue belonging to any nuclear-encoded COX subunit in the crystal structure for COX. Although by this definition variable residues in T. californicus COX2 interact with no fewer than six nuclear-encoded COX subunits, three of the seven residues where there is reduced selective constraint or weak evidence of positive selection interact with only a single nuclear subunit, COXVIC (Table 5).

Table 5 Variable residues in T. californicus COX2 for which the site-specific or branch-site models indicated that there was relaxed selective constraint or possibly positive selection and whether these residues interact with other mitochondrial-encoded and nuclear-encoded COX subunits (intergenic contacts)

Evidence that these interactions play an important role in maintaining the structure and activity of T. californicus COX comes from functional studies by Edmands and Burton (1999). They introduced population-specific mtDNA genotypes into the genomic background of alternative populations through repeated backcrossing and observed a progressive decline in COX activity with increasing genomic introgression. The effect was particularly strong among matings between copepods sampled from Abalone Cove and Santa Cruz, California. Our observation of an elevated rate of nonsynonymous substitutions along the branch leading to the Abalone Cove-Flat Rock sequence clade is consistent with the results of Edmands and Burton (1999).

Intergenomic coevolution may also be the result of interactions between COX2 and CYC. A number of residues have been identified in both COX2 and CYC which affect enzyme-substrate binding. These include a series of surface lysine residues in CYC and surface carboxyl-bearing residues in COX2 (Wang et al. 1999; Zhen et al. 1999). These residues are invariant among T. californicus CYC (Rawson et al. 2000) and COX2 sequences (this study; Fig. 3). However, we observed an amino acid substitution within the “heme-patch” region of COX2 that may be under positive selection. This region consists of alternating aromatic and acidic amino acids from Gly101 to Glu109 that form a hydrophobic loop overlying the Cu a -binding site (Roberts and Pique 1999) and interacting directly with the exposed heme edge of cytochrome c. Although amino acid residues in the heme-patch are typically highly conserved, we observed multiple substitutions including the replacement of Thr107 for Ser107 among the sequences from the Abalone Cove and Flat Rock populations and our ML analysis indicated that the Thr107 residue may have been the result of positive selection.

The presence of sites which may be directly involved in intergenomic coevolution is in agreement with the experimental results of Rawson and Burton (2002), Willett and Burton (2001, 2003), and Harrison and Burton (2006). Rawson and Burton (2002) have shown, using in vitro enzyme assays, that COX derived from SD and SC populations had significantly higher activity with cytochrome c derived from their respective populations and activity was consistently reduced when COX and cytochrome c came from the alternate population. Harrison and Burton (2006) extended this analysis to include a COX variant from copepods sampled from Punta Morro, Baja California, Mexico. Even though CYC isolated from San Diego and Punta Morro copepods differs by only a single amino acid substitution, the rate of oxidation of San Diego CYC decreases significantly when reacted with Punta Morro COX. Willet and Burton (2001) estimated the relative viability of population-specific cytochrome c alleles in the F2 generation of interpopulation crosses. They found that genotypes which were homozygous for Abalone Cove- or Santa Cruz-specific cytochrome c alleles had lower survival when they occurred in conjunction with the mitochondrial genome, and thus COII genotype, of the alternate population relative to control crosses. Our observation of seven sites with an elevated frequency of nonsynonymous substitution, particularly among sequences from the Abalone Cove and Flat Rock populations, is highly consistent with the results from these functional assays. However, definitive evidence of the impact of nonsynonymous substitutions on the shape of T. californicus COX2 will likely require obtaining the crystal structure for T. californicus COX.