Introduction

Since their discovery numerous functional and structural studies have been carried out on the oxygen carriers hemerythrins (Hrs), hemoglobins, and hemocyanins, three nonhomologous classes of metal-protein involved in reversible oxygen binding (Terwillinger 1998; Kurtz 1999). The molecular evolution of the widely distributed hemoglobins has been extensively investigated (Moens et al. 1996). Similar studies have been done on the hemocyanins of arthropods and mollusks ( van Holde et al. 2001; Burmester 2002). Conversely, a molecular evolutionary analysis has never been performed on the Hrs.

Hr is present in the phyla Sipuncula, Brachiopoda, and Priapulida (Klippenstein 1980; Terwillinger 1998). It has been recorded also in annelids belonging to the Magellonidae, Nereididae, and Glossiphoniidae families (Jushi and Sullivan 1973; Manwell and Baker 1988; Coutte et al. 2001). The erythrocytes that circulate in the coelomic fluid and in the vascular system of the sipunculans and brachiopods contain intracellular polymeric Hr (Robitaille and Kurtz 1988; Cutler 1994) that exhibits either a trimeric or an octameric quaternary structure. The trimeric form has been found solely in some sipunculan species (Addison and Bruce 1977; Smith et al. 1983; Wilkins and Harrington 1983; Uchida et al. 1990), whereas the octameric structure is present in both the sipunculans (Ferrell and Kitto 1971; Wilkins and Harrington 1983; Demuynck et al. 1991) and the brachiopods (Satake et al. 1990; Zang and Kurtz 1991). Strong intersubunit interactions are responsible for the remarkable stability of the oligomers (Holmes and Stenkamp 1991).

Myohemerythrin (myoHr), a cytoplasmic monomeric protein, is present in the muscles of sipunculans and annelids (Hendrickson, Klippenstein and Ward 1975; Ward et al. 1975; Tagagi and Cox 1991). Finally, Vergote and coworkers (2004) discovered an additional member of the Hr family, called neurohemerythrin (nHr), expressed in the central nervous system of the leech Hirudo medicinalis (Annelida, Hirudinidae).

Both Hr and myoHr have an active oxygen-binding site containing two iron atoms bound to specific amino acids (Holmes and Stenkamp 1991). Polypeptide chains of 113 and 118 amino acids constitute sipunculan Hr and myoHr, respectively. They fold in two pairs of antiparallel helixes (A, B, C, and D) that are stabilized by hydrophobic interactions as well as by the coordinated irons (Holmes and Stenkamp 1991; Zhang and Kurtz 1992). Until now, few Hr and myoHr sequences of sipunculans, brachiopods, and annelids have been determined, and no data are available for the Priapulida phylum. This paucity of molecular information made it very difficult to properly investigate the evolution of the Hr family even at the phylum level. To circumvent this point we focused our work on the Sipuncula by sequencing, completely or partially, seven new Hr or myoHr cDNAs. Our findings and analysis provided new comparative data that significantly increase the set of known Hr and myoHr sequences. Moreover, the new data obtained allowed us to study for the first time the molecular evolution of this family of proteins in the Sipuncula phylum. Finally, we discuss the possible scenarios that characterized the evolution of Hr/myoHr comparing our results with the recent proposed phylogenies for the Sipuncula phylum (Maxmen et al. 2003; Staton 2003).

Materials and Methods

Biological Material

Coelomic erythrocytes from Sipunculus nudus Linneaus 1766 and Golfingia vulgaris vulgaris de Blainville 1827 were isolated from living worms provided from the Station Biologique de Roscoff (France) and from living S. nudus specimens obtained from fishermen in Venice (Italy).

Hr Primer Design

Degenerated forward and reverse Hr-specific primers were designed according to an amino acid sequence multiple alignment obtained from the Hr sequences available in the Swiss-Prot database: Phascolopsis gouldii (P02244), Themiste zostericola (P02245), T. dyscriptum (P02246), and Siphonosoma cumanense (P22766). The following two primers—HR3A, 5′-DAT YTT NCC YTT RTA YTT RAA RTC-3′ (forward), and HR5A, 5′-GGN TTY CCN ATD CCN GAY CC-3′ (reverse) (MGW Biotech)—were then used for PCRs using a cDNA template.

Total RNA Extraction and cDNA Synthesis

Erythrocytes from coelomic fluids of S. nudus and G. v. vulgaris were separated by centrifugation for 5 min at 2000g and crushed in liquid nitrogen. Total RNA was recovered and extracted using Trizol Reagent (Gibco). Reverse transcription was initiated directly on total RNA, without further purification, with the oligo dT CTC CTC TCC TCT CCT CTT recommended by the Promega reverse transcriptase kit protocol. Moreover, a pool of total RNA was extracted from the intestinal tube tissue of S. nudus to synthesize a second cDNA template.

Hr Amplification and Sequencing

Each partial myoHr or Hr cDNA was amplified by PCR using a Perkin–Elmer GenAmp PCR System 2400. PCR were carried out as follows: initial denaturation at 96°C for 5 min, then 35 cycles consisting of 96°C for 50 s, 50°C for 50 s, and 72°C for 50 s. The reaction was completed by an elongation step of 10 min at 72°C. Amplifications were carried out in 25-μl reaction mixtures containing 10–50 ng of cDNA target, 50–100 ng of each degenerate primer, 200 μM dNTPs, 2.5 mM MgCl2, and 1 unit of Taq DNA polymerase (Promega). PCR products were visualized on a 1% agarose (Eurobio) gel under UV radiation. Gel slices containing DNA fragments of the expected size (∼200 bp) were collected and subsequently purified onto Ultrafree-DA (Millipore). PCR products were then cloned using a TOPO-TA Cloning Kit (Invitrogen). Purified plasmids containing the Hr insert were sent to the Biotechnology Center CRIBI (University of Padua, Italy) for sequencing. The 3’ and 5’ end coding sequences were obtained by RACE 5’/3’ (Roche) following the protocols provided with the kit.

Hr Molecular Phylogenetic Analysis

Amino acid sequence multiple alignments were performed using the ClustalW software (Thompson et al. 1994). The analyzed data set includes the Hr, nHr, and myoHr amino acid sequences of the following species: annelids Periserrula leucophryna (AY312845), Hirudo medicinalis (AY521548), and Theromyzon tessulatum (AF279333); Nereis diversicolor (P22761, S38261); brachiopods Lingula reevii (P23543, P23544) and Lingula unguis (P22764, P22765); sipunculans Phascolopsis gouldii (P27686, P02244), Themiste zostericola (P02247, P02245), Themiste dyscriptum (P02246), and Siphonosoma cumanense (P22766); and the seven new Hr and myoHr amino acid sequences presented under Results. Phylogenetic studies were performed according to the Bayesian inference (BI), maximum likelihood (ML), and maximum parsimony (MP) methods (Felsenstein 2003). BI analysis was performed with the MrBayes 2.1 program (Huelsenbeck and Ronquist 2001). The JTT substitution matrix (Jones, Taylor, and Thornton 1992) was used in the reconstruction, while site heterogeneity was modeled with a four-category Γ distribution. The Metropolis-coupled Markov chain Monte Carlo sampling approach was used to calculate posterior probabilities. Prior probabilities for all trees were equal, starting trees were random, tree sampling was done every 100 generations, and burn-in values were determined empirically from the likelihood values. To check for consistency of results, four heated Markov chains were run simultaneously for 1 million generations.

The ML analysis was performed with the PHYML 2.4 program (Guindon and Gascuel 2003) using the parameters listed above for the BI analysis. MP analysis was done using the program PHYLIP 3.6 (Felsenstein 2002).

Nonparametric bootstrap resampling (BT) (Felsenstein 1985) was performed to test the robustness of the tree topologies obtained from MP and ML analyses (1000 replicates). The tree topologies were visualized with the Treeview 1.6.6 and NJplot programs (Page 1996; Perrière and Gouy 1996).

Finally genetic distances were calculated and used to discuss the evolution of G. v. vulgaris and S. nudus Hr.

Molecular Adaptation

Selective pressures acting on the myoHr and Hr proteins were investigated by considering the nonsynonymous-to-synonymous substitution rate ratio ω (=dN/dS). We followed the guidelines provided by Yang and Bielawski (2000) to test if positive (ω > 1), neutral (ω = 1), or purifying selection (ω < 1) characterized the evolution of these oxygen carriers. In particular, we used the likelihood approach described by Yang et al. (2000). This approach implies the usage of various site-specific likelihood models describing the distribution of ω among sites. We applied the M0 (one-ratio), M1 (neutral), M2 (selection), M3 (discrete), M7 (β), and M8 (β & ω) models to assess negative, positive, or neutral selection acting at each codon (Yang et al. 2000). A brief description of these alternative models is provided under Results. All parameters required by the above-mentioned models were computed with the PAML program (Yang 2002). Multiple runs were done, starting with different ω values, to avoid trapping in ML multiple local optima (Yang 2002). The fitting of the different models to our data set was tested by likelihood ratio test (LRT) comparing the alternative evolutionary nested models. Analysis was restricted to the available nucleotides sequences determined so far, i.e., the new myoHr and Hr nucleotide sequences of S. nudus and G. v. vulgaris presented in this paper and those of the sipunculan P. gouldii Hr (AF220529) and the annelid P. leucophryna myoHr (AY312845), H. medicinalis nHr (AY521548), and T. tessulatum myoHr (AF279333).

Results

Identification and Characterization of the Hr of S. nudus and G. v. vulgaris

We sequenced a S. nudus cDNA coding for a myoHr (119 amino acids) (AJ632198) and two S. nudus cDNAs coding for two Hrs (118 amino acids) that we named SNa (AJ632021) and SNb (AJ632197). We also sequenced two G. v. vulgaris cDNA coding for two Hrs (113 amino acids) named Gv1 (AJ632199) and Gv2 (AJ632200). Finally, we partially sequenced two G. v. vulgaris cDNA coding for two Hrs that we named Gv3 (AJ632201) and Gv4 (AJ632202) (Fig. 1).

Figure 1
figure 1

Alignment of sipunculan hemerythrins and myohemerythrins. The alignment was obtained using the ClustalW software (Thompson et al. 1994). (*) Conserved amino acids; (:) very similar amino acids; (.) similar amino acids. The residues involved in iron binding are shaded gray. Residues in position 91, which is possibly under positive selection, are white on a black background. The five amino acids, located between C and D α-helices, present only in myoHrs and in S. nudus Hrs, are in boldface. a, b, c, and d refer to the α-helices (Holmes and Stenkamp 1991).

The pairwise comparisons of the deduced amino acid sequences with the sipunculan Hr/myoHr sequences available in the databases revealed a high percentage of identity (S. nudus Hrs vs. Hrs, 45%; G. v. vulgaris Hrs vs. Hrs, 80%; S. nudus myoHr vs. myoHrs, 60%). These new sequences were unambiguously identified as Hr and myoHr because they shared fundamental molecular signatures with the Hr/myoHr so far characterized (Fig. 1). In particular, the amino acid sequences showed the occurrence of several features that are known to be conserved in the Hr protein family (Coutte 2001). All of the residues involved in iron binding (His-25, His-55, Asp/Glu-59, His-76, His-80, His-109, Asp-114) (Xiong et al. 2000; Coutte 2001) are strictly conserved and their spacing agrees with the consensus spacing for the iron ligands in Hr and myoHr. Most of the residues (Pro-5, Pro-7, Trp-10, Phe-14, Asp-22, Phe-29, Tyr-69, Phe-115, Tyr-117) that have been indicated (Xiong et al. 2000; Coutte 2001) to be important for structural maintenance are found to be conserved in both S. nudus and G. v. vulgaris Hrs. Lys-120 is conserved only in G. v. vulgaris Hr, while in both subunits of S. nudus Hr it is substituted by a Leu. The S. nudus and G. v. vulgaris Hrs present fully conserved the residues involved in the O2-binding pocket: Phe-56, Trp-105, Leu-106, and Ile-110 (Xiong et al. 2000; Coutte 2001; Farmer et al. 2001). This group of residues also includes Leu-28, which is conserved in G. v. vulgaris, while in both subunits of S. nudus Hrs it is replaced by the very similar Ile residue as found in T. zostericola myoHr. The distance matrix obtained by applying the Dayhoff matrix (Dayhoff et al. 1978) to the sipunculan myoHrs/Hrs is presented in Table 1. The average distance among all the sipunculan Hrs/myoHrs is 0.7624 ± 0.3543, while the average distances restricted solely to Hrs or myoHrs are 0.6616 ± 0.4268 and 0.5471 ± 0.0905, respectively. The minimum distance is that existing between P. gouldii Hr and T. zostericola Hr. GV1 vs. GV4 and GV2 vs. GV3 also exhibit distances clearly smaller (0.1204 and 0.1083) than the Hr average distance.

Table 1 Sipunculan Hr and myoHr distance matrix

Phylogeny

Preliminary analyses performed including the brachiopod myoHrs revealed that these proteins were strongly divergent from annelid–sipunculan myoHrs/Hrs. The branch connecting the brachiopod clade to the root of the annelid–sipunculan clade was at least 6.2 times longer than any other internal branch of the tree (data not shown). Thus they were excluded from the final analyzed data set to avoid possible effects due to long branch attraction.

The tree obtained by applying the ML to the multiple alignment in Fig. 1 is presented in Fig. 2. The MP cladogram differs from this tree in the placement of N. diversicolor myoHrb close to N. diversicolor myoHra (74% BT support) and P. gouldii myoHr close to T. zostericola myoHr (64% BT support). BI analysis is largely congruent with the ML tree. However, BI topology shows a monophyletic sipunculan myoHr group (0.72 BI support) with P. gouldii myoHr close to T. zostericola myoHr (1.00 BI support). Moreover, the S. nudus Hrb + S. nudus Hra clade branches off as sister to all other sipunculan myoHrs/Hrs, with the latter group receiving very poor BI support (0.52).

Figure 2
figure 2

Evolutionary relationships among the annelid and sipunculan Hrs and myoHrs. This is the best likelihood tree (−1nL = 2495.688319) obtained using the PHYML program (Guindon and Gascuel 2003). The bar represents 0.1 substitution for site. BT values, expressed as percentages, are shown as plain numbers (ML) or underlined (MP) numbers. BI values are presented as fractions. The portion relative to the octameric Hrs is presented at a higher magnification to provide a clear representation of the phylogenetic relationships among these closely related sequences.

ML and MP analyses sustain a monophyletic sipunculan Hr group. However, the most basal node does not receive statistical support. Sipunculan Hrs having an octameric quaternary structure are resolved as a single group in all analyses, and this finding is strongly supported by BI and BT values (1.00; 100%). The octameric clade includes the Hr sequences of G. v. vulgaris, T. zostericola, T. discriptum, and P. gouldii. T. dyscriptum Hr is the sister sequence of this group. The other sequences are clustered in pairs (e.g., G. v. vulgaris Hr1 + G. v. vulgaris Hr4) well supported by BI and BT values (Fig. 2), however, the relationships among these couples are not well resolved, as proved by the low BI and BT support for this portion of the tree. All analyses identify a very close phylogenetic relationship between Gv1–Gv4 and Gv2–Gv3. S. cumanense Hr, which has a trimeric quaternary structure, assumes a sister group position of the octameric clade and this relationship is strongly BT supported only in the MP analysis. The S. nudus Hrs, having a quaternary structure not well resolved (see Discussion), result in all analyses as the most diverging sequences within the sipunculan Hr clade. The sipunculan myoHrs do not form a monophyletic group in ML and MP analyses. Finally, all phylogenetic reconstructions recover an annelid myoHrs clade receiving moderate to strong statistical support.

Alignment Analysis

The sipunculan and annelid Hr and myoHr multiple alignment is presented in Fig. 1. In our analysis we assumed that myoHrs and Hrs have a common gene ancestor and this point is supported by the numerous conserved amino acid residues shared by these proteins. Remains of the common ancestral sequence can be identified also in the different symplesiomorphic amino acid residues shared by the sipunculan octameric/trimeric quaternary structure Hrs and myoHrs (see legend to Fig. 1). S. nudus Hrs, having a not yet resolved quaternary structure (see Discussion), share the identical symplesiomorphic residues with trimeric Hrs.

Molecular Adaptation

Results of parameters estimation for the site-specific models applied to myoHrs and Hrs are presented in Table 2. The one-ratio model (M0) gives a log likelihood of 1Ln = –2644.667 with ω = 0.0895. This indicates the dominating role of purifying selection in the evolution of myoHrs and Hrs. Model M1 (neutral) assumes two site classes in the sequence: conserved sites with ω0 = 0 and neutral sites with ω1 =1. This model has the same number of parameters as M0 (one ratio) but fitted the data worse, with a log likelihood 1nL = –2667.795. Model M2 (selection) adds another site class to M1 (neutral), with a free ω ratio estimated from the data, thus allowing for the possibility of positive selection. Parameter estimates did not identify sites having ω > 1 (Table 2). The M2 model fits the data better than the M0 and M1 models. Moreover, the LRT tests (M2 vs. M1, M2 vs. M0) are highly significant (p = 0). Model M3 (discrete) assumes three site classes with the proportions (p0, p1, p2) and ω ratios (ω0, ω1, ω2) estimated from the data. The estimates suggest that sites are under extremely strong (25.82%), very strong (48.55%), or strong (25.63%) purifying selection. M3 fit the data significantly better than any of the simpler models M0, M1, and M2 (p ≤ 0.00325).

Table 2 Adaptive evolution parameters estimate for annelid and sipunculan myoHrs and Hrs

M7 (β) specifies that individual codons take 1 of 10 categories of ω, all estimated from the data, but where no category has ω > 1 so that the model allows only neutral evolution. M8, however, allows positive selection by specifying an 11th category of codons in which ω can exceed 1.0. The estimates suggest that 1.7% of sites are under diversifying selection with ω = 1.95595. However, only position 91 of the multiple alignment (S, K, G, R, A, H, D) (Fig. 1) receives support by Bayesian posterior probability (Table 2). The LRT test comparing M8 (β & ω) vs. M7 (β) has the statistic 2 δl = 2[–2557.603364 – (–2558.210842)] = 1.214956, which is not significant (p = 0.545) compared with the χ2 with df = 2. Thus the possibility that site 91 is under positive selection is not corroborated by the LRT test. M3 model fits the data better than any other applied model as proved by its best likelihood score (Table 2). The posterior probabilities for ω sites classes under model M3 are shown in Fig. 3.

Figure 3
figure 3

Posterior probabilities of site classes for sites along the Hr and myoHr genes under model M3 (discrete). This model assumes three site classes with the proportions (p0, p1, p2) and ω ratios (ω0, ω1, ω2) estimated from the data (Table 2). Reference sequence: Sipunculus nudus myoHr.

Discussion and Conclusion

Our sequence analysis unambiguously demonstrated the presence of different subunits in the Hrs of S. nudus and G. v. vulgaris, a finding that is in agreement with the available data on sipunculan and brachiopod Hrs.

Gel filtration experiments, performed on the Hr of the sipunculan Phascolosoma lurco (=P. arcuatum), showed the presence of two different subunits, named HrI and HrII, according to their increasing retention time. These subunits correspond, respectively, to 25% and 75% of the total protein (Clark and Webb 1981). Two different subunits have been detected in P. gouldii by applying disk electrophoresis and DEAE-cellulose chromatography (Klippenstein 1972) and by proteolysis and starch gel electrophoresis (Groskopf et al. 1963; Manwell 1963). Analysis of the distribution of these two subunits in a population of 181 specimens of this species proved the presence of two alleles in Hardy-Weinberg equilibrium (Manwell 1963). The P. gouldii major subunit, which represented 80%–85% of the pooled Hr obtained from the coelomic fluid and 50%–100% in each specimen, consists of four variants. These variants are characterized by five amino acid interchanges (Klippenstein 1972). The presence of two subunits having the same concentration has also been demonstrated in the brachiopods Lingula unguis and L. revii (Satake et al. 1990; Zhang and Kurtz 1991).

The GV1 vs. GV4 and GV2 vs. GV3 distances are markedly smaller than the average Hr distance (see Results). Thus a first explanation could be to consider each couple as alleles of the same gene as demonstrated for P. gouldii (Manwell 1963). However, the possibility that a very recent duplication event occurred in G. v. vulgaris cannot be excluded since the distance observed between clearly separated Hrs (i.e., P. gouldii Hr and T. zostericola Hr.) is even smaller than those observed for the above-mentioned sequences. Thus further investigations are necessary to solve this point.

Our evolutionary reconstruction together with the comparison of primary structures of annelid and sipunculan myoHrs supports a common origin from a monomeric ancestral protein. Our phylogentic reconstructions produced a monophyletic annelid myoHr group, thus suggesting that the recently discovered nHr of leach is in fact a myoHr. We were unable to recover a consistently supported sipunculan myoHr clade and this could be the result of the limited available data. Finally, we obtained a monophyletic Hr group where paralogous as well as orthologous proteins are present, thus demonstrating that a multiplication process shaped the evolution of these proteins.

Among sipunculan Hrs the branching-off of the S. nudus Hrs as sister to the other sipunculan Hrs is supported in our analyses by the insertion of five amino acid residues PVXXX located between C and D α-helices (Fig. 1) in their primary structure. This characteristic is absent in all other sipunculan Hrs, while it is shared with the myoHr sequences (Coutte 2001).

Two competing phylogenies for the Sipuncula phylum have recently been published (Maxmen et al. 2003; Staton 2003). Among the differences present in these phylogenies, the placement of the Sipunculus and Siphonosoma genera is important for our discussion (Figs A and B). We used the phylogenies provided by the above-mentioned authors to map some of the steps that characterize the evolution of the sipunculan Hrs. Maxmen and coworkers (2003) as well as Staton (2003) recovered a clade including the genera Themiste, Phascolopsis, Golfingia, and Phascolion (the latter genus, however, is not included by Staton [2003]) (Figs. 4A and 4B). Members of these four genera have an octameric Hr form (Ferrell and Kitto 1971; Demuynck et al. 1991; Holmes and Stenkamp 1991), thus we can infer that a quaternary octameric Hr was present at least in the common ancestor of these taxa.

Figure 4
figure 4

The distribution of the Hr quaternary structures in the different genera of Sipuncula based on alternative phylogenetic hypotheses. A Tree redrawn and simplified from Fig. 4 of Maxmen et al. (2003). B Tree redrawn and simplified from Fig. 4 of Staton (2003). Only taxa considered in the present paper or having a known Hr quaternary structure are shown. Black rectangle:Hr having a trimeric quaternary structure; white rectangle:Hr having an octameric quaternary structure; ?:Hr having an unresolved quaternary structure; dotted branch:taxon not resolved by Staton (2003); *:poorly bootstrap supported group in Staton (2003) analysis.

Members of the genera Phascolosoma and Siphonosoma have Hrs both exhibiting a trimeric quaternary structure (Addison and Bruce 1977; Smith et al. 1983; Wilkins and Harrington 1983; Uchida et al. 1990), while the Hr quaternary structure of Sipuculus Hr is not yet resolved (see below).

Chromatography experiments (Vanin 2001) favor a trimeric structure for the S. nudus Hr. This point is possibly reinforced by the presence in amino acid sequences of the same plesiomorphic features shared by trimeric Hrs and the myoHr. However, the latter aspect could be due to a remnant of the common ancestral sequence. Previously, the S. nudus Hr molecular weight, calculated by osmotic pressure measurements, was 66,000 Da (Roche and Roche 1935). This value would favor a quaternary structure including four to six subunits assuming a MW of 13,000 Da for each subunit. Finally, Bates and coworkers (1968), as a result of their sedimentation equilibrium experiments, suggested that each molecule should contain six to eight subunits. Thus the current available evidence does not allow us to establish definitively the quaternary structure of S. nudus Hr. This aspect is very relevant to proper understanding of the pattern that characterized the structural evolution of the sipunculan Hr.

In fact if Sipunculus has a trimeric form, it is reasonable to suppose that this structure predated the octameric structure. The latter was successively developed solely in the clade including Themiste, Phascolopsis, Golfingia, and Phascolion (Figs. 4AA and B). Conversely if the Sipunculus Hr quaternary structure is octameric, the reconstruction of the evolutionary history of sipunculan Hrs depends very much on the species phylogeny considered. If the Maxmen and coworker phylogeny is correct (Fig. 4A), then the trimeric structure was a novelty that appeared at the base of the clade including Phascolosoma and Siphonosoma. Conversely if the Staton reconstruction is correct (Fig. 4B), two independent origins of the octameric structure must be postulated for Sipunculus and the common ancestor of Golfingia, Phascolopsis, Themiste, and Phascolion. This scenario can be even more complicated if we postulate the origin of the octameric form in the common ancestor of all sipunculan genera depicted in Fig. 4B (except Phascolosoma) and the subsequent reversal to the trimeric form in Siphonosoma. Thus currently available evidence cannot ascertain the first quaternary structure to appear during the evolution of sipunculan Hrs if the phylogeny depicted in Fig 4A is correct. Conversely if we consider the tree presented in Fig. 4B correct, the ancestral Hr quaternary structure was trimeric.

To solve this fundamental point further studies dealing with the identification of the quaternary structure of Sipunculus Hrs as well as the Sipuncula species phylogeny are necessary.

What is clear, however, is that starting from a common gene ancestor coding for a monomeric protein, two distinct quaternary structures evolved in the sipunculan Hrs and this differentiation was probably favored by the acquisition of distinct physiological advantages. A possible pressure to change of the quaternary structure could be related to modulation of oxygen binding. However, currently available data do not help to test this point. In fact, in sipunculans, the only evidence of apparent cooperativity in O2 binding within intact coelomic hemerythrocytes has been reported in P. gouldii (Mangum and Kondon 1975; Mangum and Burnett 1987). Conversely purified Hrs in the same species do not exhibit cooperativity in either O2 binding to deoxyHr or anion binding to metHr. Finally, in octameric brachiopods Hrs, the purified protein exhibits cooperativity in O2 binding (Richardson et al. 1987). Thus it is not clear if having two different quaternary structures could be an evolutionary advantage. Other physicochemical properties could have played a role in the evolution of the two alternative octameric and trimeric forms. However, their possible role remains untested with the current available evidence.

The random-sites specific models, applied to myoHrs and Hrs, demonstrate that purifying selection mainly shaped the evolution of these proteins even if one site (Fig. 1) appears to have evolved under diversifying selection. This site, apparently under positive selection, is placed at the end of α-helix C immediately contiguous with the most variable region of the protein. However, the latter result is not conclusive because the M8 model, favoring positive selection, does not fit the analyzed data set significantly better than the corresponding null hypothesis model M7, which does not permit positive selection.

A better taxon sampling and a better understanding of the gene structure of these proteins are surely the next important steps toward a thorough knowledge of the patterns that characterized the evolution of this group of oxygen carriers.