Introduction

Ohno (1970) was the first to suggest that gene duplication and polyploidization have played a pivotal role in the evolution of vertebrates. It has been hypothesized that the increase in genetic material at the origin of vertebrates might have been due to two rounds of genome duplications (the “2R” hypothesis), one in the common ancestor of all vertebrates and the second after the divergence of Agnatha (jawless vertebrates) and Gnathostoma (jawed vertebrates) (Sidow 1996; but see Skrabanek and Wolfe 1998; Hughes and Friedman 2003). In diverse ray-finned fish (actinopterygians) the presence of extra genes and genomic clusters has been detected compared to the other vertebrate species, leading to the suggestion of an additional genome duplication in the actinopterygian lineage (Amores et al. 1998; Postlethwait et al. 2000; Naruse et al. 2004). However, our knowledge of the molecular organization of genomes following polyploidization in vertebrates is still very limited. Understanding the evolutionary processes that cause homologous chromosomes to diverge into homeologous chromosomes (i.e., duplicated homologous chromosomes), and the fate of duplicated genes within such homeologous chromosomes, is essential for elucidating the evolution and radiation of the subphylum Vertebrata. A comparative genome analysis within and between closely related species that may have undergone a polyploidization event in their recent evolutionary history is essential to furthering our knowledge in this area.

Salmonid fish are perhaps one of the best model species for studying polyploidization and genome evolution. These species are descended from a single taxon, which underwent an autotetraploidization event as long ago as 50–100 million years ago (MYA) (Ohno 1970) and perhaps as recently as 25 MYA (Allendorf and Thorgaard 1984). The observation of multivalent formation during meiosis and evidence for tetrasomic segregation at some loci are an indication that the process of restoring disomic inheritance has not been completed in these species (Johnson et al. 1987; Allendorf and Danzmann 1997; Sakamoto et al. 2000). Recent genetic maps and comparative studies of a few salmonid fish, mostly based on microsatellite data, have provided information regarding the homology, homeology, and chromosomal rearrangements that have taken place following tetraploidization within this family of fish (Nichols et al. 2003; Woram et al. 2003, 2004; Leder et al. 2004). Since knowledge in the area of salmonid genomics is still largely incomplete, the identification and characterization of genes and gene families that have been conserved across diverse taxa are required if we are to gain a more complete understanding of the consequences of polyploidization within fish and vertebrates in general.

Hox genes constitute a particularly useful tool for studying the evolution of vertebrate genomes due to their conserved nucleotide sequences and clustered nature in genomic complexes. Based on the analysis of Hox genes in a representative of the sister taxon of vertebrates, amphioxus (Branchiostomafloridae) (Garcia-Fernàndez and Holland 1994; Ferrier et al. 2000; Ferrier 2004), as well as two gnathostome vertebrates, Indonesian coelacanth (Latimeria menadoensis) and horn shark (Heterodontusfrancisci) (Powers and Amemiya 2004), it has been hypothesized that the common ancestor of vertebrates and cephalochordates possessed a single cluster bearing 13 or 14 Hox genes. Identification of four Hox clusters located on different chromosomes in all tetrapods (sarcopterygians) that have been investigated so far is an indication that two rounds of cluster or whole-genome duplications occurred: the first before the divergence of agnathan and gnathostome lineages and the second in the gnathostome lineage (Kappen et al. 1989; McGinnis and Krumlauf 1992; Krumlauf 1994; Ladjali-Mohammedi et al. 2001).

From the available data it appears that unlike the situation in the sarcopterygians, additional Hox clusters might be a shared feature among all teleosts or even more primitive ray-finned fish. Studies in different teleost species such as zebrafish (Danio rerio) (Amores et al. 1998; Prince et al. 1998), pufferfish (Spheroides nephelus and Takifugu rubripes) (Aparicio et al. 1997; Amores et al. 2004), medaka (Oryzias latipes) (Naruse et al. 2000), striped sea bass (Morone saxatilis) (Snell et al. 1999), killifish (Fundulus heteroclitus) (Misof and Wagner 1995), tilapia (Oreochromis niloticus) (Málaga-Trillo and Meyer 2001; Santini et al. 2003), or even a representative of the sister group to all other actinopterygians, bichir (Polypterus palmas) (Ledje et al. 2002; but see Chiu et al. 2004), suggest that a segmental or a whole-genome duplication occurred in the last common ancestor of euteleosts if not earlier. Interestingly the Hox cluster repertoire varies greatly among diverse teleost species. Identification of seven clusters in zebrafish, medaka, and pufferfish, each with different gene complements and cluster architecture, is a reflection of the evolutionary history of these different taxonomic orders and might further be correlated with the radiation and accelerated morphological evolution in teleost fish (Amores et al. 1998, 2004; Wittbrodt et al. 1998; Málaga-Trillo and Meyer 2001; Prohaska and Stadler 2004).

Information regarding sequence and organization of Hox clusters in salmonids is still very limited. Fjose et al. (1988) identified three distinct Hox sequences in Atlantic salmon (Salmo salar), with two of these sequences located on the same inserted fragment, confirming the presence of these genes in the form of genomic clusters. The inferred amino acid sequences show the highest similarity to HoxB2, HoxA5, and HoxA7 genes from human (Homo sapiens) and mouse (Mus musculus). In the present work we further investigated the organization of these genomic clusters in another member of the Salmonidae family, rainbow trout (Oncorhynchus mykiss). Sequence analysis and mapping data suggest the presence of at least 14 putative Hox clusters in this species. However, up to 16 clusters might exist in salmonids. Many duplicated genes seem to have been retained and share a high percentage of amino acid similarity. We also characterized two Hox genes located on the HoxCb cluster that may have been lost independently in other teleost species studied to date. Finally, our data point to the presence of conserved syntenic blocks between salmonids and human and we also suggest new putative chromosomal homeologies in rainbow trout.

Materials and Methods

Mapping Families

Two backcross reference families of rainbow trout, designated lot 25 (n = 48) and lot 44 (n = 90), provided the source material for this study. These families were initially used to detect quantitative trait loci (QTL) for upper temperature tolerance and spawning time (Jackson et al. 1998; Danzmann et al. 1999; Sakamoto et al. 1999; O’Malley et al. 2003), and details on the background of the families are given in the references provided. The families are currently being used for genome-wide mapping projects in rainbow trout.

Primer Design, PCR Amplification, Cloning, and Sequencing

Total genomic DNA was isolated from muscle, liver, fin, or gill tissue following protocols outlined by Taggart et al. (1992), Estoup et al. (1993), and Bardakci and Skibinski (1994). DNA sequences of the Hox genes for mouse, human, zebrafish, pufferfish, and medaka were obtained from GenBank (http://www.ncbi.nlm.nih. gov). The orthologous Hox genes were aligned and the consensus blocks were identified by means of CLUSTALX (Thompson et al. 1997) and/or DIALIGN2 (Morgenstern 1999). Primer3 (http://www-genome.wi.mit.edu/cgi-bin/primer/primer3_www.cgi) was used for designing gene specific degenerate primers, using the identified consensus blocks as templates (Table 1). These primers have the capability to amplify either large segments of the second exon (where the homeobox is located) or both exons with the inclusion of the noncoding region. Polymerase chain reactions (PCR) were carried out in 50-μL mixtures containing 80 ng genomic DNA, 1 × PCR buffer (20 mM Tris-HCl, 50 mM KCl, at pH 8.4; GIBCO BRL), a 700 μM concentration of each dNTP (Roche Diagnostics), 2 mM MgCl2 (GIBCO BRL), 0.09 mg·mL−1 BSA (GIBCO BRL), a 0.04 μM concentration of each primer, and 1 U of Taq polymerase (GIBCO BRL). The amplifications were carried out in a Peltier Thermal Cycler PTC-200 (MJ Research) as follows: denaturation at 95°C for 5 min followed by 30 amplification cycles of 95°C for 30 s, 60°C for 1 min, and 72°C for 2–3 min and a final extension of 72°C for 10 min. The amplified PCR products were verified on a 1.5% TBE agarose gel, before the subsequent PCR purification, cloning, and sequencing. PCR products were then purified using the Wizard PCR Prep Kit (Promega) and inserted into the pGEM-T Easy Vector (Promega). On average, plasmid DNA of six clones of each PCR product was extracted (QIAprep Plasmid Miniprep Kit, Qiagen) and sequenced by the method described by Sanger et al. (1977), using T7 and/or SP6 primers and the Big Dye Terminator Cycle Sequencing Ready Reaction Kit (PE Applied Biosystems, Foster City, CA). Sequences were analyzed on an ABI 377 automated DNA sequencer (PE Applied Biosystems) and compared to sequences in GenBank using BLASTX and BLASTN algorithms. We considered all the sequences that showed substitutions in the range of the Taq polymerase mutation rate (1 in 9000 nucleotides polymerized (or 1 in 300 bp after 30 cycles of PCR [Tindall and Kunkel 1988]), as either alleles or PCR errors.

Table 1 Gene-specific degenerate primers designed based on the alignment and identification of consensus sequences of the orthologous Hox genes from zebrafish (Danio rerio), medaka (Oryzias latipes), pufferfish (Fugu rubripes), mouse (Mus musculu s), and human (Homo sapiens). All primers are given in the 5′ direction

We further tested the probability that two fragments that differ by n nucleotides may be alleles or duplicated sequences using the method explained by Misof and Wagner (1995) and Misof et al. (1996), considering the following equation:

$$ {\rm P} (k|\mu ) ={\sum\limits_{k = 0}^\mu {{e^{- \mu} \mu ^k} \over {k!}}} $$

where μ is the expected value of allelic nucleotide substitutions, estimated using the observed nucleotide differences between the orthologous Hox genes in rainbow trout and Atlantic salmon (Moghadam HK, Ferguson MM, and Danzmann RG, unpublished data). Probability values <1% were considered allelic variations.

Identity Search and Phylogenetic Analysis

Basic Local Alignment (BLASTX and BLASTN) searches of the GenBank database were performed to determine the identity of rainbow trout Hox sequences. The homeobox sequences were then aligned using the CLUSTALX program (Thompson et al. 1997) and the output file was used to calculate a distance matrix (p-distance) from which a neighbor-joining (NJ) tree was constructed and sites containing missing data or alignment gaps were removed in a pairwise fashion. The robustness of the inferred topology was determined by analyzing 1000 bootstrap replicates of the data set. The phylogenetic analysis was conducted using MEGA version 2.1 (Kumar et al. 2001). Protein identity/similarity calculations were further performed using MatGAT v2.02 (BLOSUM50 scoring matrix [Campanella et al. 2003]).

Single-Nucleotide Polymorphism (SNP) Identification

Putative Hox genes were localized to the rainbow trout genetic map by detection of SNP markers using a combination of heteroduplex analysis (HA; Ganguly et al. 1993) and/or single-stranded conformation polymorphism (SSCP; Orita et al. 1989a, b). Gene specific primers were designed to generate PCR products ranging from 150 to 1000 bp in HA (Boyd et al. 1993) and 150 to 300 bp in SSCP (Glavac and Dean 1993; Hayashi and Yandell 1993) (Table 2). Polymerase chain reactions were performed in 11-μL reaction volumes with one of the primers being 5′-fluorescently end-labeled with tetrachloro-6-carboxyfluorescein (TET). The PCR reaction mixture, containing 30 ng of genomic DNA, 1 × PCR buffer, a 136 μM concentration of each dNTP, 2 mM MgCl2, 0.09 mg·mL−1 BSA, a 0.003 μM concentration of each primer, 0.2 U of Taq DNA polymerase, was subjected to the following amplification conditions: denaturation at 95°C for 5 min followed by 30 amplification cycles of 95°C for 30 s, 60°C for 1 min, and 72°C for 30–60 s and a final extension at 72°C for 10 min.

Table 2 List of identified rainbow trout (Oncorhynchus mykis s) Hox genes, their corresponding linkage groups (LG), length of identified fragments, GenBank accession numbers, and reverse and forward primers used to identify single-nucleotide polymorphism markers:. While i and ii symbols indicate the duplicated genes that have been mapped, for those genes that we were unable to identify their corresponding LG, numbers 1 or 2 have been assigned instead. All primers are given in the 5′ direction

A modification of the protocol of Ganguly et al. (1993) was applied to detect SNP using HA. Heteroduplexes were generated by addition of ethylenediaminetetraacetic acid (EDTA) to each PCR product to a final concentration of 10 mM, followed by denaturation at 95°C for 5 min and incubation at 68°C for 1 h. Two microliters of the sample was mixed with 2 μL of loading dye (20% ethylene glycol, 30% formamide, and 0.25% bromophenol blue) and then loaded onto polyacrylamide gels. The gels consisted of 6–10% polyacrylamide (37.5:1 ratio of acrylamide to bisacrylamide; Fisher Biotech), 10% ethylene glycol (Sigma), 10% glycerol (Fisher Biotech), 15% urea (Fisher Biotech), 15% formamide (Fisher Biotech), and 0.5 × TTE buffer (44 mM Tris, 14.5 mM taurine, 0.25 mM EDTA). The running buffer was 0.5 × TTE in the upper chamber and 1 × TTE in the lower chamber. The samples were electrophoresed either at a constant 25 W for 3–10 h or at 10–15 W for 12–16 h. Electrophoresis was conducted at room temperature.

To prepare the samples for SSCP, 2 μL of PCR product was mixed with 18 μL of loading dye (95% formamide, 10 mM NaOH, and 0.25% bromophenol blue), followed by heating at 95°C for 5 min and then quenching on an ice bath. Electrophoresis was performed using 0.5 × Mutation Detection Enhancer gels (MDE; CAMBREX) for a minimum of 6 h using a constant power of 10 W at 4°C.

Gels were prepared using 0.4-mm spacers and were run on a standard DNA sequencing gel apparatus. For both methods, 1.5 μL of the final PCR mixture (PCR reaction cocktail plus loading dye) was loaded for electrophoresis. Scanning and visualization were done using an FMBIO II scanner and Image Analysis software (Hitachi Genetics System) with the wavelength of 585 nm.

Genetic Nomenclature

We followed the conventions outlined by Scott (1992) and Amores et al. (1998) for designating Hox genes and Jackson et al. (1998) for designating duplicated genes in salmonid fish. Paralogous genes are denoted by lowercase a or b symbols that follow the gene number (1–13), and duplicated homeologous genes located on different linkage groups (LG) are arbitrarily designated with a lowercase i or ii (Jackson et al. 1998). However, genes that we were unable to map to a given LG were designated 1 or 2 instead of i or ii.

Linkage Analysis

LINKMFEX (version 1.6) software (http://www.uoguelph.ca/∼rdanzman/software) was used to test for deviation from Mendelian segregation ratios (1:1, log-likelihood G-test) and for linkage analysis of the putative Hox genes (using a minimum logarithm of odds [LOD] of 4.0). Different modules of LINKMFEX (i.e., GENOVECT, MAPORD, and MAPDIS) were used to determine the most likely order of the Hox genes with other genetic markers and to obtain map distances (x = θ). LG were graphically portrayed using MAPCHART (Voorrips 2002).

Results

Sequence Identity and Phylogenetic Analysis of the Homeobox Data

Using 13 pairs of gene specific degenerate primers (Table 1), we were able to obtain partial sequence information of 26 putative Hox genes in rainbow trout (Table 2). All the identified sequences passed the allelic test (Misof and Wagner 1995; Misof et al. 1996; see supplementary data), with the HoxA11b variants (i.e., HoxA11bi and HoxA11b-2) being the only exceptions. However these two fragments were also included in the study, as the phylogenetic relationships and comparative analysis between rainbow trout and Atlantic salmon indicate that these forms are distinct genes (Moghadam HK, Ferguson MM, Danzmann RG, unpublished data). The gene identities were confirmed by comparing the full-length nucleotide and deduced amino acid sequences to the GenBank database entries using BLASTN and BLASTX algorithms. Phylogenetic analysis of the inferred amino acid sequences (data not shown) as well as the homeobox nucleotide sequence data (Fig. 1) clustered the fragments with their orthologs from zebrafish, suggesting the presence of potentially 14 Hox genomic clusters in rainbow trout. These clusters can be summarized as HoxAai, HoxAaii, HoxAbi, HoxAbii, HoxBai, HoxBaii, HoxBbi, HoxBbii, HoxCai, HoxCaii, HoxCbi, HoxCbii, HoxDai and HoxDaii (Fig. 1, Table 2).

Figure 1
figure 1

Neighbor-joining tree representing the phylogenetic relationships of the putative “homeobox” sequence data obtained from rainbow trout (Oncorhynchus mykissOnc; based on 22 distinct genes) to the nearest reported orthologous Hox genes in zebrafish (Danio rerioDan). The number at each node represents the percentage bootstrap value of 1000 trials. Bootstrap values lower than 50% are not shown. Note that only those sequences with the available homeobox information were used for the tree construction.

For most of the genes, such as those representing HoxA3a, HoxA4a, HoxB4a, etc., two distinct orthologs of the zebrafish genes have been identified in rainbow trout. These genes share a high percentage of identity (perfect matches) and similarity (perfect and imperfect matches) between their deduced amino acid (Table 3) as well as their nucleotide sequences (data not shown) and are most probably duplicated genes that have been retained within the genome following tetraploidization. In particular, if the sequence information of the second exon, where the homeobox is located is available, the amino acid identity between duplicates usually exceeds 90%. The similarity analyses as well as the phylogenetic relationships between the putative amino acid sequences (Moghadam HK, Ferguson MM, Danzmann RG, unpublished data) suggest that the duplicated sequences are more closely related to each other than they are to the reported orthologs from zebrafish. Two exceptions, however, are the sequences obtained for HoxC4bii and HoxC9bi. While the identity between HoxC4a-2 and HoxC4ai is about 94% and their similarity to their ortholog from zebrafish is about 90%, HoxC4bii has just 70% identity (about 76% similarity) to either paralogs or the ortholog from zebrafish (Table 3, Fig. 2). This is an indication that HoxC4a-2 and HoxC4ai shared the most recent ancestral cluster and are the true orthologs to the zebrafish HoxC4a. It should be noted that the similarity/identity of the human HoxC4 to the rainbow trout orthologs is almost equal among all identified sequences, which might suggest a fish-specific duplication that occurred following the divergence of actinopterygians and sarcopterygians. The identity between HoxC9a-1 and HoxC9bi (76%) is also less than the identity obtained for HoxC9a-1 and zebrafish HoxC9a (84%; Table 3, Fig. 3), suggesting a possible orthology between the latter two sequences.

Table 3 Percentage amino acid similarity (underlined: perfect and imperfect matches) and identity (perfect matches only) between the homeologous/paralogous Hox sequences obtained from rainbow trout (Oncorhynchus mykiss) and the reported orthologs from human (Homo sapiens) and zebrafish (Danio rerio). a and b corresponds to duplicated Hox genes identified in zebrafish
Figure 2
figure 2

Neighbor-joining tree (p-distance) inferred from the alignment of ∼200 amino acid sequence data, showing the phylogenetic relationships between the putative HoxC4 sequences in rainbow trout (Oncorhynchus mykiss; Onc) and the reported orthologs from zebrafish (Danio rerio; Dan) and human (Homo sapiens; Hom) using amphioxus (Branchiostoma floridae; Amp) as outgroup. Numbers at branch nodes indicate the percentage bootstrap support for that node based on 1000 replications.

Figure 3
figure 3

Unrooted neighbor-joining tree (p-distance) inferred from the alignment of ∼250 amino acid sequence data, showing the phylogenetic relationships between the putative HoxC9 sequences in rainbow trout (Oncorhynchus mykiss; Onc) and the reported orthologs from zebrafish (Danio rerio; Dan) and human (Homo sapiens; Hom). Number at the branch node indicates percentage bootstrap support for that node based on 1000 replications.

Analysis of the sequences obtained for HoxD9a suggests that three distinct copies of this gene might exist in rainbow trout. While HoxD9aii shares about 85% nucleotide similarity to either HoxD9ai or HoxD9a-1, the nucleotide similarity between the two latter sequences is close to 96%, which is due to a 45-bp insertion and/or deletion (INDEL; starting from position 926) and four nucleotide substitutions (Fig. 4). In fact the alignment of HoxD9ai and HoxD9a-1 fragments with their putative cDNA sequence (GenBank accession number, BX077293; Guiguen Y, INRA—SCRIBE, Campus de beaulieu, Rennes Cedex 35042, France; unpublished data) reveals that the INDEL and three of the nucleotide substitutions occur within the noncoding region. Since no segregation data could be obtained for HoxD9a-1, we cannot exclude the possibility that this sequence is just a cloning artifact (i.e., recombination between HoxD9ai and HoxD9aii), a duplicate or an allele of the HoxD9ai.

Figure 4
figure 4

Alignment of three distinct nucleotide sequences obtained using the HoxD9a primer. While the sequence similarity between HoxD9ai and HoxD9a-1 is close to 96%, the similarity between each of these sequences and HoxD9aii is just about 85%.

Genetic Mapping of the Hox Genes

The rainbow trout genetic map consists of 31 linkage groups (LG). We were able to localize 12 putative Hox genomic clusters on 11 of these LG by mapping at least one representative of each cluster (Fig. 5). Further, the genomic localization of HoxD9aii still remains tentative, as this sequence was linked to three unassigned markers. It should also be noted that the genomic localizations of the putative HoxA2bi and HoxA2bii genes were inferred based on different segregation patterns of the amplified fragments using different sets of gene-specific primers (Table 2). However, the subsequent cloning and sequencing of the PCR products from these two primer sets did not reveal any nucleotide divergence. Inferred homeologies observed with the duplicated Hox genes (i.e., HoxB5bi and HoxB5bii [LG 17/22], HoxA2bi and HoxA2bii [LG 27/31], and HoxB4ai and HoxB4aii [LG 2/9]) are in agreement with those reported by Sakamoto et al. (2000) and Nichols et al. (2003). New suggestive homeologous affinities were identified between LG 3 and 16 (HoxA4aii and HoxA4ai) and LG 12 and 29 (HoxC6bi and HoxC6bii). Based on the previous research (Sakamoto et al. 2000; Nichols et al. 2003), the putative homeologs for LG 3, 12, 16, and 29 are LG 25, 16, 12, and 2, respectively. Given that most LG in the rainbow trout genome are metacentric in type and that whole arm fusions have likely been the predominant mode of chromosome evolution within salmonids (Wright et al. 1983), we may expect that up to two homeologous affinities will be identified per LG.

Figure 5
figure 5

Genetic map of the rainbow trout (Oncorhynchus mykiss) putative Hox genes (italic and underlined) for male (M) and/or female (F) parents in the two reference families (i.e., lot 25 and lot 44). Numbers indicate estimated genetic distances as centimorgans (x = θ). LGUna, cluster currently unassigned to the Nichols et al. (2003) rainbow trout genetic map.

Discussion

Chromosome Evolution and Organization of Hox Clusters in Rainbow Trout

The clustered organization of Hox genes is remarkably well conserved among all vertebrates that have been studied to date (Schughart et al. 1989; Krumlauf 1994; de Rosa et al. 1999). This allowed us to further infer the presence of extra Hox genomic clusters in rainbow trout by localizing at least one representative of each complex in the genetic map of this species. Evidence of extra Hox clusters has been reported in diverse actinopterygian species such as medaka (Naruse et al. 2000), Southern and Japanese pufferfish (Aparicio et al. 1997; Amores et al. 2004), and zebrafish (Amores et al. 1998; Prince et al. 1998). This has led to the suggestion that there was a segmental or a whole-genome duplication in the ancestor of the ray-finned fish producing a possible eight Hox complexes (Amores et al. 1998, 2004) and hence extending the one-to-four rule in sarcopterygians into one-to-four-to-eight in fish (Meyer and Schartl 1999). Our results are in accordance with this evolutionary hypothesis which predicts that the polyploid ancestor of modern salmonid fish possessed up to 16 Hox clusters (8 identical pairs), each located on different acrocentric chromosomes. By obtaining partial sequence and mapping information of 26 distinct Hox genes in rainbow trout, we were able to infer the presence of extra Hox genomic clusters in this species compared to the other actinopterygians that have been investigated so far. However, it should be noted that these results still need to be confirmed further by identifying more upstream and downstream cognate members of each cluster.

Our sequence data suggest the presence of potentially 14 Hox genomic clusters in rainbow trout, and additional clusters cannot be ruled out, as we were not able to design gene-specific degenerate primers to screen for any of the genes located on the HoxDbi or HoxDbii clusters. Twelve of these clusters have been assigned to the identified LG in this species, with one cluster mapped to a small unassigned fragment. It is further evident from the topology of the homeobox NJ tree (Fig. 1) that many of the orthologous zebrafish Hox genes are represented by two copies in rainbow trout. This reinforces the hypothesis of a mass genome tetraploidization event that occurred in the lineage leading to salmonids (Ohno 1970; Allendorf and Thorgaard 1984). Therefore we suggest that the 1-to-4-to-8 rule should be extended to 1-to-4-to-8-to-16 in salmonid species.

The NJ tree contains two large clades; one including the anterior (Hox13) and the central (Hox46) genes and the other including the posterior genes (Hox911). This topology is expected and has been previously reported (Zhang and Nei 1996; Mito and Endo 2000), as the Hox genes first expanded by lateral duplication of at least two progenitors (i.e., one anterior and one posterior) (Garcia-Fernàndez and Holland 1994), followed by interchromosomal duplications in the lineage leading to vertebrates (Schughart et al. 1989). Therefore it is expected that the paralogous genes located on different clusters will be more similar to each other than to the cognate members within each complex.

In new tetraploid species, the process of diploidization causes the structural divergence of a pair of homologs into two new ancestrally related homeologous chromosomes (Ohno 1970; Sybenga 1972). In most salmonids the process of diploidization has been accompanied by the differential Robertsonian fusion of chromosome arms. This may involve the fusion of two homeologous acrocentric chromosomes into one metacentric chromosome (Ohno et al. 1969) or the fusion of nonhomeologous chromosome arms, although the latter process is believed to have been more prevalent (Wright et al. 1983; Allendorf and Thorgaard 1984; Hartley 1987). While evidence exists that the fusion of homeologs has occurred (Woram et al. 2004), it is more likely that nonhomeologous fusions occur (Phillips and Rab 2001). Therefore, if a Hox cluster resides on a pair of acrocentric chromosomes undergoing fusion, then identifying a chromosome in the extant salmonids bearing two clusters is not surprising. This evolutionary scenario might explain the localization of two Hox clusters (HoxDai and HoxAbii) to LG 31 in rainbow trout. These clusters have a genomic distance of about 70 cM on the female map and appear to map at opposite ends of the chromosome spanning the putative centromeric region identified via gene–centromere mapping and crossover frequency analysis (Sakamoto et al. 2000; unpublished data). The previous homeologous affinities identified between LG 27 and LG 31 suggest that the chromosome arm bearing the HoxA2bii cluster on RT-31 shares homeology with RT-27. This is supported by the observation of numerous duplicated microsatellite markers that map to both of these LG and flank the HoxA2b duplicates.

Orthology/Paralogy of Rainbow Trout Hox Sequences

Members of the Hox gene family are characterized by their extreme DNA sequence conservation, particularly in the second exon where the homeobox is located. This can often make differentiation between paralogous genes difficult if not impossible (Pavell and Stellwag 1994; Misof and Wagner 1995; Misof et al. 1996). This characteristic poses an even greater problem in identifying duplicated genes in species such as salmonids that have undergone a more recent tetraploidization event. In our analysis we tried to overcome this problem by identifying large fragments of the first and the second exons as well as the intervening noncoding region, whenever possible. Our data indicate that many of these putative duplicated Hox genes not only have been retained within the genome of rainbow trout, but also share amino acid similarities >90%, a suggestion of the possible conserved function of these genes. Previous studies have suggested that about two-thirds of the duplicated genes in salmonids have become silenced (Allendorf and Thorgaard 1984). This might not be true for the Hox genes, since even with our limited data we were able to identify about 10 putative duplicated sequences from the 13 Hox genes we initially attempted to identify. In zebrafish and pufferfish it has been shown that about 11 duplicated Hox genes have been retained in their genomes (Amores et al. 1998, 2004). In two cases studied so far it has been established that these extra genes have been preserved either due to the partitioning of different functions between duplicates (subfunctionalization [Force et al. 1999; Bruce et al. 2001]) or because their function has been shifted among members of the paralogue groups (function shuffling [Prince 2002; Jozefowicz et al. 2003]) rather than acquiring novel function. In salmonids, however, this seemingly high percentage of gene retention may be more related to the physical arrangements of the Hox clusters on each chromosome. It has been suggested that most functionally duplicated loci are located telomerically (Wright et al. 1983; Allendorf et al. 1986) and can remain functional since the homeologous chromosomes still can pair during meiosis and exchange chromatid segments at their distal ends (Wright et al. 1983; Allendorf and Danzmann 1997). Therefore, Hox clusters located at distal ends of chromosomes may be retained preferentially due to crossing-over. The available mapping data cannot exclude such a mechanism with the exception of the HoxB4a duplicates, which appear to map proximal to the centromeric region on LG 2/9.

The data reported herein further support a strong similarity of the Hox sequences between the rainbow trout and its orthologs in zebrafish and human on the basis of their amino acid similarities. As expected, many of the duplicated sequences show a higher percentage of similarity to each other than to the zebrafish orthologs, pointing to the duplication that has occurred in the lineage leading to salmonids. However, our results for two of the alleged sequences, “HoxC9bi” and “HoxC4bii,” is not as expected and suggest that these genes could be located on a duplicated HoxCb cluster. This is intriguing, as no evidence of a HoxC9b or HoxC4b gene has been reported in any actinopterygian. Models proposed for the evolution of vertebrate Hox clusters suggest that these genes have been lost in the ancient teleost ancestor and that only the HoxC4a and HoxC9a paralogs are retained within the actinopterygians (Amores et al. 1998, 2004; Meyer and Málaga-Trillo 1999). However, if these genes are specific to the salmonid lineage, then one would expect a phylogenetic topology with the duplicated genes being clustered together rather than the orthologs. Therefore, it is more probable that the HoxC9b and HoxC4b genes were present on the HoxCb cluster of the ancient teleost ancestor that lived about 140 MYA (Nelson 1994) and have been lost independently in lineages leading to zebrafish, pufferfish, and medaka. More conclusive evidence can be drawn in future as further sequence and mapping information is accumulated from diverse teleost orders.

Comparative Evolutionary Conservation of Syntenic Chromosome Blocks Containing Rainbow Trout Hox Clusters

Recent comparative studies in zebrafish, medaka, and human have shown that regardless of all the chromosomal rearrangements that have taken place during the course of evolution in these diverse vertebrate species, their genomes still possess many conserved chromosomal segments (Postlethwait et al. 2000; Naruse et al. 2004). For example, one large syntenic region has been identified on chromosome 17 in human (where the growth hormone gene and the HoxB cluster are located), and its corresponding orthologous LG segments in species such as zebrafish, medaka, mouse, pufferfish, and salamander (Ambystoma mexicanum) (Postlethwait et al. 2000; Voss et al. 2001; Naruse et al. 2004; Vandepoele et al. 2004). Here we have localized the HoxBaii cluster to LG 9 in rainbow trout, where it shows a tight linkage to growth hormone gene type 1 (GH1), providing support for a possible conserved synteny between salmonids and human. So far, two active growth hormone genes have been identified in salmonids (i.e., GH1 and GH2 [Agellon et al. 1988; Oakley and Phillips 1999]), with the GH2 gene being localized to LG 2 in the Nichols et al. (2003) rainbow trout genetic map. It should be noted that although GH2 has still not been identified in our current linkage map (Sakamoto et al. 2000; unpublished data), we would expect this gene also to show a strong linkage to the HoxBai cluster, as we have localized HoxB4ai to LG 2, and LG 2 and 9 share homeology with each other.

Further, we suggest a possible conserved synteny between LG 12 and LG 29 in rainbow trout with human chromosome 12, where the HoxC cluster and the Natural Resistance-Associated Macrophage Protein gene (NRAMP) are tightly linked (12q13). Natural Resistance-Associated Macrophage is a gene family also known as SLC11A. While two copies of SLC11A (SLC11A1 and SLC11A2) exist in mammals, sequences from teleost fish display higher identity to mammalian SLC11A2 than to SLC11A1, with no known teleost ortholog of SLC11A1 (Larhammar et al. 2002). Salmonids possess two copies of SLC11A2, namely, NRAMP-α and NRAMP-β (or SLC11A-α and SLC11A-β) (Dorschner and Phillips 1999). Our results indicate linkage between NRAMP-β and HoxCbi cluster on LG 12, and a suggestive association between the HoxCbii cluster and NRAMP-α, as both have been located on LG 29, but in different mapping families (i.e., Nichols et al. 2003). It should be noted that the linkage between SLC11A2 and the HoxC cluster has not been retained in zebrafish, as SLC11A2 is located on LG 21 and the HoxCa and HoxCb clusters have each been localized to LG 23 and 11 of the zebrafish genetic map, respectively.

Inferred Homeologies Suggested by Putative Hox Genes

Many putative homeologies have been identified in rainbow trout by mapping genetic markers that show duplicate expression in this species (e.g., Young et al. 1998; Sakamoto et al. 2000; Nichols et al. 2003). Mapping the putative Hox genomic clusters in the rainbow trout genetic map allowed us to confirm some of those known homeologies and further suggest new possible homeologies in this species. Previous studies have suggested homeology between LG 17/22, 27/31, 2/9, 12/16, 29/2, and 3/25, with 1, 5, 10, 4, 4, and 1 duplicated marker(s) conferring the homeology, respectively (Young et al. 1998; Sakamoto et al. 2000; Nichols et al. 2003). Our mapping data for HoxB5bi and HoxB5bii (17/22), HoxA2bi and HoxA2bii (27/31), and HoxB4ai and HoxB4aii (2/9) support these identified homeologies. In addition, localization of the putative HoxCbi, HoxCbii, HoxAai, and HoxAaii clusters to LG 12, 29, 16, and 3 are suggestive of new potential homeologies. However, caution must be exercised in inferring homeologies which are solely based on highly conserved gene families such as the Hox genes, especially if just a single gene per cluster has been identified. The conserved nature of these genes can sometimes cause cross amplification of the paralogous genes in addition to the targeted gene. Problems would be encountered if all the amplified fragments have the same size, with the paralogous gene being polymorphic. More unambiguous results can be obtained by (i) mapping the same gene in more than one reference family or in both parents, (ii) identifying the genomic localization of at least two members of each cluster, and (iii) using comparative mapping as a potential tool for further clarification. Here we were able to make such a clarification for the putative HoxCbii and HoxCbi by localizing two members of each cluster to their respective LG. HoxC4bii and HoxC6bii have been mapped to LG 29 of the male parent of both mapping families (i.e., lot 25 and lot 44) and the female parent of lot 25, and HoxC9bi and HoxC6bi have been localized to LG 12 in the male and the female parent of lot 25. Also, association of these genomic clusters with the NRAMP gene family, as discussed in the previous section, provides further support for the possible homeology between these two LG. With regard to the new putative homeology between LG 3 and LG 16 identified by mapping HoxA4aii and HoxA4ai, it should be noted that our data actually provide supporting evidence for Phillips et al. (2003), who reported a possible homeology between LG 3 and LG 16 in rainbow trout by mapping the duplicated major histocompatability class I genes to these LG.

In conclusion, the data presented herein provide further support for the presence of extra Hox genomic clusters in the teleost fish and the subsequent tetraploidization event in the lineage leading to salmonids. Rainbow trout possess at least 14 Hox genomic complexes, with many of the duplicated genes being retained in the genome. The genetic complement of these clusters however, might be different from those of zebrafish, medaka, and pufferfish, as no evidence of the presence of the HoxC4b or HoxC9b has been reported in any teleost species studied so far. Further, the identified conserved syntenic blocks between salmonids and human is an indication of the relative linkage of these genes in the last common ancestor of sarcopterygians and actinopterygians, which diverged more than 400 MYA (Carroll 1988).