Introduction

How primitive genomes were organized is an intriguing question, the answer to which can only be inferred from survived features in the current genomes (Birnbaum et al. 2000). Most of the gene associations that have been conserved are of genes either structurally (indicative of a common origin) or functionally (indicative of selective pressures) related. Here we describe a gene association of apparently nonrelated genes that appears to have survived evolution from the Urbilateria (De Robertis and Sasai 1996), the distant ancestor of current bilateria. We found that genes for glucose transporter-3 (SLC2A3), synaptobrevin-1 (VAMP1), and γ-enolase (ENO2) map within a region of approximately 3 Mb of human chromosome 12 (12p13) (Fig. 1A). This region has been shown to be syntenic with the 17p12 region of human chromosome 17 (Lundin 1993). Consistently, the genes for β-enolase (ENO3), the synaptobrevin-2 (VAMP2), and Glut-4 (SLC2A4) map within 2.5 Mb of this region of chromosome 17. A third syntenic group was localized in human chromosome 1, at 1p35-p31. This region included the genes for α-enolase (ENO1), synaptobrevin-3 (VAMP3), and Glut-1 (SLC2A1). These gene groupings of human chromosomes 1, 12, and 17 represent a conserved gene association that we named EVG (from eno-vamp-glut).

Figure 1
figure 1

Human (a) and Caenorhabditis elegans and Drosophila melanogaster (b) EVGs. (a) Ideograms of human chromosomes 1, 12, and 17 are shown with the mapping positions of human EVGs. (b) Chromosome II of C. elegans andchromosome 3 of D. melanogaster are shown schematically with the approximate location of Ce-EVG and Dm-EVG. The structure of EVGs has been derivedfrom published sequence data from The C. elegans Sequencing Consortium (1998), the database of Drosophila genomic sequences (Myers et al. 2000), and the database resources of the National Center for Biotechnology Information (Wheeler et al. 2001). Accession numbers of sequences used in this study are as follows: human ENO3, NP_001967; human ENO2, NP_001966; human ENO1, NP_001419; Drosophila melanogaster Eno, CAA34895; Caenorhabditis elegans Eno, Q27527; human VAMP1, AAA60603; human VAMP2, AAA60604; human VAMP3, AAB05814; Drosophila melanogaster n-syb, AAB28707; Caenorhabditis elegans VAMP (II), CAB03366; Caenorhabditis elegans Snb1, T33239; human SLC2A1, NP_006507; human SLC2A3, NP_008862; human SLC2A4, NP_001033; Caenorhabditis elegans Glut, CAA91409; Drosophila melanogaster Glut1, AAC36683.

EVG seems to be also present in the genomes of Drosophila melanogaster and Caenorhabditis elegans. These species possess homologues for SLC2A, VAMP, and Eno in an arrangement reminiscent of EVG (Fig. 1b). CeEVG spans 512 kb of C. elegans chromosome II (1998), however, DmEVG is split into two regions: that of DmGlut1 (Escher and Rasmuson-Lestander 1999) and Dmn-syb (a VAMP gene (DiAntonio et al. 1993)), which maps in 684 kbp near the telomere of chromosome 3R of the fly, and that of the enolase gene (Eno), located in the 22A band of chromosome 2L (Bishop and Corces 1990).

The statistical significance of these groupings of genes was estimated according to Trachtulec and Forejt (2001), modified for paralogue substitution by multiplying the probability of by-chance association by the number of all possible paralogues of each gene. These estimates gave low probabilities that EVGs have occurred by chance (7.3 × 10−3, 2.1 × 10−5, and 1.4 × 10−5 for human EVG1, EVG12, and EVG17 and 7.5 × 10−4 and 3 × 10−2 for C. elegans and Drosophila, EVGs, respectively). Overall, the cumulative probability that human, Drosophila, and C. elegans EVGs could have occurred by chance and in an unrelated manner is very low (4.8 x 10−17), reinforcing the idea that the EVGs of the three species are evolutionary related and not the result of genetic convergence.

In the mammalian lineage, EVG has experienced at least two rounds of duplications, which gave rise to the current three human EVGs. Since teleost fish possess, at least, two Glut, two Eno, and two VAMP genes, EVG could have already duplicated at least once when teleostei appeared. The existence of three EVGs in humans supports the hypothesis of Susumo Ohno (1999) on vertebrate genome evolution, which portends that two genome-wide duplications (tetraplodization) of an ancestral genome have occurred after the separation of vertebrates and invertebrates. This hypothesis has been recently revived by several reports on the existence of en bloc duplications in vertebrate genomes (Abi-Rached et al. 2002; Gu et al. 2002; McLysaght et al. 2002; reviewed by Spring 2002).

The existence of EVG in D. melanogaster and C. elegans suggests that this gene arrangement appeared in evolution before the separation of these phyla from vertebrates. These three taxa separated between 670 (Ayala and Rzhetsky 1998) and 993–1177 (Wang et al. 1999) Myr (million years) ago, which suggests an estimated age for EVG that could be more than 1 billion years, far before the Cambrian explosion. EVG was not found in Saccharomyces cerevisiae or Schizosaccharomyces pombe genomes. Arabidopsis thaliana has putative genes for a sugar transporter, a synaptobrevin (VAMP), and an enolase, which map within 3.6 Mb of chromosome II. Although the statistical significance of this gene grouping is comparable to that of EVG, we should be cautious in considering it as a plant version of EVG, because of the rather low sequence similarities of A. thaliana VAMP and Glut-like genes to their animal counterparts.

During evolution, gene rearrangements have occurred among EVG’s genes, although they did not change the general topology of the syntenies. For example, the VAMP-like gene within Ce- EVG (Vamp[II]) is not the authentic orthologue of the VAMP gene present in any of the three human EVGs. The most probable orthologue, Ce-Snb-1 (1998), is located on chromosome V. The evolution of EVG has also permitted the inclusion of other genes in the group, besides Glut, VAMP, and Eno, that define the association. However, these extra genes are not conserved among all EVGs, not are they among species, and thus they were probably not integral to the primitive EVG, but later evolutionary additions to the association.

Primitive genomes can be regarded as collections of genes in linkage groups or primordial chromosomes. Different evolutionary histories have led to the current chromosomal situation, with most of the original gene vicinity relationships being lost. The evolutionary persistence of EVG is quite striking, since Ecdysozoa and Vertebrata, the two major clades where EVG has been found so far, proceeded quite differently in evolution. It should be expected in some way that different gene organizations should have evolved, due to the enormous cytogenetic changes that genomes have experienced during this long-duration evolution. Exceptions to these are gene clusters that have been maintained due to selective pressures or that have originated by successive internal duplications of one or several ancestral genes. Long-lasting gene associations usually group genes with some kind of relationship, either structural or functional. Although several examples of conserved syntenies exist among vertebrate genomes, these are much more scarce between invertebrates and vertebrates and even more difficult to find when the search is extended to distantly related phyla like Ecdysozoa and Vertebrata.

Table 1 summarizes the known common syntenies found in current Ecdysozoa (D. melanogaster and C. elegans are representatives) and vertebrates. The best-known examples are the clusters of homeobox genes, which include the Hox, ParaHox, NKL, and EHGbox gene clusters (Pollard and Holland 2000). These clusters group genes structurally and functionally related and a common evolutionary origin is obvious. Their conservation is thought to be due to the selective pressures imposed by their coordinated expression during embryonic development. The Wnt genes provide another example of this type of preserved gene associations. The Wnt-related genes form a cluster in Drosophila melanogaster and two paralogous groups have been found in the human genome, which appear to have originated by genomic duplication (Nusse 2001). Again, the Wnt cluster groups genes structurally related (derived from a common ancestor by gene endoduplication) and are involved in functionally related roles, which could have been instrumental for their evolutionary conservation.

Table 1 Conserved syntenies between invertebrates and vertebrates

However, syntenies of genes structurally or functionally unrelated that have been maintained from the separation of Ecdysozoa and Vertebrata are much more rare. Table 1 includes the three known examples of syntenies of nonrelated genes, which are found in the genomes of D. melanogaster, C. elegans, and humans: the FGFR/VMAT (Pebusque et al. 1998), the TBP/C5 (Trachtulec and Forejt 2001; Trachtulec et al. 1997), and EVG (this article). These syntenies could represent the vestiges of the gene organization of an ancient urbilaterian genome. Of all three, EVG has the smaller size, in base pairs, in the D. melanogaster (EVG, 684 kbp, vs FGFR/VMAT, 18 Mbp, and TBP/C5, 2.5 Mbp) and C. elegans (EVG, 512 kbp, vs FGFR/VMAT, 6–8 Mbp, and TBP/C5, 4.7 Mbp) genomes.

What did determine the evolutionary conservation of EVG? The final answer to this question may be difficult to find. However, the evolutionary persistence of this synteny may shed light on the reasons why it originated. It is tempting to speculate that the evolutionary appearance of EVG might be related to the original functions of these genes and the selective advantage that their linkage could represent for the organism, in the rapidly changing environment of earth during those primeval times. Genes clustered in each human EVG seem to share a similar pattern of expression. Thus, SLC2A4, VAMP2, and ENO3, which define human EVG-17, are expressed predominantly in muscle. Human EVG-12 genes are expressed in neuronal tissue, whereas genes of human EVG-1 are broadly expressed. This hints at mechanisms of concerted gene expression as a possible explanation for the evolutionary appearance of EVG. Enolase and glucose transporters (SLC2As) are both involved in glucose metabolism, and the vesicular transport of Glut-4 (SLC2A4) is dependent on VAMP2 (Ramm et al. 2000; Randhawa et al. 2000). It seems reasonable that, for the primitive organism where the EVG association appeared, the linkage of these genes could confer a selective advantage, probably based on mechanisms of concerted gene expression of the three genes. Thus, although the concomitant gene expression of the current EVG genes may not necessarily be relevant for their present functions, it could be so at the origin of the EVG association, and what remains today are just the relics of that relevance.

The coordinated expression of the ancestral EVG genes could be attained by way of a peculiar genomic structure of the chromosomal region where these genes localized. This could rely on large-scale genomic structures that define chromatin loops important for the expression of the genes within the loop. Recent reports on large-scale gene expression mapping have shown the clustering of housekeeping and highly expressed genes in chromosomal domains (Boutanaev et al. 2002; Caron et al. 2001; Lercher et al. 2002, 2003; reviewed by Oliver et al. 2002; Spellman and Rubin 2002). These observations demonstrated that genes localized near chromosomal locations often showed similar expression patterns. The underlying molecular mechanisms are still a matter of discussion, although regulatory elements, such as insulators (West et al. 2002), nuclear compartmentalization (Carmo-Fonseca 2002), and genes of the Polycomb and Trithorax groups (Simon and Tamkun 2002), have been involved in the establishment and maintenance of large-scale expression domains. This type of mechanism could be relevant for the evolutionary conservation of large chromosomal regions and help to explain the evolutionary persistence of syntenic groups like EVG, even though the genes that define the current syntenies do not seem to share clear functional and/or structural relationships.