Introduction

Eukaryotic cells probably utilize several hundred GTPases to control processes ranging from protein synthesis to cell growth, but members of one particular family, the Ras-related small GTPases, have emerged as key players in the regulation of many important biological processes including growth and differentiation, morphogenesis, cell division and motility, cytokinesis, and trafficking through the Golgi apparatus, nucleus, and endosomes (Exton 1998). Small GTPases are monomeric guanine nucleotide-binding proteins with a M r of 20–25 kDa and are grouped into five subfamilies: Ras, Rho, Arf (ADP-ribosilation factors), Rab, and Ran. Ras was the first small GTPase to be discovered (Ellis et al. 1981). Ras proteins regulate cell growth, proliferation, and differentiation. Rho proteins (Rho, Rac, and Cdc42) control the actin cytoskeleton and ARfs are critical components in vesicular trafficking pathways and microtubule dynamics. Rab GTPases form the largest subfamily of small GTPases and are essential regulators in the secretory and endocytic pathways and in vesicle trafficking. Ran is the last discovered GTPase that plays a central role in protein and RNA trafficking in and out of the nucleus (Moore and Blobel 1993).

Small GTPases function as molecular switches and cycle between active and inactive conformations: in complex with ADP they are inactive and are activated when ATP is bound (Bourne et al. 1991; 1898771). GTP hydrolysis and GDP/GTP exchange are catalyzed by GTPase activating proteins (GAPs) and guanine nucleotide exchange factors (GEFs or GDSs), respectively (Boguski and McCormick 1993).

Small GTPases are found in all eukaryotic organisms, from Giardia to human. Their number, especially in the Rab subfamily, is generally much higher in metazoans than in unicellular organisms. The motifs that define proteins from the Ras superfamily are mainly some completely conserved residues implicated in GTP binding: the GXXXXGKS/T, F, T, DXXGQ/TE, NXXD, and SXK. Members from different subfamilies share a maximum of 40% identity.

Sponges (Porifera) are the most ancient and simplest extant multicellular organisms, which branched off first from the common ancestor of all Metazoa. From the extensive analysis of sponge genes and proteins, especially from the demosponge S. domuncula, we now know that the sponge genome encodes a large number of proteins, including many metazoan novelties, thus tracing their origins to the root of the metazoan evolution (Cetkovic et al. 1998, 2004a, b; Müller et al. 2004; Müller 1998; Gamulin et al. 1994; Schacke et al. 1994; Pfeifer et al. 1993). However, only three S.domuncula proteins from the Ras family of small GTPases have been described until now: Ras (CAA77070), RhoA (CAH04892), and Cdc42 (CAH04893).

A collection of more than 13,000 partial cDNA sequences (ESTs) from S. domuncula was recently obtained, encoding at most 3000 different proteins. We have inspected the S. domuncula EST database and have identified full or partial cDNA sequences encoding ∼50 different small GTPases from the Ras family. The results of analyses of the complete, or nearly complete, sequences of 44 Ras-like proteins from S. domuncula (41 new and 3 known proteins) are reported here. In addition, results of the topological expression of small GTPase Cdc42, studied by in situ hybridization, are also discussed.

Materials and Methods

S. domuncula Database of Expressed Sequence Tags (ESTs)

Live specimens of the marine sponge S. domuncula (Porifera, Demospongiae, Tetractinomorpha, Hadromerida, Suberitidae) were collected in the northern Adriatic Sea near Rovinj, Croatia, and then kept in an aquarium in Mainz, Germany, at a temperature of 17°C. The preparation of S. domuncula cDNA library in λ ZAP-Express vector was described earlier (Kruse et al. 1997). Random sequencing of cDNAs was performed from the 5′-ends. The average length of the sequences is 650–750 base pairs. ESTs were assembled in 4429 “unique” clusters using MiraEST Assembler (Chevreux et al. 2004). S. domuncula EST database is available at http://www.spongebase.genoserv.de. We have also included in this analysis about 1800 additional (redundant) ESTs, which are not yet deposited in this database.

Sequence Analysis

TBLASTN was used to identify single S. domuncula ESTs or EST clusters encoding homologues of Ras-like small GTPases from humans. ESTs encoding small GTPases were translated and analyzed using Translate and ProtParam tools at ExPASy proteomics server (http://www.expasy.org). Sponge small GTPases were further analyzed by NCBI search of the Conserved Domain Database (CDD) with Reverse Position Specific BLAST. Multiple sequences alignments (MSAs) and construction of the phylogenetic tree from the MSA were performed with the CLUSTAL X program (Thompson et al. 1997). The programs GeneDoc (Nicholas and Nicholas 1997) and TreeView, version 1.6.6. (Page 1996), were used for graphic presentation of the results. Accession numbers of sponge and human small GTPases are shown in Table 1.

Table 1. Small GTPases from S. domuncula: comparison with most related human proteins

Results

The Ras Family of Small GTPases from the Marine Sponge Suberites domuncula

We have identified in the S. domuncula database of ESTs 44 members of the Ras family of small GTPases. Ras-like proteins from all five subfamilies of small GTPases were found. However, the number of Ras-like small GTPases in the actual collection of ESTs is ∼50, because several partial, truncated, and unreliable cDNA sequences coding for additional Ras-like proteins were excluded from the analysis. Complete amino acid (aa) sequences for 36 proteins were obtained; 8 protein sequences are missing several amino acids from the (nonconserved) N- or C-terminus. This was not important for the identification and classification of these proteins. Six proteins from the Ras subfamily, 5 from Rho, 6 from Arf, 1 Ran, and 26 Rabs or Rab-like proteins were identified. When clear orthologues in Metazoa were found (>50% identity), sponge proteins were named according to the names of their orthologues in humans, with the prefix Sd for S. domuncula. Small GTPases from S. domuncula are more related in primary structure to orthologues/homologues from vertebrates than to those from two model invertebrates: C. elegans and D. melanogaster. Accession numbers for 44 Ras family members from S. domuncula are given in Table 1.

The multiple alignment of 44 S. domuncula proteins from the Ras family is shown in Fig. 1, and the unrooted tree of orthologous protein pairs from S. domuncula and humans in Fig. 2.

Fig. 1.
figure 1

Multiple alignment of 44 small GTPases from the marine sponge Suberites domuncula. The motifs implicated in GTP binding—GXXXXGKS/T, F, T, DXXGQ/TE, NXXD, and SXK—are marked.

Fig. 2.
figure 2

Unrooted phylogenetic tree of 44 small GTPases from marine sponge Suberites domuncula and their most related human proteins. The bar corresponds to 0.1 (10%) substitution per site.

Members of the Ras Subfamily

Six S. domuncula small GTPases belong to the Ras subfamily: five have clear orthologues in Metazoa, including mammals/humans (Table 1). They were named SdR-Ras2, SdK-Ras2, SdRap1, SdRalA, and SdRheb. A protein named SdRap1-like has additional amino acids at both ends and shares only 41% identity (63% overall similarity) with SdRap1, or maximum 38% identity with human Raps. Several ESTs for the same protein were identified (nine in the case of SdR-Ras2), and only SdRheb was encoded on a single EST. Regions important for GTP/GDP binding are completely conserved in all six proteins from the Ras subfamily (Fig. 1), and only SdK-Ras2 does not end with CXXX (CKV at the C-end). Orthologous proteins from humans and S. domuncula form a separate cluster in Fig. 2.

Members of the Rho Subfamily

Five S. domuncula small GTPases belong to the Rho subfamily. They all have clear orthologues in Metazoa, including mammals/humans (Table 1), and were named SdRho1, SdRho2, SdRho3, SdCdc42, and SdRac. Three SdRho proteins share maximum 74% identity (Rho1 and Rho3). SdRho2 was encoded on five ESTs, and others on one or two ESTs. Insertion, characteristic for Rho subfamily proteins, is present in all five members from S. domuncula and all proteins have CXXX at the C-end (Fig. 1). As in other Cdc42 proteins, in the G2 region of SdCdc42 the sequence reads TQVD. SdRho3 has one extra glycine in the PM3 region. SdRho proteins and orthologues from humans cluster together and form a separate branch in Fig. 2.

Members of the Rab Subfamily

Rab proteins form the largest subfamily of Ras-like small GTPases. Seventeen Rabs and nine Rab-like proteins were found in the S. domuncula EST database. SdRab1, -2, -3, -4, -5, -8, -10, -11, -14, -18, -21, -24, -28, -32, -35, -39, and -41/43 display 53% to 92% identity with orthologues from humans (Table 1). SdRab1-like is not closely related to SdRab1 (only 44% identity). SdRab20-like is most similar to Rab20 from Strongylocentrotus purpuratus (43% identity; 72171441). This protein has an insertion in the same region as the Rho proteins (Fig. 1). SdRab21-like (43% identity with SdRab21) is related to a D. melanogaster Rab called Chrowded (45% identity; AAF47018) and an unknown protein from S. purpuratus (59% identity; 72044634). Proteins named SdRab-like 1 to 6 do not have clear orthologues in Metazoa and show a similar degree of homology with small GTPases from Metazoa and unicellular eukaryotes (maximum 37% identity). In addition, catalytic regions in SdRab-like proteins are not always well conserved. Interestingly, five ESTs encode SdRab-like 6, while most other SdRabs were found on one or two ESTs and only SdRab4 and SdRab11 are encoded on six and five ESTs, respectively. All full-size SdRabs and Rab-like proteins have characteristic motifs with two cysteines at the C-terminus. SdRab5, -8, -14, and -41/43 and SdRab-like4 do not have complete C-ends, while SdRab24 and SdRab-like2 have incomplete N-ends (Fig. 1). In the large branch of Rab proteins, SdRab-like proteins form a separate subcluster (Fig. 2).

Members of the Arf Subfamily

Six S. domuncula small GTPases belong to the Arf subfamily, the most divergent subfamily in the Ras family and putative progenitors of all small GTPases (Kahn et al. 2005). Nevertheless, Arfs are well-conserved proteins and five S. domuncula Arfs have clear orthologues in other Metazoa. They were therefore named SdArf1, SdArf3, SdArf6, SdArl1, and SdArl10 (Table 1). SdArl1 protein is not complete and is missing up to 40 N-terminal amino acids including the P-loop (Fig. 1). A protein named SdArf1-like shows a high homology with human Arf1 (66% identity), although much less than SdArf1 (93%). SdArf6 was found encoded on eight ESTs; SdArl10, on five; and others, on one to three ESTs. In the G3 region Arfs, but not Arls (Arf-like proteins), have variation of the TCAT sequence and amino-terminal glycine, the site of N- myristoylation (Fig. 1). Arfs do not contain cysteine at the C-end and are not posttranslationally modified at the C-terminus. SdArf proteins and their orthologues from humans form the most diverged branch of Ras-like small GTPases (Fig. 2).

Ran Protein

The single Ran orthologue, SdRan, with 83% identity to human Ran, was encoded on three ESTs in the S. domuncula database. Most animals have only one Ran. SdRan has a canonical sequence with important regions fully conserved. Rans are closest to Arfs; they also lack cysteines at the C-ends and are not posttranslationally modified.

Discussion

Ras-like Small GTPases

S. domuncula EST database (http://www.spongebase.genoserv.de) contains 11,242 partial cDNA sequences, organized in 4429 “unique”clusters (Chevreux et al. 2004). Clusters contain from 1 to more than 100 sequences. However, due to small sequencing errors, or differences in allelic sequences, identical or nearly identical proteins are often encoded on several “unique” clusters. In addition to this database, 1800 new (redundant) ESTs from S. domuncula, were recently obtained. Altogether, about 13,000 ESTs from S. domuncula were analyzed in this work. Our estimate is that partial/full cDNAs for fewer than 3000 different proteins are present in the actual collection of ∼13,000 ESTs. This is obviously far from the real number of protein genes in this sponge. However, cDNAs encoding ∼50 small GTPases were already identified in this database. In addition to the 44 complete, or nearly complete, protein sequences reported here, we have found partial, truncated, or unreliable cDNAs for 4–6 more Rabs (SdRab6?, SdRab30?, SdRab13?) or Rab-like proteins. Isoforms of the same protein were not found. Most small GTPases are encoded by only 1–5 ESTs in the actual database; it is therefore reasonable to speculate that the final number of genes for these proteins in S. domuncula genome could/should be much higher than 50. Just to mention, we did not yet find cDNAs for Sar protein(s) in S. domuncula. By comparison, C. elegans genome codes for only 56 members of Ras-like small GTPases (Lundquist 2006), and D. melanogaster for 90 members (Jiang and Ramachandran 2006).

We were not surprised that S. domuncula small GTPases generally display higher degrees of sequence conservation with orthologues/homologues from vertebrates/mammals than with those from either C. elegans or D. melanogaster. This is the general characteristic of sponge genes/proteins, observed in many previous studies (Gamulin et al 2000; Cetkovic et al. 2004a; Perina et al. 2006). However, a large collection of protein sequences from the sea urchin S. purpuratus is now available for comparison. Interestingly, S. domuncula proteins are now most related to their orthologues from S. purpuratus. It is important to mention that Echinodermata are Deuterostomia, like all chordates and vertebrates.

Rab proteins form the largest family of Ras-like small GTPases. The human genome encodes 60 Rab proteins (including isoforms), grouped in functional subclasses (Moore et al. 1995), named Rab1 to Rab41 (Pereira-Leal and Seabra 2001). We have already identified 26 Rab or Rab-like proteins from S. domuncula, compared with only 29 proteins from this subfamily encoded in the genomes of C. elegans or D. melanogaster (Pereira-Leal and Seabra 2001). Clear orthologues of 17 human Rabs were identified, including SdRab24 (64% identity with human Rab24), that is missing in both model invertebrates. On the other hand, several Rabs (i.e., Rab6, -7, -9, -30), widely present in Metazoa and found in C. elegans and D. melanogaster, are not encoded in the actual collection of S. domuncula ESTs. There is little reason to believe that they are really absent in sponges.

SdRab-like proteins, the most diverged members of the S. domuncula Rab subfamily, form a separate deep branch in the phylogenetic tree shown in Fig. 2. They are not closely related and do not share unique sequence motifs. They also have only moderately conserved Rab specific motifs F1–F5 (Pereira-Leal and Seabra 2000), even when conservative substitutions are considered (Table 2). However, some SdRabs with clear orthologues in mammals also do not have fully conserved Rab specific motifs F1–F5 (Fig. 1). Interestingly, all SdRab-like proteins have F (Phe) positioned 3 aa from the GxxxxGKS/T conserved PM1 motif (Fig. 1). This residue is part of α-helix 1 (Huber and Scheidig 2005) and is always occupied by V/L/I/M (or T in Ras) in all small GTPases, including those from S. domuncula. We have inspected hundreds of Rabs from different organisms and have found only six putative Rabs with Phe at this position: the EhRabK group of four Rab-like proteins from Entamoeba hystolitica (Saito-Nakano et al. 2005), another RabGTPase (EAL49954) from the same protist, and an unknown protein (putative Rab) from D. melanogaster (AAL49269).

Table 2. Rab specific motifs F1–F5 in Rab-like proteins from S. domuncula

The number of GTPases from the Rab subfamily in unicellular eukaryotes differs very much: from 7 in the yeast Schizosaccharomyces pombe or 11 (including isoforms) in Saccharomyces cerevisiae (Pereira-Leal and Seabra 2001) to 16 in Trypanosoma brucei (Ackers et al. 2005) and more than 90 putative Rab proteins in the protozoan parasite E. histolytica (Saito-Nakano et al. 2005). However, only six Rab subfamilies or subclasses from E. histolytica (22 proteins including isoforms) share more than >40% identity with six human Rabs (1, 2, 5, 7, 8, and 11) and all other putative Rabs have no obvious homologues in other organisms (Saito-Nakano et al. 2005). Only nine Rabs from T. brucei have potential orthologues/homologues in animals: TbRabs1, -2, -4, -5, -6, -7, -11, -18, -21, -23, and -28 (Ackers et al. 2005). A very similar situation is also found in the plant Arabidopsis thaliana: 57 Rab proteins correspond in terms of homology to only 8 animal Rabs: Rab1, -2, -5, -6, -7, -8, -11, and -18 (Pereira-Leal and Seabra 2001). Therefore, all these organisms have fewer than 10 subfamilies of Rabs that correspond to metazoan/human Rabs. We have already identified 17 clear orthologues of animal Rabs in a sponge, the simplest metazoan organism with tissue-like cell assemblies and only a few specialized cell types (Müller 2006). Recent investigations of basal metazoan taxa, Porifera and Cnidaria, have shown that these simplest multicellular animals encode many mammalian orthologues, which are missing (or are highly diverged) in fly and worm genomes (Gamulin et al. 2000; Kortschak et al. 2003; Cetkovic et al. 2004b). Their genes (intron/exon organization) are also more related to vertebrate genes than to those from C. elegans and D. melanogaster (Schmitt and Brower 2001; Müller et al. 2002, Cetkovic et al. 2004a). Both model invertebrate organisms experienced recently accelerated evolution and therefore sponge proteins very probably better reflect structures of proteins in the ancestral metazoan genome.

It was already noticed that unlike other families involved in vesicle trafficking (coats, SNAREs, and Sec1s), the Rab subfamily of small GTPases expanded and diversified in function significantly during the evolutionary leap from single to multicellular organisms (Bock 2001). The number of Rabs is not connected with the number of cells in a metazoan organism: C. elegans and D. melanogaster have the same number of Rabs (29) and differ a million times in the number of cells (Pereira-Leal and Seabra 2001). We have already identified 26 Rabs in a collection of maximum 3000 partial sponge protein sequences. It is therefore very likely that the Rab subfamily of small GTPases expanded and diversified in function very early in the evolution of Metazoa, at the level of the hypothetical ancestors of all multicellular organisms—Urmetazoa. Furthermore, it is reasonable to expect that (many) more members from other subfamilies of Ras-like small GTPases are also present in S. domuncula. The total number of different proteins from Ras-like family in this sponge could thus be equal to or even higher than that in two model invertebrates. Our results indicate that duplications and diversifications of genes encoding Ras-like small GTPases, especially the Rab subfamily of small GTPases, happened very early in the evolution of Metazoa.

Proteins Involved in the Dynamic Actin Network and Phagocytosis

Sponges are suspension-feeders that are devoid of body cavities. As in the mold Dyctiostelium discoideum, phagocytosis is the major route of nutrition of these simple multicellular animals. Thanks to their ability to filter, retain, and digest bacteria by phagocytosis, sponges were recently used as a bioremediator to remove pathogenic bacteria in integrated aquaculture ecosystems (Fu et al. 2006). In higher organisms, phagocytic cells, such as macrophages, are essential for host defense against invading pathogens, and phagocytosis contributes to inflammation and the immune response. To engulf particles by phagocytosis, profound rearrangements of the actin cytoskeleton and plasma membranes are required. Actin polymerization is thought to be the driving force for membrane extension around the particle. Control of actin filament polymerization depends on Cdc42 and Rac, two small GTPases from the Rho subfamily (Chimini and Chavrier 2000). In their GTP-bound conformation Cdc42 and Rac interact with downstream effectors, such as the Wiscott-Aldrich syndrome protein (WASP), which activates the actin-nucleating Arp2/3 complex, and this complex is necessary for the assembly of actin filaments at the phagocytic cup (Millard et al. 2004). Arp2/3 complex is made of seven proteins and is evolutionary well conserved among eukaryotes. Cortactin, a predominant substrate of Src family kinases, also directly activates Arp2/3 complex and is one of the key regulatory proteins of the actin cytoskeleton in cancer cell migration and invasion (Yamaguchi and Condeelis 2006). Signaling via Src kinases and cortactin is a characteristic of Metazoa and is not found in unicellular eukaryotes, fungi, plants, or D. discoideum. However, src kinases from S. domuncula, a true metazoan organism, are well investigated (Cetkovic et al. 2004a), and cortactin was also previously identified. Cortactin from S. domuncula is 477 aa long, has five HS1 repeats, and ends with the SH3 domain (CAC80140).

In this work we have identified in the S. domuncula EST database full or partial cDNA sequences for proteins from Arp2/3 complex and WASP and have compared sponge proteins with orthologues from vertebrates, D. melanogaster, C. elegans, D. discoideum, and S. cerevisiae. The results are summarized in Table 3.

Table 3. Proteins involved in the dynamic actin network and phagocytosis: comparison with human, D. melanogaster, C. elegans, D. discoideum, and S. cerevisia e

The highest homology was always found with proteins from vertebrates; the degree of primary sequence conservation varied very little from fish to human. However, orthologous proteins from two model invertebrates, especially from C. elegans, displayed significantly lower overall homology with sponge proteins, while WASP and Arp2/3 proteins from D. discoideum diverged even more. The lowest sequence conservation was found with orthologues from S. cerevisiae.

The topological expression of cdc42 gene in S. domuncula was studied by in situ hybridization. Frozen sections through sponge tissue (8 μm thick) were treated applying the described procedure (Perovic et al. 2003) and hybridized in parallel with digoxigenin-labeled antisense and sense probes for cdc42 gene transcript. The results are shown in Fig. 3. The expression level of cdc42 is highest in the endopinacoderm (epithelial-like layer), especially in the oscule region. The involvement of a dynamic actin cytoskeleton has been implicated not only for the endocytotic pathway (Engqvist-Goldstein and Drubin 2003) but also in defining cell polarity (Cohen et al. 2001) and axis formation (Holland 2002). In the recent study it had been proposed that the oscule region is one organizing center which controls apical-basal axis formation in sponges (Müller 2005). Our finding suggests that Cdc42 in S. domuncula is, among other functions, also involved in axis formation.

Fig. 3.
figure 3

In situ hybridization analysis applying probes for Cdc42 (Sdcdc42). A Applying a labeled Sdcdc42 antisense probe, high expression is seen in the endopinacoderm layer (>), surrounding the atrial cavity (at). This cavity is localized below the oscule (o). B In a parallel experiment a labeled Sdcdc42 sense probe was used, which showed comparably weak expression. Bars = 500 μm.