Introduction

The ATP binding cassette (ABC) superfamily comprises proteins sharing conserved motifs and represents a large family in both prokaryotic and eukaryotic organisms (Biemans-Oldehinkel et al. 2006; Higgins 1992). The most recognizable feature of proteins within this superfamily is the nucleotide binding domain (NBD), which contains a Walker A domain, the ABC signature sequence, and a Walker B site. Bacteria have used ABC transporters to transport small molecules both into and out of the cytoplasm. Bacterial transporters may require additional components to function. For example, substrate binding proteins are needed for import-type transporters, and toxin exporters of gram-negative bacteria require both a membrane fusion protein and an outer membrane protein (Delepelaire 1994). With the exception of a few plant and yeast genes that have been demonstrated to function as importers, the majority of ABC transporters function as exporters (Wilcox et al. 2002; Yazaki 2006). Three recent reviews summarize our present understanding of how ATP hydrolysis drives substrate transport across membranes (Hollenstein et al. 2007; Lee et al. 2007; Rea 2007).

Eukaryotic ATP binding modules also have been utilized in other cellular processes, including the regulation of ion transport (Dean and Annilo 2005), chromosome condensation and DNA repair (Cobbe and Heck 2003; Hirano 2002; Hopfner and Tainer 2003), mRNA processing (Belfield et al. 1995; Zhao et al. 2004), and RNAi (Sundaram et al. 2006). In the eukaryotic kingdoms, the NBD regions are sufficiently conserved to enable newly identified transporters to be classified into a series of subfamilies on the basis of NBD sequence and number of transmembrane (TM) domains. Half transporters are defined as having a (TM6-NBD) forward orientation or (NBD-TM6) reverse orientation (Biemans-Oldehinkel et al. 2006). Full transporters have domain organizations of (TM6-NBD)2 or (NBD-TM6)2. The minimal functional unit of ABC transporters has two NBDs (Tusnady et al. 2006). However, the yeast full transporter Pdr5p was found to reconstitute into lipid bilayers as a dimer, suggesting that the functional unit of this transporter utilized four NBDs (Ferreira-Pereira et al. 2003). Complete and quasi-complete inventories of eukaryotic ABC transporters were first reported for yeast and humans, respectively (Bauer et al. 1999; Klein et al. 1999). Comprehensive reviews have since been done for A. thaliana (Rea 2007; Sanchez-Fernandez et al. 2001), Dictyostelium (Anjard et al. 2002), Oryza sativa (Garcia et al. 2004; Jasinski et al. 2003), and Caenorhabditis elegans (Sheps et al. 2004). Eight major families have been defined in eukaryotes, from ABCA to ABCH, based on gene organization and conserved sequences within the NBDs (Anjard et al. 2002; Klein et al. 1999). Some of these families have both full and half transporters, while three families (ABCE, ABCF, and ABCH) do not have associated transmembrane domains. An additional subfamily classified as part of the ABC superfamily is the Structural Modification of Chromosomes (SMC) (Cobbe and Heck 2003).

The conserved features of NBDs have made the task of assigning ABC transporters to particular subfamilies in newly sequenced genomes a relatively straightforward process using standard bioinformatic tools. However, defining the substrate capabilities of individual transporters has proven to be more challenging. In yeast, Pdr5p and Snq2p share 58% similarity, and both transport a wide range of chemically distinct compounds (Kolaczkowski et al. 1998), while close homologues such as Pdr12p and Pdr15p function in other roles (Hatzixanthis et al. 2003; Wolfger et al. 2004). In animals, proteins involved in drug resistance are associated with members of subfamilies B, C, and G (Callaghan et al. 2006; Haimeur et al. 2004). The identification of orthologues has become an important resource, as orthologues are thought to retain similar functions over evolutionary time. However, amino acid sequence analysis of ABC transporters in the genomes of H. sapiens, D. melanogaster, C. elegans, and S. cerevisiae revealed that there were surprisingly few orthologous groups that could be identified (Sheps et al. 2004). The paucity of orthologues is due to the whole-genome duplication events and selective gene loss that have occurred in the yeast and vertebrate genomes (Dehal and Boore 2005; Wolfe and Shields 1997). Furthermore, functional analysis of even mammalian orthologues to human MRP1 has shown that they have significantly different transport capabilities (Haimeur et al. 2004). Thus for ABC transporters, sequence similarity is not always a reliable indicator of function.

Phytophthora species comprise a genus of plant pathogens with a huge economic impact on a wide range of agricultural and forest species (Erwin and Ribeiro 1996). One stage of their life history includes a fungal-like hyphal growth habit, but these pathogens are part of the kingdom Stramenopila and are now grouped in the Chromalveolata (Simpson and Roger 2004). Members of this group are thought to share a common photosynthetic ancestor, evidence for which was found in the genome sequences of Phytophthora sojae and Phytophthora ramorum (Tyler et al. 2006). P. sojae is primarily a pathogen of soybeans and lupins (Erwin and Ribeiro 1996). P. ramorum has a wide host range, causing disease in at least 38 plant species from 12 families (Rizzo et al. 2005). The devastation of the oak forests on the coastal range of California was first noted only a few years ago, and similar outbreaks have also been reported in Europe (Rizzo et al. 2005). Draft sequences from a whole-genome shotgun approach identified 19,027 predicted genes in P. sojae and 15,743 predicted genes in P. ramorum (Tyler et al. 2006). The two species are closely related; the average sequence similarity between predicted orthologues was 91% (Stokstad 2006), and initial surveys identified 9768 putative orthologues (Tyler et al. 2006). The Phytophthora genome contains 855 genes with a distinctive phototrophic origin, providing strong support for a photosynthetic ancestry of stramenopiles that is consistent with the chormalveolate hypothesis (Reyes-Prieto et al. 2008; Simpson and Roger 2004). Thus the migration of genes from the red algal symbiont to the nucleus of the stramenopile ancestor appears to have played an important role in the diversity of this genome. Here, we have used a comparative genomics approach to see if such an analysis would be informative about the ancestry of ABC proteins in Phytophthora sp.

Materials and Methods

Identification of ABC Transporter Genes

The DOE-JGI genome browser for the genomes of P. sojae and P. ramorum was used to identify and retrieve sequences (Tyler et al. 2006). Updates to the original draft sequences of both species are now located at the Virginia Bioinformatics Institute Microbial Database (http://annuminas.vbi.vt.edu/) (Tripathy et al. 2006). An important additional resource in the annotation of ABC genes was the P. sojae EST database (Qutob et al. 2000; Torto-Alalibo et al. 2006). A keyword search using “ABC transporter” generated hits to 195 and 206 gene models in the P. ramorum and P. sojae genomes, respectively. Manual annotation of the draft sequences suggested that many gene models contained excess introns, and introns not needed to avoid a stop codon were removed. Contigs containing ABC domains were extended where possible to obtain full-length sequences.

The BLAST tool (Altschul et al. 1990) on the DOE browser http://genome.jgi-psf.org/euk_cur1.html was used to match truncated models on smaller scaffolds with complete models. Complete models of one family were used in a BLAST search of the opposite genome to identify all potential homologues. Representative members of ABC transporters from other organisms including prokaryotes were also used in a BLAST analysis of the P. sojae and P. ramorum genomes to identify additional putative matches to the ABC family and to facilitate assignment to particular subfamilies. The P. ramorum and P. sojae genomes are organized into gene islands where the genes are tightly clustered (Tyler et al. 2006), and thus some of the predicted ABC gene models contained additional domains that were likely to be separate proteins. Unusual motifs were only included as part of an ABC gene when they were within an exon containing a portion of an ABC transporter. The full-length sequences of all genes were used in a BLAST analysis against the assembled genomes of P. ramorum and P. sojae to determine the scaffold locations for each gene model. Predicted P. sojae genes were also subjected to BLAST analysis against the P. sojae EST database to determine whether predicted gene models were expressed. Visual inspection of BLAST hits was used to determine whether the genes matched ESTs in free-living (zoospore and mycelial) or infection libraries. Resequencing of the P. sojae genome to close sequence gaps within gene models and extend scaffolds is presently under way (B. M. Tyler, personal communication).

Highlighting Conserved Sequences Within NBD Families

Amino acid sequences immediately downstream of the Walker B site of the NBD fold have been observed to be strongly conserved within gene families (Smart and Fleming 1996). ABC family members were aligned using Clustal X (Thompson et al. 1997). The Walker B site (HHHHD) was identified by manual examination of the aligned sequences. The aspartate residue and the next 10 or more conserved residues in the alignment were retrieved to serve as identifiers for ABC subfamilies.

Sequence Alignments and Phylogenetic Analyses

Members of subfamilies of two diatom genomes (Phaeodactylum tricornutum and Thalassiosira pseudonana) were identified using BLAST searches on the DOE-JGI server (http://genome.jgi-psf.org/euk_cur1.html) using P. sojae and P. ramorum proteins as bait (Armbrust et al. 2004; Tyler et al. 2006). ABC proteins in the Fusarium graminearum and Magnaporthe grisea genomes were first identified using a keyword search for ABC transporters using the Web servers of the Broad Institute (http://www.broad.mit.edu/annotation/fgi/). Sequences from each subfamily were then used in a BLASTp search to retrieve all proteins with significant matches. The most poorly matched sequence was then used in an additional BLAST search until no new ABC proteins were found. At a later stage of the analysis, representative members of the different ABC subfamilies from each fungal genome were used in a query search of the other fungal genome to identify all potential orthologues. Scaffold locations for gene models were determined by Blastn analysis of P. ramorum models against the V1 assembly of P. ramorum, and, whenever possible, Blastn analysis of P. sojae models using the V3 assembly of the P sojae genome (http://annuminas.vbi.vt.edu/).

Full-length sequences were aligned with Clustal X (Thompson et al. 1997) and manually inspected. Models with introns that spanned regions in the NBDs that were conserved in all other models were excluded as pseudogenes unless there was EST support for them. ABC sequences from the A. thaliana, H. sapiens, and S. cerevisiae genomes were obtained from GenBank along with selected ABC transporters from other genome databases and used in multiple alignments to determine the probable phylogenetic origin of oomycete proteins. Phylogenetic analysis was carried out using PAUP version 4.0 using both parsimony and distance analysis (neighbor joining; NJ) with 1000 bootstrap replicates. Predicted orthologues of P. sojae ABC proteins in A. thaliana, H. sapiens, and S. cerevisiae were also identified by implementing a local version of the Reciprocal Smallest Distance analysis (RSD) program (Wall et al. 2003). Full-length sequences of proteins were analyzed using InterProScan (http://www.ebi.ac.uk/InterProScan/) to identify novel accessory domains that are sometimes present in ABC transporters (Biemans-Oldehinkel et al. 2006). Predotar was used to identify proteins with localization signals to the mitochondria and endoplasmic reticulum (Small et al. 2004). Data for these models are available at the VBI Microbial database (http://annuminas.vbi.vt.edu/).

Results and Discussion

Here we present a classification of all complete models of the ABC superfamily in the genomes of P. ramorum and P. sojae. Manual annotation of the ABC superfamily of gene models from the draft sequences of P. ramorum and P. sojae identified 135 and 136 transporters, respectively (Supplementary Table S1). In both genomes, the majority (74%) of models were defined by having one single open reading frame, while an additional 14% of the models had only one intron. Models with multiple introns were often found to have deletions in the NBDs. Since these regions both are highly conserved across kingdoms and enable the binding and hydrolysis of ATP, such models were classified as pseudogenes and were excluded from this analysis. Gene IDs are roughly correlated with scaffold position (Supplementary Table S1). Examination of family trees revealed many examples where closely related models arose as tandem duplication events. The high level of synteny between the genomes (Tyler et al. 2006) facilitated the task of identifying both complete models and orthologues of ABC proteins in the two sequenced genomes. The total number of ABC proteins is comparable to that of the A. thaliana genome and greater than the number of these proteins in fungal pathogens such as Fusarium graminearum and Magnaporthe grisea as well as the human, yeast, and Dictyostelium genomes (Table 1). Of the published eukaryotic genomes, only those of poplar (Tuskan et al. 2006) and the human parasite Trichomonas vaginalis (Carlton et al. 2007) have been reported to have larger numbers of ABC proteins. The oomycete genomes include representatives of each ABC subfamily found in other eukaryotes, with the exception of the ABCA half transporters and members of the ABCH subfamily. In addition, both of the oomycete genomes contain homologues of bacterial ABC proteins involved in DNA repair and iron transport. Phylogenetic analysis of ABC subfamilies using both parsimony and NJ resulted in similar trees with comparable levels of support, so only NJ trees are shown.

Table 1 Numbers of genes in the ABC families of selected eukaryotic genomes

The RSD algorithm relies on a global alignment of best BLAST hits and a pairwise maximal likelihood estimation of evolutionary distances to identify ortholgous proteins in distant genomes (Wall et al. 2003). To identify oomycete ABC transporters with orthologues in other kingdoms, P. sojae ABC transporters were used in RSD analysis of A. thaliana, H. sapiens, or S. cerevisiae genomes. This analysis identified 20 P. sojae ABC proteins with orthologues in one or more of these genomes (Table 2). Transporters that are orthologous to those in other kingdoms may be involved in similar processes and thus also serve to identify networks of interacting proteins. For example, Ps133911 is orthologous to proteins in all three genomes. While the orthologue in humans (ABCG2) plays a role in drug resistance (Cervenak et al. 2006), the orthologous yeast protein ADP1 was identified as a component of the yeast secretome pathway (Schuldiner et al. 2005).

Table 2 List of predicted orthologues identified by reciprocal smallest distance analysis

Since gene prediction programs often result in truncated proteins, we found it useful to identify conserved features within each NBD for each subfamily. The conserved aspartate residue of the Walker B sequence and the sequences immediately downstream which comprise the ‘D’ loop (Ambudkar et al. 2006) were observed to be strongly conserved within each subfamily (Table 3). In full transporters, the two NBDs each contain a unique signature sequence. This signature sequence made it possible to distinguish easily between two closely positioned half transporters and a full transporter in the same family. Some families contained particularly divergent sequences in both the TM regions and the typically conserved regions of the NBDs. These genes have diverged markedly from other known eukaryotic ABC transporters and possibly, in a few cases, have a distinct phylogenetic origin.

Table 3 Conserved sequence determinants in the “D” loop of ABC domains

ABCA Family

P. ramorum and P. sojae has 10 and 11 full-size transporters, respectively. All of the P. sojae transporters are orthologous to P. ramorum. The diatom genomes each contain a single full-size transporter that is not closely related to the oomycete transporters (Table 1 and Supplementary Fig. S1). In the human genome, 8 of the 12 members of this family are associated with trafficking of lipophilic compounds, but the specific role(s) of the majority of these proteins has yet to be determined (Pohl et al. 2005). Expression has been found in free-living or infection libraries for 9 of the 11 P. sojae transporters, but only 2 (Ps137961, and Ps143446) are uniquely expressed in infection libraries. This may indicate that the primary role of these transporters is related to adaptation to environmental changes in the free-living state.

ABCB Half Family

In eukaryotes, ABCB half transporters fulfill a variety of roles, and individual proteins are localized to the plasma membrane, mitochondria, lysosomes, or endoplasmic reticulum (Abele and Tampé 1999; Allikmets et al. 1999; Graf et al. 2004; Kispal et al. 1999; Kobayashi et al. 2004; Kushnir et al. 2001). P. ramorum and P. sojae contain seven and five half transporters, respectively (Table 1). Ps128790 and Pr81934 cluster away from other transporters and from a clade with bacterial transporters (Fig. 1).

Fig. 1
figure 1

Phylogenetic analysis of half transporters of the ABCB family. Bootstrap values are shown for unrooted trees. Scale bar indicates the relative length of each branch. Gene ID numbers are preceded by a species prefix. An asterisk indicates 100% boostrap support

Models Pr72289 and Ps108370 cluster with A. thaliana, diatom, and human mitochondrial ABC transporters (Fig. 1). Predotar analysis (Small et al. 2004) showed that both oomycete transporters contain predicted mitochondrial transit peptides. Characterized transporters in this group play a key role in the folding of proteins containing Fe/S prosthetic groups (Kispal et al. 1999). Mutations of these proteins in humans and A. thaliana can result in a plethora of disorders related to iron homeostasis or loss of functional mitochondrial and cytosolic Fe/S proteins (Allikmets et al. 1999; Kushnir et al. 2001). A third group of Phytophthora sequences clusters with A. thaliana and human genes that have yet to be characterized. Localization studies of the human transporters show that ABCB2 and ABCB3 are localized to the endoplasmic reticulum, while ABCB9 is localized to the lysosome, and ABCB10 is localized to the mitochondria (Dean 2002). Of the nine diatom ABCB half transporters in T. pseudonana, only two have orthologues to oomycete genes (Fig. 1).

ABCB (MDR) Family

P. ramorum and P. sojae have seven and nine MDR transporters, respectively (Table 1 and Fig. 2). MDR transporters such as Pgp play a major role in drug resistance of animal cells (Callaghan et al. 2006). Plant MDR transporters are thought to play a major role in the export of phytoalexins (Yazaki 2006). The majority of these plant MDR transporters are localized to the plasma membrane and function as efflux transporters, but some MDR transporters function as importers (Yazaki 2006). Functional analysis of a MDR transporter from Coptis japonica shows that it is involved in the active uptake of berberine from xylem vessels (Sakai et al. 2002). In the plant genome, plant MDR-type transporters form three independent clades (Fig. 2). Intriguingly, the two plant genes that have been functionally characterized as importers, CjMDR1 (Shitan et al. 2003) and the auxin importer At2g47000 (Santelia et al. 2005; Terasaka et al. 2005), are in the same clade, while two plant genes that have been identified as auxin exporters (At2g36910 and At3g28360) form a separate cluster (Geisler and Murphy 2006). Thus sequence conservation appears to be more strongly associated with the direction of transport than the substrates being mobilized. RSD analysis showed that Ps109245 is ortholgous to the human genes Q6KG50 (no known function) and Q9SY12 (At4g01820), which cluster in the same clade as characterized plant importers (Table 2 and Fig. 2). Thus Ps109245 and related sequences (Pr91114 and Ps109290) may cluster away from other oomycete ABCB genes because they function as importers.

Fig. 2
figure 2

Phylogenetic analysis of full transporters of the ABCB family (MDR-type transporters). Bootstrap values are shown for unrooted trees. Scale bar indicates the relative length of each branch. Gene ID numbers are preceded by a species prefix. An asterisk indicates 100% bootstrap support

Two sequences in each of the diatom genomes (Tp22526 and Tp22016, Pt37558 and Pt21548) have an unusual domain arrangement (TM6NBD-NBD TM6). The EST support for both genes in P. pseudonana (http://genome.jgi-psf.org/Thaps3/Thaps3.home.html) suggests that these are functional proteins.

ABCC (MRP) Family

In both plants and animals, members of the MRP family of transporters play a key role in the removal of toxins including metals, by extrusion from the cell or concentration in the vacuole (Deeley and Cole 2006; Klein et al. 2006; Sanchez-Fernandez et al. 2001). In Arabidopsis, At2g34660 is localized to the vacuole and exports glutathionated herbicides and anthocyanins from the cytoplasm (Lu et al. 1998). Proteins in this family are also associated with the transport and regulation of ion fluxes. In animals, the cystic fibrosis transmembrane regulator, ABCC7 (CFTR), functions as a chloride ion channel and also regulates K+ channels (Matsuo et al. 2003). Two other ABCC transporters, ABCC8 (SUR1) and ABCC9 (SUR2), function as regulators of K+ channels (Dean and Annilo 2005). A feature of some members of the MRP family in plants, animals, and Dictyostelium is a hydrophobic N-terminal region of about 230 amino acids (Tusnady et al. 2006), but in oomycetes this extra domain of transmembrane helices was found only in Pr94491 and Ps128854.

In the oomycetes, the MRP family is a particularly diverse family, with 22 and 20 members in P. ramorum and P. sojae, respectively, that comprise seven clades (Supplementary Fig. S2). None of the oomycete proteins are orthologous to other eukaryotic ABCC transporters. In fact, four clusters showed significant divergence from typically conserved amino acid positions for this family, including the D loop (Table 3). The most divergent pair among members of this family is Pr96983 and Ps140011.

Three genes were included in the ABCC family, although they were not complete models (Table 1). Ps156344 has eight introns, and deletions of conserved residues in the NDB domains, but was included since there is EST support for this gene. Gene models Pr85272a and Pr85272b span gaps in the assembled genome sequence surrounding the most conserved regions of the gene models.

ABCD Family

Members of the ABCD family in animals, Dictyostelium, fungi, and oomycetes are half transporters of the format (TM)6NDB and are localized to the peroxisomal membrane (Theodoulou et al. 2006). Mutational analysis of the two yeast transporters suggests that they function as heterodimers (Shani and Valle 1996) and are involved in the import of long-chain fatty acids into the peroxisome so that these substrates can be utilized by ß-oxidation (Verleur et al. 1997). In contrast, both the rice and the A. thaliana genomes contain full-size ABCD transporters that are distinct from those of other sequences (Supplementary Fig. S3). A second group of plant transporters, At1g54350 and OsAK064992, contains a plastid localization sequence and forms a separate clade with three sequences from T. pseudonana and two sequences from the red alga, Cyanidioschyzon merolae. P. ramorum and P. sojae each contain two members of this family that share closest homology to the human gene ABCD3.

ABCE Family

With the exception of some plants, eukaryotes contain only one ABCE sequence (Anjard et al. 2002; Tuskan et al. 2006), and this is also the case in these oomycete genomes. Members of this family have two NBDs but no TM domains. Sequence conservation of ABCE proteins among eukaryotic organisms from different kingdoms is much higher than observed in other families. For example, the oomycete sequences share 83% and 81% sequence homology with proteins from the A. thaliana and human genomes, respectively. This high level of conservation suggests that, at least in this family sequence, homology may also be indicative of similar functions. Members of this family have been classified as RNase L inhibitors (Bisbal et al. 1995), but homologues of RNAse L appear to be restricted to mammals (Zhao et al. 2004). Inhibition of the ABCE in the C. elegans homologue by RNAi is lethal, and a combination of other functional assays suggests that this protein may play a role in the control of transcription and translation (Zhao et al. 2004).

ABCF Family

The ABCF family is another subclass of ABC proteins with no transmembrane domains but two homologous NBDs. Phylogenetic analysis of the oomycete models, along with homologues from A. thaliana, humans, and yeast, identified seven pairs of orthologous genes in the oomycetes, an increase over the three genes in the vertebrate genomes and five genes in A. thaliana (Table 1 and Supplementary Fig. S4). The function of a few of these proteins has been characterized in other species, thus the oomycete orthologues likely perform similar roles (Dean and Annilo 2005). Gene models Ps121330 and Pr 40048 are orthologous to HsABCF1, which interacts with eukaryotic initiation factor 2 (Table 2) (Dean 2002). Ps108178 and Pr71795 are orthologous to the S. cerevisiae protein ARB1, which functions in several steps related to the assembly of 40S and 60S ribosomes, including shuttling from the nucleus to the cytoplasm (Dong et al. 2005).

ABCG Half Family

The ABCG family of transporters includes both full-length (PDR) and half (white) ABC transporters in what is called reverse orientation, with the NBD preceding the transmembrane spanning domains. There are 20 and 22 half transporters, respectively, in P. ramorum and P. sojae (Table 1). Of the eight diatom transporters in this subfamily, only three are associated with oomycete clades (Supplementary Fig. S5). None of the oomycete transporters in this family cluster with ABCG transporters from other eukaryotes (not shown). The NBD of half transporters is most homologous to the C-terminal NBD of full-length ABCG transporters, and with notable exceptions (see below) half transporters are more similar to each other than to the C-terminal regions of PDR transporters.

In addition to gene duplication events, the conversion of full transporters to half transporters represents another means of producing novel transporters. A comparison of the alignments of P. sojae and P. ramorum contigs indictates that the introgression of a predicted phospholipase gene (Ps135746) at the midpoint of the genome sequence for a PDR transporter has converted the C-terminal half of the protein into Ps135745, a gene model with EST support. Structurally, this gene model is now an ABCG-type transporter. The N-terminal portion of the ancestral PDR gene is now likely a pseudogene (Ps135747) due to mutations resulting in the loss of TM domains. Additional examples for which there is presently no EST support include Ps135756, which shares 90% amino acid identity with the C-terminal half of the PDR transporter Pr79857, and Ps131627 and Ps131086, which share 53% and 95% identity, respectively, with the first half of the PDR transporters Ps127268 and Ps131094.

The full-length transporter Ps131094 may be another example where two half transporters have been produced by splitting of a gene. This gene has a 191-base-pair intron that effectively splits both halves of the gene, and both halves have EST support. An alternative splicing arrangement could potentially result in the production of two half transporters.

ABCG (PDR) Family

Full-length transporters in the ABCG family are also referred to as pleiotropic drug resistance (PDR) transporters, with the yeast proteins Pdr5p and Snq2p probably being the best characterized (Decottignies and Goffeau 1997). While the A. thaliana genome contains 15 members, and the rice genome contains 23 members (Crouzet et al. 2006), this family has 49 members in both P. ramorum and P. sojae (Supplementary Fig. S6). Due to the large number of family members, the phylogenetic analysis reported here included only sequences from P. ramorum and P. sojae. Members of this family are more similar to plant PDRs than those from fungal genomes. However, a phylogenetic analysis of P. sojae, diatoms, and Arabidposis PDRs indicated that all of the P. sojae PDRs clustered away from A. thaliana and the two diatom genomes (not shown).

Closely related genes in the oomycete PDR family appear to have evolved from sequential duplication events and are often clustered without interruption along the scaffold (Table 1). Phylogenetic analysis of the PDRs in P. ramorum and P. sojae suggests that expansion of the family occurred from several independent clades in the ancestral progenitor (Supplementary Fig. S6). Orthology between the two species has largely been maintained, and there are no examples of selective expansion of a clade in P. ramorum that might account for the larger host range of this species.

The PDR family also includes two very large genes, Pr74351 (2734 aa) and Ps133919 (3198 aa), that span a single open reading frame. The additional amino acids in these sequences are in the N-terminal region prior to the first NBD. Unlike all other PDR transporters, these genes exhibit divergence in the typically conserved sequences of the two NBDs. These sequences are also the most divergent orthologous pairs that we characterized, sharing only 72% identity across the portion of the protein defining the typical domain structure of PDRs and 52% identity in the novel N-terminal regions of these proteins. Hidden Markov analysis using the sequence analysis tools of InterProScan (http://www.ebi.ac.uk/Tools/InterProScan/) shows that the N-terminal cytoplasmic region has similarity to AcrB, the multidrug efflux protein of E. coli (Yu et al. 2003), and that the first TM domains in both proteins contain a lipocalin-related protein IIPR00056 motif.

The P. sojae gene PDR1, the only member of this family to be characterized to date (Connolly et al. 2005), is one of three tandemly repeated genes (PDR1, Ps159444, and Ps143204) coding for the same core sequence of 1348 amino acids that are part of a contig containing six other closely related sequences: Ps143201, Ps143203, Ps145369, Ps144697, Ps144695, and Ps 132439. BLAST analysis of P. sojae trace files (http://vmd.vbi.vt.edu/toolkit/) using the N- and C-terminal sequences of PsPDR1 identified 27 and 37 matching sequence reads, consistent with multiples copies of this gene in the assembled genome, which has an estimated ninefold coverage (Tyler et al. 2006). These sequences form a clade with eight P. ramorum sequences that are distinct from all other members of this family (Supplementary Fig. S6). A duplication event in the same clade has also occurred in the P. ramorum genome. The nucleotide sequence of P. ramorum models Pr84169 and Pr84173 are 99% identical, while sequences in the 5′UTR and promoter regions of the gene models are unique.

Analysis of EST data indicated that the PDR1 sequence was expressed at levels that were two- to threefold higher than those of several housekeeping genes and higher than those of any of the other ABC transporters (Connolly et al. 2005). PDR1 is expressed only by swimming zoospores, and heterologous expression of this gene in S. cerevisiae drug transporter mutants showed that it could transport at least five unrelated hydrophobic compounds (Connolly et al. 2005). Zoospores are carried along by surface ground water and swim at speeds equivalent to 15× their body length per second (Carile 1983). Thus elevated expression of a putative drug resistance transporter may confer a significant selective advantage for the organism in the soil environment.

SMC Family

The structural maintenance of chromosomes (SMC) family regulates the processes of chromosome condensation and sister chromatid cohesion. Members of the Rad50 family, which are involved in double-strand break repair, are also included in this family because they share the same basic architecture (Hopfner and Tainer 2003). Eukaryotes typically have six SMC proteins and one RAD50 member (Cobbe and Heck 2003; Hirano 2002) and orthologues of all these are represented in P. ramorum and P. sojae (Supplementary Fig. S7).

Other

The Phytophthora species and diatom genomes each contain a soluble protein with a single ABC domain. Gene models Pr74109 and Ps108599 both share 38% identity, and 60% similarity with A. thaliana gene, At5g02270, a similar-sized protein localized in the cytoplasm and upregulated in response to sucrose (Elena et al. 2006). However, the Phytophthora genes cluster independently from similar-sized ABC proteins from plants, fungi, and Dictyostelium (Supplementary Fig. S8). Thus no reliable inferences can be made about the function of these oomycete genes.

Bacterial ABC Proteins

While ABC proteins are present in all three branches of living organisms, it has been noted previously that import-type ABC transporters that utilize periplasmic binding proteins are restricted to prokaryotes (Saurin et al. 1999). The protein models Pr86535 and Ps132365 share homology with bacterial iron binding proteins (Koster 2001). Signal P analysis (Bendtsen et al. 2004) predicts that both proteins contain signal peptides (probability, 0.856 and 0.989, respectfully) consistent with their potential role as secreted proteins. In E. coli, periplasmic binding proteins bind iron hydroxamates and form a complex with fhuB and fhuC to enable the uptake of ferric iron (Coulton et al. 1987).

The oomycete and diatom genomes contain a homologueue of the bacterial DNA photolyase (PhrA) a protein with two ATP-binding cassettes (Dorrell et al. 1995). The diatom genomes also contains components of several additional bacterial ABC transporters (Tp38095, Tp30400, Pt15102, Pt5405, Pt15267, Pt41351 and Pt50537) that are not present in the oomycete genomes.

Summary and Future Directions

The identification of orthologous genes is often an important first step in predicting gene function in newly sequenced genomes. The association of orthology with function is based on the hypothesis that orthologous genes have retained the function of the single gene present in a common ancestor. However, cross-species comparisons within the animal kingdom showed that the frequency of orthologous pairs of ABC transporters was lower than that predicted for genes involved in trafficking (Sheps et al. 2004). In our analyses, we also observed that relatively few membrane transporters of the Phytophthora ABC family clustered with those from other eukaryotes. Fortunately, the level of orthology is much higher between the oomycete genomes. Using both phylogenetic and gene order arrangements on scaffolds of the two genomes, we identified 89 orthologue pairs of membrane transporters. With the release of other oomycete genomes, we will soon be in a position to learn if this high level of orthology in the ABC superfamily extends to more distantly related oomycete genomes.

Heterologous expression of ABC transporter in yeast mutants can be a useful tool in assessing the transport capabilities of individual transporters (Klein et al. 2006). Thus far, data are available for only one ABC transporter in P. sojae (Connolly et al. 2005), but these observations did not provide any clues as to what the endogenous substrates for this transporter were. Another approach has been to identify transporters that are upregulated in response to added drugs or toxins (de Waard et al. 2006). In P. infestans, the expression patterns of 41 full transporters and 13 half transporters were monitored in isolates that varied 10- to 100-fold in their sensitivity to fungicides (Judelson and Senthil 2006). No correlations between an elevated expression of ABC transporters and fungicide tolerance were noted. However, five genes in the PDR and MRP families were identified as being upregulated in response to one or more fungicides. Such expression studies may be helpful in generating hypotheses that lead to the functional characterization of specific transporters.

However, some of the recent work describing the activities of eukaryotic ABC transporters suggests that their functions may not be limited to transport. For example, AtMRP5 functions in guard cells as a regulator of different ion channels (Klein et al. 2006). Haf-6 was shown to be required for efficient RNAi in C. elegans, and eight additional ABC proteins appeared to affect the RNAi response (Sundaram et al. 2006). The function of ABC membrane transports extends beyond the transport of metabolites (Bouchard et al. 2006; Geisler and Murphy 2006). Protein-protein interactions will be key to an understanding of the multiple roles of ABC proteins. Yeast-based two-hybrid techniques that have been developed to identify proteins that associate with membrane proteins (Bürkle et al. 2005; Wolfger et al. 2004) may be especially useful in identifying proteins that regulate these transporters. Other interactors that could be identified in such screens include sterol binding proteins and protein toxins destined for export.