Introduction

Plant-parasitic nematodes are major parasites of most crops and cause considerable economic losses in agriculture (Müller 1999). Traditional management strategies, like crop rotation and host resistance, are often inefficient when compared to nematicidal agrochemicals, which in turn are neither environmentally safe nor nematode-specific. Hence, there is a strong demand for the development of novel control strategies. Characterizing parasites molecularly and unraveling the mechanisms these animals use to infect plants can aid in the development of new control strategies. Molecular studies have long been hampered by the species' obligate parasitic life cycle. However, research is now beginning to benefit from new methods for gene identification. Among the most powerful and cost-effective approaches for rapid gene discovery is the single-pass sequencing of randomly sampled cDNA clones or expressed sequence tags (ESTs). Different high-throughput EST projects have now brought the total number of publicly available sequences from parasitic nematodes to more than 300,000 (McCarter et al. 2003; Wylie et al. 2004). Organizing and analyzing these data by using standard similarity-based searches can greatly contribute both to basic understanding of nematode biology and applied research toward new means of nematode control (McCarter et al. 2003; Parkinson et al. 2004).

The beet cyst nematode H. schachtii is a sedentary endoparasitic nematode, and its life cycle consists of two phases: a soil stage and a plant stage. During the soil stage, the preparasitic juveniles (J2s) hatch from eggs and locate the roots of a suitable host. Following entry into the root, a single plant cell is selected and modified into a complex feeding site partly by cell-wall degradation and by fusion of neighboring protoplasts. At this stage, the nematode loses its ability to migrate through the plant and exploits the created feeding structure to withdraw nutrients during the remainder of its life cycle. It is unknown how the nematode manipulates the plant, but there is evidence that secretory proteins, produced by the nematode's subventral or dorsal pharyngeal glands and presumably injected into the plant cell, have an important role in this process (Vanholme et al. 2004). The identification of these proteins will aid in unraveling the parasitic strategy of this animal. Proteomic studies led to the purification and separation of H. schachtii secreted proteins by two-dimensional (2D) gel electrophoresis (De Meutter et al. 2001), but their ultimate identification was limited by a lack of sequence information. As recently as 2003, only four sequences from H. schachtii were available in the public database [two endoglucanases (GenBank AJ299386 and GenBank AJ299387), a vacuolar adenosinetriphosphatase (ATPase) subunit (GenBank AJ249961), and a hypothetical secreted protein (GenBank AJ297919)]. We undertook an EST approach and increased the number of available sequencing data 300-fold. The sequences originate from preparasitic J2s and consequently represent genes expressed at the onset of parasitism.

Although the data set was created to support the identification of proteins based on mass spectrometry tags obtained from purified secretions, we used the data set to identify putative secreted proteins directly by in silico analysis. This was facilitated by the presence of an amino terminal signal peptide (SP), which precedes the mature protein and is the trigger for export of many secretory proteins. SPs are on average 24 amino acids (aa) long but differ greatly in size (11–80 aa). Although their primary sequence is extremely variable, three short regions with distinct physicochemical properties can be distinguished: an amino terminal positively charged region, a central hydrophobic core, and a more polar carboxy terminal region preceding the signal peptidase cleavage site (Ladunga 2000). Using bioinformatic tools, it is possible to predict the presence of such a motif at the N-terminal end of premature proteins, and different EST projects have used such predictions (Harcus et al. 2004; Torto et al. 2003). We developed a computer program to facilitate screening for putative SP in open reading frames (ORFs) deduced from ESTs. As a result, we identified 50 putative secreted proteins, which could serve as a starting point for further experimental analyses.

Materials and methods

Collection and sterilization of nematodes

Heterodera schachtii was propagated on Brassica oleracea cv. Ramosa, grown in white sand supplemented with 4 g l−1 slow-release fertilizer (Osmocote, Scotts Europe BV, the Netherlands). Seeds were germinated in Arabaskets (Betatech, Ghent, Belgium), and 3 weeks after germination, the plants were transferred to a 2-l clay pot containing sand with cysts of a previous infection. The plants were grown in a growth chamber with a 16-h day/8-h night regime at 22°C for 10–20 weeks. The cysts were collected using sieves (1 mm2 to retain organic material and 0.01 mm2 to retain the cysts) and incubated in 3 mM ZnCl2 to stimulate hatching of the J2s. The J2s were purified from debris by a 5-min centrifugation at 1,000×g on a 35% sucrose step gradient followed by two washing steps in synthetic pore water (SPW; Smant et al. 1997), supplemented with 0.005% sodium dodecyl sulfate (SDS), and sterilized with 50 μg ml−1 chlorhexidine and 0.5 mg ml−1 cetrimonium bromide (30 min) followed by two washes with an excess of sterile SPW. After sterilization, motile nematodes were selected by letting them move through three layers of Miracloth (Calbiochem, La Jolla, CA, USA) and placed in a 100-μm sieve in a Petri dish with root exudate from cabbage. The cabbage root exudate was prepared by giving cabbage plants an excess of water and collecting the flow-through, filter-sterilized, and stored at 4°C till further use. Live nematodes were collected, washed in SPW, and concentrated by centrifugation (10 min at 1,000×g).

RNA extraction

Approximately 106 freshly hatched J2s were ground in liquid nitrogen. Total RNA was isolated using LiCl extraction (Sambrook et al. 1989) and precipitated in ethanol. The RNA pellet was washed in 75% ethanol and resuspended in dimethyl pyrocarbonate-treated water. The quality of the RNA was checked by gel and spectrophotometric analysis.

Library construction, EST sequencing, and submission

cDNA was made by combining Dynabead (DynalBiotech, Oslo, Norway) oligo-dT priming with a modified protocol from the SMART cDNA library construction system (BD Biosciences, Palo Alto, CA, USA) (Mitreva et al. 2004a). Sequencing and processing prior to submission was performed as described by Mitreva et al. (2004b) and included cleaning the sequences using SeqClean (TIGR Gene Indices Software Tools; http://www.tigr.org/tdb/tgi/software) to remove contaminating vector sequences, poly A tails, or short sequences. All ESTs have been submitted to the Expressed Sequenced Tags database (dbEST; http://www.ncbi.nlm.nih.gov/dbEST). Information for clone request and sequence trace files is available at http://www.nematode.net.

Assembly and in silico analysis

Processed ESTs were assembled into multimember clusters of overlapping sequences using the TIGR gene indices clustering tool (TIGR Gene Indices Software Tools; Pertea et al. 2003) delivering CAP3 assemblies (Huang and Madan 1999) and singleton ESTs (without similarity to other ESTs). Additional analyses were done using customized Perl-based scripts and programs. One of them, called SPIT (Secreted Proteins Identified from Tags; Fig. 1), was developed to predict putative signal sequences for secretion on clustered EST data in conjunction with SignalP (http://www.cbs.dtu/services/Signal) (Bendtsen et al. 2004; Nielsen and Krogh 1998). SPIT translates the consensus sequence in six different reading frames and looks for the longest ORF preferentially starting with a methionine. The retained sequences (passing the threshold based on their relative length) are fed to SignalP. This program combines two signal prediction methods, SignalP-NN (based on neural networks) and SignalP-HMM (based on hidden Markov models). A signal sequence was considered present when it was predicted both by the artificial NN and HMM. Finally, the membrane topology prediction program TMHMM (Krogh et al. 2001) was used to exclude proteins with putative transmembrane (TM) regions. Identification of sequence similarity was performed using basic local alignment search tool (BLAST) analyses (matrix=BLOSUM62) against the nonredundant and dbEST databases (Altschul et al. 1990).

Fig. 1
figure 1

Semiautomated processing pipeline for the identification of secreted proteins based on EST clusters and singletons. The number of ESTs retained after each step is listed. Details are described in the “Materials and methods” section

Results

The sequencing of 2,662 tags from a root-extract stimulated second-stage juvenile H. schachtii cDNA library resulted in the submission of 1,285,177 nucleotides to dbEST between June and September 2003. General properties of the sequenced tags are presented in Table 1. Figure 2 shows the distribution of the relative number of ESTs in relation to their length and illustrates that the majority of the ESTs are longer than 500 nt. A consequence of the random sampling of cDNA clones for sequencing is a general redundancy of the data set because certain genes will be sampled multiple times. Such tags usually correlate with highly abundant transcripts in an organism under study at the point of sampling (Audic and Claverie 1997). For example, 67 ESTs had significant homology to a gene coding for the fatty acid-binding protein FAR-1 (also called SEC-2; GenBank Y09293), reflecting the abundant expression of this gene in J2s (Prior et al. 2001; Vanholme et al. 2002). To compress the size and increase the quality of the data set, an in silico nonredundant database was derived by clustering the entries. From a total of 2,628 high-quality sequences, 1,824 were assembled in 408 multimember clusters, varying in size from 2 ESTs (454 cases) to 67 ESTs (2 cases) (Fig. 3). The redundancy of the library, as estimated by the number of reads that were assembled into clusters, was approximately 68%, and highly redundant sequences (occurring >10 times) made up approximately 27% of all successful reads (Table 2). Clustering increased the overall length from 482 nt for an individual tag to 589 nt for a contig. Since it is not possible to determine whether each cluster represents a single gene or a mixture of transcripts from closely related genes, a cluster was defined as a group of related sequences. Adding the 804 singletons, the data set represents an estimated 1,212 genes of H. schachtii, which is approximately 6.4% of the total gene repertoire based on the assumption that H. schachtii contains a similar number of genes as Caenorhabditis elegans (∼19,000; The C.elegans Sequencing Consortium 1998).

Table 1 Properties of the sequences obtained from the H. schachtii library
Fig. 2
figure 2

Size distribution of the ESTs obtained from the H. schachtii cDNA library

Fig. 3
figure 3

Histogram showing the distribution of 2,633 ESTs by cluster size. For example, there were four clusters of 25 ESTs each, thus representing 100 ESTs

Table 2 Summary of the 20 most abundantly represented transcripts in the EST data set obtained from the cDNA library

Despite the fact that a considerable number of nematode transcripts are equipped with a highly conserved 22-nucleotide trans-spliced leader (SL) sequence [use of SL1 by transcripts is estimated at 70% in C. elegans (Blumenthal and Steward 1997) and 60% in Globodera rostochiensis (Dautova et al. 2001)], only one out of the 2,662 ESTs contains such sequence (gi|32325592; GenBank CD750667). In a similar study performed on ESTs of Meloidogyne incognita, an SL was found in 33 clusters obtained from 5,700 ESTs (McCarter et al. 2003). And in a less stringent analysis, a 5′-truncated form of the SL was found in 37 of 1,234 ESTs obtained from Nipostrongylus brasiliensis (Harcus et al. 2004). Although higher than our observation, this is still far below the expected 60–70%. The presence of truncated cDNAs or lower-quality sequence at read ends can partly explain this discrepancy. Alternatively, the absence of SL sequences may be the result of the experimental approach undertaken. Most projects, aiming to clone putative parasitism genes (e.g., Dautova et al. 2001; Gao et al. 2003; Huang et al. 2003; this work), are based on the SMART technology, in which the 5′-end of the mRNA is extended with an oligonucleotide during first-strand synthesis. The altered 5′-cap of the SL sequence, consisting of a trimethylguanosine instead of a monomethylguanosine (Thomas et al. 1988), can be responsible for a negative selection against SL-containing mRNA sequences. Indeed, several reports mention that the altered cap structure can be responsible for altered biochemical properties (Fernandez et al. 2002; Lall et al. 2004). With the number of identified and characterized putative parasitism genes increasing, it is noticed that most lack a SL sequence, indicating a possible negative correlation between SL and parasitism genes (Gao et al. 2003). Nevertheless, controversy will remain concerning this relation as long as the number of parasitism sequences characterized is limited.

The algorithm SPIT (Fig. 1), developed for automated identification of putative secreted proteins, was applied to the complete H. schachtii EST data set. Out of 1,212 sequences, 898 were retained (314 contigs and 584 singletons) based on the criteria that they contained an ORF encoding a peptide of at least 50 aa. The distribution of their length (Fig. 4) illustrates the benefits of the clustering process. Contigs have generally longer ORFs compared to singletons; 19.1% (60/314) and 0.8% (5/584) of the ORFs obtained from contigs and singletons, respectively, code for proteins of more than 200 aa. The fidelity of in silico translation was checked by comparing the aa composition of the theoretically constructed proteins to the aa frequency of other organisms (see supplementary information). The calculated aa composition of H. schachtii was more similar to eukaryotes than to prokaryotes. In addition, the codon preference was more like that of the related species Heterodera glycines than that of C. elegans [Codon Usage Database (http://www.kazusa.or.jp/codon)]. This indicates that the nucleotide sequences (cluster consensus and singletons) were likely to be correctly translated. Moreover, 61.5% (552/898) of the ORFs could be aligned with the proteome of C. elegans (at E<10e−3).

Fig. 4
figure 4

Histogram showing the distribution of clusters and singletons by the size of their ORF (in amino acids)

An analysis based on “Signal P” revealed the presence of an SP for secretion in 31 contigs and 34 singletons. Fifteen of the sequences were eliminated based on the presence of a putative TM region as predicted by TMHMM (http://www.cbs.dtu.dk/services/TMHMM), indicating that they are integral membrane proteins rather than secreted. Hence, only 5.6% (50/898) of the predicted proteins are putative-secreted (Table 3), lower than the 10% observed in some proteomes (Ladunga 2000). Different reasons could have decreased this number. First, the accurate prediction of secretory proteins from partial sequences relies on the presence of the 5′-end of the transcript and the localization of the correct translation initiation site. For example, SPIT was unable to retain a single clone out of the 67 ESTs (CL1contig2) encoding for the secretory protein Hs-FAR-1. Alignment subsequently showed that these clones were all truncated in their 5′-domain. Another example is EST gi|32324822 (GenBank CD750279), which is similar to Mi-CRT1 (GenBank AF402771, E=1e−41), a gene of M. incognita coding for a calreticulin. Mi-CRT1 is expressed in the subventral glands of the nematode (Jaubert et al. 2002), and the corresponding protein is injected in the feeding site during the parasitic interaction Jaubert et al. 2005. Also, this EST is lacking the 5′-end. Second, to decrease the number of false positives, we used rather stringent selection criteria and retained only those sequences that scored above the cutoff value for both the NN and HMM model. Finally, some proteins lack a typical hydrophobic SP but contain internal signal sequences, which allow them to be secreted by the normal secretory pathway or other type of secretion independent from the classical secretory pathway (Bendtsen et al. 2004; Nickel 2003). To circumvent these drawbacks, Mitreva et al. (2004c) predicted secretory and TM candidates from Trichinella ESTs based on homology to other putative secreted proteins rather than finding the SP directly in the EST sequences.

Table 3 Putative secreted proteins retained by SPIT from contigs and singletons of the cDNA library

Discussion

By constructing a cDNA library from preparasitic nematodes, we ensured that the obtained transcripts encode genes expressed during the initial stage of parasitism such as plant penetration and migration. In addition, to enrich the library for putative parasitism genes, the nematodes were stimulated with root exudates to simulate the plant environment. This treatment results in the accumulation of secretory granules and transcriptional activity in the pharyngeal glands of the cyst nematodes (Blair et al. 1999; Perry et al. 1989; Smant et al. 1997). It would be interesting to extend the EST approach toward early parasitic stages when the nematode induces the feeding site. However, the use of these stages is hampered by their inaccessibility, as they are enclosed inside the plant root. Although it is feasible to dissect the nematodes manually from the root, it is difficult to do this when large quantities are needed to construct a representative cDNA library (de Boer et al. 2002a).

Organization, interpretation, and characterization of the generated sequences requires the creation and implementation of a bioinformatic approach, which can be applied uniformly across the data set. For this purpose, a computer program (SPIT) was written in Perl to predict secreted proteins from the data set. A similar approach (PexFinder) was undertaken previously to identify secreted proteins from an EST library of the plant pathogen Phytophthora infestans (Torto et al. 2003). However, our in silico pipeline was made uniform by combining different steps in a single script and expanding the algorithm to facilitate additional analyses. One of the advantages of the program is the ability to retain sequences missing the start codon. For example, contig CL376Contig1 [a homolog of a gene coding for a putative gland protein of H. glycines (GenBank AF500031)] was retained, although it was lacking a start codon. Due to the presence of a putative TM domain in this protein, the sequence was not included in Table 3. It is obvious that both approaches, analyzing partial sequences and gene sets or undertaking different approaches to screen the same type of data, may inflate or deflate the number of proteins containing an SP for secretion contained in a species' genome. For example, an alternative tag-filtering approach performed on ESTs obtained from the nematode N. brasiliensis revealed an SP in 17.6% (176/997) of the retained ORFs (Harcus et al. 2004). However, the sequences were not further analyzed for putative TM regions, and one would expect that a considerable number of the retained sequences will be trapped in the cellular membrane and cannot be considered as secreted.

Out of the 50 putative secreted proteins, 23 (46%) were novel (Table 3). The fact that many putative parasitism genes are novel sequences (Gao et al. 2003) makes them intriguing candidates for further characterization. Our semiautomated pipeline was validated by the identification of ESTs coding for parasitism factors sharing homology with previously identified secreted proteins from other plant-parasitic nematodes. Therefore, we are confident that our findings hold equally true for the “pioneer” sequences. The function of the putative homologues is indicative of a role in plant parasitism, so we will briefly discuss some of the interesting candidates.

Cell-wall-modifying proteins

The polypeptide deduced from the EST gi|33140501 (GenBank CF101434) shows similarities to a cellulose-binding protein (CBP) of H. glycines (Hg-CBP-1; GenPept AAR01198; E=1e−9) and, accordingly, was named Hs-CBP. Normally, these sequences are domains of larger proteins such as endo- or exoglucanases. In such cases, the cellulose-binding domain (CBD) is linked with a catalytic domain and regulates the interaction of the enzyme with the substrate cellulose. Remarkably, the transcript does not encode any other domain. Although it is possible that our information is deduced from a truncated cDNA, this is unlikely. The presence of several in- and out-of-frame stop codons at the 3′-end of the sequence indicates that the EST possibly contains the complete ORF coding for a protein of 139 aa. In addition, similar proteins were detected in other plant-parasitic nematodes [e.g., H. glycines (GenPept AAN32887) and M. incognita (GenPept AAC05133)], and the sequence is more similar to these sequences than to the CBD of nematode endoglucanases. Previous analyses (Ding et al. 1998; Gao et al. 2004) indicate that the CBPs are secreted by plant-parasitic nematodes and may have a role in parasitism. The specific function of this protein is not yet known, but recombinant CBD was found to modulate the elongation of different plant cells in vitro (Shpigel et al. 1998). If indeed involved in cell expansion, Hs-CBP has a similar function as expansin, which was recently detected in secretions of cyst nematodes (Gr-EXPB1; GenPept CAC83611) (Qin et al. 2004). This functional similarity is accompanied by sequence similarity (32% identity between Gr-EXPB1 and Hs-CBP at the protein level).

Three contigs (CL126contig1, CL221contig1, and CL279contig1) showed homology to endoglucanases. The sequence of CL221contig1 is similar to Hg-ENG5 (GenBank AY336935; E=5e−46), an endoglucanase gene of H. glycines. Hg-ENG5 differs significantly in its catalytic domain compared to the other cloned endoglucanases of H. schachtii [Hs-ENG1 (GenPept CAC12958) and Hs-ENG2 (GenPept CAC12959)] (De Meutter et al. 1999). In contrast to these endoglucanases, Hg-ENG5 possesses activity toward xylan (Gao et al. 2004), which is a major component of hemicellulose of the plant cell wall. CL279contig1 showed only weak similarity to previously cloned endoglucanases (Hs-ENG1; E=3e−6), and the homology was restricted to the CBD of Hs-ENG1. As expected, based on the similarity with the CBD, homology searches revealed some degree of similarity with Gr-EXPB1 (GenPept CAC83611; E=9e−3). Although the protein might function as a CBP, the protein is more similar to a CBD than to the previously described Hs-CBP (EST gi|33140501; GenBank CF101434). The isolation of a full-length clone would indicate if this contig is coding for a fragment of an endoglucanase or is functioning as an independent CBP. If the EST indeed originates from an endoglucanase-encoding transcript, the enzyme will consist of a CBD extended with the catalytic domain. Such topology is not yet described for nematode endoglucanases, in which CBDs are normally present as C-terminal extensions of the catalytic domains.

Another contig (CL018contig1), comprised of 20 ESTs, showed homology with pectate lyase. Remarkably, the deduced polypeptide sequence is more related to pectate lyases of fungi [Streptomyces sp (E=1e−40) and Magnaporthe grisea (E=3e−28)] or bacteria [Bacillus sp (E=3e−28)] than to pectate lyases of related cyst nematodes (Gr-PEL1; GenPept AAF80746; E=2e−16/Hg-PEL2; GenPept AAM74954; E=1e−14/Hg-PEC; GenPept AAK08974; E=8e−14) (de Boer et al. 2002b; Popeijus et al. 2000), indicating that the sequence codes for a pectate lyase distinct from the others described in cyst nematodes. The similarity to bacterial genes have been found for other genes coding for cell-wall-degrading enzymes in nematodes and have led to the speculation of horizontal gene transfer between bacteria and nematodes (Smant et al. 1998).

Finally, one of the retained ESTs (gi|32324875; GenBank CD750306) showed similarity to arabinogalactan endo-1,4-β-galactosidases (Xanthomonas campestris; GenPept AAM42894; 62% BLASTX identity). Those enzymes hydrolyze the 1,4-β-galactosidic linkage in arabinogalactans of the plant cell wall and have not yet been identified in animals. Although additional experiments are needed to confirm the endogenous origin of this gene, a database search revealed an EST of H. glycines (gi|15770814; GenBank BI749012) coding for a similar enzyme (74% BLASTN identity). This finding decreases the possibility of contamination because the chance that a similar contaminating organism was copurified with both H. schachtii and H. glycines samples is unlikely. Moreover, detecting a novel cell-wall-degrading enzyme supports the finding that plant-parasitic nematodes secrete a complex cocktail of cell-wall-degrading enzymes to tackle the different components of the highly resistant cell wall to “soften” the plant tissue, which facilitates their migration. The large number of ESTs coding for cell-wall-modifying proteins mirrors the known high expression level of these genes in other Tylenchids. Currently, these are the best studied parasitism factors from plant-parasitic nematodes.

Other parasitism-related genes

Two ESTs forming CL323contig1 encode an ORF of 184 aa which showed 95% identity to the N-terminal part of H. glycines chitinase (Hs-CHI-1; GenPept AAN14978) (Gao et al. 2002). Chitin, the substrate of this enzyme, is an unbranched polysaccharide polymer consisting of β-1,4-linked N-acetyl-d-glucosamine and forms an integral part of the nematode eggshell. A chitinase-derived EST was also found in a cDNA library from preparasitic M. incognita (GenBank BE239118) (Dautova et al. 2001). These tags could be residual from hatching, during which large amounts of chitinase are produced. However, Schwekendiek et al. (1999) showed the expression of chitinases during different parasitic stages of H. glycines, indicating that chitinase may be expressed after hatching. Moreover, Gao et al. (2002) analyzed a chitinase of H. glycines in more detail, and they showed that the transcript of this gene accumulated specifically in the subventral gland cells of parasitic stages, but this could not be detected in eggs or freshly hatched J2s. These observations indicate that this enzyme could have other functions besides the cleavage of chitin during hatching. Although a role in moulting is considered as a possibility, the expression in the pharyngeal glands points toward a role in the parasitic interaction.

Two ESTs, gi|33140127 (GenBank CF101060) and gi|33140147 (GenBank CF101080), showed similarity to genes coding for venom allergen-like proteins. Members of this class are similar to the allergen 5 (AG-5) family of extracellular proteins isolated from hymenopteran venom (Fang et al. 1988). Although the function of these proteins is unknown, it is suggested that they play a central role in the transition of the nematode from free living toward parasitic stages (Hawdon et al. 1996). A sequence analysis of the deduced polypeptides confirmed that the transcripts are derived from different genes sharing only 45% identity. gi|33140147 was more related to Hg-VAP-1 (H. glycines; GenBank AF374388; 84% identical), whereas gi|33140127 showed a higher similarity with Hg-VAP-2 (H. glycines; GenBank AY033601; 97% identical). In H. glycines, homologous genes are expressed in the subventral glands of preparasitic and parasitic stages (Gao et al. 2001). Their developmental expression in conjunction with the presence of a putative signal for secretion suggests that these proteins may have a role in plant–host interaction.

Expressed sequence tag gi|33139816 (GenBank CF100749) showed homology to Gr-SXP-1 (GenBank AJ271910; 82% identical), a gene expressed in the epidermis of G. rostochiensis and coding for a secreted protein (Jones et al. 2000). Gr-SXP-1 belongs to the family of SXP/RAL-2 proteins with unknown function but which are nematode-specific, which makes them suitable as targets for control. SXP proteins are also candidate antigens for the vaccination of animal-parasitic nematodes (Wang et al. 1997). In plant-parasitic nematodes, only two other members of this family were described: Gr-AMS-1 (GenPep CAB66341) (Jones et al. 2000) expressed in the amphids of G. rostochiensis, and Mi-SXP-1 (GenPep AAR35032) (Tytgat et al. 2005) expressed in the subventral glands of M. incognita.

CL320contig1 showed similarity to genes coding for cathepsin l-like cysteine proteinase. This protein is secreted in the intestine of parasitic nematodes and is most likely involved in digestion (Lilley et al. 1996) and indirectly involved in parasitism (Jasmer et al. 2003). Strategies toward developing nematode-resistant plants based on the expression of cysteine proteinase inhibitors are being pursued (Lilley et al. 2004; Urwin et al. 1997).

Conclusion

The analysis presented demonstrates that single-pass sequencing of random clones, although generating incomplete sequences, is a powerful tool for rapid identification of genes that may mediate important aspects in the plant–nematode interaction. By screening the data set to select tags coding for secreted proteins, interesting candidate parasitism genes were selected, which warrant further characterization. Furthermore, these preliminary findings confirmed that a proper implementation of developed bioinformatic tools can help prioritize targets for potential nematode control.