Introduction

Arthropod herbivore–plant interactions include feeding strategies, such as chewing and piercing–sucking, used by the arthropod to enter the plant tissues and access the plants’ nutritional contents. Within each strategy there are operational differences which may be crucial for determining the defense mechanisms utilized by the plant once it is fed upon. Our understanding of the plant–mirid interaction, which entails an insect’s use of the macerate and flush feeding strategy with extraoral digestion, is much less well developed than is our understanding of how other piercing–sucking insects (phloem or individual cell feeders) interact with plants.

The most important agricultural pest species belonging to the subfamily Mirinae (Hemiptera: Heteroptera: Miridae) are Lygus lineolaris Palisot de Beauvois and L. hesperus Knight, commonly known as plant or lygus bugs (Schaefer and Panizzi 2000). The plant symptoms resulting from lygus bug feeding include organ abscission, deformation of developing fruits, necrosis at the feeding site, seeds with aborted embryos, and reduced vegetative growth (Strong 1970). Plants also respond to lygus bug feeding biochemically. Gossypium hirsutum L. cv. Delta Pineland 5415 plants infested with L. hesperus release volatile organic compounds (VOCs) qualitatively similar to those induced by Spodoptera exigua larvae feeding on cotton (Rodriguez-Saona et al. 2002). However, the specific herbivore-associated molecular patterns (HAMPs) in L. hesperus saliva eliciting the release of VOCs in planta are not known.

Interactions between plants and arthropod herbivores have features similar to those described for plant–pathogen interactions, particularly as the substantial plant cell wall is a significant barrier protecting the plant cellular contents. Fungal pathogens secrete a variety of plant cell wall-degrading enzymes (CWDEs) into the host’s tissues to facilitate exploration of the plant tissue and to draw food from it (Esser and Lemke 1994; Annis and Goodwin 1997). Polygalacturonases (PGs) in this CWDE arsenal are known to be virulence factors for pathogenic fungi (ten Have et al. 1998; D’Ovidio et al. 2004a) and bacteria (Roper et al. 2007) and their enzymatic products, pectin derived oligosaccharides (PDOs), are considered to be microbe-associated molecular patterns (MAMPs) (Ebel and Mithöfer 1998). Schaefer and Panizzi (2000) indicate that the macerate-and-flush feeding strategy of mirids means that “the bugs inject salivary pectinase into the plant tissue to macerate the cells” and conclude that pectinase is probably an important and fundamental enzyme in the Miridae family. Strong and Kruitwagen (1968) identified PG activity in salivary glands of the western tarnished plant bug (WTPB), L. hesperus, and concluded that this pectinase was a major factor in the crop damage caused by this species’ feeding (Strong 1970). Shackel et al. (2005) supported Strong’s conclusion by observing flower abortion after a partially purified WTPB PG was introduced into alfalfa florets and cotton flowers by micro-injection. Essentially no damage was observed in controls injected with buffer or bovine serum albumin. Nevertheless, the study did not address whether the hydrolytic activity of PG was required to cause the damage observed. The complexity of the PGs in the salivary glands or guts of mirids has not been defined but, as reported for many plant pathogens (e.g., ten Have et al. 1998), the tarnished plant bug (TPB) L. lineolaris has a PG gene family (Allen 2007; Allen and Mertens 2008). Furthermore, L. hesperus salivary glands contain both exo- and endo-PG activity (Celorio-Mancera et al. 2009).

It is reasonable to apply our current knowledge of plant–pathogen interactions to the understanding of plant interactions with herbivores, because of the similarities already identified regarding the plant tissue maceration mechanisms (Mithöfer and Boland 2008 and refs. therein). Based on this premise, herein we provide molecular tools that should facilitate research on the plant–mirid interaction. We report that PG activity is required for flower bud damage, that the WTPB has a PG gene family containing at least two PG genes that are expressed in WTPB saliva, and that WTPB PGs are introduced into a diet on which WTPB is feeding.

Materials and methods

Insect supply

Lygus hesperus eggs were obtained from Prof. Wayne Bailey, Forage and Field Crops Entomologist at the University of Missouri, Columbia. Egg packs were placed in one gallon (3.79 l) plastic containers with an organdy-covered window in the lid. Containers were kept on laboratory shelves at 26 ± 1°C, a photoperiod of 12:12 (L:D) h. After the eggs hatched (7–8 days), diet packs containing a suspension of L. hesperus diet (Bio-serv, Frenchtown, NJ; www.insectrearing.com) prepared following the manufacturer’s instructions, were placed on the mesh window on top of the container. A constant fresh supply of artificial diet was maintained until insects reached adulthood (within ca. 20 days). Specimens 3–5 days after reaching adulthood were used for salivary gland excision and extraction. Wild WTPBs were collected with entomological sweep nets at the alfalfa fields belonging to the Department of Animal Science at UC Davis, within an approx. 200 m radius from coordinates: N 38°32.038′, W 121°47.850′, elevation 34 ft. (10 m) WTPB individuals were sorted and a field key was used for their identification (Mueller et al. 2003).

Polygalacturonase assay

PG activity was tested using the radial diffusion assay, as described previously (Taylor and Secor 1988; Shackel et al. 2005). Fifteen microliters of the sample to test for PG activity is placed in wells cut in an agar sheet containing the enzyme substrate, polygalacturonic acid sodium salt (Sigma, St. Louis, MO). If the sample has PG activity, the enzyme digests the substrate while diffusing radially. After overnight incubation at 37°C, the agar sheet is stained with ruthenium red (0.02%) to detect uncleaved substrate. Non-stained zones represent the digested substrate; therefore, the larger the “cleared” zone, the greater the PG activity in the sample.

Assessment of PG activity as damage causing factor

The test to determine whether the damage to alfalfa florets caused by micro-injection of WTPB PG (Shackel et al. 2005) was due to PG activity rather than a response to the PG protein structure per se, utilized two forms of recombinant A. niger PG II. These two A. niger PGs are an active, wild-type protein and an inactive, mutant protein (D202N, with a single amino acid substitution; Armand et al. 2000). These fungal PGs were used as “model” PGs since they are readily available in sufficient quantities for the experiment and a comparable active/inactive pair of PGs from L. hesperus is not presently available. Previous work (Shackel et al. 2005) has shown that alfalfa floret abscission is promoted by micro-injection of PGs from L. hesperus, fungi and bacteria. Micro-injection of the active A. niger PG II and D202N in 50 mM sodium acetate buffer, pH 5.0 followed published protocols (Shackel et al. 2005) and used florets of the alfalfa selection UC-2705-177 (Medicago sativa L.). Protein concentrations used for injection were similar, 1.8 × 10−3 mg/ml and 1.96 × 10−3 mg/ml for the active and inactive proteins, respectively. On average, three florets from several inflorescences were injected with 6 ± 2 nl of each protein solution. Data were analyzed using the Chi-square test for categorical data taking into consideration that our data fall into two classes. Using the percentage of non-aborted florets as continuous variable, we also analyzed the data using ANOVA (GLM Procedure, The SAS System). Our null hypothesis for both analyses was that the effect of injecting inactive PG would be not statistically different from the effect of active PG type injections on alfalfa floret development.

Preparation of salivary gland extract (SGE)

Dissection of the salivary gland apparatus from both wild and diet-reared insects was as described by Shackel et al. (2005). Salivary gland pairs obtained from wild insects (400) and diet-reared insects (200) were placed in a microcentrifuge tube with 1 ml and 0.5 ml of distilled sterile water (dd H2O), respectively, on ice. Samples were vortexed and centrifuged for 5 min at 16,000×g, in a microcentrifuge. The supernatant (SGE) and pellet were stored separately at −20°C.

In-gel PG activity assay and N-terminal amino acid sequencing

Lanes of duplicate acrylamide gels were loaded with 20 μl aliquots of a SGE prepared from 200 pairs of diet-reared L. hesperus salivary glands and subjected to PAGE. One of these replicate gels was subjected to an in-gel PG activity assay to identify the locations of PG isoforms (Shackel et al. 2005). After the positions of active PG were documented photographically, the gel was stained with Biosafe Coomassie (Biorad, Hercules, CA) to visualize all the proteins. The protein band with PG activity was excised from the PAGE using a sterile blade along with a piece of the gel with no protein or PG activity (control) for the liquid chromatography tandem mass spectrometry (LC-MS/MS) peptide analysis at the Proteomics Facility, UC Davis. Aliquots of the SGE for the second gel were subjected to reduction and alkylation as described in the protocol recommended by the Molecular Structure Facility at UC Davis (http://msf.ucdavis.edu/protocols.html) in order to prepare the sample for N-terminal sequencing. Following electrophoresis, the gel was electroblotted onto a Sequi-PVDF membrane. The blot was stained with Biosafe Coomassie and its protein profile was compared with that of the gel in which PG activity patterns had been identified. A prominent protein band corresponding to the L. hesperus PG in the in-gel activity assay was identified in the Coomassie-stained blot and subjected to Edman degradation using an ABI 494-HT Procise Edman Sequencer at the Molecular Structure Facility.

Identification of WTPB PG genes

Total RNA isolation

Total WTPB salivary gland RNA was prepared from the pellet from the SGE from field-collected WTPB salivary gland pairs using the Invitrogen TRIzol® Plus RNA purification kit (Carlsbad, CA), according to the manufacturer’s instructions.

Reverse transcription-polymerase chain reaction (RT-PCR)

Reverse transcription was done using total WTPB salivary gland RNA. The first strand of cDNA was synthesized using Moloney Murine Leukemia Virus reverse transcriptase according to the supplier’s recommended conditions (Roche, Indianapolis, IN). This first strand cDNA was used as a template in PCR reactions using a pair of degenerate primers (LhPGF and LhPGR) and the gene specific primer pairs for Llpg1, Llpg2 and Llpg3 provided by Dr. Margaret Allen (degenerate and specific primer pair sequences are provided in ESM, LhPGprimers.pdf). PCR reactions (20 μl in final volume) contained 2 μl of the first strand cDNA reaction, 10 mM Tris–HCl (pH 8.3), 3.5 mM MgCl2, 0.2 mM of each dNTP, 40 pmol (degenerate) or 10 pmol (Llpg specific) primers, and 1 unit of DNA polymerase (TaKaRa ExTaq™, Japan). Thirty-four cycles of 45 s at 94°C, 15 s at 54°C, and 75 s at 72°C produced PCR products that were ligated to the pCR®4-TOPO® vector (Carlsbad, CA) using the Invitrogen TOPO TA Cloning® kit (Carlsbad, CA) for sequencing. Plasmid DNA purification was performed using the QIAprep Spin Qiagen Miniprep Kit. Clones were sequenced at the DNA Sequencing Facility, UCD.

Rapid amplification of cDNA ends (RACE)

When translated into an amino acid sequence, the cDNA clone amplified by PCR using degenerate primers gave an open reading frame that lacked an initiating methionine and the stop codon; therefore, 3′ RACE (Ambion FirstChoice® RLM-RACE kit Austin, TX) and 5′ RACE (5′ RACE Invitrogen GeneRacer™ kit Carlsbad, CA) were performed according to the manufacturers’ instructions using as gene specific primers sequences a to d (ESM). Since the cDNA clone identified when using LlPG2F and LlPG2R specific primers lacked only the stop codon, 3′ RACE was performed (as above) using gene-specific primers g and h (ESM). PCR products were cloned into the pCR®4-TOPO® vector and sequenced, as described above.

Genomic DNA isolation

WTPB genomic DNA was isolated from 0.65 g (approx. 100 individuals) female adults using the DNeasy® Tissue Kit from Qiagen, (Valencia, CA) following manufacturer’s instructions.

Full length cloning of WTPB gDNA and cDNA pg sequences

Primers were designed for the amplification of the putative full-length PG gene sequences obtained by the methods described above using both WTPB gDNA and cDNA. Primer pair e/f (ESM) was used for amplification of a full-length cDNA clone originally identified using the degenerate primer pair LhPGF/LhPGR. The primer pair i/j (ESM) was used for the amplification of a full length cDNA clone originally identified using Llpg2 gene-specific primers. The PCR program and conditions followed were as recommended in the manual for the use of the Advantage® 2 PCR Enzyme System (Clontech, Mountain View, CA). Cloning and sequencing of PCR amplicons were performed as described above.

Sequence alignments

Similarity of the nucleotide sequences obtained for each putative pg clone was determined using the “BLAST 2 sequences” (bl2seq) at the National Center for Biotechnology Information (NCBI) website (http://www.ncbi.nlm.nih.gov/BLAST). BLAST database searches were performed using the NCBI BLAST server. Phylogenetic analyses were performed with the multiple sequence alignment program for DNA or proteins, ClustalW2, European Bioinformatics Institute (EBI), (http://www.ebi.ac.uk/Tools/clustalw2/index.html). Nucleotide sequence translations were performed using the ExPASy Proteomics server, “Translate tool” (http://www.expasy.ch/tools/dna.html). Nucleotide sequences for the PG-encoding genes of L. hesperus were submitted to the GenBank database.

Identification of WTPB-PGs extruded into the diet

Collection diet sachet

The technique described by Habibi et al. (2001), developed to recover secreted saliva without causing damage to insect tissues, was used to collect WTPB proteins secreted into a diet during feeding. One day old adult L. hesperus specimens that had been reared on artificial diet were allowed to feed for 1 day on a 3% agarose solidified “collection diet” containing 5% sucrose for the recovery of salivary constituents. The collection diet was provided in Parafilm®-covered pouches (collection diet sachets) each containing ca. 60 μl of diet. Twenty-four sachets were used for each one gallon container of insects. Collection diet sachets were removed from the Parafilm after the insects had fed for 24 h, ground with mortar and pestle while adding dd H2O to facilitate grinding. This sachet extract (final volume, 8 ml) was extracted at 4°C overnight with gentle stirring. The samples were centrifuged (5 min, 10,000×g), the supernatant was filtered using a 0.2 μm filter, and 2 ml aliquots were freeze-dried. Three volumes of cold 100% acetone were used to precipitate proteins from the concentrated sachet extracts. Precipitated proteins were collected by centrifugation (5 min, 16,000×g) and allowed to air dry. Collection diet that had never been exposed to insects was handled in the same way to provide a control protein extract. The precipitated protein samples were analyzed by LC-MS/MS. The nucleotide sequences for the PG-encoding genes of WTPB (see “Results”) and L. lineolaris PGs (Allen and Mertens 2008) were included in a database created for the evaluations of the LC-MS/MS results for WTPB protein samples from SGE PAGE and collection diets. Treatment and control collection diet samples were also assayed for PG activity using the radial diffusion assay before concentration.

Protein identification

The excised protein bands from PAGE and the total precipitated proteins from sachet extracts were used for mass spectrometry-based protein identification. Briefly, proteins were reduced and alkylated according to previously described procedures (Shevchenko et al. 1996), and digested with sequencing grade trypsin per manufacturer’s recommendations (Promega, Madison, WI). LC-MS/MS was performed using an Eksigent Nano LC 2-D system (Eksigent, Dublin, CA) coupled to an LTQ ion trap mass spectrometer (Thermo-Fisher, San Jose, CA) through a New Objective Picoview Nano-spray source. Peptides were loaded onto an Agilent nano trap (Zorbax 300SB-C18, Agilent Technologies) at a loading flow rate of 5 μl/min. Peptides were then eluted from the trap and separated by a nano-scale 75 μm × 15 cm New Objective picofrit column packed in-house with Michrom Magic C18 AQ packing material. Peptides were eluted using a 45 min gradient of 2–80% buffer B (Buffer A = 0.1% formic acid, Buffer B = 95% acetonitrile 0.1% formic acid). The top ten ions in each survey scan were subjected to automatic low energy CID. Protein identification also was performed using an Agilent LC MSD ion trap coupled to a chip cube interface (Agilent, Dover, DE). Peptides were loaded on the trap located on the Agilent nano-HPLC chip at a flow rate of 4 μl/min. Peptides were then eluted from the trap and separated on the 5 cm Zorbax C18 column, contained on the chip. Peptides were eluted using a 45 min gradient of 2–80% buffer B (as above). The top ten ions in each survey scan were subjected to automatic low energy CID.

Database searching

Tandem mass spectra were extracted by Mascot Distiller version 2.0. Charge state deconvolution and deisotoping were not performed. All MS/MS samples were analyzed using Sequest (ThermoFinnigan, San Jose, CA; version 27, rev. 12) and X! Tandem (http://www.thegpm.org; version 2006.04.01.2). X! Tandem was set up to search the TIGR Arabidopsis thaliana database plus the L. lineolaris PG protein sequences (LlPG1, LlPG2 and LlPG3, Allen and Mertens 2008) and the amino acid sequences predicted for the cloned Lhpg-gDNAs and Lhpg-cDNAs, assuming trypsin digestion to generate peptides (30716 entries). Sequest also was set up to search the same database, using the same sequences and assuming trypsin generation of peptides. Sequest and X! Tandem were searched with a fragment ion mass tolerance of 1.00 Da and a parent ion tolerance of 1.5 Da. Oxidation of methionine and the iodoacetamide derivative of cysteine were specified as variable modifications in Sequest and X! Tandem.

Criteria for protein identification

Scaffold (version Scaffold-01_07_00, Proteome Software Inc., Portland, OR) was used to validate MS/MS-based peptide and protein identifications. Peptide identifications were accepted if they could be established at greater than 95.0% probability as specified by the Peptide Prophet algorithm (Keller et al. 2002). Protein identifications were accepted if they could be established at greater than 90.0% probability and contained at least one identified peptide. Protein probabilities were assigned by the Protein Prophet algorithm (Nesvizhskii et al. 2003). Proteins that contained similar peptides and could not be differentiated based on MS/MS analysis alone were grouped to satisfy the principles of parsimony.

Results

PG activity is a damage causing factor

In the field, WTPBs feed on small, green, unopened buds of alfalfa racemes. This feeding generally causes developmental arrest so that the buds turn straw colored and dry and often abscise or abort (Strong 1970). Micro-injection of a partially purified WTPB PG protein in the laboratory produces similar symptoms (Shackel et al. 2005). To establish that PG activity and not the PG protein per se is responsible for these symptoms, active and inactive PGs were microinjected into alfalfa florets. Aborted florets were observed within 5 days in 67% of the florets injected with 6 ± 2 nl of solution containing 108 ± 36 ng of active A. niger PG II. No florets injected with 117 ± 39 ng of the inactive D202N A. niger PG II showed WTPB-like damage symptoms within the same time period (Table 1). The statistical analyses show significant differences between the two treatments (Table 1). The in vitro PG assay of active A. niger PG II produced a clear zone of 400.96 mm2 (total of 2.7 × 10−4 mg protein in 15 μl) while a slightly greater amount of A. niger D202N mutant PG II protein (2.94 × 10−4 mg protein in 15 μl) produced no detectable clear zone.

Table 1 Summary of tests in which 6 nl ± 2 nl were injected into individual alfalfa florets

In-gel PG activity of WTPB SGE

The gel electrophoresis analysis of PG activity in the SGE from L. hesperus reared in the laboratory shows a prominent PG band (Fig. 1a) with a molecular mass between 35.5 kDa and 50.7 kDa, based on comparison to standards. Subsequent staining of the activity gel with Coomassie blue revealed the proteins in the SGE (Fig. 1b), with a prominent band (marked by an asterisk) coinciding with the band with PG activity. After the electrophoresis of three additional aliquots of the SGE preparation that had been reduced and alkylated prior to electrophoresis, the gel was blotted to nitrocellulose and Coomassie stained (one lane of the three replicates is shown, Fig. 1c). The protein profiles in all three replicate lanes were very similar; all contained the band corresponding to the prominent PG activity. The N-terminal sequence of the PG band was VDVNNIEQLEAAKNAQDINI obtained by Edman degradation of the blotted reduced and alkylated protein band (Fig. 1c).

Fig. 1
figure 1

The PAGE gel was stained to reveal PG activity bands in WTPB SGE after protein separation (a), activity is indicated by lighter-gray regions in the gel (just one prominent band was observed and is denoted by a white asterisk). Protein profile of the same lane in (a) after Coomassie staining (b). One lane of three replicates showing the WTPB SGE PAGE blot of the reduced alkylated proteins (c). The N-terminal sequence of the protein marked * was obtained by Edman degradation. Protein molecular markers (Bio-Rad prestained SDS-PAGE low molecular weight standards, Hercules, CA) (d)

Isolation of gDNA and cDNA clones for WTPB polygalacturonase

To clone the L. hesperus PG sequences, a forward degenerate primer for PCR, LhPGF, was designed based on the amino acid sequence NIEQLEA, of the WTPB SGE PG N-terminal protein sequence. This LhPG N-terminal sequence was the most similar (90% amino acid identity) to the N-terminal sequence predicted for Llpg3 (Allen and Mertens 2008). A seven amino acid sequence (QDDCLAI) corresponding to an internal site in Llpg3 was selected to design the reverse degenerate primer LhPGR. This internal sequence was selected because it included a well-conserved aspartic acid residue-containing region critical for PG activity (Pickersgill et al. 1998; Armand et al. 2000). Both PCR and RT-PCR using these primers with L. hesperus gDNA and cDNA produced major bands of ~500 bp (data not shown, designated LhpLhgDNA and LhpLhcDNA, respectively). These PCR products were ligated to the pCR®4-TOPO® vector, and six clones for the gDNA amplicon and one for the cDNA amplicon were selected for sequencing. Two of the gDNA clones (LhpLhgDNA-5 and LhpLhgDNA-6) contained a continuous open reading frame and shared 88% amino acid identity. LhpLhgDNA-6 shared 98% amino acid identity and 100% similarity to the cDNA clone LhpLhcDNA-1, which also contained a continuous ORF. Amplification of the 5′- and 3′-ends of LhpLhcDNA-1 was performed by 5′ and 3′ RACE using gene specific primers, in combination with the inner and outer 5′ and 3′ RACE primers provided with the kit. The 1050-bp sequences of the five full-length gDNA and five cDNA amplicons obtained using primers e/f share only 55–59% nucleotide identity with the known LlPG genes; thus, we have classified them as belonging to a second Lhpg sequence, named Lhpg4, with only partial homology to the L. lineolaris PG genes. An additional sequence was obtained when the degenerate primer pair LhPGF/LhPGR was used with L. lineolaris cDNA. This sequence shares 73–75% identity with the Lhpg4 sequences and has been designated Llpg4 (data not shown). All ten Lhpg4 sequences (Lhpg4-1gDNA to Lhpg4-5gDNA and Lhpg4-1cDNA to Lhpg4-5cDNA) contain an open reading frame with an inferred 349 amino acid sequence. The GenBank accession numbers of sequences Lhpg4-1gDNA, Lhpg4-4gDNA and Lhpg4-2cDNA and are EU431963, EU450666 and EU450667, respectively. These sequences are provided in L. hesperus PG sequences.pdf (EMS).

Both PCR and RT-PCR using the LlPG2F/LlPG2R primer pair with L. hesperus gDNA and cDNA, produced bands of ~1000 bp (data not shown, designated LlpLhgDNA and LlpLhcDNA, respectively). These PCR products were ligated to pCR®4-TOPO® vector, and 11 clones for the gDNA amplicon and three for the cDNA amplicon were selected for sequencing. Five of these gDNA clones contained a continuous open reading frame sharing >96% identity. LlpLhgDNA-5 shared 99% identity to the identified cDNA clone, which also contained a continuous ORF (LlpLhcDNA-3). The 3′-end of LlpLhcDNA-3 was completed by 3′ RACE using gene specific primers g and h designed from the sequence and the manufacturer’s inner and outer 3′-end primers. Because the 1098-bp-long sequences of the six full-length gDNA and three cDNA amplicons obtained using the primer pair i/j, share ~95% identity with Llpg2, we have classified all nine as Lhpg2 sequences. All Lhpg2 sequences (Lhpg2-1gDNA to Lhpg2-6gDNA, Lhpg2-1cDNA to Lhpg2-3cDNA) contain an open reading frame with an inferred amino acid sequence of 365 residues; only Lhpg2-4gDNA and Lhpg2-5gDNA clones shared 100% identity at the nucleotide level. The accession number for the sequence of these identical clones (Lhpg2-4gDNA and Lhpg2-5gDNA) is EU431962 and for Lhpg2-2cDNA is EU431964 (EMS).

A phylogram includes the reported putative LlPGs (Allen and Mertens 2008), two of the LhPGs cloned herein, two other insect PGs whose encoding genes have been identified so far (Girard and Jouanin 1999; Shen et al. 2003) and a set of microbe and plant PGs (Fig. 2). Phylogram branch lengths are proportional to the amount of evolutionary change inferred by the ClustalW2 program. The insect PGs are from the rice weevil (Sitophilus oryzae; gi:11493893) and the mustard beetle (Phaedon cochleariae; gi:4210806). The set of plant PGs includes Arabidopsis thaliana gi:16226535, Medicago truncatula gi:41529571, Solanum lycopersicum fruit gi:225933, and Zea mays gi:22419. The only nematode PG (gi:21628922) reported to date, from Meloidogyne incognita, is also included in the phylogram. Fungal PGs from each of the following species are represented in the multiple sequence alignment; A. niger PG II (gi:227288), used for the assessment of PG activity as a damage causing factor; a PG isoform isolated from Botryotinia fuckeliana necrotroph asexual form (gi:56236524); the hemibiotroph Fusarium oxysporum f. sp. radicis-lycopersici (gi:42412524), and the hemibiotrophic oomycete Phytophthora infestans (gi:15546021). PGs of bacterial origin are represented by Xyllela fastidiosa strain Temecula 1 PG precursor (bacteria are delivered to the plant’s vascular system by means of an insect vector; gi:77747698), Xanthomonas oryzae pv. oryzae MAFF 311018 PG (bacteria are delivered to plants via rain droplets or wind; gi:84368142), and Treponema pectinovorum PG (bacteria isolated from human gingival crevices; gi:47132315). In the ClustalW2 sequence alignment, S. oryzae PG shares 27% to 34% amino acid identity with Lygus sp. predicted PG sequences. LhPG4-1gDNA shares 46%, 39%, 49% and 38% identity with the predicted LlPG1, LlPG2, LlPG3 and LhPG2-2cDNA sequences, respectively. LhPG2-2cDNA shares 96% identity with the predicted LlPG2 sequence and 46% and 37% identity with the sequences predicted for LlPG1 and LlPG3, respectively.

Fig. 2
figure 2

Phylogram of a selected set of plant and microbe PGs including the predicted amino acid sequences for the three full-length L. lineolaris PG genes and the two full-length L. hesperus PG sequences identified in the fed-upon collection diet of WTPB insects. S.oryzae = Sitophilus oryzae PG, gi:11493893; P.cochleariae = Phaedon cochleariae PG, gi:4210806; A.thaliana = Arabidopsis thaliana PG, gi:16226535; M.truncatula = Medicago truncatula PG, gi:41529571; S.lycopersicum = Solanum lycopersicum fruit PG, gi:225933; Z.mays = Zea mays PG, gi:22419; M.incognita = Meloidogyne incognita PG, gi:21628922; A.niger = A. niger PG II, gi:227288; B.fuckeliana = Botryotinia fuckeliana PG, gi:56236524; F.oxysporum = Fusarium oxysporum PG, gi:42412524; P.infestans = Phytophthora infestans PG, gi:15546021; X.fastidiosa = Xyllela fastidiosa strain Temecula 1 PG precursor, gi:77747698; X.oryzae = Xanthomonas oryzae pv. oryzae MAFF 311018 PG, gi:84368142 and T.pectinovorum = Treponema pectinovorum PG, gi:47132315

WTPB-PGs are extruded into the insect diet

In order to determine if L. hesperus PG proteins could be detected in the exudates of WTPB salivary glands or in a diet upon which the insect has fed, proteins were isolated and analyzed for peptides that correspond to the cloned L. hesperus PG sequences. PG activity was detected in the collection diet using the PG radial diffusion assay and no activity was detected for the control. Table 2 lists the proteins identified in the samples; the PG-active protein band detected in the gel electrophoresis analysis of WTPB SGE and the collection diet after lygus bug feeding. No proteins with significant similarities to Lygus PGs were detected in the controls.

Table 2 Proteins identified after searching “ath1pluslygus” database by Mascot and X! Tandem using LC-MS/MS to determine their peptide abundance in the PG protein band detected after electrophoresis and identified by an asterisk in Fig. 1b and in the fed-upon sachet protein collection diet

The predicted Lhpg4-1gDNA and Lhpg2-2cDNA amino acid sequences were identified in both the L. hesperus SGE protein band with PG activity and in the collection diet. However, LhPG4 had the highest percentage of sequence coverage in these two samples. Because these L. hesperus PGs were found in SGE and in the collection diet, they were selected for inclusion in the phylogram analysis (Fig. 2). Peptides from proteins similar to the L. lineolaris PGs were also detected but these proteins did not have peptide sequences corresponding exactly to the L. hesperus sequences that have been cloned.

Discussion

Debate over whether mechanical or biochemical factors cause the damage inflicted on plants by phytophagous Heteroptera (Schaefer and Panizzi 2000) has suggested that more rigorous tests are necessary to understand what is responsible for the damage to plant tissues caused by these significant pests. We simulated insect piercing–sucking feeding using a micro-injection technique (Shackel et al. 2005) and results using this technique strengthened Strong’s conclusion that PG action is the principal cause of WTPB-inflicted plant tissue damage (Strong 1970; Shackel et al. 2005). Injection of partially pure WTPB PG and pure fungal and bacterial PGs caused damage to alfalfa and cotton floral tissues while the injection of buffer or bovine serum albumin solutions did not. However, the question of whether PG activity or the PG protein’s structure per se was responsible for the plant tissue damage remained unanswered. It is uncertain if peptide domains rather than whole proteins represent the active part of HAMPs, such as glucose oxidase, β-glucosidase and alkaline phosphatase (Mithöfer and Boland 2008) and as these HAMPs are enzymes, it is uncertain if the plant responds to the protein or to its activity, or if the actual HAMP is the derived product. Furman-Matarasso et al. (1999) reported that the activity of the so-called ethylene-inducing xylanase was due to a plant response to structural epitopes of the protein rather than to its hydrolytic activity. We have demonstrated that injection of a mutated A. niger PG lacking enzymatic activity but maintaining its tertiary structure caused no damage in planta while injection of the active A. niger PG did. Thus, the conclusion that the action of PG, rather than PG protein structure per se, is responsible for the development of TPB-like damage symptoms is experimentally established, but the identity of the HAMP in this herbivore–plant interaction has not been determined. Cloning mirid PG genes provides key molecular tools to determine if a particular PG or its enzymatic products can be possible HAMPs in the interaction.

Electrophoretic separation and in-gel PG assay of SGE provided a PG-active protein band that was subjected to amino acid sequencing. This, in turn, guided the design of degenerate primers that yielded the amplification of full-length gDNA and cDNA clones of a novel L. hesperus PG, LhPG4. The initial ClustalW2 alignment between the N-terminal sequence of the band with PG in the WTPB salivary gland extract and the N-terminal sequence of the Lhpg4 clones (gDNA and cDNA) had only a 70% positional identity. This discrepancy was clarified by the analysis of the raw data for the Edman degradation. The data indicated that at least two proteins were present in the sample and that an N-terminal sequence VDVNDIEQLEAAKNSQRITL also was supported by the raw data. This sequence differed from the predicted N-terminal sequence for Lhpg4 only in the lysine at position 7, and this difference is a conserved amino acid substitution. It is not surprising that at least two proteins were detected by Edman degradation since the L. hesperus PGs have similar molecular weights and thus would be unresolved by electrophoresis. An analysis of the PG activity in L. hesperus salivary gland protein extracts has identified several distinct PG proteins (Celorio-Mancera et al. 2009) and PG proteins not resolving effectively in the PAGE activity gel could explain the apparent “single” strong PG activity band in the profile.

We obtained full-length gDNA and cDNA clones for Lhpg2, presumably the homologue of Llpg2, from L. lineolaris, with 95% amino acid identity shared between the two predicted protein products. Lhpg4, shares only 39% to 49% amino acid identity with the L. lineolaris PGs described by Allen and Mertens (2008). We cloned several gDNA and cDNA sequences for both Lhpg2 and Lhpg4 and these differ between 1 and 4% in the predicted amino acid sequence within each group of L. hesperus PG clones. We have discarded the possibility of sequencing errors in our analyses of the Lhpg2 and Lhpg4 clones because the differences were consistent in duplicated sequencing runs. Rather, we suspect that this diversity of cloned sequences demonstrates single nucleotide polymorphisms or genetic heterogeneity. The amino acid identity between the Lhpg2 and the Lhpg4, clones is around 40% and the nucleotide identity about 50%.

The analysis reported in Table 2 is based on the identification of tryptic peptides in the PG-active protein from the L. hesperus SGE and of peptides from proteins isolated from sachets following L. hesperus feeding. Unique PG peptides and the percentage of coverage of a given PG are important factors in identifying proteins. The peptide analysis indicates with 100% probability that LhPG4-1gDNA is in the artificial diet, however it indicated only a 93% probability that the protein predicted for the Lhpg4-2cDNA clone is also in the collection diet. The proteins encoded by Lhpg4-2cDNA and Lhpg4-1gDNA share a 97% amino acid identity. Although this is statistically less strong evidence of the presence of the Lhpg4-2cDNA gene product among the proteins in the collected diet, the apparent LhPG4 isozyme polymorphism detected in our cloning of the WTPB PGs may explain the uncertainties in the identification of the PGs encoded by the Lhpg4 cDNA and gDNA sequences in the diet. Pathogens produce multiple PGs encoded by a family of highly polymorphic PG genes. For example, the gray mold fungal plant pathogen Botrytis cinerea has six PG genes whose encoded proteins are detected in host tissues (ten Have et al. 1998). This multiplicity of PG isozymes may allow the pathogen to colonize different plant hosts under different conditions and act as a “back up” mechanism to protect against losses of pathogenicity functions (Annis and Goodwin 1997; Markovic and Janecek 2001; D’Ovidio et al. 2004a, b). Therefore, it should not be surprising that, in the evolutionary “tug of war” between plants and insects, Lygus spp. also produce several PG isoforms. Other virulence proteins also are encoded by relatively large gene families; thirty cDNAs signifying cathepsin L-like cysteine proteases have been identified in the cow pea bruchid (Callosobruchus maculatus) (Zhu-Salzman et al. 2003).

Seven unique peptides predicting that LlPG1 was present in the collection diet fed on by L. hesperus were detected. However, L. hesperus was the only Lygus spp. that fed on this diet. Therefore, the detection of peptides in the diet similar to those in LlPG1 suggests that there may be a not-yet-identified L. hesperus PG sequence very similar to Llpg1. The LlPG1 peptide sequences could be used as starting point for designing degenerate primers for cloning the putative Lhpg1. The phylogram (Fig. 2) is based on the predicted amino acid sequences for the three L. lineolaris PGs and the sequences predicted for LhPG4 (predicted from the Lhpg4-1gDNA sequence) and LhPG2 (predicted from the Lhpg2-2cDNA sequence). The two L. hesperus PG sequences were selected for the phylogenetic comparison because the peptide analysis predicts with a 100% probability that these proteins were present in the collection diet that the insects had fed upon. The analysis demonstrates the close similarity between the products of Llpg2 and Lhpg2-2cDNA and shows that the protein encoded by Lhpg4-1gDNA is an outlier among the known Lygus spp. PGs, emerging as an independent clade more closely related to LlPG3.

PG activity has been reported in extracts of whole or partially dissected Lygus spp. and in isolated salivary glands. Biochemical studies have suggested that PG isoforms are present in WTPB extracts (Strong and Kruitwagen 1968; Agblor et al. 1994). Now, with the identification of the gene families for L. lineolaris PGs (Allen 2007; Allen and Mertens 2008) and L. hesperus PGs (this manuscript) emerging as a separate subclade (Fig. 2), specific Lygus spp. PGs, among several possible saliva constituents (Laurema and Varis 1991; Schaefer and Panizzi 2000), and their enzymatic products can be evaluated in planta as possible HAMPs. Evidence demonstrating that PG is extruded into the diet when the insect is feeding had been limited to a histological description of TPB feeding on cotton floral buds (Williams and Tugwell 2000). The characteristic polyphagy of TPBs might be due to the insects’ ability to change their salivary constituents according to the biochemical characteristics of their diets. However, Varis et al. (1983) concluded that it is not the quality of food that has a decisive influence on the variation of salivary enzyme activities in L. rugulipennis, but rather the inter- and intraspecific variation of activities at the level of isozymes. This information and questions related to the origin of the PG activity found in guts of mirids (Frati et al. 2006), emphasize the importance of identifying the enzymes that are actually secreted into the insect’s diet. From all the predicted amino acid sequences for the Lhpg4 and Lhpg2 gDNA and cDNA clones (17 sequences total), LhPG4-1gDNA and LhPG2-2cDNA were detected in the collection diet with a 100% probability under the parameters used for protein identification. Therefore, we have now provided direct evidence that TPB PGs are extruded into the diet, corroborating the inference based on earlier observations of plant tissue maceration (Williams and Tugwell 2000).

In addition, all the WTPB PG proteins predicted by the genes cloned in our study and detected in SGE or in proteins extruded into feeding sachets encode proteins with molecular weights of 37–40 kDa. All the PG proteins found in the collection diet sachets are found in the SGE PG protein band, suggesting that the PGs extruded into this food source have a salivary gland, rather than gut, origin. However, we have not examined the expression of the LhPG4-2cDNA, LhPG4-1gDNA or LhPG2-2cDNA in the insect gut to exclude the possibility that the PGs are produced in both organs. Additional evidence that Lygus spp. PGs are secreted enzymes is the position of the N-terminal sequence in LhPG4 which reveals a signal peptide sequence between the initiation codon and the first codon corresponding to the first amino acid of the N-terminus of the isolated protein.

Lygus lineolaris has more than 300 hosts and L. hesperus also is polyphagous with more than 100 host plants, and both Lygus species have developed resistance to pesticides making chemical control difficult (Leigh et al. 1977). As we approach the challenge of integrating different research approaches within an ecological context to facilitate an appropriate protocol for engineering insect-resistant crops, it is crucial to develop the tools for a future molecular ecology era (Gatehouse 2008; Zheng and Dicke 2008). We have provided evidence that WTPB PG activity is inhibited by pear fruit, alfalfa and cotton polygalacturonase inhibiting proteins (PGIPs) (Shackel et al. 2005). Others have reported that an orange flavedo PGIP inhibits the citrus root weevil PG (Doostdar et al. 1997), and two of the four known PGIPs of the common bean inhibit PG activity from mirid bugs, including L. rugulipennis and L. pratensis (D’Ovidio et al. 2004b; Frati et al. 2006). PGIPs play a significant role in plant defense by inhibiting pathogen PGs (Powell et al. 2000; Ferrari et al. 2003). Expression of PGIP genes has been observed in response to insect feeding. In Brassica napus DH12075, two PGIP genes have been characterized. Bnpgip1 is highly induced by flea beetle feeding and mechanical wounding, but only weakly responsive to fungal attack (Li et al. 2003). However, it is not known if mirid feeding induces PGIP gene expression. Given the parallel between pathogen–plant interactions and TPB–plant interaction, it is plausible to test if inhibition of TPB PG activity in planta mitigates the damage caused to plant structures by the insect. The possibility of a strategy for plant protection against Lygus spp. can be studied by testing the inhibitory activity of different plant PGIPs against single, homogeneous mirid PG isoforms.