Keywords

Introduction

Up to this point, the reader should have a good understanding of HIV-1 at the protein level. In Chaps. 1 and 2, we have explored in depth the role of host proteins and the techniques that are routinely used in traditional research approaches. In this chapter we will introduce some of the fundamentals of proteomics and how they can best be applied to the study of HIV-1 and other viruses.

When our group started working on HIV-1 proteomics over a decade ago, we had to overcome many challenges. Given the limited amount of sample available for HIV-1, we didn’t have enough material to follow stringent process testing like we normally would have done for our classical biochemical experiments. Eventually, with the help of our colleagues at the AIDS and Cancer Vaccine Program, we began using large quantities of 1000-fold concentrated virus isolated from >500 mL of cell culture supernatants. Armed with sufficient virus to systematically address where we were making mistakes, we were able to identify and adapt our methods to work with HIV-1 and began to disarm many experimental and methodological landmines. In this chapter, we share the lessons and experiences that we had to overcome to finally be able to move forward to obtaining meaningful results and publications. It is my hope that the reader will be able to benefit from these experiences as a starting point for the study of HIV-1 or other projects involving limited numbers of samples and specialized approaches.

Proteomic Studies of HIV-1

Proteomics: “A Three-Legged Stool”

Proteomics has been described by many as a “three-legged stool”—if any one leg fails, there are catastrophic results for the user. In terms of HIV-1 proteomics, the legs of the stool consist of (1) sample preparation, (2) mass spectrometry, and (3) bioinformatics. In this chapter, we will principally address the first two legs, and we dedicate Chap. 6 to addressing the third. I have attempted to provide practical advice and the necessary minimal amount of background information for investigators wishing to attempt studies in this area or for physicians to better understand the type of information that one might consume as this field matures.

Sample Preparation

The purity problem of HIV-1 was explored in Chap. 2; however, we will expand upon this issue as it applies to proteomics. Ott and colleagues have nicely described the use of both subtilisin and CD45 depletion for the generation of pure virus preparations for host and viral proteins within the lipid bilayer of the virion and whole virions, respectively [1]. Our group has also contributed to this area publishing methods of virus isolation including density manipulation and refined protocols for the use of OptiPrep reagents [2]. As we get closer to completing the initial “cataloging” of host proteins that are incorporated into virions, we are very likely to shift toward examining posttranslational modifications (PTMs) of viral proteins (described in Chap. 5) and the analysis of HIV-1 virions isolated directly from patients. Also, in the near future, we are likely to begin to take more quantitative approaches to the study of HIV-1 virions in patients. This shift in applications also necessitates a shift in our sample preparation procedures—away from the traditional large-scale biochemical approaches typically used in proteomics to those that allow for the more rapid isolation of virions such as affinity purification approaches. For affinity purification approaches, despite being marketed as an HIV-1 purification kit, we only just recently confirmed that CD44 was incorporated into HIV-1 virions from both macrophages and T-lymphocytes [2]. Thus, consideration of a CD44-positive enrichment kit (Miltenyi), as a first step alongside a CD45 depletion kit, may accelerate sample processing. The caveats with these approaches, though, are that CD44 enrichment might preferentially enrich for R5-tropic virions [3] or may miss subsets of viruses that may not incorporate CD44. Although affinity enrichment and depletion methods are very straightforward, ultracentrifugation remains one of the best approaches to isolate HIV-1 [3] and is widely used for the enrichment of virus for ultralow limits of detection by quantitative polymerase chain reaction (PCR) methods. Adaptation of these protocols to the latest generation of benchtop ultracentrifuges that can rapidly obtain extremely high relative centrifugation forces in a short period of time may be an alternative path moving forward.

For the near future, it is likely that primary viruses must be expanded from cell cultures ex vivo unless working with patient samples isolated from patients in the acute phase HIV-1 infection, where HIV-1 titers are very high. If ex vivo expansion of virus is the case, then careful consideration must be given to the expansion strategy, as from a host-protein standpoint, HIV-1 will reflect the cell type that it last replicated in, potentially altering any host-protein phenotypes of PTMs that would have existed in the patient [2]. As tissue culture approaches improve for ex vivo expansion of cells [4] and our knowledge grows about what the differences are between primary virions and those expanded ex vivo, this may become less of an issue in the future. From a sequence perspective, if one is using proteomic approaches to confirm virus sequence changes ex vivo, expansion is less of a concern.

Regardless of the technique for the purification or generation of HIV-1 virions for study, for the purpose of cataloging viral proteins or the study of viral protein PTMs, virus must be significantly concentrated so that total protein concentration is in the range of 1 mg/mL to facilitate both the digestion of virus by proteases (most commonly trypsin) and the cleanup and recovery of the digested material. To this end, ultracentrifugation offers a very rapid way of “pelleting” virus to concentrate it, and the supernatant can be discarded. Even if the resulting pellet is invisible to the naked eye, the virus can then easily be resuspended in a digestion buffer containing trypsin to improve recovery. In our group we use an acid-cleavable detergent like RapiGest (Waters) to accelerate the digestion of the virus and also inactivate the virion at the same time. Acid-cleavable or similar detergents are a must as common detergents are contaminants for mass spectrometry analysis in following steps (reviewed in [5]).

Protein and Peptide Quantitation

While there are a plethora of different protein quantitation kits on the market based upon different methodologies (reviewed in [6]), users are often unaware of the effect of interfering substances, and protein quantitation can be vastly impacted. In our experience, protein and peptide quantitation is one of the most important steps in proteomics. If sufficient experimental material exists (>50 μg), our first recommendation is a precipitation-based approach, like the 2D Quant kit from GE healthcare. Specifically made to quantitate proteins from samples prepared with complex lysis buffers that include detergents, this kit simply precipitates out the protein and then follows a colorimetric approach. The downsides of this approach are the relatively high amount of protein required for this purpose, only really allowing for studies involving infected tissues or cells in the context of HIV-1. Therefore, as an alternative we have used a custom protein binding assay described in 6 with a fluorescent post-dye like Sypro Ruby (Invitrogen) or Lava Purple (epicocconone, the Gel Company/GE Healthcare). This method uses a dot-blot apparatus, and interfering substances are washed through a filter, whereas the protein itself adsorbs to a nitrocellulose membrane. The sample is run along with a standard curve and read on a fluorescent scanner. Protocols for each of these methods are available on the vendor sites. Recently, Feist and Hummon reviewed similar approaches for microgram amounts of materials for proteomic studies [5, 7].

Prior to sample concentration, for most individuals working with HIV-1 ELISAs or quantitative PCR generally is used to determine the quantity of virus present in the sample. This information can be used to help estimate the total amount of protein in a sample. Based upon published studies, there are roughly 1,500 copies of capsid (p24) per copy of viral RNA [8], which works out to be approximately 16,000 virions per picogram of p24. Using results obtained by Marozsan and colleagues, these numbers appear to overestimate the number of virions by a factor ranging from 33 to over 100, when calculating the ratios of p24 to viral RNA [9]. Given the higher reproducibility of viral RNA measures versus p24 assay results, we prefer to normalize by viral copy number. By ELISA, in our experience, [9] the protein concentration of a virus preparation is generally 10–20-fold higher than the p24 level. Like any quantitative assay, one must be very careful to run the sample in dilution to make sure that the reported value is in range of the standard curve used for the assay. In practical terms, we generally use 25 mL of infected culture supernatant to obtain sufficient virus in the supernatant to perform a discovery experiment. Our general assumptions are that HIV-1 grows at approximately 10 ng/mL of p24 from permissive cell lines. Thus, we have 250 ng of p24, or using our rule of thumb, 2.5 μg of total protein. This is on our low end for a thorough cataloging of virus by mass spectrometry but sufficient for us to obtain significant coverage of the virus for most applications.

For discovery experiments, the limited abundance of HIV-1, especially from small cell culture experiments or from primary isolates, is the biggest risk factor for the success in HIV-1 proteomics. A good rule of thumb is to work backward from your optimal analysis to inform yourself of how much material is required for your mass spectrometry experiment. For instance, if one is looking to perform an experiment to find as many proteins as possible in their HIV-1 isolate, then the ideal amount of protein to have for nano-high-performance liquid chromatography (nano-HPLC) ranges between 1 and 10 μg of digested and cleaned peptide on column. Working backward, using the optimal workflow recommended below, always underestimates protein amount after digestion by a factor of 2 prior to digestion of protein and peptide cleanup and quantitation by HPLC. Thus, for a discovery experiment, the risks expand greatly for starting amounts of material under 20 μg. Since purified virus is roughly 1 μg/mL of total protein in supernatant, this means that 20–25 mL of supernatant is the minimum required. For plasma from a patient, this would depend on the titer of virus in the patient, but 20 mL of plasma would be a good starting point for a discovery experiment, so long as the patient is viremic.

Sample Cleanup: Beware of Traditional Detergents

Most classical methods working with HIV-1 involve inactivation of virus with 1 % triton X-100. Since detergents are a profound interfering substance for both nano-HPLC and mass spectrometry, different strategies must be thought of for inactivation of virus. While new approaches and products allow for the removal of detergents [10], many of which are now commercially available (Pierce or Thermo Scientific) in our laboratory, we have substituted the use of an acid-cleavable detergent (reviewed in [11]) such as 0.5 % RapiGest (Waters) or alternatively a buffer containing urea to facilitate digestion of proteins [2] obviating the need to use detergents at all. The use of a high-quality trypsin is also essential following the manufacturer’s instructions to avoid contamination with trypsin autolysis products.

The next step of good sample preparation is removing salts (another interfering substance) from the sample and actually quantifying the peptide that will be injected onto the mass spectrometer. In our group we have moved directly to off-line HPLC cleanup of the peptides and use a standard curve by HPLC. By far this is the gold standard for peptide quantitation and desalting, but from a practical standpoint, this is somewhat like driving around the block in a Ferrari, as using HPLC instrumentation for this purpose can be costly and using a neighbor’s HPLC system is not practical since often they are set up with assays specific to their laboratory needs. A much less expensive approach is the use of peptide spin cartridges, which allow for washing and desalting of peptides in a typical laboratory setting. The subsequent use of the LavaPep [6] (The Gel Company) or a similar peptide quantitation method, like nanodrop (Thermo Scientific), provides the investigator with precise information on the abundance of peptides prior to performing a mass spectrometry experiment as the recovery can sometimes be variable between spin cartridges. The use of a peptide standard mix to make a standard curve is essential to ensure accurate quantitation (available from ProteoChem, Life Technologies and Sigma-Aldrich). Regardless of the approach, purified quantitated peptides maximize the probability of success of a mass spectrometry experiment.

Mass Spectrometry: The Basics

Prior to discussing the acquisition of mass spectrometry data, a lengthy aside is necessary to introduce mass spectrometry (MS), as the success of the HIV-1 proteomic experiment is now dependent on a well-executed MS experiment. While the end point of a mass spectrometry experiment is obtaining information on both protein composition and abundance, understanding the principles of mass spectrometry (MS) helps inform the investigator as to what types of mass spectrometers are better suited to what purpose. While any experienced proteomic core director should be able to guide the newly initiated, it never hurts to come in somewhat informed to the conversation. Unfortunately, there is a paucity of reviews in mass spectrometry for the lay audience and for the most part a lack of fundamental education for most scientists and clinicians in this area. At the time of writing, the Agard lab at University of California San Francisco (UCSF) has a fantastic primer available online (http://www.msg.ucsf.edu/agard/protocols.html—MS101; Google Keyword search: Agard lab mass spectrometry 101) that I have drawn heavily from my own teaching of the fundamentals of mass spectrometry at Johns Hopkins School of Medicine (https://jh.box.com/ms-basics-graham). The next section presents a lay view of mass spectrometry meant as a general guide for the reader and is in no way meant to be a comprehensive review of the subject but is intended to allow the reader to have an informed conversation with a mass spectrometrist.

Vacuum System and Source

Often the first thing that somebody notices when a mass spectrometer is installed in the laboratory is that they are loud. The noise associated with a MS instrument is due to the vacuum systems needed to keep the instrument operating under very low vacuum ranging from 10−3 to 10−5 Torr or lower depending on the section of the instrument or what operations are being performed (for reference, outer space pressure ranges from 10−6 to <10−17 Torr in interstellar space, the moon surface atmosphere being 10−11 Torr). Think of the classic experiment that is performed in high school labs across the nation—a falling ball or a falling feather. In normal atmospheric pressure air (760 Torr), a feather falls much more slowly than the ball due to air resistance. In a vacuum they fall at the same rate (see the Human Universe: Episode 4 on Youtube by Brian Cox). Ions, despite their incredibly small sizes, bang into other molecules and are slowed down just like any other matter. Therefore, it’s the job of the source region to allow samples to enter into the MS system from an area of high pressure to inside of the instrument—where a very high vacuum exists. The trick with mass spectrometry has always been trying to get the sample from the solid or liquid phase into the gas phase so that it can be moved around inside a vacuum. The other problem is how we can move it around once we have converted our analyte into the gas. The job of the source of a mass spectrometer is to convert our analyte into the gas phase and, at the same time, impart a charge on analyte. The charge, either positive or negative, is the only way we can move something around in the gas phase using principles of magnetism.

The source most widely used today is what is called electrospray ionization mass spectrometry. In this method peptides are resuspended into an aqueous solution, ran through a capillary column (from an HPLC system) through an emitter needle that is electrified at high voltage, while gas, typically nitrogen, is blown into the source region to evaporate the solvent. The result is that the solution, which contains charged particles, rapidly evaporates and the droplets begin to reach a point where the like charges repel each other and the force is stronger than the surface tension of the droplet resulting (Rayleigh limit) in an explosion of the particles out of solution into the gas phase (into the air). Since the most commonly used protease, trypsin, cleaves after a lysine or arginine, which are basic amino acids, peptide typically becomes protonated (H+), resulting in peptides having at least one positive charge. Protonation is facilitated by acidic pH conditions. After being charged into ions, the peptides are focused through a series of ion optics toward the next component of the system, the mass analyzer. While air and other non-charged particles are also entering into the instrument, the ion optics create fields that are stronger than the airflow created by the vacuum system. Therefore, the charged particles, or ions, are continually concentrated relative to their environment as they enter into regions of the instrument held at lower pressure.

The Mass Analyzer

As mentioned above, vacuum is going to allow charged peptides to move in the instrument. Highly efficient (roughing) and high velocity (turbo pumps) vacuum pumps are used to remove as much air as is feasible to create a vacuum. Once this is achieved, peptides can be accelerated, decelerated, and steered using magnetic fields. While beyond the scope of this book, in most Electrospray Ionization (ESI) instruments, a series of ion optics are used to steer the beam of ions and focus them to the mass analyzer where masses are separated.

The job of the mass analyzer is to separate different masses entering through the source regions. The same principles that are involved in redirecting ions (varying voltages and radio frequencies) are used by the mass analyzers to get rid of unwanted ions or enrich desired ions.

There are three major types of mass analyzers that are commonly used in modern instruments: the quadrupole (Q), the ion trap (IT), and the time of flight (TOF) mass analyzers. In the most common configuration for protein analysis, multiple analyzers are combined, generating what is referred to as a tandem mass spectrometer. Two or three analyzers are typically combined in series originating several different configurations.

Likely, the easiest mass analyzer to conceptualize is the time of flight mass spectrometer (reviewed in [12]). In a TOF instrument, ions are separated according to the time they take to travel while accelerated by a magnetic field. The ions hitting the detector are recorded, and this information is presented in a mass spectrum, with mass (m/z—defined below) on the x-axis and the intensity of the signal on the y-axis. To visualize how a TOF mass analyzer works, imagine a bowling ball and a marble sitting side by side in a lane at a bowling alley. If the exact same amount of force is applied to the bowling ball and the marble at precisely the same time, the marble will reach the end of the lane sooner than the bowling ball. Since we can measure the time it takes for this to occur and we know the amount of force that has been applied, we can calculate the mass of the marble and the bowling ball. While the equations look a bit different, for an ion in a TOF instrument, the time of flight is directly proportional to mass. The only conceptual trick is that since we cannot physically push the ions but instead need to use voltage to apply force, the ions will receive energy in a dose equal to the number of charges that they have on the molecule. For example, a molecule with one charge will receive the equivalent of one push of equal energy, whereas a molecule with two charges will receive two pushes of equal energy, and so on. In order to calculate the mass of an unknown peptide, knowing the time (measured) and the force applied, but not charge state (number of charges), other inferences need to be made. Despite the name mass spectrometry, the mass on a mass spectrum is in reality the “m/z,” or mass over charge ratio. Indeed a peptide ion flies at a speed, which is in direct proportion to its charge in the instrument. So mass (m) is actually equal to m + H/z, where M = mass, H = mass of a proton, and z is the charge. Fortunately with high-resolution mass spectrometers, z (charge state) can be calculated by using the information stored within the isotopic envelope. This is generated by the natural distribution of isotopes and their relative abundance within a peptide chain. For instance, the natural abundance of 13C generates different isotopic forms of the same peptide. The isotopic envelope, which can be observed before correcting for isotopic distribution, is a representation of the natural occurrence of heavier isotope (e.g., 13C). Since peptides are mostly comprised of carbon, hydrogen, nitrogen, and oxygen, we can use the natural abundance of heavy isotopes to determine what the charge state is by looking at what the mass difference is between the light and heavy isotopes (for reference see [13]). For example, the natural mass of carbon is 12.00 Da exactly. For carbon 13, the mass is 13.00 Da. Therefore for a population of ions in a typical peptide, most will be made up of 12C; however, some will have 13C. Therefore, when these different forms are resolved in a mass analyzer, we can see the population with the 12C form and the population with the 13C form. To calculate the charge, we look at the mass difference between the two m/z forms of the population. If the mass difference between one isotope and the next is 1, then there is only a +1 charge, if it is 0.5, then there is a +2 charge and so on. Typically, with electrospray ionization instruments, the charge state is +2 and above. Therefore, an important caveat is that we need an instrument of sufficiently high resolution to resolve the differences between the nearest peaks.

After the TOF, the next mass analyzer that is easiest to conceptualize is a quadrupole mass analyzer. The quadrupole, Q or quad for short, is in essence composed of two couples of parallel rods (four poles) aligned with an axis and equally spaced by 90° angles. If one was to look at them standing on a watch dial, one would be at 12 o’clock, one at 3, one at 6, and one at 9. A radio frequency is applied to the rods, and a current is then applied on top of this. In lay terms, one set of forces is used to nudge ions off the axis, and the other to nudge ions on the axis. For example, if filtering higher masses is desired, just enough energy is applied to keep the mass of interest between the rods—thus lower mass ions will crash into the rods or leave the ion beam, because, like in a TOF instrument, a lighter ion will travel farther with the same force. For higher mass filtering, just enough energy is applied to steer lower masses in the center of the beam, and higher masses will not be moved toward the central axis and will eventually exit the ion beam. The small ions will ping-pong back and forth, but the large ions with initial kinetic energy won’t be overcome by the small forces. Thus, by working together, the poles in the quadrupole can act as a mass filter for the masses of interest. To generate a mass spectrum, a quadrupole mass spectrometer has to allow each individual ion to pass through to separate the masses. Since ions are nudged along, the resolution of these instruments tends to be much lower than other instruments and is often used in combination with other mass analyzers in hybrid instruments. One of the most powerful applications of a quadrupole instrument is when three quadrupoles are placed in series, also known as a triple quadrupole (Q3) instruments. In this case, a particular ion can be selected, the second quadrupole can be used to fragment the ions, and the third quadrupole used to transmit only the resulting fragment ions (also known as product ions or transition ions) to the detector. In this approach, termed selective or multiple reaction monitoring, highly specific “transition” ions can be monitored with incredible gains in signal-to-noise ratios. This is because peptides that have the same mass by chance and are co-eluting (isobaric ions) are eliminated prior to reaching the detector. Selective reaction monitoring is described in detail in a recent review by Gianazza and colleagues [14].

The next major type of mass analyzer is the ion trap mass analyzer [15], which is an evolution beyond a traditional ion trap mass analyzer. A traditional ion trap mass analyzer uses similar principles to a quadrupole, except instead of letting ions pass through the gate or not, a trap keeps the ions trapped in an orbit. To measure a mass spectrum, ions are scanned out of the trap (using the same forces as a quadrupole) to the detector. Alternatively, individual ions can be kept in the trap and all the other ions ejected. An Orbitrap instrument uses some of the same principles as an ion trap, except instead of ions traveling inside of the trap, the ions spin between an outer electrode shell and an inner central axis electrode or spindle. An outer “trap” is usually necessary to load ions from the source region into the Orbitrap to overcome the field generated between the outer shell and inner spindle. Similar to a TOF, the heavier the ion, the farther away it “orbits” the electrode, and the lighter an ion is, the closer the orbit.

The Detector

In order to actually detect ions that have been separated by a mass analyzer, a detector is needed. Now working in reverse order, in the case of the Orbitrap, the detector is built into the trap on opposite sides. This configuration is necessary, since ions moving within a magnetic field generate currents on the outer shell electrode. These signals are picked up on either side of the field, and the signals can be deconvoluted using Fourier transformation to generate exquisitely high-resolution mass spectrum. This high resolution is achieved since the current itself is deconvoluted from the actual path of the ions versus being interpreted from electronics as signals are detected in a TOF instrument. In a quadrupole or a TOF instrument, once ions have been separated and sorted, the signal must be converted from ions to electrons. While detectors can vary in their construction, in the case of non-Orbitrap detectors, ions are sent colliding into charged surfaces that amplify the signal into electrons, photons, or both. The intensity of the signal is then recorded and reported along an axis that is mass to charge or m/z. Knowing that detectors, like any electronic equipment, only work within certain ranges, typically four orders of magnitude or less. This is an important consideration, since the detectors often cannot detect weak signals in the presence of strong signals, and if detectors become saturated with too much signal, they can take some time to “reset.” From an experimental standpoint, this means that if the signal is too low, it will not stand out from the electronic noise, and if a signal is too high, you will lose the ability to quantitate if the detector is saturated.

Tandem Mass Spectrometry

The final concept that must be introduced to the reader is tandem mass spectrometry. As mentioned, a tandem mass spectrometer is an instrument where mass spectrometry can be performed in tandem. For most MS applications, a hybrid mass spectrometer is used. For proteomic applications, a modest resolution is required (~10–15,000 resolution) to determine the charge state of multiply charged peptides. Given this resolution requirement, most hybrid mass spectrometers use at least a quadrupole as an analyzer. Only one vendor, Thermo Scientific, owns the patent on the Orbitrap mass analyzer. Generally, the quadrupole is used as a mass analyzer to rapidly select ions for fragmentation, followed by different analyzers (such as TOFs and traps). As mentioned, several different configurations exist on the marketplace including trap-TOFs and other magnetic sector detectors which are beyond the scope of this book.

The most common use of a tandem MS instrument is to first measure the mass and intensity of the analytes (MS) and then to isolate one molecular ion in particular, fragment it, and measure the mass of the fragments (a second MS spectrum). We term this operation MS/MS, MS2, or tandem MS. Conceptually, there are two types of tandem MS instruments: those that operate in tandem separated by space and those that operate in tandem separated by time. Tandem-in-space instruments carry out the isolation of ions, fragmentation of ions, and measurement of fragment ions in different spaces in the instrument. The Q-TOF is the best example of a tandem-in-space instrument, as the first MS experiment allows all ions to pass through the quadrupole and collision cell and be separated by the TOF. In the second MS experiment, the quadrupole isolates the mass of interest and the ion is fragmented (either in a collision cell or by increasing energy of the ions), and the fragments are separated in the TOF. As electronic components improve, at the time of writing, Q-TOFs can easily operate in the 50–200 Hz range, performing many MS/MS experiments in a second.

The second type of instrument is a tandem-in-time hybrid instrument. In a tandem-in-time experiment, the operations are performed in the same region of the mass spectrometer but at different times. An example would be an ion trap instrument, where ions are first collected and scanned out to perform the MS experiment, and then all ions but the ion of interest scanned out, the ion fragmented and the fragments scanned out for the second MS experiment. Hybrid trap instruments now exist in where ions can be measured in the Orbitrap for the first experiment, and a quadrupole used to collect ions, then the ions are fragmented in the loading trap and fragments measured in the Orbitrap. In this manner, the speed of the instrument operations can be increased significantly, with Orbitrap instruments operating in the 18–20 Hz range. While “slower” than a Q-TOF instrument, the ability of optimizing MS/MS by varying fill times of ions in the trap and the ability to perform additional experiments makes a trap instrument more versatile.

Mass Spectrometry in the Context of HIV-1 Proteomics

Chromatography Considerations

Having covered the principles of mass spectrometry in the preceding section, we can appreciate that tandem mass spectrometry will be the most important application of mass spectrometry for most researchers engaging in HIV-1 proteomic studies. In previous primers on proteomics, we have spent considerable time extolling the virtues of performing extensive protein separation techniques to increase the coverage of proteins [16]. In the context of HIV-1 proteomics, the limited amount of sample available for the investigator precludes the use of protein separation methods given the considerable losses that can occur in most gel-based or chromatography approaches. Reiterating, the sample preparation approaches described above, an in-solution digest with trypsin followed by peptide quantitation is the method of choice for HIV-1 proteomics. Fortunately, we are well beyond the days of slow instrumentation, where often only one or two MS/MS events could be performed per second. With instruments now exceeding 50 Hz, the number of MS/MS events that can be obtained per second reduces the need for extensive multiple dimension protein and peptide fractionation approaches.

Given that most experiments will involve a complex mixture of peptides but be limited to under 10 μg of peptide, the best investment for discovery proteomics uses nano-HPLC methods with long gradients and long columns for separation of peptides. Recently, Hsieh and colleagues published a very elegant study examining the relationships between column and gradient lengths on MS and MS/MS performance showing the performance gains of longer nano-HPLC columns [17]. Indeed, some companies are now marketing 1 m-long nano-HPLC monolithic columns (Dionex) that have exceptional performance. HPLC “chip” systems, which reduce the number of connections and reduce the “dead volume” of connections, are also becoming more and more robust. These systems include offerings from Eskigent/ABSciex (ChiPLC) and Agilent (ChipCube System). The chip systems offer less user variability, as do purchased columns; however, they also tend to be much more expensive. It is at this point though that the investment in off-line desalting and accurate peptide quantitation will protect the investment no matter what choices are made. At minimum most facilities should be able to offer a 30-cm column to perform nano-HPLC separations on. Prior to performing extensive experiments with biological samples, testing the system configuration for performance is a good investment prior to running an extensive experiment. Often, a single sample run in triplicate can help to determine the optimal load for the column and optimal chromatography gradients for the sample. In our laboratory, we routinely profile ~1,500 proteins from 10 μg of peptide from HIV-1 virions and ~3,000 proteins from HIV-1-infected MDMs using a 30-cm 150-μM ID column packed with 3-μM C18 resin with a 300 Å pore size, over a 90-min gradient at 500 nL per min on an ABSciex 5600 instrument (manuscript in preparation). This generic method, with direct loading onto the analytical column, is highly reproducible. Given the limited amount of sample available in a typical experiment involving virions or infected primary cells, nano-HPLC is the method of choice; however, as sources on UHPLC systems continue to improve, the gap between micro-flow and nanoflow HPLC will likely narrow. At the time of writing, approximately tenfold more material must be used with UHPLC to obtain the same limits of detection as nanoflow chromatography methods; however, the increased performance and stability of the UHPLC system warrants consideration.

Mass Spectrometry Acquisition: Qualitative Versus Quantitative Methods

Data-Dependent Analysis

By far the most common type of qualitative mass spectrometry experiment is data-dependent analysis (DDA). In this type of experiment, a full scan of all of the masses is first taken by the instrument (MS), then a specific mass is isolated and fragmented (typically by collision-induced dissociation or CID), and the fragment masses measured (MS/MS). Since peptide separation is occurring in real time, the width of a typical peak is only a few seconds. Now that the mass spectrometry field has moved away from slower instruments operating between 2 and 5 Hz and capable of only performing 1–5 MS/MS events per second, the need for extensive peptide fractionation is lessened. New high-resolution/-performance mass spectrometers now operate at speeds of up to 200 Hz at the time of writing, allowing for the acquisition of much more data in a short period of time. This reduces the probability of missing a peptide stochastically. Given the limited amount of material for an HIV-1 experiment, performing an experiment on an instrument slower than 50 Hz (for a Q-TOF) is simply not recommended. If not practical, then strategies must be considered to either separate proteins prior to digestion or peptides after digestion using different fractionation strategies (reviewed in [18]).

From the reader’s perspective, the fundamental goal of a DDA experiment is the acquisition of MS/MS data on as many peptides generated from proteins as possible. DDA experiments are typically semiquantitative. As the speed of acquisition and sampling of peptides increases, the number of times a spectrum is observed can be used as an estimate of the protein abundance. This method, termed spectral counting, is a good start at estimating protein abundance. If biological replicates are available, then this approach can be used along with simple statistical tests between groups to identify proteins that are changing under different conditions.

Label-Free Quantitative Approaches: Spectral Counting and Data-Independent Analysis

We are quickly advancing toward observing more and more of the proteome in each experiment, and the issue of quantitation is often becoming more important than detection of unknown proteins. Since we already know all of the viral proteins in HIV-1, for example, should we bother trying to isolate and identify all of them? Perhaps not. If we have already generated a large database of proteins using traditional (DDA approaches), then we can construct in silico databases based upon the time that a peptide has eluted along with the fragmentation spectrum. Once this database is constructed, then we can perform an experiment where we simply skip to the fragmentation step. Generically, in this type of approach, the instrument measures all of the precursor ion masses (and intensities) and then quickly isolates a range of masses and simultaneously fragments them and measures the fragment ions all together. By mapping the precursor ions and fragment ions back to databases constructed before and not trying to isolate a single ion, we can enhance the sensitivity of detection by approximately tenfold. This type of approach is marketed by different vendors (MSE by Waters, All Ions by Agilent and SWATH by ABSciex to name a few); however, in essence it is taking advantage of higher collision energies and looking for fragment ions that are unique to the peptide of interest. The downside of these methods is that care must be taken to ensure that peptides are normalized properly prior to acquisition on the instrument and that the samples are ran on the same column to ensure that retention times of peptides do not drift. If possible, it is best to run each sample twice, once in DDA mode and then once in data-independent analysis (DIA) mode. In this manner, one obtains the best of both worlds: the accurate quantitation and the ability to identify unknowns in each sample. Another drawback of this method is that at the time of writing, the informatic tools to manage proteomic data generated in this manner are limited and often require investing in the vendor’s proprietary software platforms or the installation of open-source software, like open Sequential Windowed Acquisition of All Theoretical Fragment Ion Mass Spectra (SWATH-MS), which can be beyond the capability of most users. Other iterations of these methods exist including what Thermo terms parallel reaction monitoring, and all of these methods have significant advantages over DDA methods. Additionally, these approaches do not suffer from the limitations of labeling chemistries described briefly below. SWATH approaches have already shown utility in the study of HIV-1-infected macrophages [19, 20].

Labeling Approaches

Isobaric Tagging

Prior to DIA, isobaric tags for relative and absolute quantitation (iTRAQ) and similar methods were used to label peptides from different conditions and mix them together during separation. The principle of isobaric mass tags is that they are intact and the same mass during the peptide-labeling step. Once the tagged peptides fragment, fragment masses that are unique to each tag are detectible, and an uncharged “balance” region is liberated along with the peptide fragments. In this way, mixtures up to eight components can be mixed together, an approach referred to as “multiplexing,” and quantitated relative to one another in a single experiment with iTRAQ reagents (ABSciex). While we and others in the field have experience with these methods in the context of HIV ([2, 21, 22]), we are now using this method less frequently due to challenges with variability of sample labeling, normalization, and bioinformatic challenges for quantitation. An attractive alternative to iTRAQ reagents is the use of tandem mass tags (TMT) from Pierce. These tags are also isobaric like iTRAQ reagents but come in a number of different covalent chemistries that are available for their use including amine-, cysteine-, and carbonyl-reactive chemistries. TMTs, like iTRAQ reagents, have also been used to study HIV-related neurological disease in synaptosomes [23]. Specialized algorithms are required at the data analysis step to ensure that samples are normalized properly, and careful consideration must be given to the reproducibility of the chemistry so that labeling is consistent between samples. The quality of the reagents also deserves consideration to avoid any degradation of the reporters. Many of the challenges associated with the use of iTRAQ reagents were addressed by Luo and Zhao from a statistical viewpoint [24]. Given the expense of the reagents and challenges with labeling and quantitation, many groups, including ours, are moving toward label-free quantitation as described above.

Stably Incorporated Labeled Amino Acids

One method that merits special mention is the use of heavy amino acids for experiments involving in vitro cultures. The stably incorporated labeled amino acids (SILAC) method is especially powerful for in vitro experiments where a cell can replicate at least seven generations to ensure uniform uptake of the label. This is accomplished by growing cell lines in a tissue culture media that contain a heavy amino acid. This allows for the mixing of peptides from different biological samples in the same MS run. The ratios of proteins can then be determined by comparing the precursor intensities of the “light” to the “heavy” peptide. A nice example of the successful application of this technique was recently published by Barrero and colleagues to examine metabolic pathways altered by HIV-1 viral protein R (Vpr) [25]. While powerful, the major drawback of this technique is that sufficient label must be incorporated to resolve the light and heavy peaks, especially for higher-charged peptides (reviewed in [26]). This can be accomplished by using LysC as a protease; however, this also results in larger peptide fragments that can be difficult to sequence. Another caveat is that cells have to be adapted to serum-free culture conditions, so this may impact results. The same caveats exist as described in the introductory chapter insofar as culturing of HIV-1 and changes in host-protein composition in the progeny virions.

Informatics

Intelligent informatic approaches are essential when dealing with HIV-1. We have therefore dedicated our final chapter to HIV-1 informatics, where we will discuss the aspects in detail (Chap. 6). If the reader has followed the advice outlined in this book, then after making excellent informed choices about sample preparation, chromatography approaches, and instrumentation, they will now have reams of MS/MS data on peptides that need to be identified. The first rule of databases is that if the information is not present in a database, then it will not be found. As for HIV-1, especially in the study of polymorphisms, we address this limitation in the subsequent chapter on HIV-1 sequencing (Chap. 4) along with strategies to built appropriate databases in our HIV-1 informatics chapter (Chap. 6). For example, in our group, we have generated custom databases that contain only the entries for human taxonomy and HIV sequences. A comprehensive strategy is elegantly outlined in subsequent chapters. In the case where careful quantitative information is sought for different mutations, then detailed sequence information must be generated de novo. This point is so important that we discuss it redundantly in this chapter, since many individuals will likely elect to have core facilities execute the proteomic portions of their studies and may skip subsequent chapters. While most core facilities have reasonable search approaches vetted by the reviewers of manuscripts that have been produced using primary data from the facility, many facilities will not be aware of the nuances of data analysis for HIV-1. Two suggestions for the reader are to first ensure that an appropriate database is constructed that will adequately cover viral sequences and second, obtain the search results and load them into either an institutional copy of Scaffold (Proteome Software) or a trial version from the company. For the most part, Scaffold will take the uninformed user to an intermediate level by following standard workflows in the software package. Since the metadata from instrumentation is captured in the search results, the software will harvest these data and will help the user to generate automated reports that are acceptable to the major journals where proteomic research is published. Finally, a special mention needs to be made of the HIV-1 proteomic resources available at BioAfrica [27] (bioafrica.net), which has a comprehensive toolbox for HIV-1 bioinformatics and is an excellent starting point about learning what resources are available for the investigator.

Targeted HIV-1 Proteomics and the Path to Clinical Applications

Selective Reaction Monitoring

In our earlier introduction to mass spectrometry, we introduced the concept of quadrupole mass filters and the triple quadrupole (QQQ) instrument. Conceptually we described a precursor ion being selected in the first quadrupole (Q1), the second quadrupole being used as a collision cell (Q2), and the third quadrupole allowing only the fragment ions specific to the entity of interest to be scanned through the third quadrupole (Q3). The probability of an isobaric (same mass) precursor eluting at the same retention time from the HPLC and having the same product ion is extremely low. Thus, while the overall intensity of the signal is much lower than traditional MS/MS, by using SRM (also known as multiple reaction monitoring or MRM), we can increase the overall signal to noise, so that the limits of detection of most targets can be improved 20–100-fold over traditional methods. Also since these methods are quite adaptable to higher flow rates from HPLC systems that use larger columns and hold more material, often one can use much greater starting material to improve the chance of detecting a target of interest.

Low sample abundance is a recurring theme of this book, and as the guidelines are shifting toward immediate treatment of HIV-1 patients, the chances of obtaining primary virus in great quantity are low. Therefore, SRM approaches provide us with some hope in the field that there may be a place for mass spectrometry in the clinical laboratory helping to inform treatment decisions about HIV-1-infected individuals. While we are years away from this becoming a reality, recently we have used SRM approaches to detect conserved HIV-1 peptides down to the low femtogram level on column. Theoretically, if validated, assays like this could replace expensive amplification-based assays in the clinical laboratory to determine HIV-1 viral load. As we understand more about the sequences leading to HIV-1 drug resistance, in addition to determining viral load, the possibility also exists to look for polymorphisms in viral proteins that are associated with drug resistance.

It is our strong opinion that 40 years after the development of the ELISA, we will start to see the replacement of the immunoassay with affinity-based mass spectrometry methods [28, 29]. As costs decrease for mass spectrometry and the sensitivity is increased, it is not unrealistic for affinity enrichment methods to be used with mass spectrometry detection. This is particularly true of technologies like SISCAPA, which stands for stable isotope standard capture with anti-peptide antibodies, termed by Leigh Anderson, who patented the approach. Briefly, this approach uses antibodies targeting peptides generated after proteolytic cleavage along with heavy synthetic peptides used as a standard for quantitation. Much like a competitive ELISA, displacement of the heavy form of the peptide with the light form provides quantitative information on the analyte. Logical extension of the art allows for many combinations of this fundamental assay including post-capture addition of standard or capture of native proteins with their subsequent digestion. Regardless, these types of approaches allow for the development of MS assays that could examine various posttranslational modifications of viral proteins or even allow for the quantitation of host proteins after pulldown and separation of virus particles from the blood.

Quick Start Guide for SRM

SRM assay development and assays that approximate SRM, like parallel reaction monitoring and DIA, described above, are becoming very common. Typically a minimum of two different peptides are used to build a targeted assay for a specific protein. These should be peptides that are unique to the protein of interest. The dominant product ion is typically used for quantitation with the addition of at least one or two qualifying ions (also present in the transition) to ensure that the relative ratios of the ions are consistent, thus reducing the chance of accidentally quantitating an isobaric species that co-eluted. For accurate quantitation of a target, heavy peptides are synthesized commercially that are shifted at least 8–12 Da and spiked into the sample at a known concentration. This mass shift is essential so that the isotopic envelope of the heavy standard doesn’t overlap with the native isotopic envelope at higher charge states. The transitions for the heavy peptides are also included along with the transitions for the natural isoforms. Comparing the relative intensity of the heavy internal standard to the measured intensity of the target allows for quantitation. An external standard is used to ensure that the measurements are within the linear range of the detector. Like ELISA’s or any other quantitative assay, dilutions may be required to get a target into linear range for quantitation. Due to their specificity, once developed, an SRM assay can be very fast (<5 min) and very inexpensive.

Thinking Back to Our Experiments and Motivations

If executed properly, proteomics now becomes a very powerful tool for the HIV-1 virologist. Concurrent to the time of writing, we have published the first special issue of proteomics on the subject “Virology meets Proteomics” (Proteomics Vol. 15 (2015) No. 12). To our knowledge, this represents one of the first collective works on viral proteomics and includes two publications on HIV-1 proteomics.

In particular, one of the most undiscovered elements of HIV-1 proteomics is studying viral proteins and their posttranslational modifications. In work we published in the early 2000s using two-dimensional gel electrophoresis, we observed several isoelectric shifts of HIV proteins, compatible with phosphorylated forms. Also, the issue of differential cleavage products of protease is yet to be explored. What about pathogenic versus nonpathogenic viruses? Many have shown the essential role of host restriction factors in making virions noninfectious, and others have shown the role of host proteins in making the virus more infectious. Recently, we published a study examining HIV-1 acylation [30], which showed changes in cellular acylation that were impacted by HIV-1 infection. The experimental possibilities are endless. Through careful quantitation and simple mass spectrometry-based experiments, a typical researcher should be well empowered to produce reasonable amounts of materials and biological replicates to use statistics to quantitate differences in their targets of interest. The power of this method is so great that in one experiment, we are now typically observing >1,500 host proteins in HIV-1 with as little as 5 μg of total protein using the methodologies described above. It is our hope that after reading the history of HIV-1 proteomics and the practical guidance provided herein and in other chapters, that we can inspire and educate scientists to become successful and contribute to this quickly growing field.

Alternative Approaches

We would be remiss to not call out to elegant studies that fall under the umbrella of proteomics but use other approaches, like pulldowns or protein arrays. These very powerful technologies are more mature in areas outside of virology; however, the LaBaer group has recently shown the power of these methods to studying an array of different antiviral responses to viruses [31].

Affinity Pulldown Approaches

A major contributor in the area of IP/interaction studies has been performed by Ileana Cristea’s group who has performed elegant work using targeted pulldown-based strategies for specific proteins looking for host-proteins that interact with viral protein targets [32, 33]. Her group has used reporter constructs with tags so that not only can one pull down and examine interacting proteins with viral proteins but also examine by microscopy where these interactions are occurring. These types of affinity-based approaches have been applied to identify restriction factors involved in the control of HIV replication, like SAMHD1 [34], by using affinity tags on viral proteins. Others have also performed elegant work using viral clones that express affinity tags to pull down interacting proteins after the infection of various cell lines [35].

Antigen Presentation

Don Hunt has pioneered the concept of major histocompatibility major histocompatibility complex (MHC) presentation for cancer [36]. The same techniques that are being used for characterizing MHC-bound peptides can also be applied to HIV-1 to potentially identify novel antigens that could be used as therapeutic vaccine targets. With new methods being developed to simultaneously profile small molecules and peptides, sample preparation requirements are becoming more streamlined and may minimize extensive processing requirements [37]. This is particularly true for HIV-1 where the virion itself contains peptide bound to class I and class II MHCs. While identification of peptides with nonspecific cleavages is a challenging informatics problem, we strongly believe that there is a great utility in this method for defining how viral proteins are processed into antigens for vaccine development [37]. Informatics tools and approaches for this purpose are described in Chap. 6.

Protein Arrays

Beautiful work has been performed by the group of Bill Robinson at Stanford, showing the power of antigen arrays for antibody characterization over 10 years ago in the HIV field [38], and more recently applied to other viruses [31]. By spotting proteins to an array and characterizing their composition by mass spectrometry, this technique opens the door to understanding antibody development to various elements, either host or viral proteins or modified viral proteins. As technology improves to clone out the variable, diversity, and joining region (VDJ) rearrangements of antibodies, this method shows promise in the identification of neutralizing antibodies and targets that could contribute to the development of sterilizing vaccines [39].

Conclusions

While we are still several years away from mass spectrometry being a “black-box” type of instrument where we simply inject our sample and walk away, rapid recent advancements in mass spectrometry data acquisition and bioinformatics have taken much of the pain out of the path to success. The most fundamentally important aspect of HIV-1 proteomics or any proteomic success is in sample preparation and the accurate quantitation of peptides post-desalting. Subsequent chapters expand in much greater detail, strategies geared toward the measurement of different posttranslational modifications of HIV and associated proteins as well as the informatics approaches designed to enhance success.