Introduction

Characterizing the proteome of complex biological samples by LC-MS/MS-based shotgun proteomic analysis depends on a number of critical factors, including effective protein solubilization and denaturation, thorough protein to peptide digestion, and efficient recovery of tryptic peptides from the digestion buffer [1]. Together, these key steps determine the overall extent of proteome coverage, and each process is known to be critically influenced by the buffer used during sample preparation. The enzyme most commonly used for sample digestion is trypsin, which cleaves proteins at the C-terminal of lysine and arginine residues [2]; hence, buffer selection is typically restricted to reagents that are compatible with this enzyme. While several different detergents, chaotropes, and surfactants have been tested as protein-solubilizing agents, very few of these are fully compatible with shotgun proteomic sample preparation. Ideally, the proteome-solubilizing agent should also be capable of improving protein digestion and enhancing tryptic peptide recovery, thereby enabling high-confidence identification of proteins and post-translational modifications (PTM), as well as enhancing the performance of multiple reaction monitoring in targeted quantitative proteomic analysis [36].

The protein-solubilizing agent sodium deoxycholate (SDC) may be capable of significantly improving proteome coverage due to being fully compatible with the sample preparation process as well as easy to remove from the digestion reaction via acid precipitation (AP) [79]. More importantly, SDC is able to enhance trypsin activity up to five-fold when used at concentrations ranging from 0.01 to 1 % [7], thereby obtaining an overall yield of tryptic peptides comparable with conventional urea-based methods [10, 11]. Removal of SDC from in-solution digestion (ISD) preparations can be achieved not only by AP but also by phase transfer (PT) to a non-miscible organic solvent phase added to the tryptic-digested sample [7]. When these techniques were directly compared during the analysis of rat liver mitochondria-enriched fractions isolated by ISD:SDC-based processing/extraction, the authors observed that the AP and PT methods identified comparable numbers of unique proteins, but the PT protocol identified ∼11 % more unique peptides [10].

SDC has previously been reported to offer excellent denaturing and solubilizing performance when compared with other protease digestion enhancers including detergents, surfactants, bile salts, chaotropes, and various organic solvents [7, 10]. Indeed, in previous comparisons, SDC and RapiGest were determined to exhibit the highest solubilizing capacities of all MS-compatible reagents tested [7, 10]. While the strong ionic detergent sodium dodecyl sulfate (SDS) displays very high solubilizing capacity, this reagent is not compatible with MS [7]. The filter-aided sample preparation (FASP) method introduced by Mann and collegues represents a universal strategy for sample processing that combines the capacity to remove low molecular weight components, as one of the main virtues of in-gel digestion (IGD), and the robustness of the ISD. The FASP method is a versatile strategy as it allows the ISD processing of special and generally MS-incompatible buffers, such as SDS, making their use possible in ESI-MS systems. SDC inclusion in FASP strategies has also been proposed [10, 12], showing better performance when SDC is coupled to SDS rather than urea [10]. Additionally, PT removal of deoxycholic acid in an enhanced FASP (eFASP) method was also introduced showing increased tryptic digestion efficiency for cytosolic and membrane proteins as compared to urea [12].

The potential utility of SDC for proteomic sample preparation has been studied intensively with respect to the analysis of moderately complex biological samples such as membrane protein-enriched fractions. However, the composition of the acid-precipitated SDC pellet for complex biological samples has yet to be determined in detail, despite the fact that this often-discarded material may contain co-precipitated tryptic peptides with potential to substantially increase proteome coverage. We therefore sought to determine whether unique tryptic peptides could be successfully recovered from the acid-precipitated SDC pellet of human plasma samples and assessed the potential impact of this approach on total proteome coverage. We now report an optimized alternative to urea-based ISD/sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE)-based in-gel digestion (IGD) methods that represents a highly efficient novel approach for the characterization of complex proteomic samples by LC-MS/MS.

Materials and methods

Reagents and chemicals

All reagents were purchased from Sigma-Aldrich (St. Louis, MO, USA) unless otherwise indicated. Proteomics sequencing-grade modified trypsin was from Promega (Madison, WI, USA). Water (HPLC grade) and acetonitrile (ACN, HPLC grade) were purchased from Thermo Scientific Inc. (Bremen, Germany).

Human plasma samples

Plasma samples were from anonymous human subjects recruited via Tan Tock Seng Hospital, Singapore, and stored at −150 °C until use. Seven plasma samples (100 μL each) were pooled (to reduce biological variation) and then used for optimization of the SDC-assisted ISD protocol. Twenty-seven additional plasma samples (25 μL each) were then randomly divided into three groups (nine subjects per group) for assessment of the optimized method. The study was approved by the Research Ethics Committee of Tan Tock Seng Hospital and Nanyang Technological University, Singapore. Written informed consent was obtained from all study participants (n = 34).

Standard SDC-assisted and urea-assisted in-solution tryptic digestions

Conditions used for trypsin digestion of the plasma samples are summarized in Table 1. Briefly, plasma samples were processed according to the method described by Proc [11] except for minor modifications. Urea-based sample processing was carried out according to the protocol described by León [10], with minor modifications.

Table 1 Summary of the reduction, alkylation, and digestion conditions used for each of the methods tested

Standard SDS in-gel tryptic digestion

SDS denaturation of proteins was coupled with IGD and performed as previously described (protocol workflow described in Table 1) [13, 14]. Proteins were resolved on a 12 % SDS-PAGE gel which was subsequently stained with Coomassie Brilliant Blue. Proteins were excised from the gel and bands were diced into 1-mm2 cubes prior to reduction, alkylation, and digestion.

Optimized SDC-assisted in-solution tryptic digestion

Reduction, alkylation, and tryptic digestion of proteins were performed as previously described [10], except for minor modifications (Table 1). Briefly, pooled plasma (25 μL) was diluted 10-fold with 100 mM ammonium acetate (AA) and then denatured under reducing conditions using 1 % w/v SDC and 10 mM dithiothreitol (DTT) for 30 min at 60 °C. Samples were then alkylated with 20 mM IAA in the dark for 45 min. Prior to tryptic digestion, samples were diluted two-fold with 100 mM AA and 10 mM DTT and incubated at 37 °C for 30 min. Proteins were digested enzymatically overnight at 30 °C using sequencing-grade-modified trypsin at trypsin-to-substrate ratio 1:50 (w/w). The enzymatic reaction was quenched and the SDC precipitated simultaneously by addition of formic acid (FA) to a final concentration of 0.5 %. To further optimize SDC precipitation and peptide recovery, a sequential extraction of peptides from the SDC pellet was included (Fig. 1). The SDC precipitate was pelleted by centrifugation at 12,000 × g for 10 min, and the resultant supernatant (S) was collected into a new tube. The SDC pellet was then treated as follows: the pellet was re-dissolved in 600 μL 0.5 % ammonium hydroxide, precipitated by addition of 0.5 % FA, and pelleted by centrifugation at 12,000 × g for 10 min. The washing process was repeated twice, resulting in two additional supernatants P1 and P2 (in reference to their pellets of origin). Samples were then desalted using Sep-Pak C18 cartridges (Waters, UK), and the eluted peptides were then dried using a vacuum concentrator. P1 and P2 fractions were analyzed separately during the protocol optimization process and subsequently combined to test the final optimized version of the protocol.

Fig. 1
figure 1

Flow chart outlining the novel SDC-assisted in-solution tryptic digestion method. PX represents the fraction obtained after completing either one (P1) or two (P2) washes of the SDC pellet

LC-MS/MS

Peptides were desalted using a Sep-Pak 50-mg C18 cartridge (Waters, Milford, MA, USA) and eluted with 75 % ACN and 0.1 % FA prior to analysis by LC-MS/MS analysis. Eluted peptides were first concentrated using a vacuum concentrator (Eppendorf, Hamburg, Germany) and reconstituted with 3 % ACN and 0.1 % FA. Peptides (0.5 μg) were separated and analyzed with a Dionex UltiMate 3000 UHPLC coupled to a linear quadrupole ion trap-Fourier transform Ultra apparatus (LTQ-FT Ultra, Thermo Scientific Inc.). Separation was performed using a reversed-phase Acclaim PepMap RSL column (75 μm, ID × 15 cm, 2 μm particle size, Thermo Scientific Inc.). Mobile phases were 0.1 % FA (phase A) and 80 % acetonitrile 0.1 % FA (phase B). Separation of samples was performed in a 240-min gradient of 3–6 % B for 2 min, 6–30 % for 208 min, 30–54 % for 14 min, 54–72 % for 1 min, 72 % for 5 min, 72–5 % for 2 min, and then initial isocratic conditions for 8 min. On-line ionization was performed using a Michrom CaptiveSpray ion source (Bruker-Michrom Inc., Auburn, USA) at an electrospray potential of 1.5 kV and an ion transfer tube temperature of 180 °C. Data acquisition was conducted in centroid and positive mode (350–1600 m/z range) in the FT-ICR cell at a resolution of 100,000 and maximum injection time of 1000 ms using Xcalibur version 2.0 SR2 (Thermo Scientific Inc., Bremen, Germany). The automatic gain control (AGC) target for FT was set to 1.0e+06, and precursor ion charge state screening was enabled. The 10 most intense ions with a 500-count threshold were fragmented by collision-induced dissociation, with a normalized collision energy of 35 %, activation Q of 0.25, isolation width for precursor ions of 2 Da, and activation time of 30 ms. MS/MS data were acquired in the linear ion trap with AGC target for full MS of 3.0e+04 and MSn of 1.0e+04 with maximum injection time of 200 ms. Peptide ions with 1+ charge were excluded for fragmentation. The dynamic exclusion list was enabled with a repeat count of 1 and exclusion duration of 40 s.

Bioinformatics and data analysis

Raw MS/MS data were converted into Mascot generic format files using Thermo Proteome Discoverer software (version 1.4.1.14, Thermo Fisher Scientific Inc.). The concatenated target-decoy Uniprot human database (downloaded on 29th October 2013, 176,946 sequences and 70,141,034 residues) was used for data searching. Database searching and carbamylation analysis were performed using an in-house Mascot server (version 2.3.02, Matrix Science, Boston, MA, USA) with precursor MS and MS/MS tolerances of 10 ppm and 0.8 Da, respectively. Two missed trypsin cleavage sites of peptides were tolerated unless specified otherwise. Deamidation (Asn) and oxidation (Met) were set as variable modifications. The peptide/protein list obtained by Mascot was processed for further analysis using an in-house script together with Excel. Peptides with a Mascot score >20 were used to generate the peptide/protein list for determination of false discovery rates (FDR = 2.0 × decoy hits/total hits). Peptides with same sequence were considered as unique peptides. The grand average of hydropathy (GRAVY), charge, and isoelectric point (pI) of the identified peptides were calculated using in-house software. Venn diagrams were constructed using Venny 2.0.2.

ANOVA with Bonferroni correction for multiple comparisons was performed using GraphPad Prism 6 (GraphPad Software, Palo Alto, USA), and only p values <0.01 were considered significant. Results are reported as mean ± standard error of the mean (SEM).

Gene enrichment analysis was performed with FunRich 2.1.2 [15].

Proteomics data deposition

The mass spectrometry proteomics data have been submitted to the Proteome Xchange Consortium [16] via the PRIDE partner repository under the dataset identifier PXD002811.

Results and discussion

Comparison of classical proteomics methods for digestion of plasma samples

To evaluate the efficiency of the most common buffers used for shotgun proteomic sample preparation, we performed a systematic comparative analysis of human plasma sample processing using urea, SDS, or SDC. To do this, we used either 8 M urea or 1 % SDC for protein denaturation/ISD or performed a conventional SDS-based IGD protocol. In accordance with previous studies [17], we observed that the standard urea-assisted tryptic digestion method provided the highest proteome coverage, closely followed by the ionic detergent SDC (see Electronic Supplementary Material (ESM) Fig. S1).

Although several surfactants have been proposed as effective reagents for membrane protein denaturation and solubilization, the most suitable detergents SDS and CHAPS (3-[(3-cholamidopropyl) dimethylammonium]-1-propanesulfonate) are not compatible with most shotgun proteomic protocols [18]. However, since SDS is one of the most efficient known solubilizing agents [7], there have been numerous attempts to remove this detergent from ISD preparations in order to render these suitable for MS analysis, most notably filter-aided sample preparation [19, 20] and off-line cation-exchange chromatography [21]. More commonly, SDS is used for polyacrylamide gel electrophoresis of proteins (SDS-PAGE), so we included this technique in our comparison of classical sample processing methods and sought to determine the relative efficiency of IGD versus that of ISD. The number of unique proteins identified using SDS-assisted IGD was comparable to the ISD protocols, but the number of unique peptides identified was substantially lower (47.4 % of the total unique peptides identified using the standard urea-based protocol), likely due to extensive peptide loss during sample extraction from the polyacrylamide gel [17, 22].

The routine use of SDC in MS sample preparation provides several major advantages over other more common techniques, and reportedly offers an efficient alternative to urea/SDS-based approaches to the analysis of moderately complex biological samples, such as cell culture-derived mitochondrial proteomes [10]. However, since current SDC-assisted ISD protocols deliver proteome coverage only similar to or lower than that obtained by urea, these techniques have been regarded as ill-suited to the analysis of highly complex human samples.

Optimization of SDC-assisted in-solution tryptic digestion

When combined with a routine AP procedure, the use of SDC to prepare human plasma samples for shotgun proteomic analysis resulted in the identification of fewer unique proteins and peptides than did the standard urea-assisted ISD method. While this could be taken to indicate poor protein/peptide recovery following SDC-based processing, Masuda [7] has previously conjectured that unique peptides could in fact be co-precipitated with SDC during the AP process, and Lin et al. found co-precipitation of peptides in membrane protein samples [23]. We therefore hypothesized that SDC used in complex samples for shotgun proteomics involves loss of a substantial proportion of novel peptides that remain bound to the surfactant during SDC precipitation at low pH and could potentially be recovered from the SDC pellet by performing a sequential re-solubilization and precipitation procedure.

In order to study the co-precipitation of peptides with SDC, we tested whether this detergent could increase peptide recovery from ISD/AP-processed samples when using two different reaction buffers (including four replicates per condition): 25 mM ammonium bicarbonate (ABB) buffer (SDC ABB ) which was used according to the methodology reported by Proc [11] and 100 mM AA buffer (SDC AA ), which we have previously shown to reduce experimentally induced deamidation and achieve optimal identification of unique peptides [24]. Additional comparisons were made with urea-assisted digestions performed in ABB (ureaABB) and AA (ureaAA) buffers using two replicates per condition. For all experiments performed in this study, FDR distribution was plotted against pep_expect distribution (ESM Figs. S2 and S3). Pep_expect < 0.05 (as generated by Mascot) was set as the threshold for peptide identification since this was found to be more stringent than the application of a 1 % FDR threshold.

In standard protocols, SDC is only involved during analysis of the first supernatant obtained after acidic precipitation [10, 25, 26], which in our experiments generated an average of 276 identified proteins. No significant differences were detected between the two buffers tested (ABB and AA), and the number of proteins identified in the supernatant fraction (S) was comparable to the urea-assisted ISD performed in ureaAA buffer (257 proteins identified; Fig. 2A). Consistent with the findings of previous studies [24], the number of proteins recovered from the ureaABB-processed sample was significantly lower, with just 228 proteins being identified. The total number of peptides identified in fraction S was comparable between the ureaAA and ureaABB buffers (∼4000 peptides each).

Fig. 2
figure 2

Optimization of SDC-assisted in-solution tryptic digestion. SDC-assisted in-solution tryptic digestion was performed using either ammonium acetate (SDC AA ) or ammonium bicarbonate (SDC ABB ) and compared with urea-assisted in-solution tryptic digestions performed under the same conditions (Urea AA and Urea ABB , respectively). A Number of proteins identified using the tested methodologies. For SDC-assisted protocols, proteins in fractions S, P1, and P2 were identified separately. No significant differences were detected between S and Urea AA . B Number of total peptides identified using the tested methodologies. There were no significant differences in the total number of peptides identified when using buffers AA and ABB, but SDC significantly increased peptide identification relative to urea (p < 0.01). C Number of unique peptides identified by the SDCAA protocol (total obtained by combining replicates). Unique peptides identified in fractions S, P1, and P2 were filtered to remove peptides commonly found in more than one fraction. D Venn diagram showing overlap between unique peptides identified in fractions S, P1, and P2 (totals obtained by combining replicates). E Analysis of low-abundance plasma proteins (LAPP; ng/mL concentration or lower) identified in S and P1 + P2 fractions. Identification of LAPP was carried out by comparison with the low-abundance plasma proteins (<ng/mL) present in the Plasma Proteome Database. High-abundance plasma proteins are indicated as HAPP. Four replicates were considered in this analysis. The Venn diagram shows the averaged number of proteins. F Estimation of technique reproducibility as calculated by evaluating peptide overlap between four replicates for the supernatant fraction (S) and pellets (P1 + P2)

We next analyzed the composition of the SDC pellet in order to test our hypothesis that a substantial subset of peptides binds and interacts with the ionic detergent and can therefore be co-precipitated during AP treatment. Indeed, using a sequential re-solubilization and precipitation procedure, we were able to successfully identify more than 230 proteins from the peptides that co-precipitated with SDC (Fig. 2A). Independent analysis of fractions P1 and P2, when combined with the analysis of fraction S, significantly increased the number of peptides identified to a total of 13,703 (averaged from SDCAA and SDCABB) (Fig. 2B). No significant differences were detected between the two buffers tested with respect to the number of peptides identified in fractions P1 and P2. Although the buffer used did not affect the proteome coverage achieved by SDC-assisted ISD, we considered SDCAA to be optimal due to the reduced incidence of experimentally induced deamidation in AA buffer [24].

When we proceeded to test the optimized SDCAA protocol, we identified a total of 13,376 peptides, of which 1728 were unique. Of the total peptides recovered, 1062 were derived from the supernatant (S), 495 were recovered during the initial washing steps (P1), and 171 were obtained during the second washing step (P2) (Fig. 2C). ESM Data Set 1 and 2 contain the lists of total peptides identified from SDCAA and SDCABB, respectively, and ESM Data Set 3 contains the list of total peptides identified from ureaAA and ureaABB for each replicate. By comparing peptidome composition between fractions, we confirmed that 406 unique peptides were present in all three fractions analyzed (S, P1, and P2) (Fig. 2D).

The unique peptides recovered from the SDC pellets represented a 63.5 % increase in the total number derived from fraction S (Table 2). Similarly, the total number of proteins identified was increased by 67.7 % following sequential re-solubilization and precipitation of the SDC pellets. This level of efficiency compared favorably with the reported improvement in protein/peptide recovery attained using PT-assisted techniques; Masuda [7] achieved 57 % improvement in total proteins recovered and 48.3 % improvement in total peptides recovered from a membrane protein-enriched fraction derived from Escherichia coli (SDC-ISDPT vs. SDC-ISDAP, shown in Table 2). Given the membrane-bound origin of the sample tested by Masuda, this likely included many hydrophobic peptides that were susceptible to being co-precipitated with SDC using a conventional AP method. In contrast, when the same methodology was applied to a rat mitochondrial fraction more closely resembling the normal cellular proteome, the difference between AP and PT was less significant, with just 0.38 % increase in the total number of proteins identified and 13.0 % improvement in total peptide recovery (León [10]; SF-SDC-ISDPT vs. SF-SDC-ISDAP, shown in Table 2). Again, when PT removal of SDC was coupled to spin filter-aided ISD, there was only modest improvement in the recovery of either proteins (11.6 %) or peptides (14.3 %), likely due to the intensive sample handling required to apply this methodology.

Table 2 Comparison of standard and optimized SDC-assisted ISD protocols (standard SDC-ISD, optimized SDC-ISD) with the detergent removal methodologies reported by León (SDC-ISD (AP), SDC-ISD (PT)). The spin filter-aided versions of the detergent removal protocols are included for comparison (SF-ISD:SDC (AP), SF-ISD:SDC (PT))

We next evaluated the reproducibility of the SDCAA sample processing method by assessing peptide overlap between four independent sample preparations (Fig. 2F). Our optimized SDC-based method exhibited peptide overlaps of 94, 89, and 93 % for the fractions S, P1, and P2, respectively, thus displaying an excellent level of reproducibility comparable to that reported for PT-assisted detergent removal methods [7]. We therefore proceeded to assess the ability of our SDC-based method to recover low-abundance plasma proteins (LAPP) from the combined fractions P1 + P2. To do this, the list of all unique proteins identified in S and P1 + P2 across four technical replicates was assessed for the presence of proteins reported in the Plasma Proteome Database [27] at the nanogram-per-milliliter level or lower. Using this approach, we determined that fraction S contained 76 % of the total of LAPP identified by SDCAA and the extraction of peptides co-precipitated with SDC allowed us to identify an additional 30 % of LAPP. Both fractions shared 45 % of the total of LAPP (Fig. 2E and ESM Data Set 4). From the recovery of co-precipitated peptides with SDC, 135 unique proteins were identified in front of 82 identified in fraction S. All together demonstrate the importance of the extraction of co-precipitated peptides with SDC as that simple procedure allows the identification of a significant subset of proteins that are not considered in AP and expands significantly the proteome coverage obtained by the use of SDC in plasma biomarkers research by shotgun proteomics. Although MS technology has evolved rapidly in recent decades, biomarker research is still limited by the wide dynamic range of the human plasma proteome, which contains proteins varying in concentration by an estimated 10 orders of magnitude [28, 29]. Since the top 10 most abundant plasma proteins represent ∼90–95 % of total protein content, it has become commonplace in MS-based studies to reduce the dynamic range of the samples by first fractionating these using methods including chromatography, gel-based electrophoresis, and immunoaffinity depletion of high-abundance plasma proteins (HAPP) [30]. It is important to note, therefore, that the SDC-based ISD method we report not only enhances trypsin performance and maximizes proteome coverage but also acts as a fractionation strategy that can help to reduce the dynamic range of the blood plasma proteome, thereby improving the identification of low-abundance plasma proteins of potential interest to biomarker researchers.

Physicochemical properties of identified peptides derived from SDC-assisted in-solution tryptic digestion

We next sought to better characterize the peptides obtained using our optimized SDC method by analyzing their physicochemical properties, including GRAVY, pI, and peptide charge (Fig. 3). Supernatant S contained a high abundance of hydrophobic proteins (GRAVY score > 0.5), which was consistent with the conventional use of SDC for the analysis of membrane-derived proteins (Fig. 3A) [7, 8, 31]. However, hydrophilic proteins (GRAVY score < −0.5) were also enriched in this fraction. According to the pI distribution, peptides derived from proteins with low pI (<7) were easily extracted in fraction S (Fig. 3B), which also contained a high number of low- or negatively charged peptides (charge < +1).

Fig. 3
figure 3

Physicochemical properties of peptides identified using the SDCAA protocol. A Hydropathic analysis of peptides identified in fractions S and P1 + P2. B The isoelectric point distribution of all peptides identified in fractions S and P1 + P2. C Peptide charge distribution of all peptides identified in fractions S and P1 + P2. Peptide net charge was calculated based on the number of constituent lysine (+), arginine (+), aspartic acid (−), and glutamic acid (−) residues, as well as the presence of a N-terminal amino group (+) and C-terminal carboxylic group (−)

The peptides obtained from combined fractions P1 + P2 would typically be disregarded when conducting a single SDC/AP protocol [10, 25, 26], but our physicochemical analyses revealed the presence of multiple mildly hydrophilic/hydrophobic proteins in that fraction (GRAVY value from −0.5 to 0.5). Proteins with high pI values (8–13) that were not successfully recovered in the first supernatant (S) were instead efficiently enriched in the P1 + P2 fraction. According to the distribution of peptide charges shown in Fig. 3C, peptide recovery from the SDC pellet increased the percentage of highly charged peptides obtained (including +2-charged peptides, which correspond to the average net charge of complex tryptic samples generated at pH 2) [32].

When comparing all the parameters assessed here (GRAVY values, pI, and protein charge), the optimized SDC-assisted ISD was found to be well-suited to the analysis of all major protein types, thus offering excellent overall proteome coverage (rather enriching for membrane proteins only, as suggested by previous reports). Indeed, the protein compositions of SDC fractions S, P1, and P2 were complementary, and the optimized protocol acted as an effective fractionation strategy for improving the identification of low-abundance proteins and long peptides by efficiently separating them from highly abundant soluble peptides.

Miscleavage of peptides

The efficiency of trypsin cleavage and generation of nonstandard protein products can be influenced by many intrinsic properties of the peptides themselves, particularly the frequency/distribution of lysine and arginine residues along the backbone [33]. Consequently, proteolysis of proteins during sample processing is often incomplete, and peptides with missed cleavage sites are common in tryptic samples. We therefore assessed the extent of peptide miscleavage and the length variability associated with our optimized SDC approach (Fig. 4). Miscleavage rates for the urea- and SDC-assisted ISDs were both ∼20 %, in line with the findings of León [10]. According to our data, the majority of tryptic miscleaved peptides were co-precipitated during the first AP process, possibly because the longer peptides remained partially bound to the surfactant and readily pelleted together with SDC. This is consistent with our observation that longer peptides were more frequently identified in the combined P1 + P2 fraction (Fig. 4A), and may explain why previous investigations detected only low levels of peptide miscleavage when using SDC, restricting their analyses to the first supernatant S [10].

Fig. 4
figure 4

Characterization of peptide infractions S and P1 + P2 using the SDCAA protocol and gene enrichment analysis. A Peptide length distribution for all peptides identified in fractions S and P1 + P2. B Graph showing incidence of miscleaved sites in the SDC fractions (S and P1 + P2) compared with urea (Urea AA ). Peptides with up to five miscleaved sites were considered. C Functional gene enrichment analysis based on cellular components

When the recovered peptides were analyzed and classified according to the number of missed cleavage sites (up to five max. per peptide), we observed no significant differences between the urea-assisted ISD and fraction S data (Fig. 4B). However, the combined P1 + P2 fraction exhibited a higher incidence of peptides with one or two miscleaved sites, as well as a significantly lower number of fully cleaved peptides. These data indicated that analysis of miscleaved and long peptides in the fractions P1 and P2 is required in order to perform robust structural characterization of the plasma proteome.

Gene enrichment study

We next performed a gene enrichment analysis in order to categorize the proteins identified in the pellet fraction. The subset of peptides obtained by re-precipitation and solubilization of the SDC pellet was observed to improve the identification of membrane and cytoplasmic proteins (Fig. 4C). In contrast, extracellular proteins were more abundant in fraction S and ureaAA, consistent with the high solubility of these proteins in common digestion buffers [34]. Additionally, analysis of the combined fraction P1 + P2 enhanced the identification of proteins derived from membrane-bound organelles, including both nuclear and exosomal proteins.

Concluding remarks

In the current study, we confirmed that unique peptides co-segregate with the detergent SDC during acid precipitation of digested plasma samples and that these peptides can be efficiently recovered from the pellet by sequential re-solubilization and precipitation. Analysis of the peptides obtained from the SDC pellet significantly increased the total number of unique peptides detected in human plasma samples; hence, optimization of this SDC-based approach represents a substantial methodological advance in the field of shotgun proteomics for complex biological samples. Indeed, by acting as a de facto fractionation strategy, our SDC protocol also efficiently reduced the dynamic range of the human plasma proteins under study, and is therefore well-suited to the preparation of complex biological samples for biomarker discovery studies.