Keywords

Introduction

Prostate Cancer

Worldwide prostate cancer (PCa) is the second most commonly diagnosed cancer and the sixth most common cause of cancer death amongst men [1]. Within the USA alone, 238,590 new cases and 29,720 PCa deaths were recorded in 2013 [2]. The principle problem arising from PCa is its propensity to metastasise. PCa preferentially metastasises to the bone marrow of the axial skeleton, and it is these metastases that are the major cause of PCa-associated morbidity and mortality [3, 4].

However, it is clear that not all PCa lesions progress towards life-threatening disease. Early studies by Franks [5] and more recently by Thompson et al. [6] showed a universally high incidence of microscopic PCa lesions in young men. However, this number does not translate in to clinically significant disease and progression towards metastatic disease and death. Autopsy studies have shown that latent disease could be detected in up to 30% of male 50-year-old prostates which increases to 75% in men over 80 [7]. The clinical relevance of this latent disease is currently unknown, and deciding on which tumour will become significant, with a risk of metastasising, remains one of the predominant diagnostic challenges facing urologists today. This variability in PCa leads to a significant level of uncertainty in PCa prognostication with subsequent overtreatment of the disease.

Epidemiological studies have shown clinical incidence, and rate of progression to metastatic disease is greater in western societies compared with developed non-western countries such as Japan [8]. It should be noted, however, that adoption of a western lifestyle by non-western men, usually through migration to a western country, has been associated with increased disease prevalence and an increased risk of aggressive disease [9]. This has led to the proposition that environmental factors, including diet, and not the underlying genetics are the drivers of disease progression.

PCa is usually asymptomatic until it advances with most tumours being found either incidentally or through routine screening and health checks. Currently the most common indication of disease is either an abnormal digital rectal examination (DRE) or a raised prostate-specific antigen (PSA) test. Nevertheless, both of these tests suffer from a lack of specificity and sensitivity. Although it has been reported that approximately 1/5 of all PCa tumours are detected by DRE, DRE only has a sensitivity and specificity of 52% and 81%, respectively [10].

PSA

Prostate-specific antigen (PSA) is 33 kDa protein secreted by the prostate epithelium and is an androgen-regulated serine protease involved in liquefaction and spermatozoa release [11]. PSA is detectable in high concentrations (mg/ml) within the semen and can be measured within the serum although at lower concentrations (ng/ml). Although PSA is the current gold standard for PCa detection, the PSA test has significant problems. The precise threshold at which a biopsy should be triggered is unclear. The Prostate Cancer Prevention Trial (PCPT) reported that up to 33.5% of men with a PSA score of <4.0 ng/ml, a value often used as a cut-off for no disease, actually had PCa [12]. A recent meta-analysis demonstrated 13% of patients had false positive PSA tests leading to 5.5% having unnecessary biopsies. The reported sensitivity and specificity of the PSA test of 79% and 59%, respectively [13], combined with the fact that conditions such as benign prostatic hyperplasia (BPH) and prostatitis, ejaculation or vigorous exercise up to 48 h prior to testing can raise PSA levels has led to concerns around PSA screening for PCa. Schröder et al. [14] reported on the European Randomised Study of Screening for Prostate Cancer; PSA screening reduced PCa mortality by 20%, but it was associated with a high risk of overdiagnosis such that 1410 men would need to be screened and 48 additional cases radically treated to prevent one death from PCa.

This paucity in markers not only affects detection of disease but also its management. A significant number of men still present with metastatic PCa and will undergo androgen deprivation therapy (ADT). These men will progress to castrate-resistant disease (CRPC) within a median of 11 months [15]. Also a significant proportion of men presenting with high-risk non-metastatic disease will go on to develop CRPC and metastatic disease [16]. Although there are a range of therapies for CRPC metastatic disease, the lack of robust biomarkers has proved problematic in therapy selection, scheduling and disease monitoring. As with initial diagnosis, the reasons for this are multifactorial, including the unreliability of PSA and tumour heterogeneity. A recent expert consensus meeting stated that currently there are no validated predictive biomarkers available for use in daily clinical practice in CRPC [17].

There is, therefore, a clear need to identify new robust clinical biomarkers for the detection, grading, therapeutic management and monitoring of PCa.

The term ‘proteome’ was coined in 1994 by Marc Wilkins at a symposium and appeared in print in 1995 [18]. It was used to define the entire protein compartment within a cell/tissue/biological sample.

Proteomics

Proteomics is, therefore, the study of the proteome and is defined by Anderson and Anderson as ‘the use of quantitative protein-level measurements of gene expression to characterise biological processes and decipher the mechanisms of gene expression control’ [19]. The study of proteomics is not merely a study of protein expression, however. Proteomics also encompasses the study of the function of those proteins, including activity, post-translational modifications, localisation and protein interactions [20].

Proteomics employs a wide range of technologies for its study. Well known, but now outdated, is two-dimensional gel electrophoresis (2DGE) . 2DGE is a gel-based method that separates proteins two dimensionally, firstly by isoelectric point (pI) and secondly by molecular weight enabling better resolution than a single-dimensional separation. Interestingly, this technique was first utilised some 20 years prior to the invention of the word ‘proteome’ and was described as a ‘high resolution two-dimensional electrophoretic method’ [19].

Proteomic biomarker discovery generally takes the form of nontargeted relative quantification methods leading to biomarkers being described as up- and downregulated. These methods include the gel-based methods (1D, 2D and DIGE) followed by mass spectrometry (MS) for identification purposes or entirely MS based. Advances in MS have resulted in proteomic tools that can compare and identify proteins implicated in different disease states with no need for an intermediate gel-based step. Many MS-based methods involve protein digestion , and so the protein analysis is actually based on surrogate peptides (bottom-up), but there are some technologies that focus on the whole protein (top-down).

Top-down

2DGE has the ability to resolve thousands of proteins and, coupled with MS for protein identification, became a renowned tool for protein biomarker discovery [21]. 2DGE, however, has several limitations such as an inability to resolve all proteins present due to huge fold differences in protein expression. In addition, there are problems with protein solubility and under-representation of basic proteins [22].

In 1997, 2D differential in-gel electrophoresis (2D DIGE) was highlighted as an alternative that overcame problems with comparing two different gels. 2D DIGE utilises different fluorescent dyes to label three different protein samples (e.g. normal, disease, control) which allows them to be run on the same IPG strip and gel enabling direct comparisons between the experimental samples and a control [21, 23]. Both 2DGE and 2D DIGE result in the identification of protein spots in relation to a particular disease or condition, but MS must be utilised to elucidate the identity of protein spots. Protein spots are excised and trypsin digested to produce peptides. MS techniques used to identify proteins from these peptide fragments include matrix-assisted laser desorption ionisation time-of-flight MS (MALDI-TOF-MS) and electrospray ionisation tandem MS (ESI-MS/MS) [24].

Surface-enhanced laser desorption ionisation TOF-MS (SELDI-TOF-MS) utilises chips with a number of different binding affinities to study subsets of proteins from a sample. Laser ionisation releases bound proteins which are detected by MS. Relative abundance of proteins can then be compared across many samples. Unfortunately, SELDI-TOF-MS fails to result in direct protein identification, often requires sample pre-fractionation and has received some criticism regarding reproducibility due to sample handling variation [25].

MS imaging (MSI) shares similarities with SELDI-TOF-MS utilised for the profiling of proteins in relation to topological information at the tissue and cellular level. Studies have utilised MALDI-TOF-MS (matrix-assisted laser desorption ionisation TOF-MS) to gain knowledge of the carcinogenic alterations that can occur outside of tumour margins highlighting the importance of the tumour microenvironment in tumourigenesis [26].

Top-down proteomics is still trailing behind in terms of sensitivity, but technologies are emerging that are capable of performing these analyses.

Bottom-up

Proteolytic peptides are either labelled with isotopic tags (ICAT25), isobaric tags (iTRAQ [27] or TMT tagging [28]) or non-isobaric tags (mTRAQ [29] or acetylation) or are analysed label-free using methods such as spectral counting [30]. For a review of some of these MS-based methods, see Schulze and Usadel [31].

Isotopic labelling is utilised in ICAT (isotope-coded affinity tags) whereby samples are differentially tagged with stable isotopes that contain a protein-reactive group, a glycol linker and a biotin tag. Linkers (tags) are made from eight hydrogen (light reagent) or eight deuterium (heavy reagent) atoms to enable differentiation between two samples. Once samples have been tagged, they are trypsin digested, fractionated via avidin affinity chromatography and then scanned by MS. Relative quantities of proteins can be determined and then differentially expressed protein peaks identified by MS/MS. The main drawback to ICAT is that the linkers only bind to cysteine and as approximately 10% of proteins do not contain cysteine, these are not labelled or analysed [24].

A similar approach, called isobaric tags for relative and absolute quantification (iTRAQ), allows quantification and identification of differentially expressed proteins in up to ten samples [32]. Isobaric (same mass) reagents are used to differentially label the amine residues of proteins in each sample prior to MS. The main drawback to this method is the potential experimental variation induced from the lengthy sample preparation required [24].

Multidimensional protein identification technology (MudPIT) exploits multidimensional high-pressure liquid chromatography (HPLC) to separate peptides prior to identification by MS/MS. Peptides can either be labelled or label-free. Protein mixtures are digested, and peptide fragments separated using a strong cation exchange column, followed by a reverse phase hydrophobicity column. Peptides eluted from the reverse phase column are then identified by MS/MS. The main benefits of this method are that complex protein mixtures can be separated and protein identification can be carried out rapidly without any pre or post-separation labelling. Although this method is highly sensitive and can be performed label-free (by the use of spectral counting [33]), identification of differential protein expression has been reported to be problematic [24].

These are numerous pre-fractionation methods and MS technologies that are being updated continuously to improve detection and quantification capabilities in bottom-up proteomic analyses. These methodologies have resulted in thousands of potential disease biomarkers.

SWATH MS is a data-independent acquisition (DIA) method which aims to complement traditional mass spectrometry-based proteomics techniques such as those described above. SWATH’s main advantage is a complete and permanent recording of all fragment ions of the detectable peptide precursors present in a biological sample that are produced, therefore allowing reinterrogation of the raw data without the need to repeat ‘wet’ experiments. Taken together it is high-throughput high reproducibility and consistency.

The method comprises two steps: the data acquisition method and targeted data analysis approach building on the high-throughput SRM (selected reaction monitoring) scoring (using the mProphet approach ) developed in the Aebersold lab [34]. SWATH-MS data consists of highly multiplexed fragment ion maps that are deterministically recorded over the user-defined mass precursor mass range and chromatographic separation—by far the most comprehensive MS approach.

Urine Proteomics

Urine represents a logical source of biomarkers to identify disease within the organs and tissues of the urinary tract, with a urine test for PCa potentially offering a far less invasive alternative to blood testing or DRE. Recent studies have, therefore, sought to characterise the proteome of urine in PCa patients to identify potential diagnostic biomarkers. Davalieva et al. [35] used a two-dimensional difference gel electrophoresis (2D DIGE) approach combined with matrix-assisted laser desorption ionisation-mass spectrometry (MALDI-MS) to identify 23 proteins, predominantly secreted enzymes, with statistically significant differences in abundance in the urine of PCa patients compared to those with benign prostatic hyperplasia (BPH). Nine of the proteins were found to be part of the ‘acute phase response’ signalling pathway, perhaps functioning within the inflammatory tumour microenvironment, and five of these were selected for further validation by immunoturbidimetry. Measurement of haptoglobin and alpha-1-microglobulin/bikunin precursor in combination offered the best diagnostic accuracy, greater than that achieved by measuring serum PSA. Interestingly, previous studies have also highlighted these proteins as being differentially expressed in PCa [36, 37], and, as such, it may be useful to carry out further validation in independent cohorts.

Promising results were similarly obtained in an earlier study by Okamoto et al. [38], who used surface-enhanced laser desorption ionisation time-of-flight mass spectrometry (SELDI-TOF-MS) to identify a peptide panel of 72 peaks which could differentiate those with PCa from cancer-free controls. Hierarchical clustering allowed the discrimination of the two groups with a sensitivity of 91.7% and a specificity of 83.3%. However, urine samples analysed in this study were collected following prostate massage to enhance the detection of prostate-specific proteins and can thus be considered a more invasive approach. Other studies have evaluated the usefulness of panels of proteins or peptides in the PCa diagnosis, rather than single proteins which are typically less reliable. A 12 peptide panel, for example, has been proposed to allow detection of PCa using a sensitive capillary electrophoresis MS approach [39, 40]. The peptides were more readily detected in the first-void urine, which has previously been shown to contain higher levels of prostate-specific biomarkers (e.g. PSA) than the midstream urine [41].

Urine is undoubtedly one of the most convenient bodily fluids to collect for biomarker research and use in the clinic. However, urinary biomarker studies are not completely free from the practical issues associated with large-scale ‘omics’ studies utilising hundreds or thousands of biological samples. Some common issues for consideration are the time of day at which the sample is collected, the portion of urinary flow to be collected (as discussed previously, PSA is highest in the first-void urine) and degradation of the protein marker over time (e.g. during sample handling or storage). Regardless, future studies of urine biomarkers for PCa are likely to prove fruitful if well designed.

Serum/Plasma Proteomics

Blood contains a huge number of proteins, and being in contact with each organ and tissue, it becomes perfused with proteins secreted from those organs and tissues and with proteins ‘leaking’ from damaged or diseased cells. Additionally, it is minimally invasive to sample. However, the difficulty with blood as a biomarker pool is that the concentration of proteins covers several orders of magnitude making it incredibly difficult to uncover the lower abundance proteins due to the masking effects of proteins such as albumin and immunoglobulins.

Promising results from SELDI-TOF-MS studies for the detection of new PCa serum biomarkers have been reported. Adam et al. [42] used SELDI-TOF-MS to analyse serum from 167 PCa patients, 77 BPH and 82 healthy controls. They detected nine peaks with a quoted 83% of sensitivity and 97% of specificity. Petricoin et al. [43] analysed sera from a training set of 25 controls and 31 PCa patients, before applying the algorithm to a test set of 266 blinded samples (38 PCa). Seven peaks were detected that could identify 36 out of 38 PCa patients in the test set (95% sensitivity and 78–83% specificity). A further study by Qu et al. [44] analysing 386 serum samples (326 training set, 60 test set) found that 74 peaks could discriminate PCa from healthy samples with a sensitivity and specificity of 100%, but 21 peaks could also do the same with 97% sensitivity and specificity. However, a later study by McLerran [45], using rigorous validation methods, brought doubt on SELDI-TOF-MS as a tool for biomarker discovery. This final study found that the peaks reported by Adam et al. [46] and Qu et al. [44] did not stand up to validation methods and were not capable of differentiating PCa from control specimens. Whilst this was a blow for SELDI-TOF-MS, the authors point out that this does not suggest that the method does not work, but that extensive biomarker validation is vital to the biomarker discovery pipeline.

Another SELDI-TOF-MS study by Pan et al. [47] analysed 178 (83 PCa patients, 95 controls) samples yielding 18 differentially expressed proteins between PCa patients compared to the controls. After the application of a decision tree algorithm, eight proteins were identified that could correctly screen PCa patients with 93% sensitivity and 96% specificity.

An 8.9 kDa peak, identified as an apolipoprotein A-11 (apo-11) isoform, was found via SELDI-TOF-MS [48] in PCa patients whose PSA levels ranged from 0 to 4 ng/ml. Consequently, it has been suggested that apoA-11 could be a marker of PSA negative PCa. However, apoA-11 is an acute-phase protein which may be raised in a variety of clinical conditions due to inflammation and thus of limited use as a biomarker [49] but could be useful in a multivariate biomarker diagnostic. This is a potential limitation of many proteomic methods without immunodepletion or fractionation strategies to span the several orders of magnitude of protein expression in serum.

A further three studies utilised SELDI-TOF-MS for the analysis of serum samples; Le et al. [50] could differentiate PCa patients with and without bone metastases with 89.5% sensitivity using cluster of SAA (serum amyloid A) isoforms, Al-Ruwaili et al. [51] had a panel of 20 peaks capable of distinguishing indolent and aggressive disease (45 vs. 54; based on Gleason score) with 73.3% sensitivity and 60% specificity, and Rosenzweig et al. [52] utilised a high-resolution SELDI-qTOF instrument to identify two predictive markers (complement component 4a and protein C inhibitor) of recurrence in pre-radical prostatectomy serum samples.

A study by Qin et al. [53] utilised anion displacement chromatofocusing chromatography followed by 2D DIGE to analyse sera from 10 PCa and 10 BPH patients. They identified (using MS/MS) three low-abundance proteins: SCCA1 (squamous cell cancer antigen 1 ), S100A9 (calgranulin B) and haptoglobin-related protein. The authors propose this pre-fractionation method as a way to uncover low-abundance proteins within the serum proteome.

A study by Jayapalan et al. [54] combined 2DE with lectin-based methods to identify Ο[omicron]-glycoproteins. They found APOA2 (apolipoprotein AII), complement C3 β[beta]-chain fragment, TTR (transthyretin), SERPINA1 (α[alpha]-1-antitrypsin) and KNG1 (heavyweight kininogen light chain) to be significantly differentially expressed. As APOA2 and the complement fragment are acute phase reactants, and SEPINA1 and TTR are not glycoproteins, it is difficult to imagine these are robust biomarkers. KNG1, however, has been implicated in breast, cervical and endometrial cancers.

Highlighting the confounding of inflammation in biomarker proteomics, Bergamini et al. [55] studied biomarkers present with and without inflammation in PCa and BPH. They found that SELDI-TOF-MS and 2DE protein profiles were different depending on whether samples were from patients who had evidence of inflammation and that the presence of inflammation could confound biomarker discovery . SELDI-TOF-MS analysis including inflammation samples found no significant difference between BPH and PCa, but exclusion of the inflammation samples revealed 20 significantly different peaks. 2DE profiles that exclude inflammation samples identified two additional proteins that hadn’t been found in the inflammation comparisons.

Utilising immunodepletion strategies followed by 2D DIGE, Byrne et al. [23] identified 13 differentially expressed proteins between PCa patients with Gleason score 5 and Gleason score 7. PEDF (pigment epithelium-derived factor) and ZAG (zinc-α[alpha]2-glycoprotein) have undergone validation, and the group suggests that PEDF is an accurate marker of early PCa. The same group went on to apply 2D DIGE and metabolomics (using nuclear magnetic resonance) [56] to identify biomarker panels of diagnosis and progression of PCa. They found a three-biomarker panel to distinguish BPH from PCa (apolipoprotein A-IV, serum amyloid P component and glutathione peroxidase 3) with an AUC of 0.926, five biomarkers to distinguish Gleason score 5 from Gleason score 7 disease (kininogen-1, protein AMBP, complement factor H, coagulation factor XIII B chain and glutathione peroxidase 3) with an AUC of 0.549 and another three-biomarker panel to differentiate organ-confined from non-organ-confined disease (protein AMBP, haptoglobin pigment epithelium-derived factor and kininogen-1) with an AUC of 0.742.

Lam et al. [57] used a whole protein top-down MS profiling method , encompassing MALDI-TOF MS, to identify a stage-specific marker in a cohort of 16 PCa vs. 15 healthy individuals. The marker was identified as PF4 (platelet factor 4), and they found it to be significantly decreased in patients with metastatic PCa but not in those with localised or no PCa. This was an interesting approach but on a limited number of samples.

In 2012, Rehman et al. [58] used immunodepleted samples followed by an iTRAQ approach to identify biomarkers that could distinguish BPH, localised PCa, PCa with local spread (biochemically detected) and metastatic PCa. There were several promising biomarkers, but the group specifically highlighted EEF1A1 (eukaryotic translation elongation factor 1 alpha 1) as it progressively increased in expression from BPH through to metastatic PCa.

From all of these serum studies, only two markers have been identified in two separate analyses—APOA2 and KNG1. With APOA2 being an acute phase reactant, it would not be prostate specific. However, KNG1 may prove useful as a diagnostic marker having been highlighted as differentially expressed between BPH and PCa [54, 56].

Tissue

PCa tissue or cell lines seem to be an obvious choice for proteomic analysis as it is a direct way to look at what is happening at the tumour level. Not only can this result in promising biomarkers, but can also yield valuable information on the mechanisms involved in the tumourigenesis of PCa. However, tissue is more difficult to obtain, requiring surgery and associated risks. In addition, the tumour microenvironment is complex, and assaying only tumour cells may not give the full view of what is happening during the course of the disease.

Using 2DGE, Meehan et al. [59] compared normal and cancerous tissue proteomic profiles from 34 radical prostatectomy samples. They identified 20 proteins (via MS) that were lost in the cancerous tissue and validated ubiquitin-like NEDD8 and CNN1 (calponin) using western blotting and immunohistochemistry. The group found that the cellular localisation of ubiquitin-like NEDD8 and CNN1 was altered in the cancerous tissue.

2DGE was again used by Lexander et al. [60] to study the proteomics of fresh radical prostatectomy specimens from 29 malignant (grouped into low and high Gleason score) and 10 benign samples. They identified 39 proteins whose expression differed between the groups of which 15 had differential expression between low and high Gleason score (Gleason 6/7 and Gleason 8/9, respectively). The group then identified 30 proteins by MS including glutathione S-transferases (GST)-π[pi]. GSTs are a well-characterised family of enzymes thought to have a role in the prevention carcinogenesis initiation. Lee et al. [61] used antibodies to show that in 88 of 91 PCa samples, GST- π[pi] was not detectable. They also found hypermethylation of the regulatory sequence for the gene encoding GST- π[pi] (GSTP1) in all PCa samples, distinguishing from PCa. Further, a correlation between methylation of GSTP1 and prognosis has been found [62].

Another group adopting a 2DGE approach found 21 protein spots differentially expressed between two subtypes of the LNCaP cell line [63]. Originally from a lymph node metastasis [64], this cell line is used frequently in biomarker studies due to its hormone sensitivity, and because it expresses PSA [65]. Ten of the 2DGE spots were identified using MS, and the group validated one spot in formalin-fixed paraffin-embedded (FFPE) tissue. This protein was the 60 kDa heat-shock protein (HSP60) whose expression correlated with clinical features of PCa. Cell line studies have advantages over tissue studies as they can reduce variables and sample selection bias, but they are only a model for disease, and therefore results do not always translate.

Alaiya et al. [66] identified a panel of 22 markers capable of not only distinguishing BPH from PCa but also low- and high-grade PCa. They had performed proteomic analysis using 2DGE and MS on fresh tissue from 8 PCa patients and 16 BPH patients, and they found that 15 of their markers overlapped with other studies from different geographical locations, indicating the homogeneity of tissue expression across different ethnic populations.

Lin et al. [67] utilised 2DGE to assess protein expression in tissue taken from prostate biopsies comparing BPH and PCa (14 vs. 9, respectively) and found 52 protein spots significantly differentially expressed between the two groups. Using MS they were able to identify FLNA(7–15) (filamin A) and FKBP4 (FK506-binding protein 4), both androgen receptor co-regulators, and PRDX4 (peroxireduxin 4) which were confirmed by western blotting as being altered in PCa tissue. FLNA(7–15) was decreased in PCa whereas FKB4 and PRDX4 were increased.

A similar study using 2DGE and biopsy tissue, 11 BPH vs. 12 PCa [68], reported 79 differentially expressed proteins including PAP (prostatic acid phosphatase). They went on to study prohibitin at the mRNA and protein level and found it to be upregulated in PCa. The group then went on to study protein expression between normal and PCa tissue in 24 radical prostatectomy specimens [69] using 2D DIGE and MS identifying 79 proteins that were differentially expressed. By western blotting they confirmed overexpression of eIF4A3 (eukaryotic initiation factor 4A-III; thought to be involved in translation), DDAH1 (dimethylarginine dimethylaminohydrolase 1; has a role in NO signalling and possibly androgen-independent cellular growth), ARG2 (arginase 2; involved in polyamine metabolism which is important in PCa development, has been implicated in small cell lung cancer), Prdx3 and Prdx4 (peroxireduxin 3 and 4; antioxidant agents thought to be involved in cell proliferation, apoptosis and gene expression) in many PCa tissues compared to matched benign samples.

A study of epithelial and stromal cells in normal, BPH, prostatitis and PCa using 2DGE and MS was performed by Khamis et al. [70]. They report a downregulation of cellular retinoic acid-binding protein 2 was in basal cells of benign prostate. Caspase-1 and interleukin-18 receptor 1 were overexpressed in PCa leukocytes. Proto-oncogene Wnt-3 was downregulated in prostatitis endothelial cells, and tyrosine phosphatase non-receptor type 1 was found only in normal and benign endothelial cells. A downregulation of poly ADP-ribose polymerase 14 was reported in myofibroblasts of prostatitis tissue. Finally, an upregulation of integrin alpha-6 was seen in epithelial cells but could not be detected in PCa myofibroblasts.

Han et al. [71] also enlisted the 2D DIGE-MS approach in their study of four radical prostatectomy specimens looking and proteome differences between PCa and adjacent tissue. Analysed alongside gene expression microarray data they went on to identify 60 proteins and from these selected 14 differentially expressed proteins to validate by ELISA in serum from 84 PCa, 35 BPH and 13 healthy patients. Their study resulted in three putative serum biomarkers associated with PCa, methylcrotonyl-CoA carboxylase 2 (beta) (MCCC2), TNF receptor-associated protein 1 (TRAP1) and inosine-50-monophosphate dehydrogenase 2 (IMPDH2).

Another 2D DIGE-MS study comparing 5 BPH and 5 PCa radical prostatectomy samples [72] yielded 39 protein spots with significantly differential expression between the groups. They validated three proteins (with roles in the cell cycle and progression) in a further 28 BPH and 14 PCa samples (UBE2N, PSMB6 and PP1CB) using western blotting.

Rowland et al. [73] used 2D DIGE to study androgen ablation in LNCaP cells. They identified 107 proteins differentially expressed between androgen-supplemented cells and anti-androgen-supplemented cells, the majority of which have not previously been associated with the androgen-responsive network.

Skvortsov and co-workers [74] utilised 2D DIGE with MALDI-MS to compare proteomes from matched benign and tumour radical prostatectomy samples. They were able to identify 19 proteins that were significantly differentially expressed, of which HSP60 was significantly upregulated in PCa (as in the aforementioned subtypes of LNCaP66) compared to benign and lamin A was able to discriminate between low- (Gleason score 6) and high (Gleason score ≥ 8)-grade disease.

Pang et al. [75] studied localised PCa (10), lymph node metastatic (LNM) PCa (7) and BPH tissue (10) samples using 2D DIGE with MS. They identified six markers associated with LNM PCa proposing them as candidate biomarkers of aggressive disease. They validated them using real-time PCR, western blotting and immunohistochemistry. These markers are FABP5 (fatty acid-binding protein, epidermal), MCCC2 (methylcrotonyl-CoA carboxylase beta chain, mitochondrial), PPA2 (inorganic pyrophospatase 2, mitochondrial), EZR (ezrin), STOML2 (stomatin) and TAGLN (transgelin).

A SELDI-TOF-MS study by Zheng and colleagues [76] reported a marker they call PCa-24 (m/z 24,782.56) to be present in 16 of 17 LCM (laser capture microdissection) obtained PCa samples but not in paired normal cells nor in 12 BPH samples assayed alongside. Another group employed a similar methodology [77] to compare LCM enriched normal, high-grade prostatic intraepithelial (HGPIN; PCa precursor) and PCa cells from 22 radical prostatectomies. They identified a 24 kDa protein with expression in 19/27 PCa, 3/8 HGPIN and in none of the normal cells. This protein was identified as GDF15 (mature growth differentiation factor 15) and the authors claim that it could be a marker of prostate carcinogenesis. Another study [78] utilising SELDI-TOF-MS assessed protein expression and TNM stage of PCa in 43 primary PCa and 26 matched non-cancerous samples. They found that TIMP1 (metalloproteinase inhibitor-1) was differentially expressed between different stages.

An initial technical PCa study utilising ICAT (isotope-coded affinity tags) was an optimisation of the procedure coupled with ESI-MS/MS (electrospray ionisation tandem mass spectrometry) [79]. The group assessed the proteomes of non-tumourigenic (P69) and highly tumourigenic (M12) PCa cell lines to identify two overexpressed and four under-expressed proteins in the tumourigenic cell line. A further study utilised a similar approach to label specific membrane proteins (PSCA and c-ErbB2) and study expression levels in PCa cell lines [80]. Subsequent work utilising ICAT with LNCaP cells have compared androgen-depleted and androgen-stimulated differences in protein expression [81, 82] and cell surface and secreted proteins [83]. Several well-characterised PCa-associated proteins were identified in addition to other proteins with unclear roles in PCa. Further validation studies are required to ascertain their utility as diagnostic and prognostic markers.

iTRAQ has been used to examine tissue and cell line proteomes. Garbis et al. [84] studied of BPH and PCa snap frozen tissue from 20 patients (10 BPH and 10 PCa) utilising iTRAQ alongside LC-MS/MS to identify 825 proteins. They found that 30 were upregulated, and 35 were downregulated in PCa compared to BPH. These markers included the well-characterised PCa markers AMACR, PAP and PSMA.

Sun and colleagues [85] also studied BPH and PCa using iTRAQ and 2D LC-MS/MS. From 50 biopsy samples (20 BPH, 20 PCa and 10 BPH with local PIN), they identified 46 differentially expressed between BPH and PCa and 33 between PCa and BPH with local PIN. Their markers included PSA and PAP, and the authors went on to validate PSTN (periostin), which they claim to be a promising diagnostic marker.

Another iTRAQ study compared the poorly metastatic cell line LNCaP with its highly metastatic variant, LnCaP-LN333. Ten proteins were shown to be over- and four under-expressed in the highly metastatic cell line. Gp96 and GRP78 were validated using 2DGE and western blotting demonstrating their overexpression in the variant cell line. Immunohistochemistry of benign and malignant prostate tissue further validated Gp96. GRP78 was previously identified as differentially expressed in isogenic prostate cell lines and Gp96 in IFNγ[gamma]-treated isogenic cell lines [86].

Sardana et al. [87] identified four PCa markers (follistatin, chemokine ligand 16, pentraxin 3 and spondin 2) using an MS-based top-down approach. For their initial discovery, they analysed the secretome of three different cell lines, and these biomarkers were subsequently validated on serum samples.

Geiger et al. [88] analysed the proteome of 11 cell lines, including LNCaP , using an LTQ Orbitrap Velos mass spectrometer with a ‘high field’ Orbitrap mass analyser and detected 10,369 proteins from the LNCaP cell line.

Using the newer SWATH-MS method , Liu et al. [89] searched for glycopeptides associated with aggressive PCa. They compared 10 normal prostate, 24 non-aggressive PCa, 16 aggressive PCa and 25 metastatic PCa tissues. They reported 220 glycoproteins with differential expression associated with PCa aggressiveness and metastasis. They validated two biomarkers associated with aggressive PCa in an independent cohort, NAAA (N-acylethanolamine acid amidase) and PTK7 (protein tyrosine kinase 7). They report that these biomarkers could pinpoint aggressive disease and help minimise overtreatment of indolent tumours.

Semen Proteomics

As stated, prostate-specific antigen, PSA, is the current gold standard biomarker for PCa and is currently used in the clinic in spite of its documented issues with specificity and sensitivity. PSA was first isolated from seminal fluid and is found at much higher concentration within the ejaculate (1.2 mg/ml) than in the blood serum (<4 ng/ml) [13, 90]. However, even though the seminal fluid is known to be rich in proteins, there has been little published regarding its utility for PCa biomarker discovery.

Seminal fluid is a complex mixture arising from multiple organs and plays multiple roles in the ejaculate. The seminal fluid consists of secretions from the seminal vesicles, prostate gland, tests and epididymis, bulbourethral gland and the periurethral gland (see Table 14.1), with each gland providing components for the multiple functions of the seminal fluid.

Table 14.1 Major source and components of the seminal fluid

The seminal fluid also contains non-sperm cells collectively known as ‘round cells’. Round cells are a mixture of leukocytes, developing spermatids, sertoli cells, epithelial cells, lymphocytes, neutrophils and macrophages, and there number varies from man to man, with age, health, sexual activity and fertility [99].

The main role of the seminal fluid was thought to be the liquefaction of the semen and providing nutrition for the spermatozoa. However, the role of the seminal fluid is much more complex. The seminal fluid interacts in with the female urogenital tract to modulate both the local microenvironment and the female immune system to aid sperm survival. High levels of basic polyamines such as spermine, spermidine and putrescine make the seminal fluid alkaline in nature, which helps neutralise the normally acidic female urogenital tract [98]. Cytokines such as the immunosuppressive TGF-B secreted by the seminal vesicles with levels reaching 150–200 mg/ml [91] help to reduce the female host immune response to the ‘invading’ and foreign male sperm. Secretions from the bulbourethral glands help lubricate the semen aiding spermatozoa motility and thereby fertility [98].

Although the first published study of seminal proteomics was in 1888, describing the discovery of propeptone as a seminal contamination of urine, it is only recently that large-scale proteomic studies have been conducted. Starita-Geribaldi et al. [100] isolated >100 proteins from seminal plasma by 2D-MALDI-TOF-MS from men with azoospermia. This has been superseded by Batruch et al. [101] using 2D-LC-MS/MS who isolated >2000 proteins in seminal fluid from pre- and post-vasectomy men or men with azoospermia. However, this may not represent the entire proteome.

The data clearly demonstrates that the seminal fluid is a highly complex proteome which, like blood plasma, contains large amounts of secreted high-abundance proteins such as kallikreins and semenogelins, which can account for >80% of the total protein content [97]. The dynamic range of detected proteins is also large, spanning 9 orders of magnitude, with total concentrations 40–60 mg/ml for the top secreted proteins to <10 pg/ml for pro-inflammatory interleukins. Over 97% of these proteins are soluble with the remaining 3% being found within microvesicles, many of which are secreted by the prostate gland, known as prostasomes [102]. Prostasomes are membrane bound vesicles between 40 and 500 nm in diameter with membranes predominantly composed of cholesterol. Prostasomes contain a sample of the interior of the prostate epithelial cells and potentially have functions in spermatozoa motility [103], seminal liquefaction [104], antibacterial activity [105], growth inhibition [106], protection from the acidic milieu of the vaginal tract [107] and immunomodulation of the vaginal tract [108]. Prostasomes have the advantage for proteomic analysis in that there is no single dominating protein complicating downstream analysis [109].

Even though the introduction of PSA significantly increased the numbers of PCa cases detected, it has failed to make an impact on the number of PCa-associated deaths [110]. There is, therefore, a clinical need for new and robust markers. Global proteomics of the seminal fluid benefits from the collection being non-invasive and from being a proximal fluid, being in contact with any potential prostate tumour, as compared with circulating blood specimens. To date, there have only been a limited number of proteomic studies specifically studying PCa.

A study [111] using 2DGE identified pigment PEDF to be an early tumourigenesis biomarker in PCa. Weak expression was observed in some HGPIN samples and all PCa samples; the weak expression seen in the HGPIN was associated with subsequent PCa diagnosis. This supports the study by Byrne et al. [23] mentioned previously who also found PEDF to be important in PCa progression.

Neuhaus et al. [112] took a top-down approach to analyse the seminal plasma from 125 patients (70 PCa, 21 BPH, 25 chronic prostatitis, 9 healthy controls). Native peptides from seminal plasma were sequenced by LC-MS/MS using a Dionex UltiMate 3000 RSLS nanoflow system prior to analysis using an LTQ Orbitrap hybrid mass spectrometer. Using this approach, Neuhaus et al. [112] were able to show that seminal plasma proteomics could generate biomarker profiles able to detect PCa and to discriminate between high- and low-grade disease. A stepwise application of a 21- and a 5-peptide biomarker profile was shown to detect PCa with a sensitivit y and specificity of 83% and 67%, respectively. A second 11-peptide marker profile was able to differentiate between Gleason score 7 organ-confined (stage ≤T2c) and advanced Gleason score 7 tumours (stage ≥T3a) with a sensitivity and specificity of 80% and 82%, respectively. Whilst this approach enables the researcher to directly detect combinations of post-translational modifications, sequence variants and degradation products, the technique’s ability to detect post-translational modifications can affect the ability to sequence the peptides. Therefore only eight of the peptides were definitively identified: semenogelin 1–4, stabling-2, PAP, N-acetyllactosaminide beta-1,3-N-acetylglucosaminyltransferase and GTPase IMAP family member 6.

An alternative approach to seminal plasma is to analyse the proteome of the prostasomes. An earlier study by Renneberg et al. [113] found approximately 80 peptide spots on 2D gels representing the normal prostasome proteome. Utleg et al. [109] took this further using μ[mu]LC-ESI-MS/MS coupled with an iterative gas-phase fractionation (GPF) approach to identify 139 proteins. Purified prostasome peptide fragments were then analysed by μLC-ESI-MS/MS along with gas-phase fractionation to achieve maximum coverage. A total of 139 proteins were confidently identified, of which 119 proteins had a probability score of 0.9, with the remaining scoring between 0.5 and 0.7. The proteins identified fell into six groups: (1) enzymes (including PAP, PSA, TMPRSS2, fatty acid synthase), (2) transport and structural proteins (six members of the annexin family, actins, ezrin, corfilin, tubulins and profilins 1 and 2), (3) GTP proteins (Rab family), (4) chaperone proteins (HSP 27, 70, 71, 90 and grp 7), (5) signal transduction proteins (including 14–3-3 protein β[beta], γ[gamma], ε[epsilon], ζ[zeta], σ[omega], clusterin, calmodulin, zinc-alpha-2-glycoprotein) and (6) unannotated (see Utleg et al. [109] for complete listing). Many of the enzymes identified within this study are exclusively present in prostasomes and coupled with the fact that many of the proteins identified are also related to PCa and the lack of a dominant protein; prostasomes are an attractive target for proteome biomarker discovery.

Several of the proteins identified by Utleg et al. [109] have also been observed by galectin-3 binding of proteasomal proteins [114] and from seminal plasma proteins [115]. Galectin-3 is a 30 kDa carbohydrate-binding protein found on the surface of prostasomes and is a proteolytic substrate of PSA [116]. MS/MS identified candidate galectin-3 binding ligands such as PSA, PAP, zinc-alpha-2-glycoprotein, CD26, CD13, neprilysin, clusterin, antibacterial protein (FALL-39) and alpha1-acid glycoprotein [114]. Hassan et al. [115] also identified PSA, PAP and zinc-alpha-2-glycoprotein, along with progastricsin and PIP as being differentially expressed in seminal plasma proteomics. Interestingly 2D electrophoresis of pooled normal or cancer patient’s seminal fluid both gave rise to the same 917 spots. By examining 2D gels loaded with decreasing amounts of protein, five differentially expressed spot clusters were identified and sequenced following tryptic digestion.

However, to date there has not been a large-scale proteomic analysis of seminal fluid prostasomes in PCa patients [117]. This may be due to the potential limitations in seminal fluid sampling due to religious, societal, physical and emotional issues and the age of the subjects.

These limitations would app ly to both prostasome and global proteomic biomarker screening of seminal fluid. Combined with the highly standardised protocols and associated infrastructure, it is currently unlikely that seminal fluid proteomics will be taken up by the clinical fraternity. However, it remains a valid biofluid for biomarker discovery [117].

Exosomes

Exosomes are small (50–100 nm in diameter) vesicles which are endocytic in origin and secreted from the cell surface [118]. Exosomes are known to contain proteins and thought to act as an additional route of cell-cell signalling. Importantly, it is becoming increasingly apparent that exosomes are released by cancer cells (including PCa cells) into the bodily fluids and may therefore warrant further investigation as a potential source of biomarkers. In cancer, exosomes are proposed to function as shuttles for proteins, which act on surrounding cells to facilitate tumour growth and metastasis [119, 120]. In PCa, an increasing body of research is beginning to characterise exosomal proteins as potential diagnostic markers and markers of disease progression or drug response/resistance.

PCa exosomes have been isolated from a number of fluids including the blood, seminal plasma, expressed prostatic secretions and urine, as well as primary prostate tissue and cell lines. A recent study by Øverbye et al. [121] to identify urinary exosomal markers of PCa found that 246 proteins were differentially expressed between the exosomes of PCa patients and healthy controls, with the majority showing upregulation in the cancer samples. A total of 37 proteins found to be significantly enriched could distinguish PCa with 100% specificity and a sensitivity of 50% or higher. These included claudins, Ras-related proteins and various enzymes and regulatory factors. The protein showing the highest sensitivity (94%) and level of enrichment in PCa (140-fold) was TM256, a potential candidate for further validation.

Concluding Remarks

A simple Pubmed search to identify studies with the search term ‘proteomics’ and ‘prostate cancer’ anywhere in an article yielded 607 hits. This is quite an achievement for just 21 years since the inception of proteomics. Indeed, these results demonstrate biomarkers that should revolutionise PCa diagnosis, prognosis and treatment. Yet, the roadblock is that since these studies first started being published, few markers have even come close to becoming a clinically applicable test. Only two markers were identified in more than one study in serum, APOA2, an acute phase reactant, and KNG1 which may prove valuable as it has also been identified in other cancers [54, 56]. Additionally, PEDF was identified in both serum and semen [23, 111].

This lack of clinical applicability is almost certainly due to study design. Ransohoff [122] has described the phenomenon of overfitting; the differences between groups in MS peak patterns are due to chance rather than biological difference, and as such the results are not reproducible. This occurs as the number of features analysed (typically thousands of peaks) are greater than the number of samples (see Fig. 14.1). This can be detected by attempting to validate results in an adequately sized independent sample. This means that until a validation has been performed, high-throughput study results are fairly meaningless.

Fig. 14.1
figure 1

Diagrammatic representation of the stages involved in biomarker discovery through to clinical implementation giving an indication of the number of analytes and samples used at each stage

Group selection , sample collection and sample storage are all times where bias can be introduced, and this represents a further potential pitfall for any proteomic study. Indeed when McLerran et al. attempted to minimise bias, they found that SELDI-TOF-MS became unable to differentiate PCa from biopsy-negative controls [45, 123]. It is true that with such international centres such as those in the USA, Switzerland and the UK, the application of proteomics to stratifying medicine will become a reality. It has to. We now have the ability to manage and mine huge amounts of live ‘big data’. A focus on a precision medicine will allow us to help patients, when they need it and in real time.