Keywords

Introduction: What is Metabolomics?

At its most basic level, metabolomics refers to the study of some or all of the metabolites in a biological sample, be it tissue, cells, serum , or other bodily fluid [1]. Whilst not a new science, advances in detection methods, statistical analysis and computing power have led to renewed interest in this area and its potential in the field of cancer. It forms a distinct branch of the ‘omics’ sciences, along with genomics , proteomics or transcriptomics. Genomic analysis identifies the genes present, including mutations of functional genes. Yet only a subset will actually be expressed [2], meaning that the remainder may be of limited or no clinical significance. Furthermore, it will not identify the normal genes that are being overexpressed by other processes. The transcriptome, as defined by the measurable RNA present, represents then the output of the genome, while the proteins produced (the proteome) are the most relevant product, being a step closer again to clinical effect. However, the interplay between these proteins, their relative enzymatic activity and the direct clinical effects can still vary. For example, the presence of altered PI3K signalling molecules from a PIK3CA gene mutation does not necessarily result in increased downstream signalling of the AKT mTOR pathway, and can depend on PTEN concentration [3, 4]. The metabolome, by contrast, represents the step ‘after the fact.’ It is the collection of molecules that exists as a result of cellular processes, which are themselves a result of the enzymatic processes catalysed by products of the genome. It is thus direct evidence of what actually exists or existed, ie the phenotype, as opposed to what could exist, and offers a complementary and multidimensional picture of both the tumor and the host.

All cellular processes produce metabolites, whether as a specific function (hepatocytes) or as products of normal cellular activities such as maintenance of homeostasis, replication, and activation of signalling pathways. These in turn are also influenced by many factors including diet, toxins, diseases and drugs [5]. These metabolites therefore can represent any number of molecular classes, from small molecules or amino acids, to lipids or carbohydrates, or any of their breakdown products [6]. Collectively they are referred to as the metabolome, which is representative of all the processes occurring in a cell, an organ or the entire body at a particular time, and which necessarily varies over time according to the multitude of influences on the body, both normal and pathological.

Metabolites can be detected in any biological sample, ranging from blood (serum or plasma) to tissue, urine, sweat, tears, saliva, or even exhaled breath condensate [7, 8]. This represents a significant clinical advantage, as acquiring samples such as serum is straightforward yet may provide significant tumor-specific information, potentially representing a liquid biopsy and sparing the patient a more invasive procedure. The caveat to this is the sensitivity of the samples to incorrect handling—the metabolic profile may change after sampling depending on a number of factors including temperature and changes in pH [9]—as well as the modulating effect of a number of variables discussed later.

Cancer Metabolism

In cancer, a number of metabolic processes are altered, either within the cancer cell, the tumor milieu, or in other parts of the body as a result of the cancer. Where this results in a measurable change in metabolites, such changes represent a potential biomarker of cancer presence or activity. Significantly altered metabolic pathways within cancer cells are well recognised. For example, many cancer cells employ aerobic glycolysis in place of the usual mitochondrial oxidative phosphorylation to generate adenosine triphosphate (ATP), a phenomenon known as the “Warburg effect” which is believed to confer a survival advantage in hypoxic conditions [10, 11]. This feature of malignant cells is already exploited in cancer imaging: fluorodeoxyglucose (FDG)-positron emission tomography (PET) relies on the enhanced uptake of radio-labelled glucose by cancer cells to define tumors on imaging studies. Other common metabolic shifts in cancer result in changes in choline and fatty acid metabolism [12]. Choline is typically absent or at very low concentrations in normal tissue, and found in higher concentrations in tumor. Magnetic resonance imaging (MRI) can be adapted to include spectroscopic interrogation of parts of the image down to a single voxel to detect choline levels; areas of high choline concentration are very likely to represent presence of malignancy. This is currently employed in brain imaging of gliomas, and screening for early breast cancer in high-risk populations.

Whilst metabolomic studies are used to detect individual metabolites that might serve as predictive biomarkers, this is not the only application. Furthermore, although several metabolites have been identified that correlate with the progression and development of breast cancer, this has not resulted in any significant clinical gains. Current metabolomics research aims to take this considerably further by looking at groups of metabolites or indeed the metabolome as a whole. These collections of data will contain patterns that then represent the metabolic signature of the sample, which can be compared to the patterns of other samples without the need to identify any of the individual molecules. This has the advantage of incorporating known and unknown metabolites of all the upstream events: gene expression and activated cellular pathways from the tumor; reactive and immunological responses from the host; as well as integrated signalling pathway cross talk and environmental influences, by far a more comprehensive picture, albeit embedded in a vast sea of other metabolite data.

Metabolomic Techniques

Two standard techniques for metabolomic analysis are nuclear magnetic resonance (NMR) spectroscopy and mass spectrometry (MS). MS has higher sensitivity than NMR, and requires lower amounts of samples.

NMR is faster, less expensive and more reproducible [13]. Another advantage of NMR is that the sample requires only a minimal handling prior to the analysis. Because NMR does not damage analytes, it is particularly useful for studying metabolite levels in intact tissues, such as tumor biopsy samples, which can then be used in further experiments. In recent years, the development of high resolution 1H magic angle spinning (MAS) made the acquisition of data on small slices of tissue without any treatment feasible: with the rapid spinning of the sample at the magic angle of 54.7°, the line broadening effects and the associated loss of information are reduced [1416], resulting in high resolution spectra (Fig. 1).

Both techniques have their role in metabolomic research, depending on the aim of the investigation. In particular NMR can be used for rapid, untargeted screening; then, once metabolic pathways of interest are discovered, MS can be used in a targeted way to detect specific metabolites that could not be revealed in the NMR spectra due to the low concentration.

Fig. 1
figure 1

NMR MAS spectrum of ovarian cancer tissue. Each of the numbered spikes represents a seperate metabolite, with the relative heights (signal strength) related to concentration. Fourty have been identified here, but a sample may contain hundreds. (Adapted from Ben Sellem et al., “Metabolomic Characterization of Ovarian Epithelial Carcinomas by HRMAS-NMR Spectroscopy,” Journal of Oncology, vol. 2011, Article ID 174019, 9 pages, 2011. doi:10.1155/2011/174019. Permission for reproduction available under the Creative Commons Attribution License 3.0 (http://creativecommons.org/licenses/by/3.0/)

Analysis

Metabolomic data are high-dimensional in nature. As many as several hundred metabolite (relative) concentrations may be measured by means of NMR or MS platforms, usually on a limited number of samples. Biological information is retrieved from these data by means of univariate and multivariate statistical methods [17, 18]. Multivariate methods use the relationships among the variables, in contrast to univariate methods that focus solely on the mean and variance of a single variable. Commonly used univariate methods are t-test and analysis of variance [19]. Multivariate methods constitute a broad category that can be further divided into two types of data analysis: supervised and unsupervised.

Unsupervised analysis looks at the measured data on their own, to try to identify patterns. As such, the analysis is unbiased to the results, and is more open to discovery of novel metabolites or patterns of metabolite presence or concentration. It can be used to look for inherent patterns or intrinsic clustering that occurs within the samples, without knowing any outcome data, and may be more appropriate in exploratory experiments. On the other hand, it often involves extremely large quantities of data, requiring complicated mining methods to extract meaningful peaks or patterns. Once patterns have been established, they can be tested in a sample with known characteristic or outcomes, to see if the patterns offer genuine discriminating power, eg for diagnosis, prognosis, or prediction of response to treatment. Some examples are principal component analysis (PCA) [20], and the recently published KODAMA [21].

Supervised analysis involves obtaining data referenced to a known established control. This might be any number of previously identified metabolites. Statistical methods like multiple regression [22] or partial least squares discriminant analysis (PLS-DA) [23] and machine-learning techniques like artificial neural networks [24], random forest [25] and support vector machines [26] are used as supervised techniques in metabolomics [17, 27, 28].

One concern is that using established prognosis calculators to supervise and thus define the profile may risk developing yet another calculator of similar power, and thus no enhanced utility. Current prognostication based on tumor grade, size, biomarker status and nodal status, such as Adjuvant! Online, or even gene expression profiling, still misclassifies a significant proportion of patients, and it is for this very reason that improved techniques are being sought. Thus, unsupervised analysis must be the initial technique, rather than supervising with established risk factors. Then, to validate the result, the gold standard is to design large cohort prospective studies.

The science of measuring and interpreting correlations in metobolomics to infer significance and true inter-relatedness is in itself an evolving science [29]. As more metabolomic data are obtained and understanding of pathways is improved, these can be shared on public networks to try to offer a comprehensive picture of human metabolism [30, 31]. The Human Metabolome Database, for example, is one of several databases, and lists approximately 7900 metabolites [32].

Challenges

The metabolic profile of an individual is not static, but rather in constant flux according to the constant variation in cellular process in response to a number of factors, including normal homeostasis, exercise, diurnal rhythm, diet, hormones, and drugs [13]. This introduces many variables that can be difficult to control for. For example, certain metabolites can vary depending on how recently a person ate, or what time they took their regular medications. This creates increased noise in data acquisition, rendering these difficult to interpret. Furthermore, if these data are controlled carefully in experimental stage, the reproducibility in the real world may be difficult, where patients may be less likely to cooperate with dietary or other lifestyle factors [33, 34].

The metabolome of an individual [35, 36] will also vary significantly from that of another, regardless of the presence or not of malignant disease [5]. This is because it can reflect any number of small differences inherent, including race, sex, age, comorbidities, gut microflora, as well as factors mentioned above [37].

Thus we see that there can be both intra-patient and inter-patient variability (Fig. 2). Any putative biomarker, be it a single metabolite or a metabolic signature, must be reliably discernible through this background variation if it is to become a useful and robust tool.

Fig. 2
figure 2

The metabolome consists of metabolites from all cellular process, which is influenced by intrinsic and extrinsic factors. Metabolites produced by cancer cells are superimposed on this landscape

No standard reference exists yet for metabolomics, due to the great inherent variability from one patient population to the next, and the complex variety of chemometric techniques that can be employed in analysis. As such, each new experiment in metabolomics that looks to differentiate two groups first requires a training set to establish the specific patterns and levels that are associated with the outcome of interest, such as disease relapse following adjuvant chemotherapy. Once this is achieved, it must then be tested against the remaining data, or against multiple subsets of the data, to validate these patterns as having genuine correlation with the outcome of interest. Examples of this in breast cancer research will be detailed in the next section.

Metabolomics in Breast Cancer

In breast cancer, as in other tumor streams, metobolomic research remains in the experimental stage, with as yet little translation into clinical application. A number of potential applications have been and continue to be explored (Table 1).

Table 1 Applications of metabolomics in breast cancer

Metabolites as Biomarkers

Metabolomic analyses have detected a number of potential biomarkers which could proceed to further validation. An example is the ratio of glutamine to glutamate in tumor tissue, where it has been shown to correlate with estrogen receptor (ER) status, tumor grade and overall survival [38]. This illustrates how a broader analysis allowed appreciation of the importance of examining more than one metabolite at once. Glutamine or glutamate levels individually bear only rough and unreliable correlation with cancer presence, yet this study demonstrates that their levels relative to one another become more informative. Whether this will lead to enhanced predictive or prognostic ability is yet to be assessed, but the hypothesis-generating ability is in itself valuable.

Prediction of Stage

Studies of NMR spectra of fine needle aspirates of suspected early breast cancer showed that malignant tissue, nodal involvement and tumor vascular invasion could be predicted with high accuracy [39], and could be used to predict grade, ER and progesterone receptor (PgR) status, or axillary spread [40, 41]. Larger numbers are needed to validate these results, and unless the profiles can be shown to offer superior prognostication to current methods, then clinical utility is debatable. Nevertheless, it is evidence that the metabolic signature tells of the aggressiveness of the phenotype .

Prediction of Treatment Effect

Prediction of response to neoadjuvant chemotherapy using metabolomic data has been achieved using combined MS and NMR data [42]. Levels of four metabolites, threonine, glutamine, isoleucine and linolenic acid, were identified that correlated strongly with pathologic complete response (pCR) following neoadjuvant chemotherapy. What remains unclear, however, are the metabolic pathways implicated in the changing metabolite levels, and their roles in cancer development and treatment response. Furthermore, the predictive benefit needs to be compared to that already offered by clinicopathological features to ensure it increases prediction power and confers a clinical advantage.

Early Detection of Recurrence

Compared to standard approaches, recurrence can be predicted earlier with metabolomics, shown in a study by Asiago et al. [43]. The investigators combined both NMR and MS techniques to analyze stored patient sera from resected early breast cancer patients. Multiple samples over time were available for each patient. A number of metabolites were found to be strongly associated with relapse, and a model was developed that predicted for relapse with sensitivity of 86 % and a specificity of 84 %. Compared to detection by standard clinical means, the profile was able to detect recurrence 13 months earlier on average in 55 % of patients. Whilst to date there is no proven clinical utility for early detection of metastatic disease, early diagnosis of local recurrence is associated with a survival advantage [44], and these results are exciting. This could form a basis for further studies into the benefits of early initiation of treatment for relapsed disease. It could also allow early recognition of failure of adjuvant endocrine therapy, preventing continuation of futile treatment or indicating new intervention to counteract resistance.

Predicting Recurrence Risk in Early Breast Cancer

Several studies have been performed to test whether metabolomic profiles have any prognostic power in early breast cancer, in terms of predicting relapse. It is worth going into the details of some of these trials to illustrate the techniques required for metabolomic analysis, the limitations of the studies, and the potential benefits.

In the field of early breast cancer, the improvement in prognostication remains a priority. This is because current practice favors over-treatment of women with systemic therapy due to an inability to identify and isolate those for whom adjuvant treatment is more likely to be beneficial. We know from early studies that even in high-risk node-positive disease, a subset of these women will be cured with local therapy alone. Seminal studies performed by the Milan group [45] comparing the CMF combination (cyclophosphamide, methotrexate, 5-fluorouracil) to no adjuvant therapy in women with node-positive early breast cancer, with over 25 years clinical follow up, demonstrated that 22 % of these clinico-pathologically high-risk women who had no adjuvant therapy remained disease free. Women with node-negative, ER-negative disease receiving no adjuvant treatment had higher survival of 40 %, with 20 years follow up [45]. Even allowing for the improved risk stratification offered by modern gene expression profiling, there is room for improvement: in the National Surgical Adjuvant Breast and Bowel Project (NSABP) B20 study comparing chemotherapy plus tamoxifen to tamoxifen alone in women with node-negative, ER-positive, resected early breast cancer, those with tumors classified as high-risk by the OncotypeDX 21 gene recurrence score had long term survival well over 60 % with tamoxifen alone [46, 47]. Today, many of those women would almost invariably be offered systemic therapy, and likely chemotherapy, with all the inherent risks and cost.

The search for biomarkers to improve stratification of patients with early breast cancer to detect those who will benefit from chemotherapy, and those for whom the toxicity outweighs the benefits, is vital. Current risk stratification relies on data taken from the biopsy and resected tumor: ER, PR, HER2, Ki-67, tumor grade, and extent of nodal involvement. Genome expression profiling has refined this, particularly in the node-negative cohort (Oncotype DX, 70 gene recurrence score), yet still a large proportion of women who were cured by surgery alone are not identified and are subsequently treated unnecessarily.

Common to these approaches is risk assessment based on features of the primary cancer alone, once it has been removed. Whilst offering clear prognostic benefit as surrogate markers, these may not reflect the biology of residual disease. In the post operative setting in breast cancer, the decision to offer adjuvant therapy is based on the likelihood of relapse, which in turn is linked to the presence of micrometastatic disease, the residual tumor cells which may be genetically or phenotypically different from the primary cancer, and thus the cells that need to be addressed. Circulating tumor cells (CTC) or disseminated tumor cells may offer a more targeted approach, and are known to confer a worse prognosis [48]. However, detection and collection in non-metastatic setting is difficult, such cells may still not be representative of all remaining cancer cells, and this approach may still fail to appreciate the host response.

Metabolomics offers a unique perspective, as it takes into consideration signals from the host, the tumor microenvironment, and the tumor cells themselves, as well as any interactions between them. This residual pool of cancer cells, and the host response to them, may result in a detectable change in the metabolic profile that might differentiate those who are likely to be cured by surgery alone from those who are more likely to relapse. It is for this reason that metabolomics may provide complementary and possibly more comprehensive information that could be added to current stratification models and aid in prognostication.

Establishing the Metastatic Metabolomic Signature

An initial test of the hypothesis that such signatures may be detectable and discriminating was performed by our group using one-dimensional proton NMR spectra of serum samples [49]. Fortyfour patients with early breast cancer had serum taken for metabolomic analysis both pre and postoperatively. As a control, 51 patients with advanced breast cancer also had serum taken. The aim was to see if serum metabolic profiles of early breast cancer patients differed from those with advanced disease; whether this changed after surgery; and whether the profiles could be used to generate a risk score that had prognostic power comparable to an existing prognosis calculator (Adjuvant! Online). A further 45 patients with early disease provided a post operative blood sample that would be used as a validation series, ie to determine if risk scores generated in a new post operative group have a similar correlation with prognosis compared to the initial group, demonstrating reproducibility and validity.

Once spectra were obtained from the serum samples, a series of analytical steps was required to allow meaningful comparisons, including data reduction using orthogonal projection to latent structure (OPLS), a technique used to convert each spectrum to a single point on a two dimensional graph to allow simple comparison of the different fingerprints. This demonstrated significant separation of the preoperative and metastatic groups into distinct clusters, illustrating that the fingerprints did indeed differ from one population to the other to varying extents. Double cross validation was then used to assess prediction ability of the model, showing a discrimination sensitivity of 75 %, specificity of 69 %, and predictive accuracy of 72 %, with some patients with metastatic disease being consistently misclassified as early, and some early patients as metastatic.

A ‘metabolomic risk score’ was then established for each early breast cancer patient based on how much their profile resembled the metastatic profile, measured as an inverse function of the distance to the barycentre of the metastatic cluster . In other words, the more the fingerprint resembled that of patients with metastatic disease, the higher the risk score. This is based on the premise that the presence of the primary and/or micrometastatic disease is more likely to yield a metastatic profile, and that its presence makes relapse more likely. High metabolomic risk score in preoperative patients was found to be highly correlated with misclassification as metastatic.

The metabolomic risk based on the preoperative serum was then compared to the 10 year breast cancer mortality estimate from Adjuvant! Online, for each patient, with the arbitrary threshold of 10 %, 10-year mortality risk for low-and high-risk. Here, concordance was low. However, once the primary tumor was removed, there was considerable change in metabolomic risk, with 86 % of patients initially assessed as having high metabolomic risk switching to low metabolomic risk, suggesting that the signal was coming entirely from the primary cancer in this group. Interestingly, 8 of 10 patients assessed as both high preoperative metabolomic risk and high Adjuvant! Online risk moved to low metabolomic risk postoperatively. Only 6 out of 21 patients with high Adjuvant! Online risk had high postoperative metabolomic risk.

When the same technique was repeated with the validation set (post operative serum samples), a similar pattern was observed, with high concordance of low metabolomic risk with low Adjuvant! Online risk, but only 32 % of high Adjuvant! Online risk patients showing high metabolomic risk. Thus we see that this metabolomic risk score generally classifies more patients as low risk.

Key points from this trial are that a detectable metabolomic signature is present in patients’ serum that can indicate the presence of breast cancer, and distinguish early from metastatic disease in a high proportion of patients. The shift in signature from a high-risk (metastatic) to low-risk following removal of the primary tumor in 86 % of patients supports this. Where a metastatic signature exists post-operatively, this is more likely to be associated with a high-risk status according to traditional measures, yet fewer post-operative patients overall are classified as high-risk. This has the potential therefore to offer greater discriminatory power in selecting those who are less likely to require adjuvant therapy.

What is missing from this trial however is follow-up data, which would offer far greater evidence of predictive power than comparison with another risk calculator. Simply using established prognosis calculators to validate the profile may risk developing another calculator of similar power, and thus will not enhance utility. Furthermore, the trial requires further validation in different patient cohorts.

Predicting Clinical Outcome

To these ends, Tenori et al. [50] performed a similar study in which they examined serum 1H-NMR metabolic profiles in both early and metastatic breast cancer patients, again with the aim to demonstrate that the spectra could differentiate between the two groups, and also to establish a risk score that might predict relapse. Importantly though, in this case there were clinical follow-up data for the patients with early breast cancer, which had to be available for a minimum of 5 years or until relapse. Serum samples were selected from a biobank at the Memorial Sloan Kettering Cancer Center (MSKCC) in New York in which left-over patient samples are stored for scientific use, with patient consent. Eighty samples from patients with early breast cancer were selected, with the criteria that they must have post-operative serum available, taken up to 90 days post surgery, but prior to commencing adjuvant therapy.

Ninety-five samples from patients with metastatic disease were obtained, and their NMR spectra obtained to create the metastatic fingerprint. The early stage group was split into two groups of 40 samples; the first half was used to generate a reference spectral patterns for early disease and to develop a risk score (training set), and the other half was used to test the risk score for concordance and accuracy (validation set). The underlying hypothesis was that sera of patients with early breast cancer with micrometastatic disease would have metabolic fingerprints more closely resembling those of the metastatic cohort, and that these patients would be more likely to experience disease relapse. Ten out of 40 patients in the training set, and 11 out of 40 in the validation set, had documented evidence of relapsed disease.

Random Forest (RF) classification was used to classify samples as either metastatic or early, based on the spectra. This is an analytical technique that can take large numbers of variables into consideration, is less prone to error or over-fitting, and does not require cross validation. This was performed on three different spectra for each sample using different NMR techniques: NOESY1D, CPMG, diffusion-edited. Similar to the previous study, there was high accuracy in predicting early or metastatic status, with correct prediction in 84–87 % of cases across the three NMR techniques.

A RF risk score was generated, based on the risk of a patient with early breast cancer specimen being classified as metastatic, and this score was taken as an indicator for clinical relapse. The RF risk scores generated from each of the spectra were then compared to the known outcomes of the patients using eceiver operating characteristic (ROC) analysis. CPMG spectra resulted in the greatest area under the curve (AUC) on the ROC curve (0.863), and were selected for use in the validation set. From here, a cut-off for the RF risk score was determined, aiming for maximum accuracy with appropriate sensitivity and specificity. The RF risk score of ≥ 53 was used, yielding sensitivity, specificity and accuracy of 90, 67 and 73 %, respectively for predicting likelihood of relapse.

This CPMG risk score model was then applied to the validation set in an unsupervised analysis (ie blind to the clinical outcome). Here the correlation between predicted relapse and actual relapse was high, with AUC 0.824, demonstrating that in this cohort the risk calculator was robust. Sensitivity was 82 %, specificity 72 %, and predictive accuracy 75 %. Nevertheless, 25 % of patients were misclassified, and, if used to dictate adjuvant chemotherapy decisions, 18 % of patients who would have relapsed would not receive adjuvant treatment.

The model was tested further by comparing it to already-validated prognostic methods that employ clinicopathological features of the primary disease. Tumor size, nodal status and RF score all had significant association with recurrence , but on multivariate analysis none remained significantly associated (tumor grade was not included, as all early cancers were grade 3). When compared to Adjuvant! Online in multivariate analysis, only RF score showed statistically significant association with relapse, indicating that the RF score offered prognostic power over and above that offered by Adjuvant! Online in this cohort.

There were some potential confounders in the trial, some of which were accounted for. First, when searching the MSKCC database for patients early disease, only cases with ER-negative disease were selected for the relapse-free cohort, as 5 years follow up was deemed insufficient for ER-positive early breast cancer. No selection for ER status was made on the relapsed cohort or the metastatic cohort. Subsequent analyses showed that ER-positivity could not be predicted from the metabolomic spectra, and the authors concluded that differences in ER status between the early and the advanced breast cancer cohorts could not explain the observed results. This was further validated by confining the study to ER-negative patients only and repeating the analysis, subsequently achieving similar sensitivity, specificity and accuracy. Second, the time interval between surgery and blood sampling varied from 5 to 80 days, but again further analysis demonstrated that metabolomic spectra could not be used to differentiate early sampling (time interval < 30 days) from late (30–80 days).

Limitations

The first study controlled for a number of variables by confining the patient population to a single institution, and by taking blood samples specifically for metabolomic analysis after an overnight fast and with a diary of the previous day’s food intake and medication. This reduces a number of potential confounders, but in doing so also reduces the generalisability. Furthermore it lacked outcome data for its early patients, instead comparing its risk score stratification to standard clinicopathological prediction. But it served as a proof of concept.

The second study again also used serum from a single institution, but here the serum had been stored for a variable length of time, and did not control for fasting state or time of blood collection. Whilst potentially confounding, this may render positive results more robust, as the likely effect of such variation is dilution or disguise of genuine metabolomic profile differences. More importantly, perhaps, a large proportion of the early breast cancer patients went on to receive chemotherapy, undoubtedly influencing the outcome data. Thus its predictive ability here may be limited to identifying those who are likely to relapse in spite of chemotherapy.

Other groups have demonstrated the presence of a metabolic signature from breast cancer. A similar study aiming to create a model to differentiate early and metastatic breast cancer using 1H-NMR spectra was performed by Jobard et al. [51], using a training cohort of 46 early and 39 metastatic breast cancer patients, and an independent validation cohort of 61 early and 51 metastatic breast cancer patients. Their model was also reported to have even higher discriminating power. Crucially however, serum samples for the early patients were taken preoperatively, ie with the primary cancer in situ. Thus it represents more a discriminator of tumor bulk, rather than tumor presence. Furthermore it did not examine the model against any clinical outcome, and its utility in prognostication or prediction remains unknown. Common to all these trials is the problem of small numbers of participants.

Specific Metabolites

In each of the studies described, certain individual metabolites were identified that showed significant correlation with the presence of metastatic disease (Table 2). In the Tenori study, reduced serum histidine and increased glucose and lipids were significantly correlated with metastatic disease [50]. In the Jobard study however, nine different metabolites were identified, which included low histidine [51]. Glucose and lipids had a trend to significance. Much greater reproducibility will be needed before any particular metabolite can be used clinically. Moreover, this tends to move away from the unique benefit of metabolomics , ie the consideration of the combined picture of tumor and host response. Many single metabolites, including amino acids, have been shown to correlate with the presence of cancer, yet none have proven discriminatory enough to be clinically meaningful [52, 53].

Table 2 The identified discriminating metabolites detected in four metabolomic studies. Note the low rate of concordance between studies. MBC, metastatic breast cancer; NS, not statistically significant

Further Trials

These exploratory trials give support to the potential of metabolomics in the detection of micrometastatic disease and the prediction of relapse, but require further validation in larger cohorts. A proposed trial by our group aims to repeat the experiment performed by Tenori et al. using a larger data set. Serum samples from some 600 early (post-operative) and metastatic breast cancer patients with documented follow up data from a number of centres will be analysed, a risk score generator created, and prediction of outcome compared to actual clinical outcome. While aiming to achieve similar results to the first study and demonstrate reproducibility, it will also shed light on transferability to other populations.

Metastatic Breast Cancer

Studies of metabolomics within metastatic breast cancer have been less productive. This is likely in part due to the greatly increased mutational load and hetereogeneity in advanced disease, that leads to far more complex, variable and inconsistent metabolic profiles . Another study by Tenori et al. [54] aimed to predict responses to treatment based on changes in metobolomic profile before and after treatment, but were unable to demonstrate any discriminatory power. In a small subset of HER2-positive patients, metobolomic analysis was able to predict response to lapatinib plus paclitaxel, but the results in this cohort were discouraging.

A proposed investigation will aim to study the serum metabolic profiles of a large cohort of metastatic breast cancer patients over time as part of a much broader prospective longitudinal cohort study, and follow their progress over time. It is hypothesised that metobolomic analyses may demonstrate prognostic or predictive power for response to therapy and disease time course, identify novel biomarkers and help to refine data derived from ‘upstream’ analysis such as gene expression profiling.

Conclusions

Metobolomic studies in breast cancer have shown that a metabolic signature of cancer exists and can be detected in patient serum. It has the potential to allow early identification of relapsed disease, predict likelihood of relapse, and act as a biomarker of disease activity and response to treatment. It is limited by its complexity, requiring high-cost specialised equipment and analysis, which may hinder its progress into larger patient population studies, while retrospective analysis of completed clinical trials is frequently unfeasible.

It would be ideal, for example, to go back to early placebo controlled trials in the adjuvant treatment of early breast cancer to assess differences between the metabolomic spectra of those who were cured with surgery alone and those who relapse. Unfortunately of course this is not possible for a number of reasons, not least of which is a lack of stored serum. Given this barrier, one may conclude that it will be impossible to develop evidence strong enough to convince clinicians and patients to ignore a traditional ‘high-risk’ assessment, and forego adjuvant therapy, based on a novel risk score without the backing of a placebo controlled trial, and that such a trial would be ethically impossible. The dream of sparing ‘cured’ patients adjuvant therapy, at least by metobolomic methods, may indeed be unattainable.

A more achievable goal may be to focus on the lower risk groups who would traditionally forgo adjuvant chemotherapy, and attempt to predict relapse. A prospective study could then assess the benefit of adding adjuvant chemotherapy to those deemed more likely to relapse. For example, future studies might combine genomic risk with metabolomic risk in patients with ER-positive early breast cancer, and observe for differences in outcome between those assessed as low genomic and low metabolomic risk, and those with low genomic but high metabolomic risk, all treated with adjuvant hormone therapy alone. In this way it may be seen if metabolomics offers complementary risk stratification power.

For now, in this field at least, metabolomics remains exploratory, until a robust algorithm for analysing metabolic spectra can be achieved that both accurately predicts the presence of cancer and the clinical outcome, and is resistant to the influence of the multitude of normal variables that impact the metabolome. Only then can it be prospectively validated as a meaningful tool to aid in risk stratification and decision making about adjuvant therapy.