Introduction

The goal of treatment in major depressive disorder (MDD) is complete remission (Keller 2003). Especially in the treatment of severely ill patients, anti-depressant pharmacotherapy is one of the key strategies. However, after the initial trial of anti-depressant treatment with selective serotonin reuptake inhibitors (SSRI) remission can only be achieved in 30 % of patients (Rush et al. 2006). Furthermore, a substantial number of patients do not respond to multiple trials of anti-depressants and develop a chronic course of disease and become treatment resistant. Chronically depressed patients have a significant quality-of-life impairment, and a chronic course of disease is associated with a high socio-economic burden (Rapaport et al. 2005). In order to assess resistant patients during the course of anti-depressant treatment, different staging models have been proposed. These models vary considerably in definition and measurement of treatment-resistant depression (TRD) (Fava 2003; Ruhé et al. 2012; McIntyre et al. 2014; Trevino et al. 2014). In addition, none of the existing models are routinely used by clinicians.

Initially, Thase and Rush introduced a simple staging system for staging anti-depressant resistance, the Thase and Rush model (TRM), aimed at clinical psychiatrists managing non-responders (Thase and Rush 1997). The model consists of five stages usually beginning with an SSRI as first-line intervention. The model has been used and reviewed extensively (Fava 2003; Nemeroff 2007; Ruhé et al. 2012). Another approach, the Maudsley Staging model (MSM), which was proposed by Fekadu et al. (2009), attempted to overcome some disadvantages of the hierarchical TRM, for example, by incorporating electroconvulsive therapy (ECT) as a non-hierarchical item (Fekadu et al. 2009). Furthermore, the authors tried to take into account both severity of disease and duration of illness (see Supplementary Figure 3).

Other staging models such as the European Staging Model or the Massachusetts General Hospital Staging Model were not analysed in the present study. These models propose a minimum trial duration of 6 weeks, whereas Thase and Rush originally proposed a 4-week trial duration as minimum trial length. The national and international guidelines recommend treatment modification after 4 weeks of treatment without response (NCCMH 2010; Deutsche Gesellschaft für Psychiatrie 2015). Therefore, the adequate trial length for a treatment trial without response was considered to be 4 weeks in our study. Furthermore, compared to the European Staging Model and the Massachusetts General Hospital Staging Model, only the MSM includes both duration of illness and severity of depression (Ruhé et al. 2012).

In the absence of biomarkers, both staging models addressed TRD solely on the basis of clinical information. In order to sharpen our diagnostic assessment and to individually target our treatment strategies, there is an unmet need for valid blood-based biomarkers (Chan et al. 2014; Niculescu et al. 2015; Bahn and Chan 2015).

Several studies have investigated molecules and genes associated with treatment response as well as TRD. Catechol-O-methyltransferase (COMT), brain-derived neurotrophic factor (BDNF) and the serotonin transporter (SLC6A4) gene polymorphisms have been investigated and may be associated with TRD and treatment response (Baune et al. 2008; Schosser et al. 2012). However, the analysis of other candidates, for example cyclic adenosine monophosphate response element binding (CREB1) or dystrobrevin binding protein 1 (DTNBP1) gene, has not shown significant association with TRD (Schosser et al. 2012). Furthermore, a recently published large genome-wide association study investigating genetic variation that may contribute to response to SSRI treatment has failed to show any significance at the genome-wide level (Biernacka et al. 2015).

Cytokines and other proteins involved in inflammation such as Interleukin 6 (IL-6) and tumour necrosis factor (TNF) may be implicated in the response to treatment with anti-depressants and the development of TRD (Lanquillon et al. 2000; O’Brien et al. 2007; Powell et al. 2013). A possible role of TNF in TRD has even led to clinical trials investigating the effect of TNF-alpha antagonists such as infliximab as a monotherapy or an add-on medication in patients suffering from TRD (Raison et al. 2013; Schmidt et al. 2014). In summary, until now, molecular markers enhancing clinical staging or a blood test for response prediction or the identification of patients at high-risk of developing TRD are lacking.

Apart from the above-mentioned methods, high throughput proteomic techniques such as mass spectrometry may offer an alternative to discover blood-based protein biomarkers for TRD and treatment response (Chan et al. 2014). Moreover, a recently developed mass spectrometric analysis method, which makes use of triple quadrupole mass spectrometry, known as selective reaction monitoring (SRM) provides a new tool for the quantitative and highly specific detection of pre-selected analytes in complex biological samples such as human serum (Lange et al. 2008; Picotti and Aebersold 2012). With this in mind and to extend the previous work on TRD biomarkers, we set out to identify serum biomarkers of TRD using different high throughput proteomic platforms. We hypothesized that molecular changes would be detectable across the different staging groups and tested this hypothesis by comparing two clinical models of TRD, the TRM and the MSM, and using an inpatient cohort of 65 individuals suffering from TRD.

Materials and methods

Study participants and sample collection

This study was approved by the ethics committee of the medical association Westphalia-Lippe, Germany (reference 2009-019-f-S). After study procedures had been fully explained, subjects provided written informed consent.

Sixty-five patients with detailed clinical information were selected from a cohort recruited for the EU funded MoodInflame project aimed at early diagnosis, treatment and prevention of mood disorders targeting the activated inflammatory response system (reference 222963) (for more information see http://moodinflame.eu). All patients were MDD inpatients from three different centres, which were diagnosed with MDD and were taking anti-depressant medication at the time of sample collection. Patients were screened and included at any time during their in-hospital treatment. All patients met the Diagnostic and Statistical Manual of Mental Disorders, 4th Edition, Text Revision (DSM-IV-TR) criteria for MDD and clinical tests including administration of the Inventory of Depressive Symptomatology (IDS-C30) assessment were performed by psychiatrists under good clinical practice-compliance to minimise variability at inclusion. All patients were symptomatic at inclusion (ICD-C30 >12). MDD patients with other psychiatric co-morbidities were identified and screened for personal or family history of neuropsychiatric disorders using M.I.N.I. (German Version 5.0.0). Remission was defined as an IDS-C of ≤11 at discharge and partial remission defined as IDS-C of ≥11 at discharge (Trivedi et al. 2004; Huijbers et al. 2012).

Staging of patients

All patients were staged using the TRM as well as MSM. For the TRM, patients were strictly staged according to definitions of the stages in the original publication by an experienced clinician (see Supplementary Figure 1) (Thase and Rush 1997). An adequate treatment trial was defined as a period of at least 4 weeks with a moderate dose in line with national guidelines. Only clinical information concerning the current episode was used for the TRM and MSM. This included a period of 8 weeks before admission for which reliable data of medication compliance could be collected for every patient. MSM stages were assigned based on the total score meaning mild (scores = 3–6), moderate (scores = 7–10) and severe (scores = 11–15) treatment resistance (see Supplementary Figure 2) (Fekadu et al. 2009). In general, medication only received a score if it was given at adequate dose for at least 6 weeks. Therefore, some patients treated with tricyclic anti-depressants (TCA) or ECT at the time of blood collection did not receive a score because the trial length of the TCA or previous antidepressants was not sufficient for classification purposes.

Since only one patient reached stage III resistance with a score of 11, the patient was included in the group of patients with stage II resistance (not being an outlier in statistical analysis). Important patient demographic and clinical characteristics such as gender, age, BMI, smoking alcohol consumption, chronic illness, family psychiatric disorder, depression severity (IDS-C) and use of non-psychiatric medication as well as psychiatric comorbidities were compared for both models in order to detect significant differences between the staging groups.

LC-MSE analysis

Serum samples were randomized and quality controls were added, before depletion of the most abundant proteins using a MARS14 (Agilent, USA) on an ÄKTA™ purifier UPC 10 chromatography system (GE Healthcare, UK) as described previously (Jaros et al. 2013).

A tryptic digest was performed after depletion and the samples were stored at −80 °C until LC-MSE analysis was performed. Using a Waters quadrupole time of flight (QTof) Premier mass spectrometer, LC-MSE analysis was carried out by running every sample in triplicate, as described by Levin et al. After ProteinLync Global Server v.2.5. (Waters Corporation) and Rosetta Inpharmatics Biosoftware Elucidator v.3.3 (USA) data processing, protein identification was performed as described previously (Stelzhammer et al. 2014).

SRM mass spectrometry

Samples were analysed by using a targeted SRM mass spectrometry approach on a predetermined set of peptides as described previously (Lange et al. 2008; Gottschalk et al. 2014). Briefly, a Xevo TQ-S mass spectrometer (Waters Corporation) was coupled online through a New Objective nanoESI emitter (7-cm length, 10-mm tip; New Objective) to a nanoAcquity nano-ultra-performance liquid chromatography system (Waters Corporation). Peptide selection was done for candidate proteins identified previously by our lab and others (see Supplementary Table 1) (Penninx et al. 2003; Hummel et al. 2011; Stelzhammer et al. 2014). For each target peptide, a heavy isotope-labelled internal standard (JPT Peptide Technologies GmbH) was spiked in the peptide mixture for accurate quantification and identification.

Transitions were calculated and selected using Skyline version v2.5 (MacLean et al. 2010). Each transition corresponded to singly charged y-ions from doubly or triply charged precursor ions in the range of 350–1250 Da. Method refinement was performed on quality control samples in order to select for the peptides with the maximal intensities and highest spectral library similarity (dot p > 0.9). A further development step of analyzing heavy-label spiked quality control samples in scheduled SRM mode was used to confirm identity via co-elution, extraction of the optimal fragment ions for SRM analysis, obtaining accurate peptide retention times, and optimalization of collision energy and cone voltage for the quantification run applying Skyline software (MacCoss Lab Software; MacLean et al. 2010).

Multiplex bead-based immunoassay

A High Sensitivity Hu Cytokine-T cell (MAGPX10223002) from Millipore was used as described in the manufacturer’s protocol. In summary, samples were thawed and antibody-immobilized beads were prepared. After preparing standards and buffers, the plate was prepared by washing it with a wash buffer. Samples, standards and beads were added to the appropriate wells and incubated overnight at 4 °C. Contents were then removed and detection antibodies added. Following 1-h incubation at room temperature (RT), streptavidin-phycoerythrin was added and incubated for another 30 min at RT. Finally, drive fluid was added and the plate was run on a Luminex MagPix Plate Reader.

Statistics

All statistical analyses were performed in R (http://www.R-project.org/) (R Development Core Team 2013).

LC-MSE and SRM data analysis

The processed and normalised LC-MSE data was log transformed to stabilize variance, and quality controls (QCs) were assessed. Peptides with over 30 % missing values and missed cleavage were excluded. Sample outliers were examined using principal components analysis (PCA) (Beniger et al. 1980) and through inspection of quantile-quantile (Q-Q) plots.

The SRM data was pre-processed using the R package MSstats (Clough et al. 2012). The data were log2 transformed and quantile normalisation was applied to remove systematic bias between MS runs. The resulting profile, QC and condition plots were carefully inspected to identify potential sources of variation for each protein, evaluate any systematic bias between MS runs and assess the variability of each condition per protein, respectively. Transitions with over 30 % missing values were excluded. Sample outliers were examined as described above.

For both the LC-MSE and SRM data, protein-level quantification and testing for differential abundance between the staging groups were performed using the random intercept linear mixed-effects model, as implemented in the R package nlme (Pinheiro et al. 2009). Random intercepts for subjects nested within recruitment centres were specified for each model to account for the hierarchical structure of the data. The confounding effects of patient demographic characteristics such as age, BMI, gender, smoking status, alcohol use and chronic illnesses were accounted for in each model (fixed effects). False discovery rate was controlled according to Benjamini and Hochberg (Benjamini and Hochberg 1995).

Multiplex bead-based immunoassay data analysis

Immunoassay data were pre-processed to remove analytes with greater than 30 % missing values. Missing values are defined as analytes with measurement values below or above the detection limits. The resulting data were log10 transformed. Logistic regression was applied with staging status as the outcome and analyte as the predictor variables. The demographic variables listed above were made available for forward and backward stepwise selection (Hastie and Pregibon 1992), with selection based upon the Bayesian information criteria (BIC) (Schwarz 1978).

Results

Demographic and clinical variables for the staging groups in both models were compared (Table 1). For the TRM, no significant differences existed across all covariates including age, gender, BMI, smoking and depression severity. For the MSM, only depression severity was significantly different between stage I (mean = 30.53, standard deviation ±8.15) and stage II (mean = 41.72, standard deviation ±9.50), which was due to depression severity being an item of the model itself (see Supplement Figure 2).

Table 1 Patient demographic and clinical characteristics

Using logistic regression, both models showed no significant predictive capability to discriminate between the group of patients achieving remission and the patients with only partial remission at discharge (for TRM log(odds) = −0.89, p = 0.119; for MSM log(odds) = −1.16, p = 0.067). Although, the predictive validity of the MSM has been shown previously, this could not be replicated in our study.

Using a non-hypothesis-driven label-free LC-MSE approach, a number of proteins were identified to be significantly different between the staging groups in both models (Table 2, Supplementary Table 2 for the list of all detected proteins). Based on a Gene Ontology (GO) term analysis, these proteins were mainly involved in the biological process of acute phase response, complement activation, coagulation and oxygen transport (The Gene Ontology Consortium 2000; The Gene Ontology Consortium 2015). For the TRM stage comparison, a total of eight proteins could be detected as significantly different: Serum amyloid P-component (β = 0.035, p = 0.008, FC = 1.08), Ficolin-3 (β = 0.033, p = 0.021, FC = 1.08), C4b-binding protein beta chain (β = 0.053, p = 0.023, FC = 1.13), C4b-binding protein alpha chain (β = 0.030, p = 0.024, FC = 1.07), complement C1q sub-component subunit C (β = 0.026, p = 0.037, FC = 1.06), Histidine-rich glycoprotein (β = −0.053, p = 0.024, FC = 0.88), nuclear factor of activated T-cells (β = 0.047, p = 0.027, FC = 1.11) and Beta-Ala-His dipeptidase (β = −0.053, p = 0.049, FC = 0.88). Interestingly, almost all of the proteins were involved in acute phase response, complement activation and coagulation.

Table 2 Results of the non-hypothesis driven label-free LC-MSE approach

For the MSM stage comparison, ten proteins were significantly changed: heparin co-factor 2 (β = −0.023, p = 0.004, FC = 0.95), plasma serine protease inhibitor (β = 0.034, p = 0.012, FC = 1.08), anti-thrombin-III (β = 0.039, p = 0.023, FC = 1.09), interleukin-1 receptor accessory protein (β = 0.067, p = 0.026, FC = 1.17), complement factor D (β = 0.059, p = 0.037, FC = 1.15), haemoglobin subunit alpha (β = −0.400, p = 0.008, FC = 0.40), haemoglobin subunit beta (β = −0.227, p = 0.017, FC = 0.59), putative post-meiotic segregation increased 2-like protein 11 (β = 0.099, p = 0.004, FC = 1.26), calcium-binding protein 5 (β = −0.068, p = 0.023, FC = 0.85) and cytosolic beta-glucosidase (β = 0.051, p = 0.039, FC = 1.12).

On a protein-level, no overlap could be detected between MSM and TRM stage comparisons. However, the top GO terms were similar in both comparisons. Except for the proteins haemoglobin subunit alpha and beta involved in oxygen transport, fold changes were mainly subtle. Testing for multiple corrections using the Benjamini Hochberg method yielded no statistically significant results (adjusted p value >0.05).

A multiplex bead-based assay was used to further elucidate possible changes in inflammatory proteins previously implicated in depression (Penninx et al. 2003; Kaestner et al. 2005; Simon et al. 2008). A total of seven analytes were included in the analysis (IFN-γ, IL12p70, IL-4, IL-6, IL-7, IL-8, TNF-α). In the TRM comparison, significant differences in TNF-α levels between the groups could be detected (log(odds) = −4.95, p = 0.045). There were no significant differences for the MSM stage comparison (Table 3). In addition, no significant correlation could be shown between the MSM score and the analytes (data not shown).

Table 3 Results of multiplex bead-based assay

Furthermore, another mass spectrometry method, selective reaction monitoring (SRM), was used to analyse the samples. For the SRM assay, a panel of apolipoproteins and inflammation-related proteins implicated in major depression was selected (see Supplementary Table 1). These analytes were chosen based on previous findings by our group (Stelzhammer et al. 2014; Bot et al. 2015) and others (Hummel et al. 2011). In addition, the most significant proteins identified by label-free LC-MSE were selected for further validation. Significant changes could only be detected for the TRM stage comparison (Table 4). Three apolipoproteins A–I (β = 0.029, p = 0.035, FC = 1.02), M (β = −0.017, p = 0.009, FC = 0.99) and F (β = −0.031, p = 0.024, FC = 0.98) as well as alpha-1-antichymotrypsin (β = 0.025, p = 0.032, FC = 1.02) were found to be changed. However, changes of ficolin-3, complement C1q sub-component subunit C and histidine-rich glycoprotein, which were significantly changed in the label-free LC-MSE experiment, could not be validated by SRM.

Table 4 Results of selective reaction monitoring analysis

Discussion

The objective of this study was to evaluate possible molecular phenotypes underlying TRD based on two clinical staging methods using proteomics. The use of proteomics for discovering blood-based biomarkers in different medical fields has progressed substantially over the last years (Hanash et al. 2008; Shao et al. 2015). New analysis methods such as SRM have been developed recently and offer novel ways of analysing a pre-determined set of proteins in a complex mixture like human serum across multiple samples (Picotti and Aebersold 2012). To our knowledge, this is the first study using SRM to detect pre-selected analytes in an MDD cohort suffering from TRD.

While group comparison in the TRM showed significantly changed proteins in all the three assays, only the LC-MSE analysis showed significant changes in the MSM. GO term analysis revealed that the identified proteins were mainly involved in complement activation and coagulation. Up-regulation of the closely interacting proteins serum amyloid P component and the C4b-binding protein in the stage II TRM group might suggest an altered regulation of the complement system in more severely affected patients (García de Frutos and Dahlbäck 1994). Interestingly, changes of TNF alpha levels in the TRM model using a multiplex bead-based immunoassay corresponded to results of previous studies, which showed a relationship between response and TNF alpha level decrease (Lanquillon et al. 2000; Strawbridge et al. 2015). SRM analysis revealed three apolipoproteins being changed in the TRM group comparison. A possible role of apolipoprotein changes during treatment of MDD has been suggested by previous studies (Sadeghi et al. 2011; Hummel et al. 2011).

A possible role of inflammation in the pathogenesis of MDD has been studied extensively in patients as well as in animal models of depression (Dantzer et al. 2008; Iwata et al. 2013). Furthermore, levels of inflammation-related genes predict lack of response to anti-depressants (Cattaneo et al. 2013). A recent meta-analysis revealed an association of IL-1β and IL-6 levels with suicidality (Black and Miller 2015). Anti-inflammatory drugs have also been found to antagonize the therapeutic efficacy of anti-depressant agents (Warner-Schmidt et al. 2011; Miller and Raison 2016). Therefore, further studies are needed to elucidate a potential relationship between inflammatory activation and different stages of TRD.

Some limitations of the study design have to be considered. The sample size for this pilot study was small and an independent validation cohort was not available; hence, larger prospective studies are warranted. Furthermore, the cohort of patients used in this study was not specifically recruited to assess all stages of TRD as defined by the two staging methods. Therefore, patients with a very high number of unsuccessful treatment trials (five or more) could be under-represented. Since the cohort did not include outpatients, no control group of responders to a first treatment trial was available. In addition, the TRM did not account for the use of augmentation strategies. Since medication had to be taken for a period of 6 weeks in MSM, only 16 patients received augmentation therapy long enough to score >0. Robust clinical data offering information about response time to an augmentation strategy in TRD is not available and needs further evaluation in specifically designed clinical trials (Keller 2005; Carvalho et al. 2007). Other staging models have been designed, which were not analysed in this study, therefore leaving the question open if different molecular phenotypes underlie the staging in these models (Ruhé et al. 2012).

In conclusion, our findings suggest that proteins involved in complement system activation, inflammatory response and lipid transport could be interesting candidates to stratify TRD at the molecular level. However, given that the molecular changes between the staging groups were subtle, the results have to be interpreted cautiously. With regard to the limitations of the study, this pilot data shows the need for optimization of the clinical staging models by conducting prospective clinical trials. Advances in proteomic technologies in terms of analytical sensitivity and resolution as well as cost-effectiveness now allow for improved targeted molecular measurements. Such advances may offer a wider range of biomarkers potentially capable of allowing for stratification of molecular phenotypes underlying TRD