Introduction

Multiple myeloma (MM) is a clonal B-cell disorder with the characteristics of excessive secretion of monoclonal immunoglobulin or light chains and lytic bone lesions [14]. Clinically, the patients with MM often presents with bone pain, renal failure, anemia, hypercalcaemia and susceptibility to infections [4, 5]. Just because MM has no specific clinical manifestations, especially in the early stage, it becomes difficult to make an accurate diagnosis and easy to commit a misdiagnosis or missed diagnosis [1]. As MM cells have resistance to conventional chemotherapy drugs, these drugs can only alleviate most patients’ clinical symptoms or prolong their life [3, 6, 7]. Although the past years have witnessed a great progress in understanding molecular pathogenesis and classification of MM [8] and in improving therapeutic effects due to novel agents, such as thalidomide and bortezomib or new therapies like autologous stem-cell transplantation [911], the MM patients’ prognosis is still not so good [1214]. In clinical practice the key to improve the prognosis relies on early diagnosis. The discovery of tumor markers specific to MM will be of great significance to the early diagnosis and prognosis evaluation.

A previous study determined a panel of three markers (8131, 22752, 11660 m/z) over the mass range m/z 1000 to 50000 Da (optimized mass range 2000–15000 Da) using weak cation exchange magnetic beads (WCX) and MALDI-TOF-MS to analyze serum protein samples [15]. Apparently the three proteins have bigger molecular weight, >8000 Da. If the optimized mass range is down-regulated, it is possible to find novel tumor markers with smaller molecular weight.

MALDI-TOF-MS is a powerful and highly sensitive tool for the detection of large numbers of peptides and proteins, in particular the acquisition of complex profile spectra that reflect the status quo in biological mechanisms and cell pathways, and has been successfully applied to identify disease biomarkers by simply surveying human serum [16]. CLINPROT is an integrated set of tools provided by Bruker Daltonics for preparation, measurement and visualization of peptide and protein biomarkers in context to clinical proteomics. This system is consisted of magnetic bead-based sample preparation, MALDI-TOF MS acquisition, and a bioinformatics package for inspection and comparison of data sets as well as for the discovery of complex biomarker pattern models, which greatly facilitates the clinical proteomic studies [17]. Analogously to the previous clinical proteomic profiling studies [15, 1719], this study was to screen serum tumor markers of MM over the mass range m/z 700–10000 Da using the WCX and the ClinProTools software provided by Bruker Daltonics Inc.

Materials and methods

Study population

Profile spectra were obtained from 48 MM patients and 74 controls. All 122 serum samples were gathered from the Second Affiliated Hospital of Xi’an Jiaotong University and Xi’an Central Hospital from March 2008 to September 2010. The blood collection protocols were approved and the informed consent was obtained. The 48 patients (male/female 33/15, median age 61 years, age range 41–80) were newly diagnosed according to the diagnostic criteria for MM from the International Myeloma Working Group [2], including IgG M-protein type 28 patients, IgA M-protein type 8 patients, and monoclonal light chain type 4 patients, non-secretory type 8 patients (Electronic supplementary material). And the collection of these patients’ blood samples was done before chemotherapy. The 74 controls (male/female 50/24, median age 60 years, age range 45–75) were those who came to our two hospitals to undergo the healthy physical examination and had no any abnormal symptoms and results. The patients and the controls were subdivided into a training set (26 MM patients and 26 controls) for model generation and a blinded testing set (22 MM patients and 48 controls) for validation. All experiments were performed in triplicate.

Reagent and instrumentation

From Bruker Daltonics Inc. (Billerica, MA, USA) we purchased the CLINPROT System, including WCX magnetic beads kit, matrix, acetone, dehydrate alcohol, magnetic bead separator, AnchorChip target, MALDI-TOF-MS (Microflex® and its equipped software FlexControl2.2 and Flexanalysis3.O), and the analytic software ClinproTools2.2. The 200-μl Orcugen sample tubes were purchased from ORCUGENTM (USA).

Sample preparation

In all cases, 3 ml of fasting blood samples were gotten early in the morning and collected in 4 mL BD vacutainers without anticoagulants, and then centrifuged (2500 rpm, 10 min) at room temperature within 2 h. The supernatant serum was transferred into other marked centrifuge tubes and followed by centrifugation at 4°C (10000 rpm, 10 min). Each pooled serum sample was allocated into 5 tubes (100 μl/tube), frozen, and stored at −80 °C for future analysis. No sample underwent more than one freeze-thaw cycle before analysis.

The serum samples stored at −80 °C were thawed at room temperature. 5-μl serum samples were diluted in a 200-μL Orcugen sample tube by adding to 10-μl binding buffer and 10-μl WCX magnetic beads. This sample tube was allowed to stand for 5 min and then placed in a magnetic bead separator for 1 min, after which the supernatant was discarded carefully with a pipette. Subsequently we added to the tube 100-μl washing buffer, moved the tube to and fro in two adjacent holes of the magnet separator, placed the tube in the magnet separator to stand for 1 min so as to make the magnetic beads adhere to the wall of the tube, and then discarded the supernatant carefully using a pipette. This washing process was repeated twice. Following binding and washing, we added 5-μl elution buffer to the tube to elute the bound proteins from the magnetic beads, and then placed the tube in the magnet separator for 2 min, after which the supernatant was transferred into a 0.5-ml tube with 5-μl stabilization buffer. Binding, washing, elution, and stabilization buffer were all provided by Bruker Daltonics Inc., and the preparations above for prefractionation were done according to the Bruker’s instructions.

Proteomic analysis

In the proteomic profiling analysis, the serum samples from the MM patients and the controls were randomized, and the investigator was blinded to their identity. We prepared a matrix solution with α-Cyano-4-hydroxycinnamic acid (HCCA) of 0.3 g/l in ethanol:acetone 2:1 (prepare fresh every day). Then we diluted 1 μl of eluted sample in 10 μl of matrix solution. Each sample was spotted onto a MALDI-TOF AnchorChip(TM) target (600 μm anchor diameter). After air-drying at room temperature, the AnchorChip target was put into the Microflex, and then followed by calibration and data acquisition on an automated robotic platform (ClinProt Robot).

The ClinProTools(TM) software was used for data analysis, which begins with a raw data pretreatment, including normalization of a set of spectra derived from a patient cohort, internal signal alignment using prominent internal signal peaks, and a peak picking procedure. The whole pretreatment was automatically performed using default settings without any user interaction. The pretreated data were used for visualization and statistical analysis in ClinProTools. In this study, spectra were generated on a Bruker autoflex(TM) MALDI-TOF MS operated in the linear mode with a total of 300 laser shots summed per sample (2 × 30 shots from each spot multiplied by 5 spots). Flexanalysis 3.0 software was used to compile spectra and detect peaks. The m/z ratios between 700 and 10000 were selected as the target mass range for analysis because this range contained the resolved protein and peptides with smaller molecular weight. The m/z range between 0 and 700 was eliminated from analysis to avoid interference from adducts, artifacts of the energy-absorbing molecules, and other possible chemical contaminants. Peak statistics was performed by means of a Welch’s t test.

The ClinProTools(TM) provides a number of highly sophisticated mathematical algorithms that generate models to differentiate between samples from diseased patients and from controls. Mass spectra from the two model generation cohorts (26 MM patients and 26 controls) were analyzed using the Genetic Algorithm in ClinProTools(TM). The resulting optimized model was subsequently used to validate the 70 test samples.

Results

This study used the ClinProTools(TM) software to interpret the data of MALDI-TOF spectra derived from serum samples of the two groups. The samples have been prefractionated with MB-WCX, and this process was simple to be scaled-up and automated using a liquid handling robot as shown by Villanueva et al. [18]. We only used one control group, namely healthy controls without other disease controls, which enabled us to distinguish general healthy marker candidates from those that are specific for MM.

Exemplary MALDI-TOF mass spectra obtained from the samples after purification using MB-WCX are shown in Fig. 1. 89 peaks within the respective spectra were mass labeled in the range of 700–10000 Da using a signal to noise threshold of 3.0. The spectra obtained were highly reproducible, because the experiments for each sample were performed in triplicate.

Fig. 1
figure 1

Serum protein spectra of MM patients and controls. a Red curve shows the serum protein spectra of MM patients in the range 700–10000 Da, and b green curve the serum protein spectra of controls

ClinProTools(TM) offers a variety of viewer options for the analysis of clinical profiling data, e.g., Pseudo-Gel Viewer and Stack Plot Viewer. They can display the mass spectra for comparative visualization. The Stack view and pseudo-gel view of 52 mass spectra from the training data set is shown in Fig. 2. Peak heights and peak areas are independently calculated in the ClinProTools(TM) with Welch’s t test to determine the statistical separation strength of all the peaks and then to generate an output file. On the list of the output file there were 49 discriminating mass peaks with statistical significance (P < 0.0001) in the range of 700–10000 Da (Table 1).

Fig. 2
figure 2

Stack View and Pseudo Gelview of mass spectra in the training set. a The Stack View of 52 mass spectra from the training data set. Both groups (MM and healthy control) contain 26 samples that have been prepared in triplicate. This 3-dimensional stack view gives some hints concerning signal scattering in a sample class. b The Pseudo Gelview of the 52 Mass spectra, and all individual spectra are shown in a density scale

Table 1 Forty-nine discriminating m/z peaks between MM group and the control group

In the ClinProTools(TM) the peaks with high separation power in the Welch’s t test are used to generate a biomarker pattern model. Classical peak statistics, the genetic algorithm and other algorithms may be used independently or in combination. Once saved, pattern models may be validated in a cluster analysis with an independent test sample set. Here we used the Genetic Algorithm to determine the peaks with the highest separation power and to generate a diagnostic biomarker model. The ClinProTools(TM) provided us a model constructed with four peaks. The four peaks were 2900.4, 3315.96, 7763.24, and 2660.65 m/z respectively. The spectra of the four peaks shown in Fig. 3 reveal the significant differences between the peak areas and peak heights from controls and MM patients.

Fig. 3
figure 3

Four peaks with the highest separation power. The four peaks (2900.4, 3315.96, 7763.24, and 2660.65 m/z) are determined by Genetic Algorithm as the highest separation power to generate a diagnostic biomarker model. The protein spectra of the four peaks show the significant differences between the peak areas and peak heights from controls and MM patients

We established an independent and blinded test data set of sufficient cohort size (70 samples, including 22 MM patients and 48 controls) for validation. Validation determines the predictive capability of a pattern as a percentage of the correctly classified test data. The resulting profile model constructed in training data set was then used to validate the independent test set. This model correctly identified 19 MM patients from the 22 MM samples (86.36 % sensitivity), and correctly recognized 42 healthy controls from the 48 healthy samples as non-diseased (87.5 % specificity).

Discussion

Based on the 2003s epidemiological statistics of International Myeloma Working Group, MM is the second most common hematologic malignancy affecting at least 32000 new patients per year in USA and accounting for approximately 1 % of all malignant diseases and 10 % of hematologic malignancies [2]. However, pathogenesis of MM has not been fully understood. Tumor markers are usually the proteins produced by body in response to cancer growth or by cancer tissue itself. They may be detected in blood, urine, or tissue samples. The detection of tumor markers is of great significance for oncologic research and clinical practice. We hope we may lay a solid foundation for future researches by identification of some MM biomarker candidates.

We have completed a primary study concerning identification of MM biomarker candidates using CLINPROT(TM). The aim of this study was not to establish diagnostic biomarker patterns, because such a confirmation to a diagnostic biomarker pattern needs larger data set and identification via TOF/TOF fragment analysis to provide biological relevance to the statistical analysis. The aim of this study was just to find MM biomarker candidates over the mass range m/z 700 to 10000 Da.

The past decades saw a great development in the detecting methods of tumor markers. CLINPROT system is one of these detecting methods. Many previous studies showed us this system was advanced, reliable and efficient in tumor marker discovery [16, 2022]. In this study, we utilized the ClinProTools(TM) to detect the serum samples in the training data set and obtained 49 discriminating m/z peaks. And then we used the statistical features of ClinProTools to evaluate those peaks with seemingly high power to differentiate between the two classes by means of a Welch’s t test. The result of the Welch’s t test is a P value, which indicates the probability that the observed intensity differences of the individual peaks are not based on coincidence. Accordingly, the lower the P value, the better a respective peak signal is suited to be used to separate the two classes [17]. ClinProTools(TM) automatically selected the four peaks with the highest separation power (2900.4, 3315.96, 7763.24 and 2660.65 m/z) in the training set to build a biomarker model. And this model resulted in a sensitivity of 86.36 % and a specificity of 87.5 % in the validation in the testing data set. This means it is reliable the four proteins have the power to distinguish MM patients from the healthy.

Among the four peaks, 2900.4 m/z is obviously up-regulated and 7763.24 m/z down-regulated in the MM group. The differences in these two ranges between the two groups (MM/controls: 4.96/1.31 kDa in the range of 2900.4 m/z; 2.09/8.86 kDa in the range of 7763.24 m/z) are so great that we presume the two proteins might be unique to MM. But this opinion needs further confirmation. Because previous studies prove some tumor markers can be seen in several cancer types, and even some well-known tumor markers may also be elevated in non-cancerous conditions [2325].

In 2006, Bhattacharyya et al. reported they identified a set of four peaks with the discriminating power between MM patients with and without skeletal involvement using copper ion loaded immobilized metal affinity SELDIchip arrays in the mass range of 2000 to 20000 Da. The four peaks are 5802, 6639, 6443 and 2019 m/z [26]. They are different from the three protein peaks (8131, 22752, 11660 m/z) identified by Wang [15] over the range of 1000–50000 Da using WCX and MALDI-TOF-MS. By the three peaks Wang built a model that could correctly identify 87 % (26/30) MM patients and 100 % (34/34) healthy controls. Although this current research and Wang’s study basically adopted the same study design and the same magnetic beads, namely WCX, this current research captured a different panel of four peaks with smaller molecular weight (2900.4, 3315.96, 7763.24 and 2660.65 m/z). This situation should be resulted from that the mass ranges observed by the two studies are different. We have noticed there is a part of overlap in the optimized mass range of the two studies. Therefore we think to some extent this situation should also be related with the different algorithms for construction of diagnostic model and the different serum samples of MM patients. In Wang’s study part of serum samples of MM patients were collected after chemotherapy, but all MM serum samples in this study were collected before chemotherapy. One thing is common between the two studies in our opinion. It is that due to not setting relevant disease controls in both studies, neither Wang’s three biomarkers nor our four biomarkers can differentiate myeloma from other plasma cell dyscrasias.

The biggest limitation of this study exists in that we did not set disease controls with different sort of diseases in study design, which undoubtedly narrows the probability of clinical application of our novel findings. Another limitation is that we only used one type of magnetic beads (MB-WCX) to do the prefractionation rather than using multiple types of beads together as Ketterlinus [17] did, which might reduce the reliability of our findings to some extent. For the four novel biomarkers of MM we believe more researches are needed to disclose their corresponding function and structure and this will cast new insight into MM diagnosis and therapy.

Conclusion

Using CLINPROT system and MB-WCX we have screened the serum samples from MM patients and healthy controls and got four novel peaks to build a diagnostic model that achieved high sensitivity and specificity in separation of MM patients and healthy controls in an independent testing data set for validation. The four biomarkers are 2900.4, 3315.96, 7763.24 and 2660.65 m/z. Meanwhile, it may be confirmed that CLINPROT system is a powerful and reliable tool for clinical proteomic researches.