Introduction

Despite recent advances in the early detection, prediction, and treatment of micrometastatic disease in breast cancer patients, metastasis remains the major cause of death in affected individuals. Metastasis occurs in a series of discreet biological steps in which a single, frequently clinically occult micrometastatic cell travels from the primary tumor to a distant location, where it lodges and grows. By the time that a distant metastasis is detected by traditional means such as imaging or serology, it is often incurable. These single circulating tumor cells (CTCs) therefore not only represent important clinical targets for adjuvant therapy, but their detection and quantification also has significant prognostic value. We previously demonstrated that the detection of CTCs in the peripheral blood by quantitative real-time PCR (QPCR) predicted a significantly worse overall survival and progression-free survival [1] in metastatic breast cancer patients. Likewise, detection of CTCs in the peripheral blood [2], or disseminated tumor cells (DTCs) in the bone marrow [3], or lymph nodes [4] in early stage breast cancer patients is also associated with significantly poorer outcome.

Methodologies to detect and quantify CTCs have frequently utilized protein markers differentially expressed on CTCs relative to the non-tumor cell background such as haematopoietic cells of the peripheral blood or bone marrow. While automated methods of detection and quantification of CTCs labeled by immunohistochemical staining are improving, manual labeling can be time-consuming and identification of positively-stained CTCs can be subjective [5]. Moreover, these methods can be prone to false-positives and can lack sensitivity when the cells of interest are at very low frequency, as is invariably the case in peripheral blood [6]. In an attempt to increase sensitivity and concurrently decrease subjectivity in the enumeration of CTCs in cancer patients, increasing numbers of studies have investigated QPCR as a detection platform [712]. Despite this work, however, there has been little consensus as to the optimal analytical procedures to use to achieve this goal. The effects of sample handling and preparation, target cell enrichment, marker gene selection, and data analysis techniques vary considerably between studies, and each has a marked effect on the quality of the data generated and the conclusions that can be derived from them. The work described herein therefore aimed to test various methods, from storage and sample processing to data acquisition and analysis, to find an optimal combination of techniques for an effective and practical QPCR-based platform for the detection of CTCs in the peripheral blood of breast cancer patients.

Materials and methods

Patient and healthy control groups

Blood (1 × 8 cc) was collected in tubes containing a Ficoll-Hypaque density fluid separated by a polyester gel barrier from a sodium citrate anticoagulant (BD Vacutainer® CPT™). Peripheral blood mononuclear cells (PBMCs) were isolated from the blood samples of 18 patients with advanced breast cancer (M1 disease, according to the Union Internationale Contre le Cancer criteria) during a routine follow-up visit to the Netherlands Cancer Institute (NKI)/Antoni van Leeuwenhoek Hospital, and also from 23 healthy female volunteers. All patients and volunteers gave informed consent, and the study was approved by the Medical Ethical Committee of the NKI.

Cell spiking

For sensitivity assays, different numbers of cells derived from a mixture of the five breast cancer cell lines MCF-7, T47D, BT474, SKBR3, and ZR75-1, chosen to represent a large proportion of the different histological and genetic breast cancer subtypes seen in the clinic, were spiked into peripheral blood samples from healthy female volunteers.

Tumor cell enrichment

Two methods for producing a sample enriched in CTCs were compared. In the first, PBMCs underwent an immunomagnetic enrichment using anti-CD326 (HEA/EpCAM) and/or anti-ErbB2 Micro Beads (MACS®, Miltenyi Biotec) as per the manufacturers instructions. Briefly, beads were incubated with the PBMCs for 30 min at 4°C, after which labeled cells were collected on a magnetic separation column. After removal of the column from the magnetic field, the retained HEA+ and/or ErbB2+ cells were eluted, and added to either cell lyses buffer (5 M Guanidine thiocyanate (Merck, Germany), pH 6.8, 0.05 M Tris, 0.02 M EDTA, 1.3% Triton) or RNA-Bee (Campro Scientific). For the second method, anti-CD326-FITC antibodies (Miltenyi Biotec) were incubated with PBMCs in an analogous fashion as above in labeling buffer (phosphate buffered saline (PBS) pH 7.2, 0.5% BSA, and 2 mM EDTA) for 10 min in the dark at 4°C. Cells were subsequently washed and sorted using the Moflo High speed cell sorter (Dako, Glostrup, Denmark).

mRNA isolation and cDNA synthesis

Two methods of RNA extraction and cDNA synthesis were compared. In the first, RNA was precipitated from the cell lysate and dissolved in lysis buffer from the μMACS™ One-step cDNA kit (Miltenyi Biotec). Oligo (dT) Micro Beads were added and the mixture placed onto the μMACS column in the thermoMACS™ Separator. cDNA synthesis proceeded as per the manufacturers instructions. In the second method, RNA was extracted from cells lysed in RNA-Bee as per the manufacturers instructions. The resulting mRNA was then used as a template for cDNA synthesis using the SuperScript III Reverse Transcriptase system (Invitrogen) primed by random hexamers.

Quantitative real time PCR

Based on the published genomic sequences of CK19, p1B, EGP-2, PS2, mammaglobin and SBEM, the sequences of the real-time quantitative PCR primers (Sigma Genosys, Cambridge, UK) and of the 5′-fluorescently FAM-labeled probes (Applied Biosystems, Nieuwerkerk a/d IJssel, The Netherlands) were designed using the Perkin Elmer Primer Express® software (PE, Foster City, USA) (Table 1). All primers were designed to be intron-spanning to preclude amplification of genomic DNA. To normalize relative levels of expression, commercially available primers and probes for the housekeeping genes β-actin and glyceraldehyde-3-phosphate dehydrogenase (GAPDH) were used (Applied Biosystems).

Table 1 Primer sequences used to amplify each of the six marker genes used to identify CTCs in peripheral blood

Serially diluted cDNA synthesized from the amplified RNA of 82 snap frozen breast cancer tissues was used to generate standard curves for control and marker gene expression measurements. For all cDNA dilutions, fluorescence was detected from 0 to 50 PCR cycles for the control and marker genes in singleplex reactions, which allowed the deduction of the CT-value for each product (CT-value (threshold cycle) being the PCR cycle at which a significant increase in fluorescence is detected due to the exponential accumulation of PCR products, represented in arbitrary units (TaqMan Universal PCR Master Mix Protocol, Applied Biosystems) [13]). The quantities found for the β-actin control and marker genes were used to calculate the relative quantity of control and marker gene expression in each sample. The second control gene, GAPDH, was only used for confirmation of β-actin expression. Each experiment was performed in triplicate. Quality control measures for the PCR reactions included the addition of a genomic DNA control and a negative non-template control.

Statistics

The performance of several common algorithms for class prediction using the expression data of four marker genes (CK19, P1B, EGP2, and MmGl) was compared using the software package BRB Array Tools 3.5 developed by Dr. Richard Simon and Amy Peng Lam (see http://linus.nci.nih.gov/BRB-ArrayTools.html). These included the compound covariate predictor [14], diagonal linear discriminant [15], 1-nearest neighbor, 3-nearest neighbors, nearest centroid [16], support vector machines [17], Bayesian compound covariate predictor [18], and the quadratic discriminant analysis (QDA) score function, used as previously described.

Results

Effects of sample storage on marker gene expression

1 × 105 Tumor cells derived from a mixture of 5 breast cancer cell lines were spiked into samples of peripheral blood collected from healthy females. The expression of the common ‘housekeeping’ genes GAPDH and β-actin, and tumor marker genes CK19, P1B, PS2, EGP2, MmGl, and SBEM was then measured by QPCR in the PBMC and tumor cell mixture at various timepoints post-collection, in samples stored at either room temperature or 4°C. Absolute mRNA abundance detectable in the sample was relatively stable for GAPDH, however abundance varied markedly for β-actin (Fig. 1a), with a large, rapid decrease shortly after collection for cells stored at 4°C. Also apparent was a large decrease for cells stored at room temperature, but which occurred more gradually, reaching its lowest point at 24-h post-collection. These time and temperature-specific effects were also seen when the absolute abundance of each of the tumor marker genes was measured (Fig. 1b). Storage temperature had a more marked effect on the relative mRNA abundance of marker genes, however, with most genes showing a steady decrease over time when stored at 4°C, with an average decrease in abundance of 54% by 48-h post-collection. This concurrent decrease in both housekeeping and marker gene mRNA abundance over time meant that the relative abundance of most tumor marker genes appeared generally stable, however, excepting a slight decrease in abundance in the first 4 h post-collection (Fig. 1c). If storage of samples is necessary, it should optimally be at room temperature, at which samples can be stored for at least 48 h with little change in relative mRNA abundance.

Fig. 1
figure 1

The effects of storage time and temperature on the absolute mRNA abundance of ‘housekeeping’ genes (a), and tumor marker genes (b), and also the relative abundance of tumor marker genes (c), as quantified by QPCR. The relative mRNA abundance for most marker genes appeared relatively stable (with the exceptions of β-actin and PS2), with a slight decline as the storage time approaches 48 h when stored at 4°C

Effects of tumor cell enrichment and sample processing on CTC marker gene mRNA detection

One hundred tumor cells derived from a mixture of five breast cancer cell lines were spiked into samples of peripheral blood collected from healthy females. CTCs were detected by QPCR measurement of six tumor marker genes in samples which underwent tumor cell enrichment using antibodies directed against CD326 (HEA/EpCAM) and/or ErbB2, either by cell sorting, or via an immunomagnetic enrichment procedure (Fig. 2, Methods C–H). These were compared to samples that underwent no enrichment (Fig. 2, Method A–B). The use of on-column RNA extraction and cDNA synthesis was also compared to off-column preparations (Fig. 2, Methods A–B, E–F). In each treatment group, control samples were included consisting of healthy peripheral blood only (denoted by the open circles in Fig. 2) to determine the level of non-target expression of each of the marker genes. Table 2 contains a summary of the treatments compared.

Fig. 2
figure 2

Effects on relative levels of tumor cell marker mRNA due to sample processing, including tumor cell enrichment and RNA extraction/cDNA synthesis, as detected by QPCR. See Table 2 for a key for Methods A–H. For each processing method, peripheral blood samples into which tumor cells have been spiked are denoted by the solid circle, and healthy control samples are denoted by the open circle. Median values for each group are denoted by the short horizontal line in each group

Table 2 Summary of sample treatments compared, including tumor cell enrichment and RNA extraction/cDNA synthesis, and the resulting average increase in relative mRNA abundance (averaged across all marker genes) as detected by QPCR

Enriching samples of peripheral blood for tumor cells using any enrichment method (Fig. 2, Methods C–H) resulted in higher relative mRNA levels than using no enrichment (Fig. 2, Method A–B; average 597-fold increase), as expected. Enriching for cells using immunomagnetic columns and antibodies directed against both ErbB2 and CD326 (Method F) resulted in a mean relative tumor marker gene expression level 4-fold and 16-fold higher than selecting for cells that were ErbB2+ (Method C) or CD326+ (Method D) alone, respectively.

An advantage in detecting relative levels of tumor cell marker gene mRNA was also seen in both non-enriched (Method A versus Method B) and enriched (Method E versus Method F) populations when an on-column RNA extraction and/or cDNA synthesis was performed (using the μMACS™ mRNA Isolation Kit followed by the μMACS One-step cDNA Kit (Miltenyi Biotec)), as opposed to first eluting off the cells captured on the column and then performing a separate total RNA extraction and cDNA synthesis (using RNABee (Campro Scientific) RNA isolation reagent followed by SuperScript III Reverse Transcriptase System (Invitrogen)).

Using two immunomagnetic columns sequentially (Method G) further purified the target cell population, and almost always resulted in a non-detectable signal from the negative control (peripheral blood only) samples. While this exclusion of non-target cells resulted in high relative expression, the absolute expression was lower in some samples than to those that underwent only a single immunomagnetic column enrichment (Method F), indicating that some tumor cells were also likely excluded by such rigorous enrichment (Method F; absolute expression data not shown).

The efficacy of cell sorting using an antibody directed against CD326 (at the time of experimentation no suitable antibody against ErbB2 for cell sorting was available for use in conjunction with CD326), proved variable. While this and the double-column enrichment method (Method G) consistently excluded the detection of non-target cells (with a mean of only 2.2 cells collected from a sample of ∼1 × 106 PBMC (only) cells per sort from the negative controls (range 0–7; data not shown) resulting in non-detectable levels of marker gene expression), some samples containing tumor cells also displayed a non-detectable level of marker gene expression indicating that again stringent enrichment was likely sometimes excluding tumor cells.

The optimal technique therefore when high specificity is required, for example when helping to identify patients who have very low numbers of CTCs and therefore who may not benefit from aggressive adjuvant chemotherapy, was via double column enrichment or by cell sorter (Method G and H, respectively). However as this comes at a cost of somewhat decreased sensitivity, the optimal combination of sensitivity, specificity, and high relative tumor cell gene expression detection, which may be more useful for regular prognostication purposes, may be the use of a single column enrichment (Method F).

Platform detection limits

To test the limits of detection of this platform, 12 samples containing 1 × 106 PBMCs and ∼10 tumor cells (made by serial dilution and confirmed by both manual and automated counting) were assayed using Method F and compared to Method A (Fig. 3). The gene expression values of four marker genes were combined into a single score using a quadratic discriminant analysis (QDA; see below). The majority (66%) of the QDA scores for these samples assayed by Method F were above the highest QDA score of the negative controls, demonstrating that the method more often than not had sufficient sensitivity to detect 10 tumor cells in a background of 1 × 106 PBMCs. When enrichment was not used (Method A), the QDA scores were not significantly different from the negative controls.

Fig. 3
figure 3

To test the limits of the detection platform using Method F (compared to Method A), QDA values from samples of 1 × 106 PBMCs into which 10 tumor cells were spiked were calculated. Method F detected tumor cells in the samples two-thirds of the time (reflected in the positive QDA score), whereas Method A was unable to distinguish samples containing tumor cells to those that did not. The horizontal line represents the median value for each group

Effects on the number of marker genes used to detect circulating tumor cells

In a previous study by us [19], peripheral blood samples from 23 healthy female controls and 16 metastatic breast cancer patients were prepared using Method E (above), and the mRNA abundance of six tumor marker genes (CK19, P1B, EGP2, PS2, SBEM, and MmGl) was assayed. Using these data, the effects of different numbers and combinations of marker genes to differentiate these two groups was investigated. To facilitate this, the expression data was combined into a single value using QDA as previously described [1, 19, 20]. This value is optimized in such a way as to make the highest QDA value for the control group the baseline for the test group. Every possible combination of expression data of one to six of these marker genes was used to derive a QDA value, from which the combination that provided the highest leave-one-out cross-validation class prediction results for each number of marker genes was considered optimal (Fig. 4). The separation of QDA values between healthy controls and metastatic patients increases as the number of marker genes increases to five, however the cross-validation results for the four-marker panel was highest overall at 95.5%. There appeared to be no advantage in using six marker genes over four or five. These results also suggest that using only one (CK19) or two (CK19 and p1B) marker genes was much poorer at separating metastatic patients from healthy controls than using three or more, due to the high proportion of healthy controls expressing CK19.

Fig. 4
figure 4

The optimal combination of different numbers of tumor marker genes was used to derive a QDA value to separate healthy female controls (open circles) from metastatic breast cancer patients (closed circles). Median QDA score is denoted in each sample set by the horizontal line. The cross-validation accuracies are shown above each X-axis category. Four marker genes was the optimal number that allowed both good separation and highest cross-validation class prediction results. No further advantage was seen in using six marker genes. (Optimal marker combinations used: 1 gene = CK19; 2 genes = CK19, p1B; 3 genes = CK19, EGP2, MmGl; 4 genes = CK19, P1B, EGP2, MmGl; 5 genes = CK19, p1B, EGP2, MmGl, SBEM; 6 genes = CK19, p1B, EGP2, MmGl, SBEM, PS2.)

Effects of statistical methods to estimate tumor cell load from tumor marker gene expression data

Using the previously-generated dataset derived from 23 healthy female controls and 16 metastatic breast cancer patients described above, eight mathematical algorithms which have been frequently utilized to classify biological samples using gene expression data (Compound covariate predictor [14], diagonal linear discriminant [15], 1-nearest neighbor, 3-nearest neighbors, nearest centroid [16], support vector machines [17], Bayesian compound covariate predictor [18], and QDA) were tested in an attempt to find the optimal statistical method for distinguishing healthy females from metastatic patients. A leave-one-out cross validation strategy was used to estimate classification efficiency, sensitivity, and specificity for each classifier (Table 3). Generally all algorithms demonstrated high classification accuracy (range 74–94%, mean 88%), with the compound covariate predictor, diagonal linear discriminant, nearest centroid, and Bayesian compound covariate predictor assigning each sample identically into the two groups. This resulted in each achieving the highest correct classification rate of 90%, with a sensitivity and specificity for the metastatic patients of 0.81 and 0.96, respectively.

Table 3 Leave-one-out cross-validation performance of eight classification models which used QPCR data to distinguish 23 healthy controls from 16 metastatic patients

Discussion

Metastatic spread is the primary cause death in breast cancer patients, and the detection and quantification of the CTCs which cause metastasis has significant, independent diagnostic and prognostic value. Despite the great potential benefit in being able to detect these cells, however, thus far there has been no clear consensus as to the optimal methods to achieve this. While automated immunohistochemical methods are beginning to show promise, traditional manual methods are time-consuming, and the results can be subjective [6]. We and others [712] have used QPCR as an alternate method for the detection of tumor cell-specific mRNA in patients as an estimate of tumor cell load, although different studies have invariably used a variety of different procedures to achieve this. The aim of this study was therefore to test various processing and analytical methodologies in an attempt to find the optimal procedure for the detection of CTCs by QPCR. Tumor cell dissemination is known to proceed via both hematogenous and lymphogenous routes, and CTCs/DTCs can be detected in blood, bone marrow, and lymph nodes. We have focused on sampling peripheral blood, which has the advantages of being much less invasive and considerably easier to collect than other sampling materials, which therefore makes it more amenable to becoming the basis for a practical clinical test for routine use.

Once peripheral blood for such a test has been collected from a patient, it is important to know how soon it must be tested in order to provide a reliable result. This is of particular importance in QPCR-based platforms which measure mRNA abundance, as it is a molecule prone to rapid degradation. Absolute quantitation of common housekeeping genes and tumor marker genes both demonstrated a rapid decrease in mRNA abundance shortly after collection (Fig. 1a, b), before stabilizing or slightly increasing at 24–48 h post-collection, in line with other studies [21]. The marker genes assayed generally demonstrated a greater decrease in abundance over time compared to housekeeping genes, perhaps reflecting a cellular mechanism which preserves more important mRNA such as GAPDH or β-actin within the cell during times of stress [22]. This concurrent decrease in abundance for both housekeeping and marker genes meant, however, that the effect on the relative abundance of each tumor marker was largely dampened. Storage temperature also significantly affected the levels of mRNA detected, with the observed large decrease significantly retarded in samples stored at room temperature compared to those stored at 4°C, an effect which may be due to a decrease in mRNA production in cells stored at low temperatures, rather than (or outweighed by) an increase in mRNA degradation [23]. Furthermore, the mRNA abundance of samples stored at room temperature demonstrated an increased stability over time than those samples stored at 4°C. Therefore while immediate assay or stabilization of samples is recommended for the optimal detection of tumor cell-specific marker gene mRNA, these mRNAs will still remain at detectable levels even after a least two days of storage.

Pre-processing peripheral blood samples by enriching for cells of interest before mRNA measurements were made was of significant benefit for the detection of CTCs (Fig. 2). While an ideal tumor marker gene would be exclusively expressed in transformed cells, this is seldom the case, with several studies reporting that most, if not all, tumor marker genes currently used are expressed at some level (albeit much lower) in subsets of normal, healthy cells [2426]. This was also apparent in the current study, where each of the six tumor marker genes measured was present in non-enriched negative control samples containing only healthy PBMCs (Fig. 2; Method A). Even extremely low-level expression of tumor markers in background cells can be a significant source of confounding when background cells outnumber target cells by a large margin, as is inevitably the case when assaying samples of peripheral blood or bone marrow. It is therefore important to enrich for target cells, for example as we have demonstrated here by isolating ErbB2+ and/or CD326+ cells, which results in a more clear delineation of samples containing tumor cells from those that do not, and therefore a likely more accurate quantitation of tumor cells present in a sample.

The platform used for tumor cell enrichment had a large impact on the relative abundance of marker gene mRNA detected by QPCR (Table 2). Using antibodies directed against the epithelial cell surface antigens ErbB2 or CD326 alone increased average relative tumor marker mRNA detected by 94-fold and 24-fold, respectively, over non-enriched samples, and using both antibodies together further increased detected mRNA levels. These data suggest that there may be subsets of tumor cells that express different antigens, and while underlining the importance of using more than one antibody for enrichment this also highlights an important limitation of this technique—namely that by enriching a cell population using one or a small number of cell surface antigens, other important subsets of tumor cells which do not express these antigens may be missed. For example, tumor stem-like progenitor cells with high very metastatic potential [27] may not be retained in a population enriched by their expression of epithelial antigens, as progenitor cells frequently differ in cell surface markers than those of differentiated cells [20]. Furthermore, there are several distinct molecular subtypes of breast tumor cell [28], not all of which may express the same surface antigens. Therefore while it appears that tumor cell enrichment increases usefulness of CTC detection, the optimal choice of antigens to facilitate enrichment will likely require additional investigation, although a combination of antibodies will most likely result in increased sensitivity.

An immunomagnetic enrichment procedure utilizing two columns sequentially instead of a single column increased the relative tumor marker gene abundance detected by QPCR in some instances, but not consistently. Similar to cell sorting, however, it did appear to consistently remove the majority of non-target PBMC cells resulting in non-detectable levels of tumor marker gene expression in the negative controls, unlike when a single column was used. While very high specificity may be of value in certain situations, high sensitivity in a clinical test or study would likely be desirable in most circumstances even when it is accompanied by somewhat decreased specificity. Though enrichment by cell sorting demonstrated the highest specificity of all methods, with all negative control samples showing non-detectable levels of any tumor marker gene, this frequently came at a significant decrease in sensitivity (Fig. 5). As stated previously, exploiting additional target cell surface antigens or the use of a more optimized gating strategy would likely increase the sensitivity while keeping specificity high. Based on the current results, however, the use of a single immunomagnetic column enriching for both ErbB2+ and/or CD326+ cells coupled with an on-column RNA extraction and cDNA synthesis (Method F) gives the optimal combination of high sensitivity and specificity.

Fig. 5
figure 5

Sensitivity and specificity of each of the method used for sample processing. Generally, specificity increases at the expense of sensitivity as more rigorous enrichment methods are applied to the samples

The number and combination of tumor markers used in an assay of this nature has potentially the largest effect on its prognostic or diagnostic value. To determine the effects of these factors, expression data from between one and six tumor marker genes was combined into a single QDA value and used in an attempt to separate healthy controls from metastatic breast cancer patients (Fig. 4). QDA is a statistical technique to find the combination of quadratic and linear functions of variables (in this case marker genes), which leads to an optimal separation between groups (in this case breast cancer patients with advanced disease and healthy female controls). It is a generalization of the more familiar Fisher’s Linear Discrimination Analysis (LDA), which allows only linear functions [12]. A positive discriminant score derived from the marker gene expression indicates the presence of breast tumor cells, and a negative discriminant score indicates the absence of tumor cells. It can be seen that using the optimal combination of each number of marker genes, the separation of the two groups increases as the number of marker genes used increases to five. Using four marker genes, however, results in the highest cross-validation class prediction rate of 95.5%. Using six marker genes appears to have no additional benefit. When using one or two marker genes, which in both cases included the often considered ‘prototypical’ tumor marker gene CK19, appeared to be poor at reliably distinguishing healthy controls from metastatic patients in this assay, due to the high proportion of healthy controls which contained cells which were also expressing some level of this gene. This problem of specificity is common in routine clinical immunohistological staining, particularly with markers such as CK19 [29]. The results presented here suggest that the use of multiple markers would be highly beneficial in this instance.

Eight mathematical algorithms were compared in their ability to distinguish metastatic breast cancer patients from healthy female controls using gene expression data from the four tumor marker genes CK19, P1B, EGP2, and MmGl. It is beyond the scope of this work to fully describe the benefits of each mathematical method, suffice to say that all algorithms tested performed similarly and well, with an average correct classification rate of 88%. In the clinic, an assay of this nature would generally only be performed on patients already known to have breast cancer. Therefore, like the previous results, a high sensitivity in predicting high-risk patients (in this case metastatic patients) is likely more important than high specificity, as it is essential to identify the most high-risk individuals accurately. In this case, the compound covariate predictor, diagonal linear discriminant, nearest centroid, and Bayesian compound covariate predictor performed equally well, with a correct leave-one-out cross-validation classification rate of 90%, with a sensitivity and specificity for the metastatic patients of 96 and 81%, respectively. In terms of ease-of-use in a clinical context, however, the QDA was designed to provide a simple positive or negative score derived from weighted gene expression values reflecting tumor cell presence or absence, respectively, and would therefore likely be the most amenable for a simple but powerful clinical test.

In conclusion, we have demonstrated the efficacy of a QPCR-based platform for the detection of CTCs in cancer patients. We have optimized this assay using samples of peripheral blood as opposed to bone marrow, which has the advantage of being considerably less invasive, and have demonstrated that it displays both high sensitivity and specificity, and the ability to detect down to 10 tumor cells from a background of greater than one million peripheral blood cells. While debate continues over whether the detection of tumor cells in blood or bone marrow is of the most value, both have shown to be of prognostic benefit. Of the methods described above, the optimal combination for a detection platform would likely include the enrichment of tumor cells in the sample using both ErbB2 and CD326 antibodies using a single immunomagnetic enrichment column, followed by on-column mRNA extraction and cDNA synthesis. Using the QDA function, data from multiple marker genes can be conveniently combined into a single value providing an objective estimation of tumor cell load in a given sample. The platform described represents an accurate and objective assay which could augment current routine clinical assays for circulating and DTCs, an important indicator of disease progression in cancer.