Introduction

In the last decade, new optical guidance techniques have been implemented for the diagnosis and treatment of cancer [1, 2]. One of these new optical techniques is diffuse reflectance spectroscopy (DRS) [35]. DRS can identify tissue characteristics by measuring their intrinsic light absorption and scattering properties at different wavelengths. By illuminating tissue with a selected light spectrum, an ‘optical fingerprint’ of the tissue is obtained which represents specific quantitative biochemical en morphological information. The characteristics of the observed DRS spectrum depend on metabolic rate, vascularity, intra-vascular oxygenation and tissue morphology. Hence, DRS can provide detailed information on the underlying biological composition of tissue and as such has the potential to differentiate tumour tissue from normal tissue. In this way, DRS may be able to improve cancer diagnostics and therapy monitoring. For example, DRS could be incorporated into a biopsy needle, thus constructing an optically guided biopsy tool. Such a tool could reduce the number of indeterminate breast biopsies, which presently still ranges from 5 to 30 % [69].

However, breast tissue has arguably been considered one of the most challenging human tissue types for DRS analysis due to its morphologic inhomogeneity [10]. Breast tumours show a large diversity in histology, whereas the composition of normal breast tissue varies according to age and hormonal status. Despite this diversity and inter-patient variation, several groups have focussed on breast tissue discrimination with DRS [5, 1117]. In these studies, the accuracy of DRS differentiation between normal and malignant breast tissue varies from 65 to 90 %. All studies so far have in common that a collective analysis of the DRS data is performed, resulting in an overall discriminative accuracy of DRS for the whole study population.

Although the results reported are promising, discriminative accuracy of DRS may be improved by limiting the effects of inter-patient variation. This could be accomplished by means of individual patient analyses in which normal tissue is directly compared to tumour tissue for every patient individually. Moreover, such an approach also complies with clinical practice by providing an individual diagnosis to every individual patient.

It is the aim of this present study to determine the diagnostic accuracy of DRS in individual breast cancer patients. When positive, such an approach would be an important step towards the development of intelligent medical tools such as optical biopsy needles.

Materials and methods

Clinical study design

This study was conducted at The Netherlands Cancer Institute (NKI-AVL) under approval of the Internal Review Board Committee. Breast tissue was obtained from 47 female patients who had undergone either a local excision or total mastectomy of the breast due to the presence of a fibroadenoma or (pre)-malignant disease. Shortly after surgical resection, tissue was transported to the Pathology Department for optical spectroscopy analysis. After gross inspection by the pathologist, the spectroscopy measurements were performed on freshly excised tissue within 2 h after resection. The optical spectra were collected from macroscopically normal fat, glandular tissue and fibroadenoma lesions as well as from (pre)-malignant tissue. On average, five optical measurements were performed at each measurement location. A biopsy was taken from each location for histological comparison. Tissue samples were fixated in formalin, paraffin-embedded, cut in 2- to 3-μm-thick sections and stained with standard hematoxylin & eosin staining. An experienced pathologist, who was blinded for the outcome of the spectroscopy analysis, examined the histological slides. For each measurement location, the percentages of adipose, glandular and fibroadenomatous tissue as well as ductal carcinoma in situ (DCIS) and invasive carcinoma were scored. Adipose and glandular tissue specimens were histologically classified according to the most predominant tissue type within the biopsy specimen. The tissue specimen at a measurement location should at least contain 20 % of fibroadenomatous tissue, DCIS or invasive carcinoma to be classified as such. Samples with mixtures of fibroadenomatous, DCIS or invasive carcinoma at one measurement location were excluded from the study because they could not be grouped in one single tissue class.

Instrumentation

The instrumentation and calibration procedure of our optical spectroscopy system has been described in recent papers [1820].

The system consists of a console comprising a Tungsten/Halogen broadband light source, two spectrometers and an optical probe with three optical fibres. The two spectrometers resolve light in the visible wavelength range between 400 nm and 1,100 nm (Andor Technology, DU420A-BRDD) and in the near infrared wavelength range from 800 up to 1,700 nm (Andor Technology, DU492A-1.7), respectively. The optical probe contains three optical fibres: one fibre is connected to the light source, while the other two fibres are connected to the spectrometers to capture the light diffusely scattered from the tissue (Fig. 1). The average tissue volume that is illuminated is roughly 5 mm3. The acquisition time of each spectrum was on average 0.2 s.

Fig. 1
figure 1

Schematic overview of diffuse reflectance spectroscopy system. Centre to centre distance between the emitting and collecting fibres distal end is 2.48 mm. VIS visual; NIR near infra-red

Spectral data processing

The light delivered by the illuminating optical fibre is subject to optical absorption and scattering. Each biological substance in the probed tissue has its own intrinsic optical absorption property as a function of wavelength. These specific optical absorption spectra are well-determined and available from the literature. Given this knowledge, an analytical model was used to extract the chromophore composition and the scattering properties of the tissue samples from the measured spectra over the wavelength range from 500 to 1,600 nm [12, 18, 2022]. This model was first described by Farrell et al. [23]. The measurements were fitted into the analytical model by applying a nonlinear Levenberg–Marquardt inversion algorithm.

In particular the parameters total haemoglobin concentration (is the sum of oxygenated and deoxygenated haemoglobin; THC), water, lipid, collagen volume fractions and β-carotene were determined. We also determined the optical scattering. Optical scattering depends on the size and the density of cellular and subcellular structures and was defined by a reduced scattering amplitude at an arbitrarily given wavelength (e.g. at 800 nm). Accurate water, lipid and collagen volume fraction could be derived due to the inclusion of the near-infrared part of the spectrum (wavelength range 1,000–1,600 nm) [20]. Spectral characteristics analysis was performed with a Matlab software package (MathWorks Inc., Natick, MA). The distribution of the quantified values of each tissue parameter was displayed in boxplots.

Spectral data processing allowed to classify five different breast tissue classes: adipose tissue, glandular tissue, fibroadenoma, DCIS and invasive carcinoma. In addition, we distinguished normal breast tissue (all tissue locations of adipose, glandular and fibroadenomatous tissue) from malignant breast tissue (DCIS and invasive carcinoma).

Tissue classification analysis

A classification and regression tree (CART) algorithm was used to automatically classify each collected tissue into one of the defined breast tissue types based on the chromophore concentrations and scattering parameter values derived from the measurements [24]. With the CART algorithm, a decision tree was created based on absolute thresholds calculated from those tissue parameters displaying the most significant differences between the defined tissue classes. Each spectrum was then separately classified based on the calculated thresholds in the decision tree using a leave-one-out (LOO) cross validation method and compared to the histology analysis [20]. Previously, the CART analysis with LOO cross validation scheme was studied by Nachabé et al. [20] and compared to other generally used spectral classification algorithms. The main advantage of the CART method is that the results can easily be interpreted and correlated to clinical details, since the input parameters are thresholds of the calculated values of the main tissue parameters. The CART analysis was performed for all acquired data collectively and also for each included patient individually with both normal breast tissue and (pre)malignant tissue being measured. Classification as either normal breast tissue or malignant breast tissue in the collective analysis was performed by randomly taking as many samples of all tissue classes as the lowest sample size of any tissue class within both groups. This was done to avoid overestimation of the discrimination accuracy due to the higher representation of one of the tissue classes over the other within either the normal or malignant tissue group. Within the normal breast tissue group, 160 tissue measurements of both adipose and glandular tissue were randomly selected and added to the corresponding number of fibroadenomatous tissue measurements. For the malignant tissue measurements, 120 invasive carcinoma tissue measurements were randomly selected. For the individual patient analysis, all measurements in each defined tissue class were analyzed and compared to the corresponding histological diagnosis. In this individual patient classification analyses only the fat content, collagen content and the scattering parameters were taking into account to avoid overfitting of the data. For this comparison we chose an arbitrary threshold of 90 % agreement of all DRS measurements at a marked tissue site with the histopathological diagnosis to either determine the DRS measurements as correct (≥90 %) or define the measurements as ‘uncertain’ (<90 %).

Statistical analysis

The DRS-estimated quantification of each parameter in the breast tissue classes cannot be described by a parametric distribution such as the Gaussian distribution. The statistical differences of each parameter in the defined tissue classes were therefore determined using the non-parametric Kruskal–Wallis test [25]. P values smaller than 0.05 were considered statistically significant.

Results

Tissue specimens

A total of 47 breast tissue specimens from female patients were included into this study. The mean age at time of operation was 52 years (range 20–74 years). Within the 47 resected tissue specimens, 160 measurements were performed in 32 locations of fibroadenomatous lesions, 121 measurements in 24 areas of DCIS and 314 measurements in 35 invasive carcinoma lesions. Five of these lesions were lobular carcinomas and 30 were ductal carcinomas. In addition, 294 measurements were taken in 79 areas of adipose tissue of the breast specimen and 184 measurements in 37 areas of glandular tissue, giving a total of 1073 DRS measurements (Table 1).

Table 1 Diagnosis for each tissue measurement generated by the DRS analysis compared to the histology diagnosis of the measurement location with the calculated sensitivity and specificity of all the measurements in each tissue class

Cohort data analysis

For each tissue measurement, the chromophore volume fractions and scattering coefficients were calculated. The distributions of each of the six most significantly different tissue parameters for all five distinguished tissue classes are depicted in Fig. 2. Adipose tissue is best distinguished from the other tissue classes by fat, water and β-carotene content, as well as by the scattering coefficient at 800 nm. Also fibroadenomatous tissue can clearly be discriminated from the other five tissue classes based on fat and β-carotene content. For the other tissue classes such as glandular tissue, invasive carcinoma and DCIS there is a notable overlap in the parameters measured and further analysis was performed using the CART algorithm.

Fig. 2
figure 2

Boxplots of most significantly different tissue parameters

Based on the results of the tissue parameters depicted in Fig. 2, each measurement was diagnosed by the CART algorithm and assigned to one of the five defined tissue classes. In addition, measurements were classified as either normal breast tissue or malignant breast tissue. The results of the classification by the CART algorithm are displayed in Table 1. A high specificity (≥90 %) for all tissue classes was noted. The result of discrimination between normal and malignant breast tissue samples is displayed in Table 2. When compared to the pathology DRS yielded a sensitivity of 90 % and a specificity of 88 %. Overall accuracy was 89 %.

Table 2 Classification of tissue measurements defined as normal or as malignant breast tissue

Tissue heterogeneity

The microscopic heterogeneity of the various tissue samples is illustrated in Fig. 3. Three examples of invasive carcinoma are displayed with different percentages of malignant tissue within the specimen, respectively 20, 50 and >90 % invasive carcinoma as determined by pathology analysis. The corresponding spectrum measured for each sample displays notable differences. To further investigate how such differences may affect discriminative accuracy in distinguishing glandular tissue from invasive carcinoma, all of the 314 measurements of invasive carcinoma were divided into <50 % or >50 % malignant cells within the tissue specimen. Next, the quantification of tissue parameters in both groups was compared to all the measurements in glandular tissue (Table 3). Both malignant groups can be discriminated from glandular tissue based on the parameters total haemoglobin count and fat. No significant differences between malignant and glandular tissue are identified for the parameters scattering at 800 nm, β-carotene and collagen, when the percentage of invasive carcinoma within the tissue specimen is <50 %. These parameters do however illustrate significant discriminative accuracy when ratios of invasive carcinoma are >50 %.

Fig. 3
figure 3

Example of the pathological heterogeneity within several tissue samples defined as invasive carcinoma

Table 3 Significant differences of the quantification of the most notable tissue parameters of all invasive carcinoma measurements with either <50 % or >50 % malignant cells within each tissue specimen compared to the glandular tissue measurements

In one of the 47 tissue specimens, all five defined tissue classes were present and could be examined. The histology of the five tissue classes and the corresponding spectral analysis are displayed in Fig. 4. For each tissue sample, a marked heterogeneity of the histology is notable. At each tissue location an average of 5 optical measurements was performed. In general only small variations within the consecutive measurements were observed indicating high reproducibility within a specific tissue location. The small variations within a tissue location are displayed in Fig. 4f. To discriminate the five tissue classes, the scattering coefficient at 800 nm wavelength was plotted versus the fat concentration. Using these two parameters the five tissue classes within this individual patient could clearly be discriminated (Fig. 4f).

Fig. 4
figure 4

H&E staining of the five different tissue classes within one of the included patient specimen. A adipose tissue; B glandular tissue; C fibroadenomatous tissue; D DCIS; E invasive carcinoma. The percentages of the main tissue components as defined by the pathologist. The DRS spectrum for each tissue sample is depicted. In F the distribution of all the measurements for each defined tissue class based on the quantification of scattering at 800 nm and Fat content from each tissue spectrum is illustrated. DCIS -ductal carcinoma in-situ; a.u. -arbitrary units; nm -namometer; 1/cm – reciprocal centimeter

Individual data analysis

Since the number of tissue classes measured per patient was limited, we only focused on the ability of DRS to differentiate normal breast tissue (adipose, glandular and fibroadematous) from malignant breast tissue (invasive carcinoma and DCIS) in the individual analysis. Results of the DRS analysis in all individual patients are displayed in Table 4. In all patients, except two, the specificity was 100 %. In these two patients only 1 out of 10 and 1 out of 17 DRS measurements in benign tissue was led to a false positive DRS diagnosis of malignancy. In only 3 of the 36 patients, the sensitivity for the DRS diagnosis malignancy was not 100 %. The percentage of false negative measurements varied from 5 to 37 % in these 3 patients. When 90 % agreement between all DRS measurements and the pathology analysis is considered as an arbitrary threshold to confirm the diagnosis of breast cancer, DRS measurements were ‘uncertain’ in only one out of 36 patients (patient 6).

Table 4 Illustration of all measurements for each defined tissue class for each patient individually

Discussion

Optical technologies, such as DRS, are increasingly being explored for their merits in supporting the diagnostic workflow in breast cancer. Heterogeneity in normal breast tissue and tumour tissue may, however, interfere with its diagnostic accuracy. We too observed a wide heterogeneity in the histology of breast cancer in the present study. To reduce this inter-patient variability we hypothesized that an individual patient analysis would provide superior discriminative accuracy of DRS measurements as compared to the generally performed analysis on patient cohorts.

To challenge this hypothesis we performed a cohort analysis as well as an individual analysis for the various DRS measurements. In addition, spectra were acquired in the wide wavelength range of 500–1,600 nm, in contrast to the wavelengths 500–1,000 nm commonly used by other research groups [18, 19].

For the cohort analysis all measurements were grouped as either benign or malignant. Sensitivity and specificity were 90 and 88 %, respectively, yielding an overall accuracy of 89 %. During individual analysis all measurements per patient were again categorized as either benign or malignant. The discriminative accuracy of these individual analyses was nearly 100 %. Only in one patient the diagnosis was uncertain.

Several previous studies have used DRS for analysis on breast tissue [5, 12, 13, 1517, 2628]. These studies use collective data from all included patients to discriminate normal breast tissue from malignant breast tissue. The resulting sensitivity ranged from 67 to 87 %, specificity from 76 to 96 %. The results from the collective analysis of normal versus malignant tissue in the present study display comparable results [5, 13, 15, 16, 26, 27]. Yet, a fair comparison between studies remains difficult due to the many different data analysis techniques used.

In contrast to most others papers, we distinguished five different tissue classes. For this purpose, DRS was performed in a wider wavelength range than usual. Recent findings by Taroni et al. showed that collagen is an important absorber to include in the model for fitting the measured spectra as it has distinct absorption features above 900 nm [2931]. Therefore, we measured the absorption coefficient of collagen up to 1,600 nm and included it in our model. Moreover, measurements on the wavelengths above 1,000 nm allow better quantification of the lipid and water fraction of the tissue. The boxplots clearly displayed a notable distribution range of the optical parameters within the different tissue classes. Adipose tissue and fibroadenoma could be well differentiated from the other three tissue classes, yet the discrimination of glandular tissue from DCIS and invasive carcinoma was not straightforward due to overlap in the quantification of the tissue parameters. This has also been reported by Volynskaya et al. [12], who did not observe any significant differences in collagen and β-carotene in these tissue classes as well as by Zhu et al. [5] who reported similar concentrations in β-carotene and THC. We observed significant variations within all the tissue classes in THC as displayed in Fig. 2. The pressure applied to the tissue during the measurements with possible relocation of blood, as well as the damaged blood vessel system due to the excision, could cause this variation. Therefore, caution should be taken drawing conclusions in an ex vivo setting using THC.

To allow further classification several classification algorithms have been described. Volynskaya et al. [12] and Zhu et al. [5] respectively used logistic regression analysis and linear support vector machine. In the present study, further classification of tissue classes was performed using a Classification and Regression Tree (CART) algorithm. As previously demonstrated by Nachabé et al. [20], CART analysis with LOO cross validation scheme proved to be a favourable classification scheme when compared to other generally used algorithms. We observed an overall discrimination accuracy of DRS of 84 % when distinguishing the 5 tissue classes. Sensitivity was lowest for glandular tissue (77 %) and DCIS (71 %). Specificity for all tissue classes was 90 % or higher.

The difficulty in discriminating between the three tissue classes glandular tissue, DCIS and invasive carcinoma can be explained when taking into mind that primary malignant degeneration in the breast is generally a gradual evolution of glandular tissue to carcinoma in situ to an invasive carcinoma. Moreover, the large differences in composition between various cancerous lesions, as displayed in Fig. 3, cause significant variations in collected spectra. Depending on the proportion of malignant cells within the measured specimen the different tissue parameters will be more or less significantly different from normal glandular tissue as is demonstrated in Table 3.

This differences in tumour composition as well as sampling variation and the resultant effect on the differences in quantification of the tissue parameters in relation to normal glandular tissue prompted us to investigate a more individualized analysis of the DRS measurements. Indeed, with a discriminative overall accuracy of nearly 100 % the individual analyses were superior to the results of the cohort analysis.

When we translate these results on individual analysis to the clinical practice of tissue biopsy, it is important to note that the acquisition of data is fast and can be performed real time. This means that accurate positioning of an optical guided biopsy needle within the target lesion becomes possible. The breast tissue analysis for each individual patient resulted in a correct clinical diagnosis of all normal breast tissue measurements and in all but one malignant tissue measurement. We therefore argue that smart biopsy tools that incorporating DRS into biopsy needles could prevent indeterminate breast biopsies.

Although our data are promising, specific improvements should be noted for future studies. First, the results of this study are based on ex vivo analysis. Before clinical use of a DRS system can be considered, the presented results must be confirmed in an in vivo analysis. In addition, the discriminative accuracy for glandular tissue could still be improved. As shown in Table 1 discrimination of glandular tissue is still less reliable than that of adipose tissue. This may be due to the general histological inhomogeneity in the area of glandular breast tissue.

In conclusion, the results presented in this article demonstrate that the analysis of optical characteristics of different tissue classes within the breast of a single patient is superior to an analysis using the results of a cohort data analysis. We argue that for future application of DRS into clinical practice, such as breast tissue biopsy, emphasis should be put on individual tissue data analysis. A prospective in vivo analysis of breast tissue is underway to confirm the presented results as a next step towards the clinical application of smart biopsy tools and surgical instruments.