Keywords

1 Introduction

Despite the development of medicine, cancer remains one of the most dangerous diseases nowadays. World Health Organization (WHO) has reported 18.1 million new cancer cases and 9.6 million cancer deaths in 2018 [1]. Therefore, the detection and treatment of cancer is one of the most challenges for medicine in the twenty-first century. An effective solution of the problem is the use of modern interdisciplinary technologies. Most often, if the tumor is diagnosed earlier and treated, the patient will have a better prognosis and much greater opportunities for complete recovery. Many recent technological innovations have used physics principles, such as optics and coherent photonics, to improve early diagnostic and therapeutic procedures to reduce cancer incidence and mortality.

The development of optical methods in modern medicine in the field of diagnostics, surgery, and therapy has stimulated the study of the optical properties of human and animal tissues, since the efficiency of optical sensing of tissues depends on the photon propagation and fluence rate distribution [2].

Examples of diagnostic use are the following: monitoring of blood oxygenation and tissue metabolism, analysis of main tissue components, detection of malignant neoplasms, and recently proposed various techniques for optical imaging. The latter is particularly interesting for virtual optical biopsy and the precise determination of tumor boundaries during surgical operations. Therapeutic usage mostly includes applications in photodynamic therapy. For all these applications, knowledge of the optical properties of tissues is of great importance for the interpretation and quantification of diagnostic data and for predicting the distribution of light and absorbed energy for therapeutic and surgical use.

In this chapter, we provide an overview of the optical properties of benign and malignant tumors measured over a wide wavelength range and discuss the main cancer markers for various types of tumors.

2 Tumor Optical Properties Measurements: A Brief Description

Among the numerous methods for measuring the optical properties of tissue, the most widely used are integrating spheres spectroscopy, reflectance spectroscopy as well as Raman and fluorescence spectroscopy.

Iterative methods for processing experimental data, as a rule, take into account discrepancies between the refractive indices at the boundaries of the sample as well as the multilayer nature of the sample. The following factors are responsible for the errors in the estimated values of the optical coefficients and need to be borne in mind in a comparative analysis of the optical parameters obtained in various experiments [3]:

  • The physiological conditions of tissues (the degree of hydration, homogeneity, species-specific variability, frozen/thawed or fixed/unfixed state, in vitro/in vivo measurements, smooth/rough surface);

  • The geometry of irradiation;

  • The matching/mismatching interface refractive indices;

  • The numerical aperture of photodetectors;

  • The separation of radiation experiencing forward scattering from unscattered radiation;

  • The theory used to solve the inverse problem.

To analyze the propagation of light under multiple scattering conditions, it is assumed that absorbing, fluorescence, and scattering centers are uniformly distributed across the tissue. UV-A, visible, or NIR radiation is usually subjected to anisotropic scattering characterized by a clearly apparent direction of photons undergoing single scattering, which may be due to the presence of large cellular organelles (mitochondria, lysosomes, Golgi apparatus, etc.) [3,4,5].

When the scattering medium is illuminated by unpolarized light and/or only the intensity of multiply scattered light needs to be computed, a sufficiently strict mathematical description of continuous wave (CW) light propagation in a medium is possible in the framework of the scalar stationary radiation transfer theory (RTT) [3,4,5,6]. This theory is valid for an ensemble of scatterers located far from each other and has been successfully used to develop some practical aspects of tissue optics. The main stationary equation of RTT for average spectral power flux density \( {I}_{\lambda}\left(\overset{\rightharpoonup }{r},\overset{\rightharpoonup }{s}\right) \) (in W/cm2 sr) for wavelength λ at point \( \overset{\rightharpoonup }{r} \) in the given direction \( \overset{\rightharpoonup }{s} \) and monochromatic irradiation has the form

$$ \frac{\partial {I}_{\lambda}\left(\overset{\rightharpoonup }{r},\kern0.5em \overset{\rightharpoonup }{s}\right)}{\partial s}=-{\mu}_t{I}_{\lambda}\left(\overset{\rightharpoonup }{r},\kern0.5em \overset{\rightharpoonup }{s}\right)+\frac{\mu_s}{4\pi}\underset{4\pi }{\int }{I}_{\lambda}\left(\overset{\rightharpoonup }{r,}\kern0.5em \overset{\rightharpoonup }{s}\right)\;p\left(\overset{\rightharpoonup }{s},\kern0.5em {\overset{\rightharpoonup }{s}}^{\prime}\right)d{\varOmega}^{\prime }+\varepsilon \left(\;\overset{\rightharpoonup }{s},\kern0.5em {\overset{\rightharpoonup }{s}}^{\prime}\right), $$
(1.1)

where \( p\left(\overset{\rightharpoonup }{s},\overset{\rightharpoonup }{s^{\prime }}\right) \)is the scattering phase function, 1/sr; dΩ is the unit solid angle about the direction \( {\overset{\rightharpoonup }{s}}^{\prime } \), sr; μt = μa + μs is the total attenuation coefficient, 1/cm; μa is the absorption coefficient, 1/cm; μs is the scattering coefficient, 1/cm; and \( \varepsilon \left(\;\overset{\rightharpoonup }{s},{\overset{\rightharpoonup }{s}}^{\prime}\right) \) is the internal light source, which accumulates the effects of fluorescence and Raman spectroscopy. However, in most practically interesting cases, the measurement of the absorption and scattering coefficients of tissues can be performed neglecting the effects of fluorescence and Raman scattering, since their quantum efficiency is relatively small. It is equivalent to Eq. (1.1) in the absence of the internal radiation sources.

The scalar approximation of the radiative transfer equation (RTE) gives poor accuracy when the size of the scattering particles is much smaller than the wavelength, but provides acceptable results for particles comparable to and exceeding the wavelength [7].

The phase function \( p\left(\overset{\rightharpoonup }{s},\overset{\rightharpoonup }{s^{\prime }}\right) \) describes the scattering properties of the medium and is actually the probability density function for scattering in the direction \( {\overset{\rightharpoonup }{s}}^{\prime } \) of a photon traveling in the direction \( \overset{\rightharpoonup }{s} \); in other words, it characterizes an elementary scattering event. If scattering is symmetric relative to the direction of the incident wave, then the phase function depends only on the scattering angle θ (angle between directions \( \overset{\rightharpoonup }{s} \) and \( {\overset{\rightharpoonup }{s}}^{\prime } \)), i.e., \( p\left(\overset{\rightharpoonup }{s},\overset{\rightharpoonup }{s^{\prime }}\right)=p\left(\theta \right) \). The assumption of random distribution of scatterers in a medium (i.e., the absence of spatial correlation in the tissue structure) leads to normalization:

$$ \underset{0}{\overset{\pi }{\int }}p\left(\theta \right)2\pi \sin \theta d\theta =1. $$

In practice, the phase function is usually well approximated with the aid of the postulated Henyey–Greenstein function [2,3,4,5,6, 8]:

$$ p\left(\theta \right)=\frac{1}{4\pi}\frac{1-{g}^2}{{\left(1+{g}^2-2g\cos \theta \right)}^{3/2}}, $$
(1.2)

where g is the scattering anisotropy parameter (mean cosine of the scattering angle θ):

$$ g=\kern0.5em <\cos \theta >\kern0.5em =\underset{0}{\overset{\pi }{\int }}p\left(\theta \right)\cos \theta 2\pi \sin \theta d\theta . $$

The value of g varies in the range from −1 to 1; g = 0 corresponds to isotropic (Rayleigh) scattering, g = 1 to total forward scattering (Mie scattering at large particles), and g =−1 to total backward scattering [3,4,5,6,7,8,9].

Other phase functions commonly used to analyze the propagation of light in turbid media, including tissue, are the small-angle scattering phase function [10, 11], the Mie phase function [12,13,14,15], the δ-Eddington phase function [16, 17], the Reynolds–McCormick phase function [18,19,20], the Gegenbauer kernel phase function [14, 15, 21, 22], and their modifications [23,24,25].

2.1 Integrating Sphere Spectroscopy

Integrating sphere spectroscopy (ISS) is commonly used as an optical calibration and measurement tool and, in particular, it is successfully used to measure optical properties of tissues [2, 3, 5]. A detailed theory of the integrating sphere spectroscopy is presented in [26,27,28,29,30,31,32]. The inner surface of an integrating sphere is uniformly coated with highly reflective diffuse materials (exceeding 0.98) to achieve homogenous distributions of light radiation at the sphere’s inner wall. A light beam falling on the inner surface of an integrating sphere is evenly scattered to all directions (Lambertian reflections) and the light fluxes are evenly distributed (spatially integrated) on the homogenous inner surface of the sphere after multiple Lambertian reflections . A standard integrating sphere usually has three ports: an input port, an output port, and a third port for the detector. In certain applications, the fourth port is also used so that the specular reflection beam can go out from the sphere in a light trap. However, for real integrating spheres, the surfaces do not have perfect Lambertian reflection. To prevent measurement errors by specular reflection, baffle(s) coated with a highly reflective material is often placed inside the sphere to further diffuse the specular reflection and avoid the direct reflection from reaching the detector.

There are several advantages of using spectroscopy with integrating sphere for measuring the spectral reflectance and transmittance of tissue samples, in comparison with direct measurement of the samples by a spectrometer. First, in a regular spectrometer measurement the incident light directly illuminates the sample surface, and the detected reflectance often has a dependency on the angle and distance between the incident beam and the detector. When an integrating sphere is used, all backreflected fluxes are captured and normalized by the sphere. Therefore, the angular dependency is no longer an issue. Second, the detector-object distance is often fixed in the integrating sphere measurement. Even if there is a small change in the sample-sphere distance, it will not affect the results of the measurements as long as all reflected light bounces back into the sphere. Additionally, by using integrating spheres, the spectral measurements are less dependent on the shape of the light beam and the homogeneity of the sample, since both incident light beam and the reflected/scattered light will be normalized on the inner surface of the sphere before being captured by the detector.

The optical parameters of tissue samples (namely the absorption coefficient μa, the scattering coefficient μs, and the anisotropy factor of scattering g) could be measured by various methods. The single- or double-integrating sphere method combined with collimated transmittance measurement (see Fig. 1.1) is the most often used for in vitro tissue studies. Briefly, this approach implies either sequential or simultaneous determination of three parameters: the total transmittance Tt = Tc + Td (Td is the diffuse transmittance), the diffuse reflectance Rd, and the collimated transmittance Tc = Id/I0 (Id is the intensity of transmitted light measured using a distant photodetector with a small aperture, and I0 is the intensity of incident radiation).

Fig. 1.1
figure 1

Measurement of tissue optical properties using an integrating sphere. (a) Total transmittance mode, (b) diffuse transmittance mode, (c) diffuse reflectance mode, (d) collimated transmittance mode, (e) double-integrating sphere. 1 is the incident beam; 2 is the tissue sample; 3 is the integrating sphere; 4 is the entrance port; 5 is the transmitted radiation; 6 is the exit port; 7 is the diffuse reflected radiation

Any three measurements from the following five are sufficient for the evaluation of all three optical parameters [3]:

  1. 1.

    Total (or diffuse) transmittance for collimated or diffuse radiation;

  2. 2.

    Total (or diffuse) reflectance for collimated or diffuse radiation;

  3. 3.

    Absorption by a sample placed inside an integrating sphere;

  4. 4.

    Collimated transmittance;

  5. 5.

    Angular distribution of radiation scattered by the sample.

The optical parameters of the tissue are deduced from these measurements using different theoretical expressions or numerical simulations: the inverse Monte Carlo (IMC) [33,34,35,36,37,38,39,40,41] or inverse adding-doubling (IAD) [42,43,44,45,46,47,48,49,50,51] methods, or methods based on the diffusion approximation of the transfer equation [52,53,54,55,56]. However, the diffusion approximation has limitations, including describing tissue with a low albedo and accurate consideration of boundary conditions. To overcome these shortcomings other techniques such as the IAD and the IMC are the most commonly used.

The adding-doubling technique is a numerical method for solving the 1D transport equation in slab geometry. It can be used for tissue with an arbitrary phase function, arbitrary angular distribution of the spatially uniform incident radiation, and infinite beam size as lateral light losses cannot be taken into account. The angular distribution of the reflected radiance (normalized to an incident diffuse flux) is given by Prahl et al. [42]:

$$ {I}_{\mathrm{ref}}\left({\eta}_c\right)=\underset{0}{\overset{1}{\int }}{I}_{\mathrm{in}}\left({\eta}_c^{\prime}\right)R\left({\eta}_c^{\prime },{\eta}_c\right)2{\eta}_c^{\prime }d{\eta}_c^{\prime }, $$
(1.3)

where Iin(ηc) is an arbitrary incident radiance angular distribution, ηc is the cosine of the polar angle, and \( R\left({\eta}_c^{\prime },{\eta}_c\right) \) is the reflection redistribution function determined by the optical properties of the slab.

The distribution of the transmitted radiance can be expressed in a similar manner, with obvious substitution of the transmission redistribution function \( T\left({\eta}_c^{\prime },{\eta}_c\right) \). If M quadrature points are selected to span over the interval (0, 1), the respective matrices can approximate the reflection and transmission redistribution functions:

$$ R\left({\eta}_{\mathrm{ci}}^{\prime },{\eta}_{\mathrm{cj}}\right)\to {R}_{\mathrm{ij}};\kern0.62em T\left({\eta}_{\mathrm{ci}}^{\prime },{\eta}_{\mathrm{cj}}\right)\to {T}_{\mathrm{ij}}. $$
(1.4)

These matrices are referred to as the reflection and transmission operators, respectively. If a slab with boundaries indexed as 0 and 2 is comprised of two layers, (01) and (12), with an internal interface 1 between the layers, the reflection and transmission operators for the whole slab (02) can be expressed as:

$$ {\boldsymbol{T}}^{02}={\boldsymbol{T}}^{12}{\left(\boldsymbol{E}-{\boldsymbol{R}}^{10}{\boldsymbol{R}}^{12}\right)}^{-1}{\boldsymbol{T}}^{01}, $$
$$ {\boldsymbol{R}}^{20}={\boldsymbol{T}}^{12}{\left(\boldsymbol{E}-{\boldsymbol{R}}^{10}{\boldsymbol{R}}^{12}\right)}^{-1}{\boldsymbol{R}}^{01}{\boldsymbol{T}}^{12}+{\boldsymbol{R}}^{21}, $$
$$ {\boldsymbol{T}}^{20}={\boldsymbol{T}}^{10}{\left(\boldsymbol{E}-{\boldsymbol{R}}^{12}{\boldsymbol{R}}^{10}\right)}^{-1}{\boldsymbol{T}}^{21}, $$
$$ {\boldsymbol{R}}^{02}={\boldsymbol{T}}^{10}{\left(\boldsymbol{E}-{\boldsymbol{R}}^{12}{\boldsymbol{R}}^{10}\right)}^{-1}{\boldsymbol{R}}^{12}{\boldsymbol{T}}^{01}+{\boldsymbol{R}}^{10}, $$
(1.5)

where E is the identity matrix defined in this case as:

$$ {E}_{\mathrm{ij}}=\frac{1}{2{\eta}_{\mathrm{ci}}{w}_i}{\delta}_{\mathrm{ij}}, $$
(1.6)

where wi is the weight assigned to the i-th quadrature point and δij is a Kronecker delta symbol, δij = 1 if i = j, and δij = 0 if i ≠ j.

The definition of the matrix multiplication also slightly differs from the standard. Specifically,

$$ {\left(\boldsymbol{AB}\right)}_{\mathrm{jk}}\equiv \sum \limits_{j=1}^M{A}_{\mathrm{ij}}2{\eta}_{\mathrm{cj}}{w}_j{B}_{\mathrm{jk}}. $$
(1.7)

Equations (1.5) allow one to calculate the reflection and transmission operators of a slab when those of the comprising layers are known. The idea of the method is to start with a thin layer for which the RTE can be rather easily simplified and solved, producing the reflection and transmission operators, and then proceed by doubling the thickness of the layer until the thickness of the whole slab is reached. Several techniques exist for layer initialization. The single-scattering equations for reflection and transmission for the Henyey–Greenstein function are given by van de Hulst [57] and Prahl [58]. The refractive index mismatch can be taken into account by adding effective boundary layers of zero thickness and having the reflection and transmission operators determined by Fresnel’s formulas. Both total transmittance and reflectance of the slab are obtained by straightforward integration of Eq. (1.3). Different methods of performing the integration and the IAD program provided by S. A. Prahl [42, 58] allow one to obtain the absorption and the scattering coefficients from the measured diffuse reflectance Rd and total transmittance Tt of the tissue slab. This program is the numerical solution to the steady-state RTE (Eq. (1.1)) realizing an iterative process, which estimates the reflectance and transmittance from a set of optical parameters until the calculated reflectance and transmittance match the measured values. Values for the anisotropy factorg and the refractive indexn must be provided to the program as input parameters.

It was shown that using only four quadrature points, the IAD method provides optical parameters that are accurate to within 2–3% [42]. Higher accuracy, however, can be obtained by using more quadrature points, but it would require increased computation time. Another valuable feature of the IAD method is its validity for the study of samples with comparable absorption and scattering coefficients [42], since other methods based on only diffusion approximation are inadequate. Furthermore, since both anisotropic phase function and Fresnel reflection at boundaries are accurately approximated, the IAD technique is well suited to optical measurements of biological tissues and blood held between two glass slides. The adding-doubling method provides accurate results in cases when the side losses are not significant, but it is less flexible than the Monte Carlo (MC) technique.

Both the real geometry of the experiment and the tissue structure may be complicated. Therefore, inverse Monte Carlo method has to be used if reliable estimates are to be obtained. A number of algorithms to use the IMC method are available now in the literature [5, 15, 19, 33, 37,38,39, 59,60,61]. Many researches use the Monte Carlo (MC) simulation algorithm and program provided by S. L. Jacques, and L. Wang et al. [35, 62, 63]. The MC technique is employed as a method to solve the forward problem in the inverse algorithm for the determination of the optical properties of tissues and blood. The MC method is based on the formalism of the RTT, where the absorption coefficient is defined as a probability of a photon to be absorbed per unit length, and the scattering coefficient is defined as the probability of a photon to be scattered per unit length. The effects of fluorescence and Raman scattering may be also taken into account in a similar way by introducing the probability of generating new photons with different frequencies for the correspondingly absorbed or scattered initial photons. Using these probabilities, a random sampling of photon trajectories is generated. Among the firstly designed IMC algorithms , similar algorithms for determining all three optical parameters of the tissue (μa, μs, and g) based on the in vitro evaluation of the total transmittance, diffuse reflectance, and collimated transmittance using a spectrophotometer with integrating spheres can be also mentioned [5, 15, 33, 37, 38, 40, 41, 44, 50, 60, 61, 64]. The initial approximation (to speed up the procedure) is achieved with the help of the Kubelka–Munk theory , specifically its four-flux variant [3, 5, 33, 37, 38, 65,66,67]. The algorithms take into consideration the sideways loss of photons, which becomes essential in sufficiently thick samples. Similar results have been obtained using the condensed IMC method [5, 60, 61, 68,69,70,71,72,73]. Figure 1.2 demonstrates the typical flowchart of the IMC method [41].

Fig. 1.2
figure 2

The typical flowchart of the IMC method [41]

In the basic MC algorithm a photon described by three spatial coordinates and two angles (x, y, z, θ, ϕ) is assigned its weight W = W0 and placed in its initial position, depending on the source characteristics. The step size s of the photon is determined as s = − ln (ξ)/μt, where ξ is the random number between (0, 1). The direction of the photon’s next movement is determined by the scattering phase function substituted as the probability density distribution. Several approximations for the scattering phase function of tissue and blood have been used in MC simulations. They include two empirical phase functions widely used to approximate the scattering phase function of tissue and blood, Henyey–Greenstein phase function (HGPF) (see Eq. (1.2)), the Gegenbauer kernel phase function (GKPF), and Mie phase function.

In most cases, azimuthal symmetry is assumed. This leads to p(ϕ) = 1/2π and, consequently, ϕrnd = 2πξ. At each step, the photon loses part of its weight due to absorption: W = W(1 − Λ), where Λ = μs/μt is the albedo of the medium.

When the photon reaches the boundary, part of its weight is transmitted according to the Fresnel equations . The amount transmitted through the boundary is added to the reflectance or transmittance. Since the refraction angle is determined by the Snell’s law , the angular distribution of the out-going light can be calculated. The photon with the remaining part of the weight is specularly reflected and continues its random walk.

When the photon’s weight becomes lower than a predetermined minimal value, the photon can be terminated using “Russian roulette ” procedure [35, 62, 63]. This procedure saves time, since it does not make sense to continue the random walk of the photon, which will not essentially contribute to the measured signal. On the other hand, it ensures that the energy balance is maintained throughout the simulation process.

The MC method has several advantages over the other methods because it may take into account mismatched medium-glass and glass-air interfaces, losses of light at the edges of the sample, any phase function of the medium, and the finite size and arbitrary angular distribution of the incident beam. The only disadvantage of this method is the long time needed to ensure good statistical convergence, since it is a statistical approach. The standard deviation of a quantity (diffuse reflectance, transmittance, etc.) approximated by MC technique decreases proportionally to \( 1/\sqrt{N} \), where N is the total number of launched photons. It is worthy of note that stable operation of the algorithm is maintained by generation of from 105 to 5 × 105 photons per iteration. Two to five iterations are usually necessary to estimate the optical parameters with approximately 2% accuracy.

2.2 Diffuse Backscattered Reflectance Spectroscopy

Diffuse backscattered reflectance spectroscopy (BS) [5, 72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87] is well suited for use in biomedical applications due to its low instrumentation cost, easy implementation, and non-destructive measurement setup. Hence, many different BS measurement configurations have been developed. Optical fiber arrays and non-contact reflectance imagery are two typical sensing configurations in BS measurement, which can be implemented with fiber-optic probe (FOP), monochromatic imaging (MCI) , and hyperspectral imaging (HSI). In the FOP measurement , a single spectrometer, multiple spectrometers, or a spectrograph-camera combination coupled with multiple detection fibers can be used to measure diffuse reflectance at different distances from the light incident point. Moreover, it is also desirable to measure a tissue sample at a greater depth. To overcome the shortcomings of a rigid FOP, a flexible FOP with numerous optical fibers covering a spatial distance range of 0–30 mm can be used for measuring the tissue optical properties. Optical fibers have to be coupled to a multichannel hyperspectral imaging system, which allows simultaneous acquisition of reflectance spectra from the sample. The use of several different sizes of fibers for the probe also expands effectively the dynamic range of the camera, allowing acquiring spectra at greater depth of the sample.

As a non-contact method , MCI is more suitable for measuring optical properties of tissues for monochromatic irradiation. A laser diode or a combination of a supercontinuum laser and a monochromator can be used to illuminate a sample at a specific wavelength. The diffuse reflectance is acquired with a CCD camera. This BS configuration is simple and relatively easy to implement. The acquired 2D scattering images are reduced to 1D scattering profiles by radial averaging when the scattering images are axisymmetric with respect to the laser incident point. However, this assumption is not satisfied for anisotropic tissues where the light is guided by the tissue fibers. For example, in the case of bovine muscle tissue, the effect of the fibers resulted in scatter spots with a rhombus shape. Measurement at multiple wavelengths requires sequential wavelength scanning. In addition, a substantial portion of the signal of each pixel comes from the surrounding areas, which may affect the accuracy of the measurement . Therefore, the characterization of the point-spread function (PSF) is necessary in order to minimize errors in the obtained intensity values for the image data interpretation.

In the hyperspectral imaging , spectral and spatial information is acquired simultaneously and, therefore, it has advantageous for measuring diffuse reflectance profiles over a broad spectral range. As a rule typical hyperspectral imaging-based BS system in line scan mode has high spatial resolution and mainly consists of a high-performance CCD camera, an imaging spectrograph, a zoom or prime lens, a light source, and an optical fiber coupled with a focusing lens for delivering a broadband beam to the sample.

As an indirect method for optical property measurement, computation of the optical parameters from the BS measurements usually requires sophisticated modeling based on the diffusion approximation of radiative transfer theory or MC simulation, coupled with appropriate inverse algorithms. Numerical methods are generally required for solving the radiative transfer equation or using inverse MC simulation. These methods are flexible and allow possibility for modeling of different geometries of experimental setups but they may be subjected to statistical uncertainties during the estimation of the reflectance. Moreover, one of the major drawbacks with the numerical methods is that they require substantial computational time. To overcome the shortcomings the condensed IMC method can be used, that is, a library of MC simulated BS profiles for a grid of μs, μa and g values can be calculated, and then the library can be used either as a look-up table or for training a neural network.

Another way to reconstruct the tissue optical parameters (such as μa and reduced scattering coefficient\( {\mu}_s^{\prime }={\mu}_s\left(1-g\right) \)) has been proposed by Zonios et al. [88,89,90,91]. Their approach is based on diffusion approximation and assumes that\( R\left(\lambda \right)=\frac{\mu_s^{\prime}\left(\lambda \right)}{k_1+{k}_2{\mu}_a\left(\lambda \right)} \). Here R(λ) is the diffuse reflectance, λ is the wavelength, k1 and k2 are constants that depend on the probe geometry. The optical coefficients μa and \( {\mu}_s^{\prime } \) can be related to the absorption and scattering properties of the tissue through Eqs. (1.8) and (1.9) (for example):

$$ {\displaystyle \begin{array}{c}{\mu}_a\left(\lambda \right)={C}_{\mathrm{Hb}}\left[\alpha \cdotp {\varepsilon}_{\mathrm{Hb}\mathrm{O}}\left(\lambda \right)+\left(1-\alpha \right){\varepsilon}_{\mathrm{Hb}}\left(\lambda \right)\right]+{C}_w\cdotp {\varepsilon}_w\left(\lambda \right)\\ {}\kern1.6em +{C}_{\mathrm{mel}}\cdotp {\varepsilon}_{\mathrm{mel}}\left(\lambda \right)+{C}_{\mathrm{col}}\cdotp {\varepsilon}_{\mathrm{col}}\left(\lambda \right)+\dots, \end{array}} $$
(1.8)

where CHb is the total concentration of hemoglobin, α is the oxygen saturation of hemoglobin, Cw is the concentration of water, Cmel is the concentration of melanin, Ccol is the concentration of collagen, and εHbO, εHb, εw, εmel, εcol are the absorption coefficients of oxyhemoglobin, deoxyhemoglobin, water, melanin, and collagen, respectively.

$$ {\mu}_s^{\prime}\left(\lambda \right)=\frac{A}{\lambda^w}, $$
(1.9)

where parameter A is defined by the concentration of scattering particles in the tissue, and the wavelength exponent w is independent of the particles concentration, characterizes the mean size of the particles, and defines the spectral behavior of the scattering coefficient [92].

Accurate estimation of optical parameters by inverse algorithms is not an easy task due to the complexity of analytical solutions and potential experimental errors in the measurement of diffuse reflectance from a medium. Moreover, for many biological materials, the values of the absorption coefficient over a specific spectral region (especially in the region from 700 to 900 nm) are rather small that makes it more difficult to obtain an accurate estimation of the optical parameters. For these reasons, it is generally considered acceptable or accurate when errors for measuring μa and \( {\mu}_s^{\prime } \) are within 10%. In general, the estimation of optical parameters can be defined as the nonlinear least-squares optimization problem with several important assumptions, that is, constant variance errors, uncorrelated errors, and a Gaussian error distribution. The results will not be valid if these assumptions are violated. In addition, for estimating the optical parameters of layered media, the increased number of free parameters can dramatically increase the computational time, further exacerbating the estimation of optical parameters, and/or causing ill-posed problems. Different strategies such as a multi-step method, sensitivity analysis, statistical evaluation, etc. [93] have been proposed to optimize the inverse algorithms and improve the estimation accuracies.

2.3 Raman Spectroscopy

Raman Scattering

Neoplastic cells are characterized by increased nuclear material, an increased nuclear-to-cytoplasmic ratio, increased mitotic activity, abnormal chromatin distribution, and decreased differentiation [94, 95]. There is a progressive loss of cell maturation, and proliferation of these undifferentiated cells results in increased metabolic activity. The morphologic and biochemical changes that occur with malignant tissue are numerous and in many cases depend on the specific type and location of the cancer. Biochemical tumor markers include cell surface antigens, cytoplasmic proteins, enzymes, and hormones. These general features of neoplastic cells result in specific changes in nucleic acid, protein, lipid, and carbohydrate quantities and/or conformations [95]. There are multiple molecular markers, located in the membrane, the cytoplasm, the nucleus, and the extracellular space that may be indicative of neoplasia. As most biological molecules are Raman active, with distinctive spectra in the fingerprint region (500–1800 cm−1), vibrational spectroscopy is a desirable tool for cancer detection.

Raman spectroscopy is based on the inelastic scattering of photons by molecular bond vibrations. Therefore the alteration of molecular signatures in a cell or tissue undergone cancer transformation can be detected by noninvasive Raman scattering without labeling .

In general, the majority of scattered photons have the same frequency as incident photons when light passes through the tissue (Fig. 1.3). This is known as Rayleigh or elastic scattering . However, a very small portion of photons alters the energy after collision with molecular due to inelastic of Raman scattering. The energy difference between the incident and scattered photons (Raman shift, measured by wavenumber in cm−1) corresponds to the vibrational energy of the specific molecular bond interrogated [96,97,98].

Fig. 1.3
figure 3

Energy level diagram for elastic (Rayleigh) and inelastic spontaneous Raman scattering

The ground state vibrational frequencies and energies vary depending on the strengths of bonds and masses of atoms involved in the normal mode motion. The greatest varieties of vibrational transitions in biological molecules occur in the fingerprint range (500–1800) cm−1. Signatures in the higher wavenumber (HW) range (2800–3500) cm−1 arise from transitions between states of modes involving symmetric or asymmetric stretching of C–H bonds. The intensity of the Raman peaks for a particular molecule is directly proportional to the concentration of that molecule within a sample so the resulting spectrum is a superposition of Raman response of all the Raman active molecules from within a sample. Therefore, a Raman spectrum is an intrinsic molecular fingerprint of the sample, revealing detailed information about DNA, protein, and lipid content as well as macromolecular conformations , which can be extracted from the measured spectra. The spectral capacity of encoding chemical information can be estimated as the maximum number of distinct spectral states one can discriminate and include up to 50 spectral peaks in the entire Raman spectrum [99]. The original analyses for Raman signals are based on differences in intensity, shape, and location of the various Raman bands between normal and cancerous cells and tissues. These characteristic Raman bands elucidate not only information about biological components of the cell but also their quality, quantity, symmetry, and orientation. They can be used for understanding the spectral signature as it pertains to the disease process. However, it should be taken into account that high sensitivity to small biochemical changes is accompanied by weak Raman signal (inelastic scattering cross section is ~10−30 cm2/molecular) often in the presence of high background. Therefore, significant problems exist for acquiring viable Raman signatures inherent to the chemically complex and widely varying biological tissue. The primary challenge for obtaining Raman spectra from biological materials is the intrinsic fluorescence, which is ubiquitously presented in almost all tissues and in several orders of magnitude intense than Raman signal.

Typical Raman setup is shown in Fig. 1.4 and consists of three primary components—laser source (1), sample light delivery and collection module (2), and spectrometer with CCD detector (3). The diagnostic effectiveness of Raman system is tightly bound by the instrumentation parameters, which have to be chosen very carefully to measure the weak Raman signals. Generally, the choice of instrumentation is always a compromise between different factors driven by tissue under study and the pathophysiological processes. For example, the laser power is limited by signal-to-noise ratio (SNR) and maximum permissible exposure.

Fig. 1.4
figure 4

The typical Raman setup: 1—laser; 2—sample light delivery and collection module; 3—spectrometer with CCD detector; 4—PC. L1, L2 fiber-coupling lens, OBJ objective lens, DM dichroic mirror, M mirror, BPF bandpass filter, LPF longpass filter

The key component of a Raman system is the detector, which in most cases is a charged coupled device (CCD) . Several important factors have to be considered when choosing the appropriate CCD array for any Raman spectroscopy application. Specifically, the noise level and the quantum efficiency (QE) are of great importance. A typical CCD camera used in spectroscopy consists of a rectangular chip wherein the horizontal axis corresponds to the wavelength/wavenumber axis and the vertical axis is used to stack multiple fibers for increased throughput, which can subsequently be binned for improved SNR. While different types of chips are commercially available for different applications, a back-illuminated, deep-depletion CCD provides the highest QE in the NIR region. Most CCDs use a thermoelectric (TE) multistage Peltier system to actively cool the camera down to at least −70 °C in order to realize excellent dark noise performance. In fact, current Raman systems for most biomedical applications are only limited by shot noise. Selection of appropriate wavelengths for excitation is often governed by the reduction of fluorescence and scattering background, which decreases with wavelength increasing. However, due to 1/λ4 dependency the Raman intensity also reduces with wavelength increasing and quantum efficiency of silicon-based CCD detector falls rapidly for wavelengths over 1000 nm.

Overall, researchers in this field tend to prefer 785 nm and 830 nm excitation as a reasonable compromise for most tissues. A comprehensive overview of different Raman-instrumentation schemes and various probe designs is given in [100,101,102].

Data Processing and Analysis

The direct background subtraction from raw signal may be achieved by excitation wavelength shifting within a few nanometers with following differentiation of acquiring signals, but such hardware technique requires specific design considerations including the use of tunable stabilized lasers [103, 104]. The other common methods for fluorescence elimination use software-based mathematical techniques like frequency-domain filtering [105], wavelet transformation [106], polynomial fitting [107, 108]. The polynomial curve fitting has an advantage over other fluorescence reduction techniques due to its inherent ability to retain the spectral contours and intensities of the input Raman spectra and minimal presence of artificial peaks in low SNR spectra [100, 108].

As Raman scattering intensity is extremely weak the measured Raman spectra require significant noise smoothing and binning for extraction of the underlying Raman bands, including median filter, the moving average window filter, the Gaussian filter, the Savitzky–Golay filter of various orders [109,110,111], and multivariate statistical approaches to remove the higher order components and noise [112].

Raman spectra are complex in nature as tissue contains a diverse set of small and large biomolecules. The vibrational frequencies associated with different functional groups and backbone chains, for example, in proteins, saccharides, and nucleic acids often overlap, thus, making it difficult to assign a specific observed band in the Raman spectrum to a specific functional group of a particular molecule in the tissue [113]. Moreover, while the peak location of an isolated functional group of atoms is typically known, the actual peak location of a functional group in a molecule may slightly differ from the isolated case because of interactions and bonding with its neighbors. Nevertheless, functional groups associated with specific molecules often give rise to relatively narrow and well-resolved bands in the Raman spectra. Table 1.1 summaries the major Raman spectral modes, where spectral differences have been found for normal and cancerous tissues [94, 95, 98, 114,115,116,117,118,119,120,121]. A detailed description of Raman spectral modes for malignant tissues may be found in Refs. [94, 115]. Characteristic Raman peaks arise from nucleic acids, lipids (C–C, C–O stretching), proteins (C–C, C–N stretching), and C–O stretching of carbohydrates in the region between 800 and 1200 cm−1; C–N stretching and N–H bending (amide III band) with contributions from proteins (CH3CH2 wagging, twisting, bending), polysaccharides, lipids (CH3CH2 twisting, wagging, bending), and nucleic acids in the region between 1200 and 1400 cm−1; C–H, CH2, and CH3 vibrations in the region between 1400 and 1500 cm−1; C=O stretching vibrations (amide I band), proteins (C=C), nucleic acids, and lipids (C=C stretch) in the region between 1500 and 1760 cm−1; CH2 symmetric and asymmetric stretching modes of lipids and proteins in the region between 2850 and 3000 cm−1; OH stretching modes of water in the region from 3100 to 3500 cm−1. Despite there being clear Raman bands in malignant tissue that probably connected to the abundance of different biomolecules, there were no unique peaks that could be assigned to any type of cancer alone.

Table 1.1 Major molecular vibrational modes and biochemical assignments observed for normal and malignant tissues [94, 95, 98, 114,115,116,117,118,119,120,121]

The prognostic value of Raman diagnostics follows from its inherent chemical specificity, which makes it possible to determine changes in the content of the tumor compared with the surrounding tissue. Observed content alteration includes several biomarkers, such as relative abundance of DNA [122,123,124], changes in structural and hydrogen bonding information for lipid, protein, and nucleic acids [122, 125, 126], variation of collagen and elastin context [122, 123, 127,128,129,130], increase or decrease in chemical components content like tryptophan [113, 123, 128, 129], keratin [113], carotenoids [129, 131], glycogen [123, 128, 131], cholesterol ester [122, 131], tyrosine and proline [123, 128]. Most markers have several peaks, facilitating increased robustness in detection. In some cases it is possible to determine identity of diagnostically relevant species by a few factors. For example, Haka et al. [122] have shown that relative abundances in calcium hydroxyapatite and calcium oxalate dehydrate correlate with malignancies in breast cancer. Peak pairs also can provide information on protein-to-DNA and protein-to-lipid ratios. It has been shown by several research groups that the ratio of intensities at 1455 cm–1 and 1655 cm–1 may be used for classification of tumor vs. normal tissue in the lung, brain, breast, colon, and cervix [109, 124, 129], since the 1655 cm–1 band corresponds to the C=O stretching of collagen and elastin, and the 1445 cm–1 band (CH2 scissoring) varies with the lipid-to-protein ratio. But in most cases spectral changes between healthy and diseased tissue appear in the context of entire highly complex spectra from the tissue, and diagnostic information may be derived only with a help of spectral pattern recognition approaches. The Raman spectra also contain hidden links between different bands of the spectrum due to the contribution of the same chemical components. This leads to the emergence of multiple correlations. Consequently, multivariate statistical techniques have become the accepted practice for the development of discrimination and classification algorithms for diagnostic applications. Chemometrics is one of the powerful tools that are able to identify variations that lead to accurate and reliable separation of malignant and normal tissue. In the past few years discrimination techniques such as linear and nonlinear regression [132,133,134], principal component analysis (PCA) for data compression [112, 114] as well as classification techniques such as support vector machines (SVMs) [135], neural networks [125, 136], classification trees [137, 138], partial least-squares discriminant analysis (PLS-DA) [139, 140] have been employed. One of the perceived advantages of PLS-DA is that it has the ability to analyze highly collinear and noisy data. As a result, a combination of Raman spectral data and chemometrics is capable of differentiation between cancer and normal tissues as surveyed from the publications reviewed in Table 1.2.

Table 1.2 General overview of Raman spectroscopy cancer studies

Tissue Analysis

To assess the applicability of Raman spectroscopy for the clinical diagnosis of cancer, numerous studies have been conducted with extracted tissues that have been frozen (with liquid nitrogen or dry ice) at the time of collection and thawed for study or fixed in formalin to prevent deterioration. The fixation process chemically alters the tissue, primarily cross-linking the collagen proteins, and thus, affects the Raman spectral signature of the tissue. Although some differences are observed in the Raman spectra of fresh and fixed tissues, the variation appears to be small and does not fundamentally affect the potential diagnostic capability of the spectrum [95].

Lyng et al. [141] have examined formalin fixed paraffin preserved specimens of benign lesions (fibrocystic, fibroadenoma, intraductal papilloma) and cancer (invasive ductal carcinoma and lobular carcinoma) aiming an aid to histopathological diagnosis of breast cancer. Several modes of vibration have been found to be significantly different between the benign and malignant tissues. The band at 1662 cm−1 is assigned to the amide I mode originating mainly from proteins and nucleic acids. The two weak bands at 1610 and 1585 cm−1 observed in the breast tissue are due to the ν(C=C) modes of aromatic amino acids (phenylalanine, tyrosine, and tryptophan). The band at 1448 cm−1 is assigned to the ν(CH2/CH3) modes from a combination of lipoproteins from the cell membrane, adipose tissue, and nucleic acids. The amide III bands are observed in the region of 1295–1200 cm−1, which are attributed to a combination of ν(CN) and ν(NH) modes of the peptide bond ν(–CONH). The bands at 936 and 856 cm−1 are assigned to the ν(C–C) modes of proline and valine and the ν(C–CH) modes of proline and tyrosine, respectively. The spectrum exhibits three major characteristic bands in this region including those due to the ν(C=C) mode at 1515 cm−1, the ν(C–C) mode at 1156 cm−1, and the ring breathing mode at 1004 cm−1. The performance of the different algorithms PCA-LDA, PCA-QDA, PLS-DA, linear c-SVC, linear nu-SVC, RBF c-SVC, and RBF nu-SVC has been evaluated using sensitivity and specificity calculated based on the results from the Raman data and histopathology as the gold standard. PCA-LDA, PCA-QDA, and PLS-DA models have achieved similar sensitivity and specificity of 80%. SVM models have achieved sensitivity and specificity exceeding 90%, but required more processing time than other models.

Cell Lines

The complexity of tissue structure and environment makes the interpretation of tissue Raman spectra difficult. An understanding of the molecular, microscopic, and macroscopic origin of observed tissue Raman signals may be achieved by in vitro study of Raman spectra of biologically important molecules in solution, in single living cells, in cell cultures prepared from surgically removed human tissues [142] or established with cancer cell lines [114]. Cell lines are widely used in many aspects of laboratory research and particularly as in vitro models in cancer research. They have a number of advantages, for example, they are easy to handle and represent an unlimited self-replicating source that can be grown in almost infinite quantities. In addition, they exhibit a relatively high degree of homogeneity and ease of handling [143]. Raman spectra not only reveal differences in biological composition between cell lines but also represent the combined effect of these parameters in order to study various aspects of elementary biological processes such as the cell cycle, cell differentiation, and apoptosis.

The majority of researchers have primarily been focused on spectral differences in the fingerprint range 600–1800 cm−1 as it includes peaks that can be assigned to different biochemical compounds, such as lipids, proteins, or nucleic acids. The lipid content and the chemical structure of these compounds, for instance, can be evaluated using peak frequencies of 1754 cm−1 (C=O), 1656 cm−1 (C=C), 1440 cm−1 (CH2 bend), and 1300 cm−1 (CH2 twist). Specification of the protein content of biological samples can also be understood from 1656 cm−1 (amide I), 1450 cm−1 (CH2 bend), 1100–1375 cm−1 (amide III), and 1004 cm−1 (phenylalanine) [99, 115, 127, 142,143,144,145,146,147,148,149].

Oshima et al. [150] have demonstrated differences among cultures of normal and cancerous lung cell lines, namely adenocarcinoma and squamous cell carcinoma with low to medium and high malignancy. Single-cell Raman spectra have been obtained by using 532-nm excitation wavelength. Strong bands at 747, 1127, and 1583 cm−1 have been assigned to cytochrome c (cyt-c) indicating resonance near 550 nm with excitation light. Peaks at 1449, 1257, 1003, and 936 cm−1 have been assigned to the CH2 deformation, amide III, the symmetric ring breathing bands of phenylalanine of the protein, and C–C stretching, respectively. The bands at 720, 785, 830, 1086, 1340, 1421, and 1577 cm−1 have been assigned to nucleic acids (DNA and RNA). The overlapping modes of the amide I band of protein and the C=C stretching band of lipids form strong Raman peak at 1659 cm−1. PCA has successfully applied and 80% accuracy has been achieved in discrimination between four cancer cell lines.

Guo et al. [151] have reported that Raman spectroscopy can be used to differentiate malignant hepatocytes from normal liver cells. It has been shown that the strong bands at 1447 and 1656 cm−1 can be attributed to the CH2 deformation mode and the C=C stretching mode of the lipids and proteins, respectively. The band originating at 786 cm−1 can be assigned to the O–P–O stretching mode of DNA. The bands appearing at 1004 and 1032 cm−1 can be assigned to the symmetric ring breathing mode and the C–H in-plane bending mode of phenylalanine, respectively. Statistical methods such as t test, PCA, and LDA have been used to analyze the Raman spectra of both cell lines. The results of t test have confirmed that the intensities of these bands are considerably different between two cell lines, except for the 1585 and 1625–1720 cm−1 bands.

Crow et al. [128] have studied different prostatic adenocarcinoma cell lines and have found that principal components allow identification of molecular species from their Raman peaks and provide an understanding of the origins of the statistical variations. PC1 represents increased concentrations of nuclear acids (721, 783, 1305, 1450, and 1577 cm−1), DNA backbone (O–P–O) (827 and 1096 cm−1), and unordered proteins (1250 and 1658 cm−1). PC2 represents decreased concentrations of α-helix proteins (935, 1263, and 1657 cm−1) and phospholipids (719, 1094, 1125, and 1317 cm−1). PC3 represents decreased concentrations of lipids (1090, 1302, and 1373 cm−1), glycogen (484 cm−1), and nucleic acids (786, 1381, and 1576 cm−1). The PCA/LDA algorithm has achieved near perfect identification of each cell line, with sensitivities ranging from 96 to 100% and specificities all 99% or higher.

Krishna et al. [152] have used micro-Raman spectroscopy to investigate randomly mixed cancer cell populations, including human promyelocytic leukemia, human breast cancer, and human uterine sarcoma, as well as their respective pure cell lines. According to the results, cells from different origins can display variances in their spectral signatures and the technique can be used to identify a cell type in a mixed cell population via its spectral signatures.

Recent attention has been directed towards the use of high-wavenumber range (2800–3600 cm–1), as the HW spectral range exhibits stronger tissue Raman signals with less autofluorescence interference. In this spectral region most of the spectral features obtained from tissue are overlapping symmetric and asymmetric stretching of CH2 and CH3 vibrations of phospholipids and proteins with four main peaks, located at 2850 cm−1, 2880 cm−1, 2920 cm−1, and 2960 cm−1 [144, 147, 148]. There are also minor peaks of SH-stretching vibrations 2500–2600 cm−1 [115, 145] and broad band of OH-stretching vibrations (primarily due to water) in the spectral interval 3100–3500 cm−1 [145,146,147]. The CH stretch vibrations are sensitive to their environment by direct coupling and through Fermi resonances with C–H bending modes near 1500 cm−1. Together, these influences can introduce significant shifts and broadening of the CH stretch peaks [99, 127].

For example, Telari et al. [114] have studied Raman spectra of normal (MCF-10A) and two breast cancerous cell lines with different concentrations of nucleic acid (MDA-MB-436 and MCF-7) in fingerprint and HW ranges using noninvasive dispersive micro-Raman system equipped with a 532-nm laser. Peak intensities have shown clear differences among three cell lines in lipids (2934 cm−1), amide I (1658 cm−1), and amide III (1244 cm−1) ranges. PCA with the whole spectral range has shown good overall separation between the three cell lines, but it has not formed separate clusters representing “normal” and “cancerous/diseased” classes [114]. This suggests a very large biochemical variation even between the two breast cancerous cell lines. The MCF-7 cell line appears to be much higher in lipids compared to MDA-MB 436 and MCF 10A, and PCA works well to single out this cell line in view of the high-wavenumber region, which includes major peaks of symmetric and asymmetric stretching CH2 vibrations of lipids at 2882 cm−1, C–H, CH2 symmetric vibrations in lipids and proteins (2940 cm−1, 2921 cm−1, and 2948 cm−1). Although MCF-7 and MDA-MB-436 are both breast cancer subtypes, the MDA-MB-436 does not appear to contain lipids at a concentration vastly different to those found in the normal MCF-10A cell line. Instead, the difference lies more in the relative protein and amino acid concentrations, which may be identified in fingerprint region for adenine and guanine (1337 cm−1), CH2 deformation of lipids, adenine, and cytosine (1258, 1299, and 1304 cm−1), and methylene twisting vibrations (1294 cm−1) and different conformations in C=O stretching of proteins (1687 cm−1), anti-parallel ß-sheets of amide I (1670 cm−1), tryptophan or ß-sheet of protein (1621 cm−1), C=C of phenylalanine ring vibration, tyrosine (1607 cm−1), and tryptophan (1548 cm−1).

Gala de Pablo et al. [153] have studied Raman spectra distinction (Fig. 1.5) between primary (SW480) and secondary (SW620) tumor cells, derived from a primary Duke’s stage B adenocarcinoma and secondary tumor in a lymph node from the same patient. The CH2 and CH3 stretching contributions in the region of 2800–3200 cm−1 have shown higher overall intensity for primary tumor cells and a greater CH2:CH3 ratio for secondary cells, indicating differences in lipid composition between the two cell lines with higher lipid content for the larger size of primary cells.

Fig. 1.5
figure 5

Average single-cell spectra and variability spectrum, for primary (SW480) and secondary (SW620) tumor cells , normalized to the amide I peak. The error around the average shows one standard deviation. The region around 2900 cm−1 is shown reduced by a factor of 4 to enhance the details in the fingerprint region. Adapted with permission from [153]

When normalizing to the amide I band, secondary tumor cells (SW620) show a larger contribution of α-helix proteins, saccharides, nucleic acids, and double bonds related bands, whereas primary tumor cells (SW480) show larger contribution of lipids, β-sheet, and disordered structure proteins. Principal component analysis with linear discriminant analysis yields the best classification between the SW620/SW480 cell lines, with an accuracy of 98.7 ± 0.3% (standard error).

Laser-Trapped Single-Cell Diagnostics

The combination of laser tweezers and Raman detection is a very attractive application for the identification of malignant cells in cytological diagnosis systems. Chen et al. [142] have employed PCA analysis of Raman spectra from laser trapping of single cell of colorectal epithelial cells solution to differentiate cancerous and normal epithelial cells. The higher concentrations of nuclear acids and proteins in cancerous cells are reflected in major variations and an increase in Raman intensities at 788 cm−1 (DNA backbone O–P–O stretching), 853 cm−1 (ring breathing mode of tyrosine and C–C stretching of proline ring), 938 cm−1 (C–C backbone stretching of protein α-helix), 1004 cm−1 (symmetric ring breathing of phenylalanine), 1095 cm−1 (DNA PO2 symmetric stretching), 1257 cm−1 (amide III β-sheet), 1304 cm−1 (lipids CH2 twist), 1446 cm−1 (CH2 deformation of all components in cell), and 1657 cm−1 (C=O stretching of amide I α-helix). The PCA scores have been fed into logistic regression algorithm to determine the parameter equation that best differentiates the cancer cells from the normal ones, obtaining an overall sensitivity of 82.5% and specificity of 92.5%.

The extensive ex vivo studies have helped to form a reliable and detailed database of accurate Raman peak definitions and have given the knowledge about differences in spectral features of normal, benign, and malignant tissues (see references in Table 1.1). However, the real benefit of the method can only be explored through in vivo applications, which has become possible due to the advantages in the detector technology and progress in the development of miniature Raman fiber-optic probes. As such, it has been a significant movement from ex vivo to in vivo studies in recent years. A partial list of different Raman clinical applications for cancer diagnostics can be found in Table 1.2. Strong efforts have been made towards transfer of ex vivo tissue statistical models and classifiers to an in vivo clinical situation. For example, Molchovsky et al. [154] have found that the ex vivo classifier has not performed well; indeed, the PCA analyses of ex vivo and in vivo tissue are different. Therefore, the designed models need to be adapted to in vivo applications. In vivo studies are focused on three major clinical targets: early cancer diagnosis, biopsy guidance, and oncologic surgery guidance. As it may be seen in Table 1.2 the average sensitivity and specificity obtained using Raman spectroscopy for different cancer types vary from 83 up to 96% and from 77 up to 94%, respectively. It is interesting to point out that multimodal approaches, combining different modalities (OCT, fluorescence and Raman spectroscopy), improve the sensitivity of in vivo Raman diagnostic system by 5–8% and allow the more accurate diagnosis of premalignant lesions. Implementation of biophysical models together with cross-validation algorithms allows obtaining a statistical predictor for cancer diagnostics with biochemical semi-quantitative justification .

2.4 Fluorescence Spectroscopy

Light-induced autofluorescence spectroscopy is a very attractive tool for early diagnosis of cancer due to its high sensitivity, easy-to-use methodology for measurements, lack of need for an exogenous contrast agents’ application, possibilities for real-time measurements, and noninvasive character of the detection technique in general, which allows one to work in vivo without pre-preparation of the samples [5, 79, 80, 83,84,85]. Highly-sensitive cameras and narrow-band filters application nowadays allow obtaining fluorescent maps of the tissues investigated in 2-D image modality, which support the exact tumor borders and safety margins determination, which is required and very useful information in the following therapeutical planning. Fluorescence spectroscopy is a very sensitive tool with broad applicability for tumor detection. Its diagnostic sensitivity depends on many factors related to the lesions investigated: their biochemical content, metabolic state, morphological structure, localization and stage of tumor development.

Internal chemical compounds, which can fluoresce after irradiation with a light, are called endogenous fluorophores. Investigation of such chemicals’ fluorescent emission properties can give information about their concentration, distribution into the different tissue structures and layers, as well about alterations in microenvironment, related to disease progress, including changes in pH, temperature, or chemical transformations or reactions, preceded in these fluorophores. Typical endogenous fluorophores used for evaluation of the tissue state are divided into several groups depending on their chemical nature, including amino acids, proteins, co-enzymes, vitamins, lipids, and porphyrins. Protein cross-links, being overmolecular structures, which are related to the tissues’ extracellular matrices, add their fluorescent signals as well, to enrich the picture of emission properties that can be used for tissues’ biomedical diagnostics. Both the endogenous fluorophores concentration and distribution into the tissues depend on the metabolic and structural peculiarities of the tissue investigated. Some of them during alteration of microenvironment of the tissue or cells, where they are situated, go through chemical transformations as well, which can alter their emission properties and also can be used for evaluation of the processes of tumor growth and metabolic activity in the lesions investigated.

The compounds that absorb the light without re-emission in normal conditions in the form of fluorescence signal are known as endogenous chromophores and also influent the emission response, when fluorescent spectroscopy technique is used for analysis and can significantly alter the emission detected from the tissues investigated. In the ultraviolet (UV) spectral range, most of the biologically important molecules including amino acids, DNA, RNA, structural proteins, co-enzymes, and lipids absorb light. Typical endogenous chromophores with absorption bands in visible and near-infrared range, where the tissue endogenous fluorescence is observed typically, are melanin (pheo- and eumelanin, the pigments typical for mammal skin and eye tissues), pigment in the red blood cells—hemoglobin, in its oxidized and reduced form (oxy- and deoxyhemoglobin), and bilirubin (yellow pigment, product of catabolism of heme in hemoglobin). These absorbers can have significant influence on the emission signal from the tissue investigated due to filtering effect, when they directly absorb excitation light leading to decreased effective absorption in the fluorophores and lower yield of emitted photons as a result, and due to lower levels of their excitation, as well indirectly, when they reabsorb the resultant emission from the fluorophores. Their absorption bands are observed in the reported emission spectra for different types of tumors and localizations. The cancerous tissues are characterized by different content and distribution of such chromophores in the tissue volume. Therefore, their influence on the emission spectra is significant index to the process of malignization being non-specific but diagnostically-important additive indicator for tumor development process in the tissue investigated.

Fluorescent properties investigated for the tissue cancer diagnostics needs are based on the steady-state or time-resolved measurements of excitation and emission spectral and fluorescent decay properties, respectively.

Steady-state fluorescence spectroscopy technique is based on the detection of fluorescence intensity as a function of the registered wavelength (energy and frequency) for fixed excitation wavelength. Each fluorophore is characterized with specific pair of excitation (1) and emission (2) maxima—(1) wavelength with light absorption maximal efficiency, which is transformed to a fluorescent signal and (2) wavelength, where the fluorescent intensity observed is maximal by its absolute value in comparison with all others into the emission range for a given compound. Such pair of excitation and emission wavelengths is unique for each fluorophore appeared in the biological tissue and can be used as indicator of the presence of this compound. If multiple excitation wavelengths are used for consequent detection of such fluorescent intensities functions of registered wavelength, the so-called excitation–emission matrix can be developed, which allows to address whole set of endogenous fluorophores in a complex sample, such as biological tissues that are consisted from a mixture of several different fluorescent compounds. Excitation–emission matrices developed in such a way consist of specific islands with high fluorescent emission detected that correspond to the specific pairs of excitation and emission wavelengths. In ideal case, the number of such “islands” in excitation–emission matrix corresponds to the number of endogenous fluorophores existing in the tissue investigated. The fluorescent emission intensity corresponds to the number of excited molecules of given type of fluorophore, which re-emit light, that correlate directly to the quantity of this compound in the sample investigated. In such a way, the steady-state fluorescence intensity measurements allow the determination of the fluorophores’ presence and concentration inside of the object investigated, and they are broadly used in experimental studies of neoplasia due to simple approaches needed for spectral data detection, processing, and analysis.

Time-resolved fluorescence spectroscopy technique is not so popular for biological tissue investigations due to the required very sensitive and fast detection equipment, which lead to higher costs for the last. Time-resolved fluorescence allows finding the values of the fluorescence decay time of the endogenous fluorophores after irradiation with short pulse of excitation light. Fluorescence decay time, also called fluorescence lifetime, occurs as emissive decays from the excited to ground singlet – state energy levels of the endogenous fluorophore molecule. The typical decay time for diagnostically important fluorophores lies in the region from picoseconds to nanoseconds. This parameter is specific for a given chemical compound by its value, but also can vary due to strong sensitivity to the small perturbations in the microenvironment around such fluorophore molecule. Information about the fluorescent decay time and its deviations allows to obtain knowledge about the interaction with surrounding molecules for a given fluorophore and for the microenvironment conditions for the molecular ensemble in general.

In the process of malignancy development prominent alterations in biochemical and morphological properties of the biological tissues are observed. They can lead to significant differences in the fluorescent spectra of normal and abnormal biological tissues, which can be detected and used as diagnostic indicators and/or as predictors of tumor lesion development.

Table 1.3 presents the typical endogenous fluorophores and chromophores, the dynamics of their fluorescent properties, which are indicative of malignant alterations in the biological tissues. Reasons for these changes are also briefly indicated, according to the investigations of research groups referred.

Table 1.3 Endogenous fluorophores —excitation and emission maxima, the dynamics of their fluorescence intensity from normal to cancerous tissue development and origins of the observed alterations in cancer tissues’ fluorescent properties

The most often alterations observed and discussed in the literature are related to the changes in the ratio of NADH/NAD+ that lead to changes in the level of the autofluorescent intensity—reduced form of the coenzyme NAD+ is not fluorescent, but its concentration increases in tumor cells due to alteration in their metabolism related to hypoxic environment in the tumor, which leads to general decrease of the tumor fluorescent intensity in 420–460 nm spectral region. Fluorescence intensity decrease in the region of 470–500 nm is observed as well due to the tissue partial destruction in the process of tumor lesion growth and changes in the extracellular matrix and decrease or even partial demolition in the structural protein content in the area of tumor. That extracellular matrix damages affect the signals coming from collagen and elastin, the main structural proteins, as well as from the cross-link protein structures. In some specific cases the opposite tendency is observed, where the tumor reveals increased metabolic activity, fast growth, and low pigmentation, such as for cutaneous squamous cell carcinoma (SCC) lesions. There, the autofluorescence intensity can be higher than that of surrounding normal skin, and in advanced stages of SCC flavin green fluorescence can be detected and easily observed even with naked eye.

Red fluorescence signals in vivo are also observed and reported in the literature. Hypothesis related to the origin of this signal is related to accumulation of endogenous porphyrins in the tumor cells of various types of tumors. The specific signature of fluorescent emission with bright maximum at 635 nm and less pronounced 704 nm fluorescence peaks related to the endogenous porphyrins can be observed in advanced stages of tumor growth (grade III and IV), which make it specific but not optimistic index of lesion development. However, usually the fluorescent maxima at 635 and 704 nm on the initial stage of lesion development are with low intensity or even absent and not typically observed for lesions on grade I or II of their development. Porphyrins’ fluorescent signal can be increased using exogenous delta-aminolevulinic acid application, which is precursor of protoporphyrin IX. After accumulation of 5-ALA in the cells it transforms to heme of hemoglobin. In normal cells for few hours all chain of heme synthesis is accomplished, but in tumor ones, due to blockage of enzyme ferrochelatase the iron ion cannot be added to the protoporphyrin IX molecule, which will transform it to heme molecule, and the concentration of PpIX is rapidly increased in the cancerous area. In many clinical applications exogenous fluorophores from the family of porphyrins photosensitizers are applied as exogenous fluorescent contrast agents.

3 Optical and Physiological Properties of Malignant Tissues

3.1 Lung cancer

In both sexes combined, lung cancer is the most commonly diagnosed cancer (11.6% of the total cases) and the leading cause of cancer death (18.4% of the total cancer deaths) [1]. Reasons for the high mortality rate are the fact that patients tend to be diagnosed at an advanced stage and a lack of effective treatments. Part of the diagnostic process is white light or fluorescence bronchoscopy combined with tissue biopsy for definitive pathology. A problem with this technique is that it suffers from either low sensitivity or specificity and it is difficult to ensure the representativeness and quality of the biopsies during the procedure [242].

In early study Huang et al. [129] demonstrated the potential of Raman spectroscopy to differentiate accurately normal bronchial tissue specimens, squamous cell carcinoma, and adenocarcinoma. The Raman spectra of malignant tumor tissue were characterized by higher intensity bands corresponding to nucleic acids (PO2 asymmetric stretching 1223 cm−1 and CH3CH2 wagging 1335 cm−1), tryptophan (752, 1208, 1552, and 1618 cm–1), and phenylalanine (1004, 1582, and 1602 cm–1) and lower signals for phospholipids (CH2CH3 bending modes 1302 and 1445 cm–1) and proline (855 cm–1), compared to normal tissue. The peak at 1078 cm–1 in normal tissue due to the C–C or C–O stretching mode of phospholipids was shifted to 1088 cm–1 in tumor tissue and had lower normalized percentage signals, reflecting a decreased vibrational stability of lipid chains in tumors. The authors found that the ratio of the Raman band intensity at 1445 cm−1 (CH2 scissoring) and 1655 cm−1 (C=O stretching of collagen and elastin) had high discrimination power between normal and tumor tissues with sensitivity and specificity of 94% and 92%, respectively. Zakharov et al. [111, 243] used three ratios of maximum scattering intensities in the 1300–1340 cm−1 bands, in the 1640–1680 cm−1 bands, and in the 1440–1460 cm−1 bands to separate lung tumor from healthy tissue with following differentiation adenocarcinoma and squamous cell carcinoma by ratios contrast with surrounding normal tissue. It was achieved sensitivity and specificity of 91% and 79%, respectively. However, the diagnostically useful information contained not only in a few peaks, the entire spectral information could be important for the accuracy of tissue classification and cancer detection.

Similar spectral features were obtained by Magee et al. [244] using shifted subtracted Raman spectroscopy for reduction of the fluorescence from the lung tissue and principal component with a leave-one-out analysis for accurate tissues classification. The first in vivo study was conducted in 2008 by Short et al. [148]. The authors fail to obtain precise Raman spectra in fingerprint range due to high fluorescence background, which were explained by high levels of hemoglobin close to the tissue surface. Clear Raman peaks were registered only in HW range, where the intensity ratio of extracted Raman peaks to the fluorescence was six times greater for the most intense Raman peaks compared to those in the fingerprint range. Preliminary research on 26 patients demonstrated that the combination of Raman spectroscopy with white light bronchoscopy and autofluorescence bronchoscopy could reduce the number of unnecessary biopsies and achieve the sensitivity and specificity above 90% for detection of lung cancer and high-grade dysplasia lesions [159].

Recently, McGregor et al. [156] used the bronchoscopic Raman spectroscopy in vivo in 80 patients. The authors acquired Raman spectra from the high-wavenumber region (from 2775 to 3040 cm−1) with an acquisition time of 1 s. Major Raman peaks were observed for CH2 symmetric stretching modes of fatty acids and lipids at 2850 cm–1; CH3 symmetric stretching modes at 2885 cm–1; overlapping CH vibrations in proteins and CH3 asymmetric stretching modes of lipids and nucleic acids at 2940 cm–1; in-plane and out-of-plane asymmetric CH3 stretching in lipid and fatty acid molecule at 2965 cm–1 and 2990 cm–1. It was found that spectra with malignant lesions presented a distinctive loss in lipid at 2850 cm−1. The intensity of the inflammation group was relatively higher than all other categories between 2850 cm–1 and 2900 cm–1. To extract a more reliable correlation of spectra with pathology, principal components with generalized discriminant analysis and PLS-DA with leave-one-out cross-validation (LOOCV) were used for spectral classification. The detection of high-grade dysplasia and malignant lung lesions resulted in a reported sensitivity of 90% at a specificity of 65%. In 2018 same group developed novel miniature Raman probe (1.35 mm in diameter) capable of navigating the peripheral lung architecture [245]. The in vivo collected spectra showed lipid, protein, and deoxyhemoglobin signatures in fingerprint (1350–1800 cm−1) and HW (2300–2800 cm−1) ranges that might be useful for classifying pathology.

It is known that repeated exposure to carcinogens, in particular, cigarette smoke , leads to lung epithelium dysplasia. Further, it leads to genetic mutations and affects protein synthesis and can disrupt the cell cycle and promote carcinogenesis. The most common genetic mutations responsible for lung cancer development are MYC, BCL2, and p53 for small cell lung cancer (SCLC) and EGFR, KRAS, and p16 for non-small cell lung cancer (NSCLC) [246,247,248]. The broad divisions of small cell lung cancer (SCLC) and non-small cell lung cancer (NSCLC) represent more than 95% of all lung cancers.

Small Cell Lung Cancer

Histologically, SCLC is characterized by small cells with scant cytoplasm and no distinct nucleoli. The WHO classifies SCLC into three cell subtypes: oat cell, intermediate cell, and combined cell (SCLC with NSCLC component, squamous, or adenocarcinoma). SCLC is almost usually with smoking. It has a higher doubling time and metastasizes early; therefore, it is always considered a systemic disease on diagnosis. The central nervous system, liver, and bone are the most common sites. Certain tumor markers help differentiate SCLC from NSCLC. The most commonly tested tumor markers are thyroid transcription factor-1, CD56, synaptophysin, and chromogranin. Characteristically, NSCLC is associated with a paraneoplastic syndrome which can be the presenting feature of the disease .

Non-small Cell Lung Cancer

Five types of NSCLC are distinguished: squamous cell carcinoma, adenocarcinoma, adenosquamous carcinoma, large cell carcinoma, and carcinoid tumors. Squamous cell carcinoma is characterized by the presence of intercellular bridges and keratinization. These NSCLCs are associated with smoking and occur predominantly in men. Squamous cell cancers can present as Pancoast tumor and hypercalcemia. Pancoast tumor is the tumor in the superior sulcus of the lung. The brain is the most common site of recurrence postsurgery in cases of Pancoast tumor.

Adenocarcinoma is the most common histologic subtype of NSCLC. It is also the most common cancer in women and non-smokers. Classic histochemical markers include napsin A, cytokeratin-7, and thyroid transcription factor-1. Lung adenocarcinoma is further subdivided into acinar, papillary, and mixed subtypes.

Adenosquamous carcinoma comprises 0.4–4% of diagnosed NSCLC. It is defined as having more than 10% mixed glandular and squamous components. It has a poorer prognosis than either squamous and adenocarcinomas. Molecular testing is recommended for these cancers.

Large cell carcinoma lacks the differentiation of a small cell and glandular or squamous cells [249].

Optical and physiological properties of the lung tumor tissue were investigated in [36, 250,251,252]. Earlier Qu et al. [250] investigated the optical properties of 10 human lung tumor samples (without indicating the king of the tumors) using integrating sphere technique and IAD method in the spectral range from 400 to 700 nm. The result of the measurements is presented in Table 1.4. Fishkin et al. [251] measured the optical properties of human large-cell primary lung carcinoma using multi-wavelength frequency-domain photon migration instrument and found significant absorption differences between normal and tumor tissue at all wavelengths. Scattering changes were less significant, but exhibited consistent wavelength-dependent behavior. Lower tumor scattering parameters (versus normal tissue) could be due to a loss of cellularity and increased water content in necrotic zones [251]. The authors demonstrated that total hemoglobin content varied from 29.2 ± 2.4 to 42.9 ± 2.9 μM for normal tissue and from 85.1 ± 8.2 to 102 ± 10 μM for tumor tissue. Deoxyhemoglobin content varied from 6.22 ± 0.64 to 9.68 ± 1.04 μM for normal tissue and from 15.9 ± 3.2 to 20.2 ± 5.2 μM for tumor tissue. Oxyhemoglobin content is varied from 23.0 ± 2.1 to 33.2 ± 2.7 μM for normal tissue and from 66.0 ± 7.4 to 86.0 ± 9.6 μM for tumor tissue. In turn oxygenation degree varied from 77.4 ± 8.2 to 82.2 ± 8.3 (%) for normal tissue and from 77 ± 18 to 84 ± 13 (%) for tumor tissue. Water content varied from 3.95 ± 1.94 to 5.87 ± 1.31 M for normal tissue and 20.1 ± 10.8 M for tumor tissue [251]. Similar results were obtained by Fawzy et al. [252]. In the study, the author measured in vivo 100 reflectance spectra of normal tissue, benign and malignant lesions (small cell lung cancer, combined squamous cell carcinoma and non-small cell lung cancer, non-small cell lung cancer , and adenocarcinoma) in 22 patients. As follows from their analysis, the mean value of the blood volume fraction was higher for malignant lesions (0.065 ± 0.03) compared to the benign lesions (0.032 ± 0.02). The mean value of the oxygen saturation parameter was reduced from 0.90 ± 0.11 for benign lesions to 0.78 ± 0.13 for malignant lesions [252]. The significant increasing in the volume fraction of blood in malignant tissue related to the overgrowth of the tumor microvasculature [253]. A significant decrease in blood oxygenation in malignant lesions was consistent with hypoxia-related changes during the development of cancer [254], which could be related to the increase in tissue metabolism and a high proliferation rate of the cancerous cells [252].

Table 1.4 The optical properties of lung tumor tissues measured in vitro and in vivo [36, 250, 251]

3.2 Breast Cancer

The second leading cause of cancer-related deaths in women worldwide is breast cancer with a worldwide incidence rate of more than 2,088,849 (11.6% of the total cases) and a mortality rate of 522,000 (6.6% of the total cancer deaths) [1]. A low-dose X-ray mammogram is the most common technique used for screening of microcalcifications in breast cancers. Mammography is not effective in dense female breasts and does not discriminate whether a lesion is benign or malignant. Therefore, it is always followed by either surgical excision biopsy or needle biopsy and only 36.5% of found microcalcifications are identified as malignant tumor [255]. The intra-surgical assessment of the tumor margins is often quite challenging and requires an objective and rapid guidance to eliminate the risk of additional resections, whose rate varies between 7 and 73% as reported by different institutions [256].

Frank et al. [257] demonstrated in 1995 the possibility of Raman biopsy by fiber-optic sampling through a hypodermic needle. It was remotely shown that differences between benign lesion (fibrocystic) and infiltrating ductal carcinoma were smaller than those between normal and malignant specimens. Rehman et al. [126] reported about spectral differences between the nuclear grades of ductal carcinoma in situ and invasive ductal carcinoma of the breast. It was confirmed the increase of protein content and relative decrease in the lipids/acylglyceride content in the cancerous tissues. The intensity at 1662 cm−1 (amide I group of proteins) varied with the degree of fatty acid unsaturation and it depended mainly on the lipid-to-protein ratio. The normal tissue showed weaker intensity at 1442 cm−1, which represented CH2 scissoring and CH3 bending in lipids and proteins, and increased with the increase in nuclear grades. The same trend was observed for the OH–NH–CH peaks in the 2700–3500 cm−1 region, indicating varying concentrations of fatty acyl chains, phospholipids, cholesterol, creatine, proteins, and nucleic acids.

Haka et al. [160] showed that the types of microcalcifications could be easily distinguished based on the presence or absence of vibrational bands characteristic of calcium oxalate dihydrate at 912 cm−1 and 1477 cm−1 and calcium hydroxyapatite at 960 cm−1, and that their relative abundances correlated with malignancies in breast cancer [99]. Further Saha et al. [116] presented ex vivo studies for real-time identification of microcalcifications in stereotactic core needle breast biopsy specimens collected from freshly excised tissue from 33 patients. The authors employed ordinary least-squares fitting to approximate the acquired spectra with a breast model, including fit coefficients for total calcium, collagen, and fat. Further the same group demonstrated the utility of Raman spectroscopy as a guidance tool for mastectomy procedures [159]. The modified classification model used SVM algorithm for the diagnosis of lesions irrespective of microcalcification status followed by logistic regression algorithm for detection of microcalcifications . The accuracy obtained for differentiation between normal, fibrocystic change, fibroadenoma, and breast cancer was 82.2% [159]. Haka et al. [117] reported on the feasibility to use Raman spectroscopy for in vivo diagnostics in an operation surgery environment. With their classification model, they reached an overall accuracy of 93% (28 of 30) [258]. The highest sensitivity and specificity (94.9% and 93.8%, respectively) were achieved by Li et al. [162] with proposed adaptive weight k-local hyperplane (AWKH) algorithm , which extended K-local hyperplane distance nearest-neighbor algorithm of breast cancer classification.

Brozek-Pluska et al. [161] showed clear differences in carotenoids and fatty acid composition and products of their metabolism between cancerous tissue and surrounding noncancerous. The most pronounced differences were observed in the region of the bands at 1158 and 1518 cm−1 assigned to the C–C and C=C stretching modes of carotenoids , the symmetric and asymmetric C–H vibrations of lipids at 2850 and 2940 cm−1, and the region of the OH stretching mode of water around 3300 cm−1. The Raman intensities of lipid peaks were significantly smaller in the cancerous tissue than in the noncancerous tissue as in fingerprint region (854, 1444, 1660, 1750 cm−1) as in HW spectral range (2888, 2926 cm−1). It was reported to effectively diagnose early-stage breast cancer with a sensitivity of 72% for malignant tissue and 62% for benign tissue and a specificity for normal tissue of 83% [161]. It was also observed by Abramczyk et al. [157, 158] after testing the same Raman system on 150 patients. The group found that the fatty acid composition and products of their metabolism in cancerous breast tissue had an increased content of 20-carbon essential fatty acid, whereas surrounding noncancerous tissue was almost identical to monounsaturated oleic acid. This study suggested that carotenoids and lipids can be used as Raman biomarkers in breast cancer pathology.

Optical properties of breast tumors were investigated in [45, 259,260,261,262,263,264], the data partially summarized in [265] and presented in Table 1.5. Zhang et al. [45] compared the optical properties of normal breast tissue, benign and malignant neoplasm using integrating sphere technique and IAD method in the spectral range from 400 to 2200 nm. The authors observed an increase in water concentration and a decrease in lipid content in malignant tissue compared with normal tissue. Moreover, as can be seen in Table 1.5, spectral behavior of the scattering properties of tumor tissue is determined primarily by relative small (so-name Rayleigh) scatterers in comparison with normal tissue. Earlier similar results were obtained in [259, 263, 264] for in vitro and ex vivo experiments in visible and near-infrared spectral range. In vivo studies of normal and malignant breast tissues were performed by Fantini et al. [260], Grosenick et al. [261], and Cerussi et al. [262] using frequency- and time-domain techniques. The authors observed an increase in both the absorption and scattering coefficients for the tumor tissue in comparison with normal tissue. Moreover, the authors found a significant increase in total hemoglobin concentration: from 17.3 ± 6.2 μmol/L for normal tissue to 53 ± 32 μmol/L for tumor tissue [261] or from 17.5 ± 7.5 μM (normal tissue) to 24.7 ± 9.8 μM (tumor tissue) [262]. At the same time, the blood oxygen saturation was not changed: 74 ± 7% for normal tissue and 72 ± 14% for tumor tissue [261] or 67.7 ± 9.3% for normal tissue and 67.5 ± 8.4% for tumor tissue [262]. Lipid content in the tumor tissue decreased: from 66.1 ± 10.3% (normal tissue) to 58.5 ± 14.8%, and on the contrary the water content increased: from 18.7 ± 10.3% (normal tissue) to 25.9 ± 13.5% for tumor [262]. Similar results were presented in [266,267,268].

Table 1.5 The optical properties of breast tumor tissues measured in in vitro and in vivo [45, 259,260,261,262,263,264]

3.3 Skin Cancer

Malignant melanoma (MM) is the most aggressive form of skin cancer with a worldwide incidence of 287,723 (1.6% of the total cases) and mortality rate as high as 60,712 (0.6% of the total cancer deaths) [1]. Although nonmelanoma skin cancers (NMSC) , such as the basal cell carcinoma (BCC) and squamous cell carcinoma (SCC), are associated with low mortality rate deaths, they are particularly common in fair-skinned populations of European descent (5.8% of the total cases), with high incidence rates found in Australia/New Zealand, North America, and Northern Europe [1, 269]. Standard type of treatment of skin cancer is complete removal of the lesion with a high cure rate without reducing life expectancy (5-year relative survival rate for melanoma is approximately 92% [269]). Therefore, investigations of Raman spectroscopy for skin cancer primarily focused on early detection and discrimination of skin tumor types.

In early ex vivo studies Gniadecka et al. [125, 270] revealed clear-cut changes in skin tumor tissues allowing to differentiate MM from normal skin, seborrheic keratosis, and BCC. The major spectra alteration was found in the region 1200–1750 cm−1. An increase in the intensity of Raman bands was observed for pigmented tumors in the region 2500–3500 cm−1. To demonstrate spectral changes for proteins, the ratio between the amide I band and δ(CH2)(CH3) in proteins and lipids (I1650/I1450) and the ratio between the amide III and lipids at around 1320 cm−1 were calculated (I1270/I1320). Neural network analysis of Raman spectra in a range 200–3500 cm−1 achieved a diagnostic sensitivity of 85% and specificity of 99% for the diagnosis of MM .

In 2008 Zhao et al. [230] reported 289 in vivo measurements of Raman spectra from nine different types of lesions. The authors stated a sensitivity of 91% and a specificity of 75% in differentiating malignant lesions from benign lesions. Further, the same group utilized a Raman probe to study MM, BCC, SCC, actinic keratosis, atypical nevi, melanocytic nevi, blue nevi, and seborrheic keratosis from 453 patients in 2012 [227] and from 645 patients in 2015 [225]. The collected single-point spectra were acquired in 1 s and subjected to principal component with generalized discriminant analysis (PC-GDA) and PLS for statistical data evaluation. The sensitivity to differentiate skin cancer versus benign lesions ranged between 95 and 99%, with a related specificity of 15–54%. Similar results were achieved by Silveira et al. [224] for a set of 145 spectra from biopsy fragments of normal (n = 30), BCC (n = 96), and MM (n = 19) skin tissues. The authors applied the best-fitting model to the spectra of biochemicals and verified that actin, collagen, elastin, and triolein were the most important biochemicals representing the spectral features of skin tissues.

Typical in vivo Raman spectra for normal skin tissue, melanoma, and BCC are depicted in Fig. 1.6. Overall all skin lesions appear to share similar major Raman peaks and bands in fingerprint region. There are no distinctive Raman peaks or bands that can be uniquely assigned to specific skin cancers by visual inspection alone. The strongest Raman peak is located around 1445 cm−1 with other major Raman bands centered at 855, 936, 1002, 1271, 1302, 1655, and 1745 cm−1. The development of the malignant skin disease increases the content of metabolic products in the pathological areas of the skin, changes the concentration of proteins and lipids. Proteins predominantly contribute to the appearance of bands in the spectral range 1240–1270, 1340, 1440–1460, and 1665 cm−1, the spectral features arising from the contribution of lipids, predominantly of triolein, are observed in the 1271–1301, 1440, 1650–1660 cm−1 bands [224]. One of the significant differences between malignant and benign formations is the process of metabolism and destruction of collagen. Cells of malignant tumors form fast-growing, low-differentiated structures, and the development of such structures is accompanied by the increased activity of collagenase [95]. Collagenase destroys the molecular bonds of collagen fibers, and changes in Raman spectra of skin tissue can be observed in 1248, 1454, and 1665 cm−1 bands associated with peaks of collagen [121].

Fig. 1.6
figure 6

Average Raman spectra and standard deviation of melanoma (MM), basal cell carcinoma (BCC), and normal skin (NORM). Each spectrum was acquired by Raman setup [140] and preprocessed with baseline removal, smoothing by the Savitzky–Golay method, data normalization, and centering

Schleusener et al. [229] performed in vivo measurements on 104 subjects with lesions using a multi-fiber Raman probe, which was optimized for collecting scattered light from within the epidermal layer’s depth down to the basal membrane, where early stages of skin cancer developed. NMSC were discriminated from normal skin with a balanced accuracy of 73% (BCC ) and 85% (SCC ) using partial least-squares discriminant analysis (PLS-DA) . Discriminating MM and pigmented nevi (PN) resulted in a balanced accuracy of 91%.

Lim et al. [223] employed fiber-optic Raman probe in combination with fluorescence and diffuse backscattered reflectance techniques in order to improve diagnostic outcomes. Raman, fluorescence, and reflectance spectra were acquired from 137 lesions in 76 patients. Raman spectroscopy alone demonstrated to achieve 100% sensitivity and specificity for discriminating melanoma from benign pigmented lesions, but only 68% sensitivity and 55% specificity for distinguishing NMSC from normal tissues. However, for multimodal approach NMSC were classified with a sensitivity of 95% and specificity of 71%. To support the data analysis, the group also analyzed the spectral contributions of individual skin components such as collagen, elastin, triolein, nuclei, keratin, ceramide, melanin, and water, by fitting spectra obtained in vitro using Raman microscopy [222]. It was demonstrated that the biophysical model had consistent diagnostic capability similar to statistical PCA-LDA model with leave-one-lesion-out cross-validation. More importantly, the biophysical model captured the relevant biophysical changes accounting for the diagnosis. In particular, the authors found that collagen and triolein were the most important biomarkers in discriminating MM from benign pigmented lesions, and BCC had a significantly different concentration of nucleus, keratin, collagen, triolein, and ceramide compared to surrounding healthy skin. Recently, the group demonstrated the capability to use developed biophysical model for skin cancer margin assessment in BCC surgery resection [226].

In 2017 Bratchenko et al. [228] tested combined Raman and autofluorescence ex vivo diagnostics of MM (n = 39) and BCC (n = 40) in near-infrared and visible regions. The authors stated the accuracy of 97.3% for discriminating each of skin cancers in multimodal approach, whereas the determined accuracy for each modality separately was 79%. Further, the same group has performed in vivo measurements on 17 MM, 18 BBC, and 19 various types of benign neoplasms with portable Raman system and confirmed the higher accuracy of multimodal approach [140]. The diagnostic efficiency of portable system was defined by PLS-DA analysis of entire spectra, taking into account each feature of the spectra in a range from 300 to 1800 cm−1 including maxima intensities and exact bands position estimation. For combined Raman and autofluorescence diagnostics, the authors reported the accuracy of 89.5% and 91.1% for classifying MM vs other neoplasms and BCC vs other neoplasms, respectively.

The other multimodal approach includes combination of optical coherence tomography (OCT) and Raman spectroscopy (RS) for noninvasive characterization of skin lesions based on either morphological or biochemical features of disease [271]. Although the OCT not clearly defines features associated with malignancy, it provides a morphological context to guide placement of the RS acquisition axes for specific biochemical analysis of the tissue. In 2015, Zakharov et al. [111] reported the increase in average accuracy of in vivo diagnosis of skin tumors (9 MM, 9 BCC, and various benign tumors) by multimodal RS-BS-OCT system. It was shown that these methods were complementary and increased the diagnostic specificity for a variety of tumor types by 5–11%. In 2018, Varkentin et al. [272] presented trimodal RS-OA-OCT with optoacoustic (OA) modality, which provided precise tumor depth determination due to potentially deeper penetration compared to OCT. The Raman signal was collected via the OCT scanning lens to maximize the signal-to-noise ratio of the measured signal while keeping radiation levels below maximum permissible exposure limits. The preliminary results of first RS-OA-OCT clinical trials showed good agreement with histology results and distinctive differences in Raman data between normal skin and different areas of melanocytic lesions.

The optical properties of skin tumor tissue are presented in Table 1.6, which demonstrates qualitatively similar behavior for all investigated nonmelanoma types of skin tumors [39, 273]. Scattering coefficient gradually decreases with the increasing wavelength. Quantitatively, infiltrative BCC is characterized by a higher scattering coefficient in comparison with the scattering of nodular BCC and SCC. The higher scattering coefficient of infiltrative BCC may be explained by its structural characteristics. Typically, these tumors have thin strands or cords of tumor cells extending into the surrounding highly scattering dermis. The value of \( {\mu}_s^{\prime } \) for SCC is consistently lower than for both types of BCC in the entire wavelength range [39].

Table 1.6 The optical properties of skin tumor tissues measured in in vitro and in vivo [39, 273]

Similar result was obtained by Garcia-Uribe et al. [273] (see Table 1.6). Higher light scattering in cancerous tissue can be explained by the larger average effective size of the scattering centers. SCC in situ has not yet penetrated through the basement membrane of the dermoepidermal junction. SCCs typically appear as scaling plaques with sharply defined red color. Histologically, all epidermal layers may contain atypical keratinocytes. The larger amount of atypical keratinocytes in SCCs can increase the light scattering in this type of skin lesion and significantly affect its contribution to diffusely reflected light on the surface. SCCs can penetrate the basement membrante to become invasive [273].

Actinic keratosis can appear rough and scaly and can develop into SCCs . Histologically, actinic keratosis is recognized by the presence of atypical keratinocytes in the deeper parts of the epidermis. Defective maturation of the superficial epidermal layers results in parakeratosis, alternating with hyperkeratosis [274, 275]. The amounts of atypical keratinocytes and collagen are factors related to the amount of light scattering in the lesion [273].

BCC is derived from the basal layer of keratinocytes, the deepest cell layer of the epidermis. BCCs can present nodular aggregates of basal cells in the dermis and exhibit peripheral palisading and retraction artifacts. Melanin can also be present in the tumor and in the surrounding stroma, as observed in pigmented BCCs. The aggregation of basal cells can increase the light scattering in these types of malignant lesions. The progression of seborrheic keratosis into BCC and SCC is rare [276, 277]. Seborrheic keratosis, composed of basaloid cells admixed with some squamous cells, can be pigmented when some cells contain melanin transferred from neighboring melanocytes [273].

Cugmas et al. [81] investigated ex vivo optical properties of canine skin and ex vivo and in vivo subcutaneous tumors and found that average water volume fraction in the skin samples was 81.4%. Darkly pigmented skin contained almost 10 times more melanin (2.22 mmol/L) than lightly pigmented skin (0.26 mmol/L). The authors estimated melanin concentrations of 115.0 and 443.5 mg/L for the lightly and darkly pigmented human skin, respectively. The average hemoglobin mass concentration was 1.07 g/L and saturation was 46%. For tumors the water volume fraction was around 82%, saturation was slightly above 50%. However, benign tumors contained 0.62 g/L of hemoglobin and malignant tumors contained 7.93 g/L of hemoglobin [81].

3.4 Colorectal Cancer

Over 1.8 million new colorectal cancer cases and 881,000 deaths are estimated to occur in a year, accounting for about 1 in 10 cancer cases and deaths [1]. Overall, colorectal cancer ranks third in terms of incidence but second in terms of mortality. Surgery is the only curative modality for localized colon cancer. Colonoscopy is the most sensitive instrument for screening and detection of early-stage malignancies and premalignant polyps (adenomas). However, colonoscopy miss rates are about 20% for adenomas [277].

Li et al. [163] identified the differences between colorectal normal and cancer tissues in five spectral bands in the regions around 815–830, 935–945, 1131–1141, 1447–1457, and 1665–1675 cm−1. The strongest signals were observed at 1004 cm−1 (C–C stretching ring breathing of phenylalanine), 1323 cm−1 (CH3CH2 twisting of proteins and nucleic acids), 1450 cm−1 (δ(CH2) of phospholipids and collagen), and 1665 cm−1 (C=O stretching mode of amide I and lipids). It was shown that normalized intensities of Raman bands in the ranges of 800–860 and 1580–1660 cm−1 were greater in normal tissue than in cancer tissue, while Raman signals at 1210–1400 cm−1 increased in cancer tissue, which correlated with dysplasia progression. Raman peaks 1323 cm−1 became widened and intense in cancer tissue than in normal colorectal tissue, revealing the increase of nucleic acid contents in tumor cells. The Raman bands 1665–1675 cm−1 which are attributed to the amide I bands of protein in the α-helix conformation were increased in malignant tissue, suggesting that malignancy may be associated with an increase in the relative amounts of protein in the β-pleated sheet or random coil conformation. The authors stated the increase in the intensity of amide I band 1665–1675 cm−1 in malignant tissue is associated with the increase in relative amount of protein in the β-pleated sheet and significant decrease in the intensity of band 1131–1141 cm−1 (C–N stretching mode of proteins, lipids) indicating relative reduction of lipid content in cancer tissue in accordance with early micro-Raman investigations [278]. The diagnostic statistical model was built with the help of the ant colony optimization, and support vector machine provided a diagnostic accuracy of 93.2% for identifying colorectal cancer from normal tissue.

In a similar study Widjaja et al. [279] investigated ex vivo 105 colonic tissue specimens from 59 patients (41 normal, 18 hyperplastic polyps, and 46 adenocarcinomas) using PCS-SVM diagnostic algorithm that utilized the entire Raman spectrum from 800 to 1800 cm−1. The Gaussian radial basis function kernel SVM algorithm was proven to be the best classifier for providing the highest diagnostic specificity 98.1–99.7% and 100% sensitivity for multiclass classification. Wood et al. [280] measured Raman spectra from a total of 356 colon biopsies (81 of normal colon mucosa, 79 of hyperplastic polyps, 92 of adenomatous polyps, 64 of adenocarcinoma, and 40 of ulcerative colitis) from 177 patients. Spectral classification accuracies comparing pathology pairs ranged from 72.1 to 95.9% for 10-s acquisitions and from 61.5 to 95.1% for 1-s acquisitions, reflecting the improved signal-to-noise ratio with longer spectral acquisition times.

Short et al. [155] and Li et al. [164] separately analyzed Raman spectra in regions from 1000 to 1800 cm−1 and from 2800 to 3800 cm−1 for ex vivo samples collected during endoscopic biopsy . It was shown that the peak intensity of C–H stretching vibration bands relating to the lipids (near 2958 cm−1, 2924 cm−1, and 2858 cm−1) decreased and even disappeared in the spectra of malignant tissues due to essential consumption of fat in carcinoma development. The entropy weight local-hyperplane k-nearest-neighbor classifier provided a sensitivity of 81.38% and a specificity of 92.69% for differentiating cancer from colitis samples. Petersen et al. [166] performed Raman fiber-optical measurements of 242 colon biopsy samples. The authors stated that better accuracy was achieved for leave-one-patient-out cross-validation in comparison with leave-one-spectrum-out cross-validation schemes due to minimization of systematic errors. Cancer was differentiated from normal tissue with a sensitivity of 79%, specificity of 83%, and an accuracy of 81%. PCA-LDA and PLS-DA discrimination models were compared on the same dataset of Raman spectra acquired in normal (n = 78) and cancerous (n = 81) colorectal tissues resulting in the preference of PLS-DA algorithm with LOOCV [170]. PLS-DA modeling yielded a diagnostic accuracy of 84.3% for colorectal cancer detection, while the accuracy of PCA-LDA classification was 79.2%.

Bergholt et al. [119] demonstrated that simultaneous Raman endoscopy in fingerprint and high-wavenumber regions provided a diagnostic sensitivity of 90.9% and specificity of 83.3% for differentiating colorectal adenoma from hyperplastic polyps, which was superior to considering either region alone. It was found that adenomas were associated with significantly reduced Raman peak intensities at 1078 cm–1 (C=C stretching), 1425 cm–1 (δ(CH2) scissoring), 2850 and 2885 cm−1 (symmetric and asymmetric CH2 stretching), and 3009 cm–1 compared to hyperplastic polyps pointing to a relative reduction in lipid content. An up-regulated protein content was largely indicated by the biomarkers at 1004 cm–1 (symmetric C–C stretching, ring breathing of phenylalanine) and band broadening of the 1655 cm–1 (amide I C=O stretching mode of proteins). The peak-ratio of the asymmetric to symmetric OH stretching (defined as mean intensity ratio I3250/I3400) showed significant differences, which represented the evidence of re-arrangements in hydrogen-bonded networks in epithelial cells caused by local interactions with macromolecules such as proteins [281]. The fingerprint range contains highly specific information about proteins, lipids, and DNA conformations. On the other hand, the HW technique contains information related to the CH2/CH3 stretching of lipids/proteins, as well as intense water bands reflecting the local conformation of water that are not contained in the FP range. The complementary properties of the FP and HW Raman spectral modalities for enhancing tissue diagnosis can partially be explained by back-tracking the misclassified spectra of each Raman modality.

Physiological differences between normal and tumorous tissues were investigated in [282, 283] and colon tumor optical properties were investigated in [284, 285]. In in vivo studies, Zonios et al. [282] showed that normal colorectal mucosa had a total hemoglobin concentration of 13.6 ± 8.8 mg/dL, whereas the corresponding value for the adenomatous polyp was approximately 72.0 ± 29.2 mg/dL. The hemoglobin oxygen saturation was found to be 0.59 ± 0.08 and 0.63 ± 0.1, respectively. The effective scattering size was found to be 0.94 ± 0.44 μm for polyps and 0.56 ± 0.18 μm for normal mucosa. It was found that the values for scatterer density are (9.2 ± 7.5) × 108 mm−3 and (3.5 ± 4.0) × 108 mm−3 for the normal mucosa and the adenomatous polyp, respectively. The mean oxygen saturation was determined by Knoefel et al. [283] as 37 ± 19, 46 ± 13, 45 ± 10, and 49 ± 15% for adenocarcinomas, adenomatous polyps, hyperplastic polyps, and normal mucosa, respectively. Colon tissue properties are presented in Table 1.7.

Table 1.7 The optical properties of colon tumor tissues measured in in vitro [284, 285]

3.5 Cervical Cancer

With an estimated 570,000 cases and 311,000 deaths in 2018 worldwide, this disease ranks as the fourth most frequently diagnosed cancer and the fourth leading cause of cancer death in women [1]. Cervical cancer ranks second in incidence and mortality behind breast cancer; however, it is the most commonly diagnosed cancer in 28 countries and the leading cause of cancer death in 42 countries, the vast majority of which are in Sub-Saharan Africa and South-Eastern Asia [1]. In 2001, Utzinger et al. [109] first demonstrated in vivo detection of squamous dysplasia, a precursor of cervical cancer, using Raman fiber-optic probe in 24 measurements in 13 patients. It was introduced simple algorithm of classification involving two intensity ratios I1454/I1556 and I1330/I1454, which were correspondently greater and lower for samples with squamous dysplasia than all other tissue types. By integrating analytical algorithms with data collection, diagnostic accuracies as high as 88% were achieved [203]. Observed strong peaks at 1660 (amide I), 1450 (δ(CH2) deformation), and 1340 cm−1 (DNA) of the Raman spectra were characteristic of a cervical tumor, which indicated increased DNA and protein while decreased peaks at 1280 and 1240 cm−1 indicated collagenous proteins [197]. By considering the variations in the Raman spectra of normal cervix due to the hormonal or menopausal status of women, the diagnostic accuracy was improved from 88 to 94% [118, 286]. To further increase the diagnostic accuracy, the authors also incorporated spectral variations linked to confounding factors, such as age, race, smoking habits, body mass index, and menopausal status in cervical Raman spectra [287].

Duraipandian et al. [198] reported an in vivo investigation on cervical precancer detection based on the measurement of 105 near-infrared Raman spectra from 57 sites in vivo of 29 patients. The authors employed a genetic algorithm partial least-squares discriminant analysis (GA-PLS-DA-dCV) to identify seven significant bands associated with lipids, proteins, and nucleic acids in tissue 925–935 cm−1 (CCH deformation mode of glycogen, C–C stretching mode of protein and collagen), 979–999 cm−1 (phospholipids), 1080–1090 cm−1 (PO2 symmetric stretching mode of nucleic acids and C–C stretching mode of phospholipids), 1240–1260 cm−1 (amide III), 1320–1340 cm−1 (CH3CH2 wagging of nucleic acids and proteins), 1400–1420 cm−1 (CH3 bending vibration of proteins), and 1625–1645 cm−1 (C=O stretching mode amide I, a-helix). It was achieved a diagnostic accuracy of 82.9% for differentiation of low- and high-grade precancerous lesions. The potential of high-wavenumber (2800–3700 cm−1) Raman spectroscopy for in vivo detection of cervical precancer has been investigated by Mo et al. [201]. Significant differences in CH2 stretching bands of lipids at 2850 and 2885 cm−1, CH3 stretching bands of proteins at 2940 cm−1, and the broad Raman band of water at 3400 cm−1 were observed in normal and dysplastic cervical tissue. A follow-up study by Duraipandian et al. [288] explored the advantages of using both the low- and high-wavenumber regions for in vivo detection of cervical precancer, acquiring 473 Raman spectra from 35 patients. Raman spectral differences between normal and dysplastic cervical tissue were observed at 854, 937, 1001, 1095, 1253, 1313, 1445, 1654, 2946, and 3400 cm−1, mainly related to proteins, lipids, glycogen, nucleic acids, and water content in the tissue. PLS-DA together with LOPOCV yielded the diagnostic sensitivities of 84.2%, 76.7%, and 85.0%, respectively; specificities of 78.9%, 73.3%, and 81.7%, respectively; and overall diagnostic accuracies of 80.3%, 74.2%, and 82.6%, respectively, using FP, HW, and integrated FP/HW Raman spectroscopic techniques for in vivo diagnosis of cervical precancer.

3.6 Prostate Cancer

Prostate cancer is among the most common cancers in men worldwide, with an incidence of 1,276,106 (7.1% of the total cases) and mortality rate of 358,989 (3.8% of the total cancer deaths) [1]. The diagnosis of prostate cancer is often made through transrectal ultrasound guided prostatic biopsy. Ten to twelve core biopsies are recommended [289]. When the diagnosis of prostate cancer is made, grading is used to describe the histologic appearance of the tumor cells. Prostate cancer is most commonly graded using the Gleason score (GS) , which categorizes the degree of cancerous tissue as compared to normal prostate tissue.

As Raman probes are suitable for use during endoscopic, laparoscopic, or open procedures they can be used for screening, biopsy, margin assessment, and monitoring of prostate cancer treatment efficacy [290]. Patel and Martin [291] used Raman spectroscopy to characterize the transitional, central, and peripheral zones of normal prostates, revealing larger concentrations of DNA and RNA in the peripheral zone, as well as differences in the relative concentration of lipids and proteins between the three zones. Crow et al. [187, 292] and Stone et al. [123] showed the ability to differentiate prostate cancer into three categories (GS < 7, GS = 7, and GS > 7) and from benign prostatic hyperplasia with 89% accuracy from in vitro analysis of frozen biopsies. They found that the nucleus/cytoplasm (actin) ratio increased with malignancy, with malignant Raman spectra showing increased DNA concentration. An increase in the relative concentration of choline and cholesterol was shown to be associated with malignancy, potentially representing increased cell membrane synthesis from increased proliferation and increased necrosis, respectively.

Devpura et al. [196] detected benign epithelia and adenocarcinoma , distinguishing GS of 6, 7, and 8 in deparaffinized bulk tissues. The intensity of 782 cm−1, associated with DNA bases, increased with malignancy, as did the ratio of I726/I634. An intensity of 726 cm−1 correlated with DNA, and the intensity of 634 cm−1 was constant across all tissues, which again demonstrated a relative increase in DNA content with malignancy. Spectral variation by Gleason score was observed in the ranges 900–1000 and 1292–1352 cm−1. Adenocarcinoma was identified using PCA with 94% sensitivity and 82% specificity, and Gleason scores of 6, 7, and 8 were distinguished with 81% accuracy. PCA and SVM were used by Wang et al. [194] to classify the spectra of 50 patients into two groups according to their GS (≤7 and >7), achieving 88% sensitivity, specificity, and accuracy. In the experiments performed with the 1064-nm laser, significant differences were found predominantly in the 1000–1450 cm−1 range [195]. Using SVM the prostate samples were classified into malignant and benign with 96% accuracy and prediction of their GSs with 95% accuracy. The specificity of the method was consistently high, with an average of 98%. The sensitivity varied from 67 to 100%, with an average of 89%.

In 2018, Aubertin et al. [193] demonstrated availability of accurate diagnosis and grading of prostate cancer using handheld contact Raman fiber probe with the 785 nm excitation laser. It was shown that the sensitivity and specificity of differentiation of benign and cancerous tissues vary depending on GS from 76 to 90% and from 73 to 89%, respectively. Later, the same group enhanced the accuracy of prostate cancer determination up to 91% by duel wavelength excitation (785 nm and 671 nm) with simultaneous Raman spectra measurements in fingerprint and high-wavenumber regions and SVM classification model with leave-one-patient-out cross-validation procedure.

Optical properties of experimental prostate tumors in vivo were investigated in [293]. The authors measured absorption and reduced scattering coefficients of R3327-AT and R3327-H prostate tumors at 630 and 789 nm and found that the absorption coefficient of the tumors was 0.9 ± 0.4 (630 nm) and 0.4 ± 0.2 (789 nm) for R3327-AT tumor, and 0.9 ± 0.4 (630 nm) and 0.5 ± 0.3 (789 nm) for R3327-H tumor. The reduced scattering coefficient of the tumors was 10.1 ± 3.5 (630 nm) and 5.3 ± 1.4 (789 nm) for R3327-AT tumor, and 12.3 ± 3.2 (630 nm) and 6.7 ± 1.7 (789 nm) for R3327-H tumor.

3.7 Bladder Cancer

Tumors of the genitourinary system account for almost one-fourth of malignancies [294]. Bladder cancer is the sixth most commonly occurring cancer in men and the 17th most commonly occurring cancer in women. There were almost 550,000 new cases in 2018 and mortality was 199,922 (2.1% of the total cancer deaths) [1].

According to the clinical course, there are muscle-noninvasive (TIS, Ta, T1), muscle-invasive (T2–T4), and metastatic bladder cancer . Superficial and muscle-invasive tumors of the bladder in 90–95% are represented by urothelial carcinoma, but differ in molecular genetic, morphological, and immuno-histochemical characteristics. Muscle-invasive bladder cancer (MIBC) is a potentially fatal disease, as patients die within 24 months without treatment. In 50% of patients with magnetic resonance imaging operated radically, relapse develops, which is associated with the morphological stage of development of the primary tumor and the state of the regional lymph nodes. The most common localization of metastases of urothelial cancer is regional lymph nodes (78%), liver (38%), lungs (36%), bones (27%), adrenal glands (21%), and intestines (13%), less often (1–8%) metastases develop in the heart, brain, kidneys, spleen, pancreas, meninges, uterus, ovaries, prostate [295, 296].

To reduce the mortality of patients with MIBC , the early detection of relapses and metastases in the pre- and postoperative periods is of leading importance. Metastasis in patients with bladder cancer most often affects the regional lymph nodes of the pelvis, the bifurcation area of the common iliac arteries, distant metastases capture the bones, liver, and lungs [297]. After radical treatment in the first year, the probability of tumor recurrence reaches 10–67%, progression over 5 years—0–55% [298].

The WHO 1973 grading system proposed by Mostofi et al. [299] differentiates papillary urothelial lesions into three grades: G1, G2, and G3 [300]. Tumors are graded according to the degree of cellular and architectural atypia. The lowest grade (G1) displays nearly no atypia, while the highest grade (G3) displays major atypia with major architectural disorders, such as loss of polarity or pseudostratification.

In 1997, a new multidisciplinary consensus meeting was held to revise terminology and provide updated recommendations to the WHO on the pathology of urothelial carcinomas. The WHO/International Society of Urological Pathology (ISUP) classification of 1998 distinguishes papilloma, papillary urothelial neoplasm of low malignant potential (PUNLMP), and low-grade (LG) and high-grade (HG) carcinomas .

The WHO 2016 system is based on the WHO/ISUP 1998 classification and the WHO 2004 classification, which refined the criteria of WHO/ISUP 1998. According to the WHO 2016 system, pTa and pT1 tumors are graded into LG and HG and all detrusor muscle-invasive urothelial carcinomas are considered to be HG tumors. pTa tumors do not invade the lamina propria (no lymphovascular invasion and distant metastasis). However, pT1 tumors grow under the basement membrane into the lamina propria, and lymphovascular invasion and metastasis can be seen in these cases. In many instances pathologists identify pT1 tumors as HG tumors, independently of their atypia [301, 302].

There are no reliable screening tests available for detecting bladder cancer; hence, the diagnosis is usually made based on clinical signs and symptoms. Microscopic or gross painless hematuria is the most common presentation and a hematuria investigation in an otherwise asymptomatic patient detects bladder neoplasm in roughly 20% of gross and 5% of microscopic cases [303, 304].

Currently, the definition of generally accepted criteria for the stage of bladder cancer, such as the depth of tumor invasion, the degree of differentiation of cells, the defeat of regional lymph nodes for prediction, does not always lead to a positive treatment outcome. This is also confirmed by the analysis of the long-term results of treatment of patients with the same diagnosed stage of bladder cancer. Some patients after organ-preserving surgery have a favorable outcome, while others relapse and tumor progression quickly develops [305].

Cystoscopy is an essential procedure for the diagnosis and treatment of bladder cancer, allowing for direct access to a tumor for biopsy, fulguration, and/or resection. Low-grade (LG), papillary (Ta) tumors can be reliably eradicated with one treatment but more advanced disease (high grade and/or T1) often requires repeat resection for complete eradication. Following an initial diagnosis of HG Ta or T1 between 40% and 78% of re-TUR specimens can contain residual disease, with muscle invasion presented in 2% and 14%, respectively [306,307,308,309].

Carcinoma in situ (CIS) of the urinary bladder is extremely hard to diagnose. The symptoms are highly unspecific and the small, flat CIS lesions can easily be missed, thus remaining unseen in standard white light cystoscopy. Photodynamic diagnosis (PDD) is recommended by the European Association of Urology (EAU) as a diagnostic procedure in cases of suspected CIS [310]. PDD represents a great enhancement in the urological diagnosis of CIS of the urinary bladder and is a superior method to standard white light cystoscopy in cases, where CIS is suspected [311].

The standard for diagnosis is cystoscopy , with biopsies of suspicious lesions, and transurethral resection to confirm the diagnosis [302]. Early in vitro and ex vivo studies demonstrated the potential of Raman spectroscopy to detect bladder tumors by increased DNA and cholesterol content and decreased collagen content [123, 312, 313]. In 2005, Crow et al. [187] were the first research group to integrate a fiber-optic probe into Raman spectroscopy to differentiate between benign and malignant bladder tissue in vitro. In 2009, Grimbergen and colleagues [189] also investigated the potential of using Raman spectroscopy for bladder tissue diagnosis during cystoscopy by examining 107 bladder tissue biopsies ex vivo using an endoscopic probe. Raman spectral measurements were obtained from fresh tissue samples immediately after surgery with an integration time of 2 s. Developed PCA-LDA model with a leave-one-out cross-validation distinguishes normal and malignant tissue with sensitivity and specificity of 78.5% and 78.9%, respectively.

In 2010, Draga et al. [190] were the first research group to investigate the use of Raman fiber-optic probe [189] for the diagnosis of bladder cancer in vivo. Raman spectra were obtained during transurethral resection of bladder tumor procedures on 38 patients. Spectra were measured with a previously reported high-volume Raman probe [194] with a penetration depth of 2 mm for integration times of 1–5 s. The authors found a significant peak at 875 cm−1 in the spectra of normal bladder tissues, which could be assigned to hydroxyproline, an important molecular component of collagen. In addition, the cancer spectra also showed significant elevated peaks at the wavenumbers 1003, 1208, 1580, and 1601 cm−1 and 1208, 1548, and 1617 cm−1, which might be attributed to the amino acidsphenylalanine and tryptophan, respectively. Bladder cancer spectra were significantly expressed by elevated intensities of the wavenumbers 680, 789, 1180, 1580, and 1610 cm−1 that were most likely due to nucleotide chains. However, the relative increase of lipid content in malignant mucosa in vivo was not ascertain in contradiction to the results of [192, 200], which could be explained by the influence of Raman spectra collected from deeper layers due to the use of the high-volume Raman probe. PCA-LDA and leave-one-out cross-validation were used to distinguish cancer from normal tissue, achieving a sensitivity of 85% and specificity of 79%.

And most recently, in 2012, Barman et al. [314] proposed the use of a confocal fiber-optic Raman probe to increase the specificity (in terms of tissue depth discrimination) for bladder cancer diagnosis. The confocal probe was designed by placing a pinhole aperture into the high-volume probe to decrease the depth of field to 280 μm, thus suppressing the spectral information from surrounding regions and from deeper tissue layers beyond the region of interest. All spectra were preprocessed and diagnostic algorithms were developed using PCA and logistic regression analysis along with a leave-one-out cross-validation. The high-volume probe produced a sensitivity of 85.7% and specificity of 85.7%, whereas the confocal probe had a sensitivity of 85.7% and specificity of 100%. The significant increase in specificity values of the confocal probe in comparison to the high-volume probe was associated with the smaller depth of field values, giving this particular device an advantage in the application of Raman probes for real-time in vivo diagnosis of bladder pathology.

In 2018, Chen et al. [191] implemented a low-resolution fiber-optic Raman sensing system for different bladder pathologies discrimination. With the help of a specially trained and cross-validated PCA-ANN classification model, an overall diagnostic accuracy of 93.1% was obtained for the determination of normal, low-grade, and high-grade bladder tissues.

Bovenkamp et al. [188] demonstrated the applicability of RS-OCT for improved diagnosis, effective staging, and grading of bladder cancer by linking the complementary information provided by either modality. OCT well discriminated urothelium, lamina propria, and muscularis layers, which specifically identify the pathological degeneration of the tissue. Raman spectroscopy determines the molecular characteristics via point measurements at suspicious sites. It was shown that OCT differentiated healthy and malignant tissues with an accuracy of 71% in tumor staging and Raman spectroscopy yielded an accuracy of 93% in discriminating low-grade from high-grade lesions.

3.8 Stomach and Esophageal Cancer

Stomach cancer (cardia and noncardia gastric cancer combined) remains an important cancer worldwide and is responsible for over 1,000,000 new cases in 2018 and an estimated 783,000 deaths (equating to 1 in every 12 deaths globally), making it the fifth most frequently diagnosed cancer and the third leading cause of cancer death. Esophageal cancer is the eighth most frequent cancer with a worldwide incidence rate of more than 572,034 (3.2% of the total cases) and a mortality rate of 508,585 (5.3% of the total cancer deaths) [1]. By adapting Raman fiber-optic probe designs for endoscopic compatibility, in situ measurements of these disease targets have been enabled by many research groups (Table 1.3). In 2008, Teh et al. [181, 184] studied 73 gastric tissue samples from 53 patients and found that Raman peaks at 875 and 1745 cm−1 to be two of the most significant features to discriminate gastric cancer from normal tissue. A sensitivity of 90% and specificity of 95% between cancerous and healthy tissue were reported. In a follow-up studies, it was demonstrated sensitivity of 94% and specificity of 96.3% for distinction of gastric dysplastic tissue with a help of narrowband image-guided Raman endoscopy associated with PCA-LDA model [182]; predictive accuracies were evaluated as 88, 92, and 94% for normal stomach and intestinal- and diffuse-type gastric adenocarcinomas, respectively [182]. Bergholt with co-authors performed in vivo studies for diagnostic gastric dysplasia in Barrett’s esophagus [315, 316], premalignant and malignant lesions in the upper gastrointestinal tract [176], ulcers in the stomach [179], gastric dysplasia and neoplasia [167], intestinal metaplasia, Helicobacter pylori infection, and adenocarcinoma [177].

Gastric tissue Raman spectra contain a large contribution from triglyceride (major peaks at 1078, 1302, 1445, 1652, and 1745 cm−1) that reflects the interrogation of subcutaneous fat in the gastric wall [167, 179]. Remarkable Raman spectral alterations are observed in the Raman peaks 875, 936, 1004, 1078, 1265, 1302, 1335, 1445, 1618, 1652, and 1745 cm−1 between different tissue pathologies due to major pathological features such as upregulation of mitotic and proteomic activity, increase in DNA contents and relative reduction in lipid as well as the onset of angiogenesis leading to neovascularization in the tissue [169, 178]. It has been demonstrated that the diagnostic capabilities can be optimized through the combination of near-infrared autofluorescence with Raman spectroscopy [178]. A total of 1098 normal tissue samples and 140 cancer gastric tissue samples from 81 patients were measured with a spectral acquisition time of 0.5 s. The differentiation between gastric cancer and normal tissue was achieved with a sensitivity of 97.9% and specificity of 91.5%. It was proposed and tested several probabilistic models for online in vivo diagnostics and pathology prediction: PLS-DA [168], PCA-LDA [178, 184], ACO-LDA [176], CART [181]. The PLS-DA modeling provided the predictive accuracy of 80.0% [168], suggesting that Raman endoscopy with the integration of online diagnostic framework could be a diagnostic screening tool for real-time in vivo gastric cancer identification.

It was demonstrated that the acquisition of both the fingerprint and the high-wavenumber regions of a Raman spectrum meaningfully enhanced the detection of esophageal neoplasia [175] and gastric intestinal metaplasia [176] in vivo in comparison with each region alone. A total of 157 patients were included. PCA-LDA model with the leave-one tissue site-out, cross-validation on in vivo tissue Raman spectra yielded the diagnostic sensitivities of 89.3%, 89.3%, and 75.0% and specificities of 92.2%, 84.4%, and 82.0%, respectively, by using the integrated FP/HW, FP, and HW Raman techniques for identifying intestinal metaplasia from normal gastric tissue. Wang et al. [174] achieved a diagnostic accuracy of 93.0% (sensitivity of 92.5%; specificity of 93.1%) for differentiating gastric dysplasia from normal gastric tissue by using the beveled fiber-optic Raman probe, which was superior to the diagnostic performance (accuracy of 88.4%; sensitivity of 85.8%; specificity of 88.6%) by using the volume Raman probe.

Optical properties of stomach and esophagus, measured in the spectral range from 300 to 1140 nm in [317], are presented in Table 1.8. It is clearly visible that the absorption of light in healthy tissues (both the stomach and the esophagus) predominates over absorption in tumor tissues (adenocarcinoma and squamous cell carcinoma). At the same time, scattering properties of these tissues are comparable across the entire wavelength range.

Table 1.8 The optical properties of stomach/esophageal tumor tissues measured in vitro [317]

3.9 Oral Cancer

The global incidence rate for oral cancer is 354,864 (2.0% of the total cases) and mortality is 177,384 (1.9% of the total cancer deaths) [1]. These are mostly associated with tobacco and alcohol use, which affect the entire upper aerodigestive tract mucosa resulting in molecular changes that can progress further into carcinomas. The diagnoses of oral cancers are typically performed using a biopsy and histopathology of the tissue [318]. The first in vivo study of site wise variations in the human oral cavity was carried out by Guze et al. [319] in high-wavenumber region and by Bergholt et al. [320] in the fingerprint region. It was found that the Raman signal was not influenced by gender or ethnicity [319]; however, the inter-anatomical variability is significant and should be considered as an important parameter in the interpretation and rendering of Raman diagnostic algorithms for oral tissue diagnosis and characterization [320]. Singh et al. [321, 322] reported the discrimination of normal control, premalignant, and cancerous sites with prediction accuracies ranging from 75 to 98% depending on oral cancer location in smoking and non-smoking population. Krishna et al. [211] investigated the classification between spectra acquired from multiple normal sites of 28 healthy volunteers and 171 patients with oral lesions. Using probability-based multiclass diagnostic algorithm, each oral tissue type (squamous cell carcinoma, submucosa fibrosis, leukoplakia, and normal mucosa) was correctly classified in 89%, 85%, 82%, and 85% of the cases, respectively.

Sahu et al. [323] investigated the influence of anatomical differences between subsites on healthy vs pathological classification by examining Raman spectra acquired from 85 oral cancer and 72 healthy subjects. Mean spectra indicated predominance of lipids in healthy buccal mucosa, contribution of both lipids and proteins in lip, while major dominance of protein was found in tongue spectra. From healthy to tumor, changes in protein secondary-structure, DNA, and heme-related features were observed. PC-LDA followed by LOOCV yielded an overall classification of 98%, 54%, 29%, and 67% for healthy, contralateral normal, premalignant, and malignant conditions.

Amelink et al. [324] examined the physiological differences between normal and tumorous (squamous cell carcinoma) oral mucosa and stated that oxygen saturation was (95 ± 5)% for normal tissue and (81 ± 21)% for SCC; the vessel diameter was 24 ± 14 μm for normal tissue and 25 ± 12 μm for SCC, and the blood volume fraction was equal (1.0 ± 0.9)% for normal tissue and (2.2 ± 2.3)% for SCC.

3.10 Liver Cancer

The global incidence rate for liver cancer is 841,080 (4.7% of the total cases) and mortality is 781,631 (8.2% of the total cancer deaths) [1]. Surgical intervention is often indicated as a potential treatment for liver cancers identified at early stages. The liver’s highly specialized tissues regulate a wide variety of high-volume biochemical reactions and are characterized by strong background autofluorescence that overwhelms the Raman scattered signal in fingerprint region. For Raman excitation laser with wavelength 785 nm the feasibility of recovering the spectral signature from bulk liver specimens with sufficient signal-to-noise ratios for interpretation was achieved only from high-wavenumber regions [325]. Therefore, most of the investigations with liver malignancy were conducted in cell lines and thin slices of tissue, where the use of confocal collection geometry reduced autofluorescence [186, 326]. For example, Tolstic et al. [186] reported the precise multivariate PCA-SVM separation of two types of hepatocellular carcinoma cells by the recognition of spectral pattern with peak intensities at 2900–2850, 1655, 1440, 1304, 1266, and 1060 cm−1. The results confirmed that a lot of molecular differences were hidden in lipids and associated with specific wavenumbers of unsaturated fatty acids.

Pence at al. [132] reported the use of a dispersive 1064 nm Raman system using a low-noise indium-gallium-arsenide (InGaAs) array to discriminate highly autofluorescent bulk tissue ex vivo specimens from healthy liver, adenocarcinoma, and hepatocellular carcinoma. The resulting spectra were combined with a multivariate discrimination algorithm, sparse multinomial logistic regression (SMLR), to predict class membership of healthy and diseased tissues, and spectral bands selected for robust classification were extracted. These spectral bands included retinol, heme, biliverdin, or quinones (1595 cm−1); lactic acid (838 cm−1); collagen (873 cm−1); and nucleic acids (1485 cm−1). It was achieved 100% sensitivity and 89% specificity for normal versus tumor classification.

Cholangiocarcinoma (СС) is a group of malignant tumors originating from bile duct epithelium. According to WHO classification [327] the term cholangiocarcinoma is reserved for carcinomas arising in the intrahepatic bile ducts. The prognosis of this malignancy is dismal owing to its silent clinical character, difficulties in early diagnosis, and limited therapeutic approaches.

CC is the most common malignant tumor of the biliary tract found in the bile duct epithelial cells and the second most common primary tumor of the liver [328]. Depending on anatomical localization, CC is classified as intrahepatic СC or extrahepatic CC, including perihilar CC and distal CC. The Bismuth–Corlette classification provides preoperative assessment of local spread. The anatomic margins for distinguishing intra- and extrahepatic CCs are the second-order bile ducts [329].

The Liver Cancer Study Group of Japan proposed in 2000 a new classification based on growth (morphologic) characteristics being identified as mass forming, periductal-infiltrating, and intraductal-growing types [330].

Intrahepatic СC is a primary liver malignancy arising from the epithelial cells of the distal branch intrahepatic bile duct [331]. The incidence of intrahepatic СC exhibits wide geographical variation and generally accounts for between 5 and 30% of primary liver cancers [332,333,334]. Approximately 67% of CCs are perihilar.

Distal CCs are those that arise in the mid or distal bile duct. They are potentially amenable to pancreaticoduodenectomy.

The Classification of Malignant Tumors (TNM) of the American Joint Committee on Cancer and the International Union Against Cancer applies to all primary carcinomas of the liver, including hepatocellular carcinomas, intrahepatic bile duct carcinomas, and mixed tumors [335].

Hilar CC arises from the extrahepatic bile ducts (right and left hepatic ducts at or near their junction) and is considered an extrahepatic carcinoma [336].

In most peripheral CCs, hard, compact, and grayish-white massive or nodular lesions are found in the liver. They may grow inside the dilated bile duct lumen or show an infiltrative growth along the portal pedicle. Usually the tumors are not big compared to the whole liver. Hemorrhage and necrosis are infrequent, and the association with cirrhosis is only occasional. Tumor located just beneath the capsule of the liver shows umbilication, as in metastatic liver cancer.

In most hilar CCs , the tumor infiltrates and proliferates along the extrahepatic bile duct, which is thickened in most cases. Mass formation can be minimal and there can be thickening and enlargement of the portal region. The infiltration in the liver has an arborescent appearance. Extensive parenchymal infiltration is also observed in most cases [337].

Currently, surgical resection remains the most effective treatment for intrahepatic СC [338]. The prognosis for patients with this disease remains disappointing despite advances in the operative and nonoperative management [339]. A positive bile duct resection margin is correlated with higher local recurrence rate and poor prognosis and its role is similar to a positive lymph node [340].

However, because of vague symptomatic presentation, most patients are at an advanced stage by the time of diagnosis, and only nearly one-third of patients are eligible for surgical resection [341]. As a result, the overall outcome of intrahepatic СC remains extremely poor, in which patients who are unable to undergo surgical resection have a less than 10% survival rate at 5 years. Moreover, the reported outcome after hepatic resection is also not optimistic, with a 5-year survival rate of 30–35% [342].

The principal reason for the dismal outcome of surgical treatment is the high incidence of postoperative intrahepatic СC recurrence, in which more than 60% of patients can subsequently develop cancer recurrence after hepatic resection.

The earliest description of CC of bile ducts without palpable surface liver mass on laparotomy was described by Sanford in 1952 [343]. Tsushimi et al. hypothesized that CC arising within bile ducts was from ectopic liver tissue [344].

3.11 Thyroid Cancer

Thyroid cancer is responsible for 567,000 cases worldwide, ranking in ninth place for incidence. The global incidence rate in women of 10.2 per 100,000 is three times higher than in men; the disease represents 5.1% of the total estimated female cancer burden, or 1 in 20 cancer diagnoses in 2018 [1]. The diagnosis is commonly based on clinical perceptions and ultrasonography-guided fine needle aspiration, often presenting inconclusive results, which can indicate surgery as the main treatment for these cases. Therefore, the need to biochemically characterize the thyroid gland during surgery is extremely important. Raman spectroscopy may greatly speed up the diagnostic process whether pre-operatively or in the theater setting. The applicability of Raman spectroscopy for thyroid cancer diagnostics was confirmed in cell line studies [215, 216, 345]. In 2009, Harris et al. [215] reported the accuracy of 95% for identification of cancerous cell lines using neural network analysis. O’Dea et al. [216] demonstrated the possibility to correctly classify cell lines representing benign thyroid cells and various subtypes of thyroid cancer. Spectral differences were consistently observed between the benign and cancerous cell lines with the strongest signals occurring at ~470, ~780, 855, 941, ~1230, 1278, 1343, 1402, 1436, 1456, 1571, 1650, 1690, and 1677 cm−1, representing significant differences in the molecular composition of carbohydrates, nucleic acids, lipids, protein structures, and amides. A PC-LDA model was applied to examine the possibility of correct classification of various subtypes of thyroid cancer. The well-differentiated papillary and follicular thyroid carcinoma cell lines were detected with sensitivities >90% and specificities >80%, although the model yielded lower performance scores for identifying the undifferentiated thyroid carcinoma cell lines (sensitivities of 77% and specificities of 73%). Rau et al. [217] demonstrated ex vivo the significant presence of carotenoids in papillary thyroid carcinoma with respect to the healthy tissue. The authors stated the sensitivity of 93% and specificity 100% in discrimination of papillary and follicular thyroid carcinoma using combined fingerprint and high-wavenumber regions of Raman spectra and PC-LDA statistical model with leave-one-out cross-validation. In 2019, Medeiros-Neto et al. [346] compared in vivo and ex vivo spectra of papillary carcinomas, confirming the efficacy of the technique in the biochemical identification of the analyzed tissue. The intense peaks related to an increased amount of DNA were registered at 1017, 991, 829, and 810 cm−1 (in vivo samples) and at 1421, 1324, 828, 810 cm−1 (ex vivo samples). The amino acid tyrosine, a very important metabolite for the proper functioning of the thyroid gland, was evidenced by the peaks at 1205, 863, 854 cm−1 (in vivo) and at 1605, 1206, 863, 853, and 828 cm−1 (ex vivo). The phenylalanine, produced from the hydroxylation process and essential for the thyroid, was observed at the peak situated at 1174 cm−1 (in vivo), and at 1174 and 1103 cm−1 (ex vivo), proving the increase in protein concentration in carcinogenic tissues. Another important observed amino acid was tryptophan, at 1366 and 877 cm−1 (in vivo), and at 1556, 1360, and 878 cm−1 (ex vivo). This amino acid is important because it participates in the production of the hormones serotonin and melatonin and of the enzyme tryptamine, which are tumor growth inhibitory substances.

3.12 Brain Cancer

The global incidence rate for brain cancer is 296,851 (1.6% of the total cases) and mortality is 241,037 (2.5% of the total cancer deaths) [1]. Raman spectroscopy is a potential modality that can identify the margins of the tumor intraoperatively. For example, Kalkanis et al. [347, 348] demonstrated identification of normal gray matter and white matter from pathologic glioblastoma and necrosis in frozen brain tissue sections by imaging of relative concentrations of 1004, 1300, 1344, and 1660 cm−1, which correspond primarily to protein and lipid content. Leslie et al. [349] investigate the application of Raman spectroscopy to diagnose pediatric brain tumors acquiring Raman spectra from fresh tissue samples. Support vector machine analysis was used to classify spectra using the pathology diagnosis as a gold standard. Normal brain (321 spectra), glioma (246 spectra), and medulloblastoma (82 spectra) were identified with 96.9, 96.7, and 93.9% accuracy, respectively. Jermyn et al. [350] demonstrated that a handheld Raman probe could detect cancer cells intraoperatively that could not be detected by T1-contrast-enhanced and T2-weighted MRI. The gliomas were detected with 93% sensitivity and 91% specificity using supervised machine learning boosted-trees classification algorithm that utilized all spectral data. Recently, the same research group demonstrated the increase in accuracy of brain cancer detection by multimodal optical system from 91% for standalone RS to 97% when combined with fluorescence analysis [319].

Desrochers et al. [218] tested the use of high-wavenumber Raman spectroscopy in a practical fiber-optic probe that satisfied the stringent miniaturization constraints required for direct integration with a commercial brain biopsy needle. As it was expected, in comparison with gray matter, the white matter spectrum demonstrated larger contributions from lipids (2845 cm−1) and a lower contribution from proteins and nucleic acids (2930 cm−1) in comparison with gray matter. These data demonstrated an increase in the protein/lipid ratio for dense cancer compared to normal brain samples, consistent with findings [351] in brain ex vivo samples. However, when compared with that of normal brain, the protein/lipid ratio in infiltrated samples did not show significant differences. The authors showed that HW Raman spectroscopy could detect human dense cancer with >60% cancer cells in situ during surgery with a sensitivity and specificity of 80% and 90%, respectively.

Optical properties of brain tumors were studied in [41, 43, 77, 352,353,354,355] and summary of those investigations is presented in Table 1.9. Genina et al. [41] measured absorption, scattering, reduced scattering coefficients, and scattering anisotropy factor of brain tissues in a wide spectral range from 350 to 1800 nm (see Table 1.9) for healthy rats and rats with model C6 glioblastoma. Glioblastoma multiforme (GBM) is the most common and aggressive form among all brain tumors (grade IV WHO). The development of the glioblastoma is accompanied by the following main symptoms: headaches, dysfunction of memory and general brain function, visual impairment, poor speech, impaired sensitivity and motor activity, pathological changes in behavior, loss of appetite, etc.

Table 1.9 The optical properties of brain tumor tissues measured in in vitro [41, 43, 77, 352,353,354,355]

Gebhart et al. [43] investigated human glioma optical properties using integrating sphere technique and inverse adding-doubling method in the spectral range 400–1300 nm. Astrocytoma of optic nerve and medulloblastoma were investigated in vivo by Bevilacqua et al. [77] using spatially resolved diffuse reflectance. Absorption and reduced scattering coefficients of different human tumor tissues (glioblastoma, meningioma, oligodendroglioma, and metastasis) were investigated by Honda et al. [352] by double-integrating sphere and IMC technique. Schwarzmaier et al. [353] investigated optical properties of human low (astrocytoma WHO grade II) and high (WHO grade III) grade glioma with integrating sphere technique and IMC method. Optical properties of glioma were investigated by Sterenborg et al. [354]. Yaroslavsky et al. [355] studied optical properties of human meningioma and astrocytoma (WHO grade II).

3.13 Kidney Cancer

The global incidence of renal cell cancer is increasing annually and the causes are multifactorial. It ranks the second most common neoplasm found in the urinary system [356].

Renal cell carcinoma (RCC) is the commonest solid lesion within the kidney and accounts for approximately 90% of all kidney malignancies and 2–3% of all cancers, with the highest incidence occurring in Western countries [357]. The proportion of small and incidental renal tumors has significantly increased owing to the widespread use of abdominal imaging. Consequently, more than 50% of RCCs are currently detected incidentally [358]. Diagnosis and subtyping of RCC can usually be accomplished through a thorough morphologic investigation of the resected tumor, which in itself offers valuable prognostic information [301]. The main subtypes of RCC are clear cell, papillary, chromophobe, collecting duct, and unclassified [359].

The most frequent histological type of RCC is clear cell renal cell carcinoma , it occurs in 75% of all primary kidney cancers. Papillary and chromophobe RCC are less common subtypes [360, 361].

The classic clear cell renal cell carcinoma has a yellow-brown cut surface and it is inhomogeneous due to hemorrhage and necrosis. Macroscopically, it is relatively well separated from the normal renal tissue, but there may be a risk to form microscopic tumor satellites. The tumor cells are derived from the proximal convoluted tubule. The rich content of glycogen and fat in the cytoplasm of the cells produces a clear appearance in conventional staining. But there are also eosinophilic, sarcomatoid, and mixed patterns of differentiation.

The distinction of clear cell RCC from papillary renal cell carcinomas is not particularly difficult, but the distinction between RCC and other neoplasms in some cases is problematic and requires additional research methods, such as immunohistochemistry. Occasionally, the tumor cells harbor granular to pink eosinophilic cytoplasm and can resemble chromophobe RCC, which more typically contains polygonal cells with transparent to reticulated cytoplasm rimmed by thickened cell membranes [362].

The most effective treatment of RCC remains the surgical resection of the tumor mass by partial or total nephrectomy [363]. The functional benefits of nephron-sparing procedures have driven the indication of partial nephrectomy , which is recommended as the standard treatment in patients with T1a tumors [364]. Adjuvant therapy after nephrectomy has not been proven to prolong survival or to have any significant patient benefit [365].

The less invasive approaches include percutaneous radiofrequency ablation and laparoscopically assisted cryoablation. Indications for thermal ablations are usually small renal masses in elderly more comorbid patients unable to undergo surgical intervention and patients with bilateral tumors or solitary kidney [366].

Optical properties of kidney tumor transplanted in rat were investigated in [48] (see Table 1.10).

Table 1.10 The optical properties of kidney tumor tissues measured in in vitro [48]

3.14 Pancreatic Cancer

Pancreatic adenocarcinoma (AC) , the 4th leading cause of cancer death in the USA with a 5-year survival rate of less than 6%, is often detected at late stages of development when treatment is ineffective. Intraductal papillary mucinous neoplasm (IPMN) is a precursor lesion of pancreatic cancer, characterized by an intraductal proliferation of neoplastic cells with mucin production [367].

Lee et al. [367] measured human pancreatic malignant precursor, IPMN, using methods of reflectance and fluorescence spectroscopy . They found morphological property differences between normal tissue, IPMN, and AC (see Table 1.11).

Table 1.11 Morphological properties of normal tissue, IPMN, and AC

Optical properties of normal and cancerous pancreas were investigated in [368, 369] (see Table 1.12) using integrating sphere technique.

Table 1.12 The optical properties of pancreas tumor tissues measured in vitro [368, 369]

4 Biochemical Cancer Model

To date, most Raman studies of cancerous tissues have used multivariate statistical algorithms to describe the spectral differences of spectral data, such as PCA-LDA or PLS-DA. However, the principal components and loading vectors are difficult to relate to the biophysical origin of the disease, such as the microstructural organization of proteins and lipids and the functional state of cellular metabolism, which are the key features for the pathologist diagnostic decisions for appropriate cancer treatment. Therefore, several research groups have proposed biochemical diagnostic models extracting physiologically relevant markers from Raman spectra of tissues (Table 1.13). The biochemical model derives the morphological and biochemical composition of the modeling tissue from its Raman spectrum. The building blocks of the model are Raman active components either measured directly from synthetic/purified chemicals [123, 167, 169, 177, 224, 313, 316, 370,371,372,373] or morphologically extracted from tissue sections in situ [98, 122, 222]. In the last case, a Raman spectrometer is coupled to a microscope and is scanned across the tissue section to obtain Raman images that can then be correlated with serial hematoxylin–eosin stained sections to identify relevant morphologic components and their Raman signature. In situ constituents better represent the milieu of biological tissues that cannot be recapitulated in a synthetic environment. For example, collagen can be presented in human tissue in many different forms, each one having a slightly different Raman spectrum. However, if both of them are included in the model, it may lead to overfitting and unstable results [123, 372]. In addition, skin constituents synthesized in the lab or from commercial sources are not in their natural state. Nevertheless, the advantage of using synthetic/purified chemicals as model components is that they can be easily measured using the same Raman instrument as used to measure the biological specimen, providing evaluation of more specific molecular constituent changes.

Table 1.13 Raman biochemical diagnostics models of tumor tissues

Construction of a biochemical model of the tissue relies on three assumptions: first, that the Raman spectrum of a mixture equals to the weighted linear sum of the individual components of the mixture; second, that the biological morphological features, such as cells, have the same Raman spectrum from one patient to another; and third, that the basis spectra included in the model are sufficiently distinct to enable their differentiation based on their Raman spectrum [98]. In such approach the fitting of the vector normalized constituents to the mean spectra of the different pathologies can be performed by linear least-squares analysis with a nonnegative constraint for model fitting (NNLS) , according to the following equation:

$$ X=c\;S+E, $$
(1.10)

where X is the measured spectra of the tissue, c is the matrix of concentrations to be predicted, and S is the matrix of spectral components. This can be used to provide a linear “best fit” of the spectral components with minimum residuals. Here E gives the error or residual, which can be mainly attributed to the noise in the measured signal. The NNLS model presumes that the Raman spectra measured from the tissue is a linear combination of its biochemical components’ spectra and the signal intensity scales linearly with the relative concentration of biochemical components in the tissue. The biochemical model determines the relative concentration profiles of major tissue biochemical constituents responsible for prominent tissue Raman spectral features and its changes associated with disease progression. If components are not included the omitted variable bias can introduce some errors in the fit [375]. Observation of the residual E enables the quality of the fit to be observed and any remaining features of the spectra to be included in the next iteration of the model.

One important factor that may influence the performance of the model is a collinearity of the basis spectra. Collinearity is a common issue in linear regression that can lead to an unstable result [123]. Any collinearity in the components selected will skew the fit. An example being amino acids and the proteins containing them being used in the same model. Hence, the collinearity coefficients of the basis components must be calculated:

$$ R=\frac{x^Ty}{\left({x}^Tx\right)\left({y}^Ty\right)}, $$
(1.11)

where x and y are any two component spectra and T indicates the transpose of the respective spectra. The orthogonality matrix represents the degree of orthogonality between the components. If the orthogonality value is zero, then the two components are orthogonal, and if the orthogonality value is one, then the two components are identical. For instance, DNA and RNA have an orthogonality value close to 1. As usual, this equation is used for the initial evaluation of the model components.

The choices of biochemical substances used in the model are mainly based on their known presence in the correspondent tissue, and the contributions they would give to the observed tissue spectrum. For example, Stone et al. [123] diagnosed bladder and prostate cancer by quantifying differences in actin, collagen, choline, triolein, oleic acid, cholesterol, and DNA, assessing the gross biochemical changes in each pathology. De Jong et al. [313] demonstrated that spectra from normal bladder tissue showed a higher collagen content, while spectra from tumor tissue were characterized by higher lipid, nucleic acid, protein, and glycogen. Haka et al. [122] used a biochemical model to correlate changes in the amounts of fat (adipocytes), collagen, cholesterol, and calcium oxalate in the cell nucleus and cytoplasm, aiming at breast cancer diagnosis in vivo.

Huang et al. [169] developed biochemical model for effective gastric cancer diagnosis , including eight reference tissue constituents (actin, albumin, collagen type I, DNA, histones, triolein, pepsinogen, and phosphatidylcholine). The authors showed that albumin, nucleic acid, phospholipids, and histones were found to be the most significant features for diagnosing the epithelial neoplasia of the stomach, giving rise to an overall accuracy of 93.7%.

In 2011, Bergholt et al. [167] based on NNLS analyses of over 35 basis reference Raman spectra obtained from different biomolecules associated with GI tissue (e.g., actin, albumin, pepsin, pepsinogen, B-NADH, RNA, DNA, myosin, hemoglobin, collagen I, collagen II, collagen V, mucin 1, mucin 2, mucin 3, flavins, elastin, phosphatidylcholine, cholesterol, glucose, glycogen, triolein, histones, beta-carotene, etc.) showed that the following five biochemicals, i.e., actin, histones, collagen type I, DNA, and triolein were the most significant Raman-active biochemical constituents that could effectively characterize gastric and esophageal tissue with very small fit-residuals. For instance, DNA represented nucleic acids within the cell nucleus; triolein represented typical lipid signals; actin and histones resembled proteins of different conformations and were the major components of the cytoskeleton and chromatin, respectively, whereas collagen type I was a substantial part of the extracellular matrix [95, 123, 169]. In a follow-up study [307] the authors added glycogen to the model, which was present in the squamous epithelium.

The comparisons of the mean in vivo measured Raman spectra and the reconstructed Raman spectra of different normal and cancerous tissues are shown in Fig. 1.7 and indicate good fitting (with residual less than 10%) [167]. The diagnostic sensitivity of GI cancer was 97.0%, and the specificity was 95.2% [177, 316].

Fig. 1.7
figure 7

Comparison of the in vivo Raman spectra measured with the reconstructed tissue Raman spectra through the employment of the five basis reference Raman spectra: (a) normal esophagus, (b) esophageal cancer, (c) normal gastric, (d) gastric cancer. Residuals (measured spectrum minus fit spectrum) are also shown in each plot. Reprinted with permission from [167]

The six-component biochemical model showed that neoplastic tissue was mainly associated with a decrease in actin, collagen, lipids, and glycogen, while an increase in DNA and histones concentration [316]. More specifically, it was found a significant increased fit coefficient of DNA (highly related to the Raman peak at 1335 cm−1) and histones (associated with amide III at 1265 cm−1 and amide I (C=O) stretching vibration at 1655 cm−1), the relative reduction in actin, which represented a major part of the Raman signal originating from cell cytoplasm. This reflected the increase in the nuclear-to-cytoplasm ratio of neoplastic cells, which was a well-established qualitative indicator of malignancies used by pathologists. The fit coefficient of collagen representing the extracellular matrix was noted to be lower for cancerous as compared to normal tissue. The model also revealed a considerable decrease in fit coefficients of lipids (associated with Raman peaks at 1078, 1302, 1445, 1745 cm−1) and a reduction in glycogen in cancer tissue related to the abnormal glucose metabolism in cancer cells [372]. Incorporation of significant biochemical fit coefficients in LDA-DA statistical analysis provided an inherent separation of different tissue types based on the biomolecular information.

Feng et al. [222] designed skin biochemical model with eight primary model components: collagen, elastin, triolein, nucleus, keratin, ceramide, melanin, and water, which were collected from human skin in situ and were averaged over multiple patients. Those components contained both biochemical and structural information. For instance, nucleus referred to the nuclear material in the cell. Collagen and elastin referred to dermal extracellular matrix. Keratin represented epidermal extracellular matrix. Triolein mainly represented subcutaneous fat. The fit coefficients provided the relative concentration of those components and were used as the input variables of the discriminant analysis. The authors showed that a biophysical model could achieve consistent diagnostic performance with the statistical model while simultaneously extracting the relevant biomarkers accounting for the diagnosis [121, 226].

In general, Raman spectral biochemical modeling in conjunction with linear discriminant analysis shows good classification results and also provides new insights into biochemical origins of Raman spectroscopy for tissue diagnosis and characterization.

5 Summary

Numerous experimental studies have shown the capability of Raman spectroscopy for malignant lesion detection and grading based on objective and quantifiable molecular information. Consequently, the use of this technique can reduce the number of unnecessary biopsies and guide the precise tumor margin detection improving the surgical outcome of the patients.

Despite the advantages that the Raman spectroscopy can offer, there are some challenges existing in cancer diagnosis with Raman spectroscopy. First, it has low measurement speed. One way to overcome this is to complement Raman spectroscopy with other techniques, such as autofluorescence imaging and OCT. Image-guided Raman spectroscopy substantially reduces the time spent on redundant or non-relevant Raman measurements.

The translation for clinical use involves the development of comprehensive spectral databases and tissue classification methodologies, which can draw effective diagnostic information from usually overlapping Raman spectra with subtle spectral differences between neoplastic and normal tissues. Validation studies need to be performed to confirm that classification diagnostic algorithms developed on ex vivo specimens are applicable to in vivo tissues. Deep learning training, using large numbers of spectra, can also discriminate different cancer types and become predictors of the aggressiveness of the cancer. Combination of statistical and biochemical models can lead to new classification methodologies that can be comparable with current gold standards.

Recently optical properties of many kinds of tumor have been investigated spectroscopically using integrating sphere spectroscopy and reflectance spectroscopy methods. From the analysis of the above presented tables we can conclude that during tumor development both absorption and scattering properties of tumor tissues increase (as a rule) in comparison with normal tissue. Hemoglobin and water content increase, whereas oxygenation degree does not change or changes insignificantly. The presented data can be used for the development of novel methods of cancer diagnostics and treatment.