Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

This chapter presents an overview of the state of existing and emerging endoscopic technologies and techniques and shows how nanoparticles might be used to overcome some of the limitations of these in order to improve endoscopic diagnosis and/or treatment. We first present an overview of the principles, technology and clinical applications of current “standard” white-light endoscopic imaging, highlighting some of the major unmet clinical needs. We next summarize some of the novel optical approaches that have been investigated to address these needs, including the use of different light-tissue interactions (fluorescence, Raman scattering, elastic scattering, optical coherence tomography and photoacoustics). Some of these novel approaches involve the use of optically-active agents to provide additional ‘molecular contrast’ to improve either the sensitivity and/or specificity of disease detection. One option towards achieving this goal is to use nanoparticles as the means of providing the additional contrast, as well as to enhance local endoscopic treatments. This then leads naturally to a section in which we provide an overview of optical nanoparticles and their potential uses in medicine in general and then in particular to how nanoparticles may be used with each of the endoscopic imaging techniques. The chapter concludes with a summary of broad issues in moving nanoparticles into clinical endoscopic practice, including concerns over potential toxicity, regulatory barriers and scientific/technological challenges, as well as opportunities for integrating endoscopy instrumentation with emerging nanotechnologies.

1.1 Principles of Endoscopy and Current Clinical Applications

Endoscopy is the examination of the inner surfaces of organs, most commonly using visible-light to illuminate the tissue and collect images of the light that is diffusely reflected from the surface. This approach dates back over 200 years to the invention of the “lichtleiter” by Phillip Bozzini that comprised an expanding device that was placed in the patient’s rectum to view the inner tissue surface under candle-light illumination. It was noted at the time that the Vienna Medical Society “disapproved of such curiosity”. There was some continued development of endoscopic technologies using traditional optical elements throughout the remainder of the 19th and first half of the 20th centuries but the technique did not become widespread until the introduction of fiber optic-based endoscopes in the 1950s. One of the most important developments at this time occurred in 1952 when a group in France revolutionized imaging inside the body using ‘cold light’ fiberglass illumination. Modern fiberscopes utilize a coherent fiber bundle to transmit the image from within the body to an external camera [1]. A major subsequent advance was the development of the video endoscope in the 1980s in which a CCD camera is placed at the distal end of the endoscope, i.e. within the patient, thereby improving the imaging performance. A flexible fiberoptic- or video-endoscope, such as shown in Fig. 1, then has additional major elements of: a channel for fiberoptic delivery of the broad-band white light, usually from a halogen or an LED light source; one or more working channels through which instruments may be placed, such as biopsy forceps to take tissue samples or a therapeutic device (laser-delivery fiber or argon plasma probe for tissue ablation, electrocautery probe, etc.); a means to steer the tip of the endoscope for positioning, usually by guide wires; and image capture and display devices, usually nowadays done digitally. Rigid endoscopes are also available and used for some applications and body sites in which steering of the endoscope tip is not required, i.e. the tissue of interest is more directly accessible. In these instruments the image of the tissue is commonly transferred to an exterior camera or directly to the operator’s eye by a series of lenses or a coherent fiber bundle.

Fig. 1
figure 1

a Schematic showing a distal end of a typical fiberoptic endoscope (reprinted from Ref. [2], with permission from American Society for Microbiology). In videoscopes the fiberoptic imaging bundle is replaced by a CCD chip. b Typical modern flexible endoscope

The most widespread application of endoscopy is in diagnostic examination of hollow organs for the purpose of detecting and localizing cancer, including in the lung, the gastrointestinal tract, the oral cavity and upper neck, the urinary bladder and the cervix. In addition, both flexible and rigid endoscopes are used for visualization in laparoscopic (‘keyhole’) surgery, in examining joints, in the ear canal and in the ducts of the breast to detect early cancer. A new technological advance, an ultrathin (~1 mm) scanning-fiber endoscope, has recently emerged that improves the field-of-view and image resolution in applications where access is very restricted such as in the bile duct or for intravascular imaging [3]. This uses laser sources, a piezoelectrically-driven rapidly-scanning optical fiber to deliver the light point-by-point and one or more stationary fibers to collect the light from the tissue [4].

1.2 Limitations of Conventional White-Light Endoscopy

Standard white-light endoscopy is based on direct or indirect visualization of the diffuse reflectance of visible light from the tissue surface. Here we consider in what ways it does not fully meet significant clinical needs. Since cancer detection and therapeutic guidance have driven many of the developments discussed below, these clinical needs will be used to illustrate the challenges and potential solutions. Tumors arising in the hollow organs of the body represent the majority of solid tumors (excluding skin cancers) and so are a very important contributor to cancer-related mortality and morbidity. These organs comprise essentially an outer muscular layer that provides structural integrity and motility, an inner mucosal layer that provides the biochemical/physiological functions of the organs, and an intervening submucosal layer. Tumors start in the mucosal layer, since this typically has rapid turnover and is exposed to potentially mutagenic chemicals, bacteria and viruses. As illustrated in Fig. 2, following tumor initiation there is progression through a number of stages: dysplasia (premalignant), carcinoma-in situ (malignant but confined to the mucosal layer), sub-mucosal extension and then invasive cancer with risk of metastatic spread to other organs. Hence, as summarized in Table 1, it is very important (a) to accurately detect lesions at the pre-malignant stage so that they can be removed or ablated in a minimally-invasive way, (b) to determine if there is submucosal invasion and if there is tumor spread to local lymph nodes, (c) to guide local interventions such as mucosal resection/ablation, (d) to improve safety and efficacy of treatment and (e) to monitor the treatment response and completeness. In many cases there is a very strong correlation between detecting cancers at an early stage and their curability: for example, in esophageal cancer the 5-year survival rate for localized disease is about 40 % but only about 1 in 5 cases of esophageal cancer is detected at this stage. Unfortunately, about 40 % of patients are diagnosed with advanced metastatic disease for which the 5-year survival drops to less than 5 %. The situation is even worse in lung cancer where nearly 60 % of cases are diagnosed at a late stage, also with less than 5 % 5-year survival but increasing to over 50 % if the cancer is still localized at time of diagnosis and treatment [5].

Fig. 2
figure 2

Schematic of initiation and progression stages in the development of luminal tumors (for the specific case of the colon). Illustration Copyright © 1998–2003 by The Johns Hopkins Health System Corporation and The Johns Hopkins University. Illustration created by Mike Linkinhoker and Cory Sandone. With permission from the Johns Hopkins Division of Gastroenterology and Hepatology (www.hopkinsmedicine.org/gi)

Table 1 Clinical needs for endoscopic cancer diagnosis and treatment and potential roles of nanoparticles

The molecular signature of the particular lesion, i.e. the expression of specific biomarkers that are known to be associated with prognosis and with sensitivity to particular treatments such as chemotherapy, provides valuable information for patient management to complement standard histopathology of tissue biopsy specimens taken at the time of endoscopic examination. This personalization is increasingly important to ensure optimal management of patients based on the characteristics of their individual tumors. There is also an explosion of so-called ‘omics’ information that is potentially useful for this purpose: genomics, epigenomics, proteomics, metabolomics, etc. [6] and we will see how nanoparticles may be particularly useful in translating this knowledge into more accurate endoscopic imaging.

Conventional white-light endoscopy in general has less than optimum performance, especially for detecting early-stage disease and for determining if there is sub-mucosal invasion or lymph-node involvement. For early detection, there is a trade-off between sensitivity (the ability to detect a lesion) and specificity (the ability to identify the lesion as being malignant or premalignant). Depending on the organ, there can be a high rate of false negative results (FN, missing a malignancy) and false positive (FP, incorrectly identifying normal or benign lesions as malignant). The sensitivity is defined as Sens = TP/(TP+FN) and the specificity as Spec = TN/(TN+FP), where TP and TN are the true positive and true negative rates, respectively: e.g. in standard definition white-light colonoscopy the sensitivity and specificity for identifying so-called pre-malignant polyps (adenomas) are around 52 and 91 % [7], resp. Missing an adenoma means that it will not be resected and so places the patient at risk of progression to more dangerous invasive cancer, while false positive findings result in unnecessary resection with associated risk and costs. The severity of this situation is compounded by the fact that some patients may have many such polyps on a complex background of chronic inflammation. Note also that the “gold standard” histopathologic assessment is also far from perfect, especially for early-stage disease.

1.3 Methods to Improve Conventional White-Light Endoscopy

Figures 3 and 4 illustrate three of the several different approaches that have been explored or implemented in order to improve conventional white-light endoscopy (WLE). The first major division is between techniques based on improving white-light reflectance-based imaging and those that exploit other light-tissue interactions. WLE is based on detecting photons that have undergone multiple elastic scattering in the tissue. The spectrum of this diffuse reflectance then depends on the absorption of specific molecular species in the tissue, whose concentration and spatial distribution may be altered by disease. Hemoglobin is a dominant chromophore in the visible spectral range, so that the blood content of the tissue is an important biomarker and this may change due to the proliferation of new blood vessels (angiogenesis) or to altered blood flow. The morphological features of the tissue also contribute to the diagnosis, e.g. thickening of the mucosa or changes in collagen content. A number of different chromogenic stains, i.e. molecular dyes of distinct color, are used to improve the contrast of lesions by altering the local optical absorption spectrum and are usually applied topically by spraying or painting onto the tissue surface. These have different tissue-binding characteristics and help highlight suspicious regions on the relatively unstained tissue background and/or enhance specific morphological features of the lesion (Fig. 3c). In some cases acetic acid is used to improve early cancer contrast by causing temporary nuclear condensation that alters the refractive index of the tissue and so provides optical scatter contrast in the reflectance signal. While these stains are widely used in endoscopy, in general they yield limited improvements in diagnostic sensitivity or specificity [8].

Fig. 3
figure 3

Endoscopic identification of the squamous islands in Barrett’s esophagus under a white light imaging, b narrow-band imaging and c iodine chromoendoscopy (Reprinted from Ref. [10], with permission from John Wiley and Sons)

Fig. 4
figure 4

Probe-based confocal laser endomicroscopy showing a normal esophageal tissue, b Barrett’s esophagus without dysplasia, c high-grade dysplasia and d carcinoma (Reprinted from Ref. [11], with permission from Elsevier)

Two main approaches have been used on the instrumentation side to improve white-light endoscopy. The first is narrow-band imaging (Fig. 3b), in which the red, green and blue spectral bands of the detected light are separated. Since these three colors have different effective penetration depths in tissue and are absorbed to different extent by hemoglobin, the resulting images have different appearance from the composite white-light image. The microvascular pattern of the tissue is particularly highlighted in the short-wavelength images. A different concept for spectroscopic imaging in endoscopy is the use of hyperspectral detection, in which the fine structure of the elastic scattering spectrum is utilized and is sensitive to the nuclear size distribution in cells, which may be altered in malignant transformation [9]. The second instrumentation approach is high-magnification endoscopy (endomicroscopy or microendoscopy), in which the objective is to generate images with spatial resolution comparable to microscopy (Fig. 4). While revealing microscopic features can be diagnostic, the limitation of this approach is that the field-of-view is very small (~300–600 µm), so that it is not practical for examining large surface areas.

There have been many different attempts to improve endoscopic diagnosis by utilizing other light-tissue interactions besides absorption and elastic scattering. The advantages and limitations of each are summarized in Table 2. The most fully developed is the use of fluorescence, i.e. the emission of longer wavelength light by molecules following electronic-state excitation by absorption of shorter-wavelength photons. This includes both the endogenous fluorescence of tissues (autofluorescence) and the use of exogenous fluorescent contrast agents. Tissues contain many fluorescent components in the UV-visible wavelength range, including structural proteins (e.g. collagen, elastin), and metabolic compounds (e.g. NADH, flavins), each of which may be altered either in concentration or distribution with disease. In addition, relatively simple changes such as thickening of the mucosal layer can cause differential wavelength-dependent absorption of fluorescence excitation and/or emission light, thereby altering the spectrum of detected fluorescence and providing disease-specific contrast. Autofluorescence endoscopy, with blue-light excitation and red-green ratiometric detection, was first developed for bronchoscopy [12, 13] and significantly improved the sensitivity for detection of precancerous lesions, but had relatively poor specificity. Hence, it was combined with narrow-band imaging in a commercial trimodal system to address both requirements [14, 15]. A further variant on autofluorescence detection is the use of fluorescence lifetime [1618], exploiting the fact that this depends on the tissue microenvironment, providing a complementary contrast mechanism. This has been developed to date in point-spectroscopy mode using optical fiber probes but there are technical and cost barriers to performing rapid time-resolved fluorescence endoscopic imaging. Exogenous fluorophores—analogous to chromogenic stains—have been investigated widely in preclinical cancer models. The basic concept is to provide disease-specific fluorescence contrast by using a ‘probe’ comprising a fluorescent reporter molecule and a biomarker-targeting moiety such as an antibody or peptide. This approach has been demonstrated in various clinical studies, as reviewed by Kim and Myung [19]. One of the fundamental issues is the extent to which the biomarker targeting or binding is affected by conjugation to the fluorophore. In addition, such approaches to ‘molecular endoscopy’, including nanoparticle-based methods as discussed below, require that the cancer biomarker be expressed on the cell surface so that they are easily accessible to the optical contrast agent. This can be a significant limitation. A variant on this approach has been the use of the fluorescent molecule protoporphyrin IX (PpIX) that is produced in cells upon oral administration of the prodrug aminolevulinic acid (ALA) and that has shown very high tumor-to-normal tissue contrast [20].

Table 2 Summary of other light-tissue interactions applied to endoscopy (without nanoparticles)

An alternative to fluorescence detection is to use inelastic (Raman) scattering of tissue. While very weak (~107–108-fold less than elastic scattering and ~103–104-fold less than fluorescence), Raman spectra have very narrow and multiple peaks corresponding to vibrational states of common molecular bonds in biomolecules, i.e. they provide very distinct spectral ‘fingerprints’ or ‘signatures’. Several studies using fiberoptic probes placed through the instrument channel of endoscopes have shown high sensitivity and specificity (typically >~90 %) for early cancer detection in several different organs, including the lung, GI tract and urinary bladder. Spectrally-resolved Raman microscopy of human tissues ex vivo has confirmed the molecular specificity of the Raman signals. Since a few seconds of signal integration are typically required to collect spectra of adequate signal-to-noise from tissue in vivo, endoscopic imaging is not practical. Coherent anti-Stokes Raman spectroscopy (CARS) or Stimulated Raman spectroscopy (SRS) exploiting nonlinear light-molecule interactions may provide markedly greater signal-to-noise so that imaging, or at least coarse tissue surface mapping, may be feasible but have not been demonstrated clinically in endoscopic mode to date.

The third novel approach to endoscopy that has been developed to the stage of clinical trials [21] is optical coherence tomography (OCT), the optical analogue of high-frequency ultrasound, in which sub-surface cross-sectional images in tissue are generated using interferometric detection of singly-backscattered light. As in diffuse reflectance imaging, the signals depend on the tissue elastic scattering and absorption but in this case subsurface depth information is also obtained at high spatial resolution (~10 µm axially), albeit usually at a single (near-infrared) wavelength. Fiberoptic-based OCT probes for endoscopy have been developed that are used through an instrument channel of a conventional fiberscope or videoscope. The image is typically generated by rotating the probe tip at high speed as it is drawn back, generating a volumetric image along a length of the lumen, where the image is up to about 1.5–2 mm deep below the surface (Fig. 5a). Diagnosis then depends on visualizing microstructural changes in the tissue. Doppler-OCT has also been demonstrated to reveal the pattern of microvasculature and blood flow during GI endoscopy [22]. The ability of OCT to image below the mucosal surface is, in principle, a significant advantage, since it would allow direct visualization of whether or not there is submucosal invasion, which is a major determinant of whether minimally-invasive endoscopic mucosal resection can be used for curative treatment: the jury is still out on whether the maximum depth of OCT is in practice adequate for this purpose.

Fig. 5
figure 5

Illustration of the concept of OCT-guided endoscopic biopsy (Reprinted from Ref. [23], with permission from Elsevier). OFDI: optical frequency domain imaging

Another emerging optical technology that is applicable endoscopically is photoacoustic imaging (PAI), in which acoustic waves (usually in the tens of mHz range) are generated by the thermoelastic expansion of tissue following absorption of ns laser pulses. The resulting ultrasound images represent the spatial distribution of optical absorption at the laser wavelength, which may be scanned to image the hemoglobin and oxyhemoglobin in the tissue (micro)-vasculature: an example is shown in Fig. 6. The spatial resolution is set by the ultrasound frequency, while the depth of imaging and the image contrast are determined by the diffuse (multiply-scattered) light field and optical wavelength, respectively. Endoscopic ultrasound is already widely available, so that we can expect that clinical endoscopic PAI will be developed by incorporating pulsed laser light delivery into the latter or by developing integrated photoacoustic probes de novo, as has been reported in preclinical studies [24, 25]. Exogenous contrast may be provided as for conventional (optical absorption-based) endoscopy.

Fig. 6
figure 6

Example of photoacoustic imaging of microvasculature at two different wavelengths in (porcine) rectal tissue ex vivo, imaged from the mucosal surface. The microvasculature signal is stronger at 750 nm (a) where the absorption of hemoglobin is higher than at 824 nm (b). The tissue layers are evident. The white square outlines a region beyond the rectal wall containing porphysome nanoparticles that have high absorption at 824 nm but low absorption at 750 nm [26]. Supported by Terry Fox Research Institute grant # 1022

Finally, there a number of non-linear optical imaging techniques that may provide complementary information on tissue structure or status and can, in principle, be implemented in endoscopic mode. These include second- and third-harmonic generation, CARS and SRS (as mentioned above) and multi-photon fluorescence. These have all been demonstrated or are in semi-routine use in microscopy mode [27] but are only recently starting to penetrate into endoscopy. A system based on the scanning-fiber endoscope and incorporating the two types of coherent Raman scattering imaging has been reported and demonstrated chemical selectivity in distinguishing lipid and protein-rich areas in a biological sample using the SRS method by tuning the laser system to match the CH2 and CH3 stretching modes [28]. A particular challenge with all these non-linear techniques is to ensure high-fidelity insertion and propagation of ultrashort (fs) laser pulses in optical fibers. In addition, the specialized laser sources are still relatively bulky and expensive for routine clinical use.

It is important to emphasize that none of the above established or emerging endoscopy techniques either

  • alone or in combination has fully addressed some critical unmet clinical needs in early cancer detection, staging, and treatment delivery/guidance/monitoring, or

  • fully utilizes the emerging omics revolution in personalized (cancer) medicine.

Hence, in the remainder of this chapter we will examine the extent to which nanoparticles may contribute to addressing each challenge, not only by providing high image contrast and so improving diagnostic sensitivity or specificity, but also for enhancing therapeutic endoscopic techniques.

2 Optical Nanoparticles for Bio-applications

In this section we will give a brief overview of optically-active nanoparticles before considering how these may be used in the endoscopic setting. According to the US National Institutes of Health, nanoparticles are defined as having (biologically-effective) diameters in the range between 1 nm (about the size of a glucose molecule) and 100 nm (about the size of a virus and 100 times smaller than the diameter of a typical cancer cell). A key characteristic of nanoparticles is that their physical properties—mechanical, electrical, magnetic, optical- are often distinctly different from those of the corresponding bulk material. Thus, for example, gold NPs show optical absorption spectra that depend markedly on the NP size and shape and so can be fine-tuned across the visible spectrum to provide optimum optical contrast. For biomedical applications, NPs are conveniently divided into ‘hard’ and ‘soft’, depending primarily on whether they comprise inorganic (e.g. metal, semiconductor, ceramic, carbon) or organic (e.g. lipid, polymer, lipoprotein) materials. For optical NPs, i.e. those exhibiting detectable interactions with light, examples of the former are quantum dots (QDs) and gold NPs, while examples of the latter are liposomes, co-block polymers, nanocapsules and dendrimers incorporating fluorescent or Raman-active molecules. Table 3 summarizes the potential advantages of optical NPs compared to molecular “dyes” (i.e. optical agents that are not NP based) for clinical applications, including but not limited to endoscopy. These advantages and their general implications are as follows.

Table 3 Benefits of optical nanoparticles compared to molecular dyes for clinical applications

High payload. High loading can be achieved with several types of NPs, i.e. the NPs can carry a significant amount of active diagnostic or therapeutic agents such as molecules, genes or even viruses. For example, molecules may be loaded into liposomal nanoparticles, either by incorporating them into the core or attached to the surface, with a large number of molecules per NP. The NPs can then be targeted (e.g., by antibodies or peptides), so that the delivery of the active agent depends on the nanoparticle rather than on the pharmacological and targeting properties of the agent itself. Thereby, the delivery of the active agent to the target cells or tissues and its biological activity are uncoupled so that they may be independently optimized. Consequently, the dose of the agent can be reduced, minimizing the cost and potential toxicity while retaining sensitivity and specificity. Liposomal delivery is already in clinical use for some therapeutic drugs, including photosensitizers for light-activated photodynamic treatments [29] and, as previously mentioned, also overcomes the problem of poor water solubility of many drugs. For endoscopy, one can envisage using such NPs to deliver optical contrast agents to the tissue, either by systemic (e.g. intravenous) or local topical administration (sprayed onto the tissues surface as with existing chromoendoscopy stains).

Strong and tunable intrinsic optical signal. In some optically-active NPs the optical signal is very strong and can be tuned by varying the material properties such as size and composition. This is the case with QDs, for example, that have very bright fluorescence emission compared to most molecular fluorophores. Furthermore, their emission spectrum depends strongly on the NP size and material: e.g. for QDs comprising of a CdSe core with a ZnS shell the peak luminescence emission wavelength is around 480 nm for 2.5 nm diameter core and increases to around 640 nm for 6.3 nm diameter. It is also important for quantitative imaging and diagnosis using biomarker information that the NPs are unaffected by the tissue environment.

Optical signal amplification. With metal NPs the optical signal of reporter molecules attached to their surface may be significantly amplified compared to their native signal, due to plasmonic effects, as illustrated in Fig. 7. In essence the electric field of the light induces motion of the free valence electrons in the metal which, in turn, increases the local electric field and amplifies the optical signal from the molecules. This effect is particularly strong when the wavelength of the light is comparable to the diameter of the NPs, since a resonant condition is then set up. There is an optimal distance between the metal NP surface and the biomolecule (typically ~10 nm) to achieve the greatest amplification. NP-enhanced fluorescence has been reported [3032], although this may not be so useful for endoscopic imaging, since generally the fluorescence signal of molecular dyes is strong enough. However, plasmonic amplification is a distinct advantage for Raman imaging, where the Raman signal from biomolecules is typically a thousand-fold weaker than fluorescence. Hence, for example, attaching a biomolecule to the surface of a metal nanoparticle can increase the Raman signal of the biomolecule by a very large factor: this has been cited as up to 1014–1015 under special conditions [33] but is more typically ~104-fold. Nevertheless, this is still substantial enough that SERS (surface enhanced Raman scattering) imaging in vivo becomes possible with clinically-practical integration time and light exposure. In fact, SERS NPs have been reported to produce 200 times stronger Raman signal in vivo as the fluorescence signal from the same number of near-infrared-emitting QDs [34].

Fig. 7
figure 7

Schematic of the plasmonic enhancement effect in metal nanoparticles where the free electrons experience oscillations when excited by an electric field (Courtesy of Vicken Boyrazian)

Lack of photobleaching. Many optically-active biomolecules degrade under light exposure, so that their optical signal fades. This is a particular problem with many exogenous fluorophores and limits the excitation power and/or imaging time that can be used. However, photobleaching is minimal with many nanoparticles. For example, NPs such as QDs or luminescent nanocrystals do not photodegrade (indeed, they may even get brighter), while in SERS NPs the reporter molecules are minimally affected by the near-infrared light that is typically used. This photostability allows the imaging to be quantitative, i.e. the optical signal is proportional to the concentration of the NPs in the tissue, so that either the relative concentration (between differently-targeted NPs or for the same NPs over time) can be compared or, with suitable calibration procedures, the absolute concentration can be measured. This capability becomes particularly valuable when the NPs are targeted to specific biomarkers whose level of expression in the tissue (e.g. tumor) is predictive of the response to specific treatments.

Broad excitation spectrum. Several types of nanoparticles, including QDs and SERS NPs, have the additional advantage that a single wavelength can be used for excitation, independent of the emission (detection) wavelength selected. In practice this means that a single excitation light source can be used for multiplexed imaging, thereby reducing the instrumentation cost and complexity and ensuring similar effective tissue-sampling depth of the light.

Narrow line widths. A further characteristic of some NPs that can be exploited for biomedical imaging is that the emission spectra can be narrow. For example, for QDs the intrinsic fluorescence peaks are typically ~20–30 nm wide, compared to 50–100 nm for most organic fluorophores. Similarly, SERS NPs provide typical Raman spectra with multiple, molecular bond-dependent peaks that are only 1–2 nm wide. This opens the possibility of image multiplexing, i.e. the use of a cocktail of NPs of different ‘color’, each targeted to a different biomarker, representing one approach to imaging omics. We will illustrate this below for the case of multiplexed SERS endoscopy, while Fig. 8 shows an example using different QDs for multiplexed staining of tumor tissue: combined with the ability to quantify the QD signal because of lack of photobleaching, this enables the expression levels of several biomarkers to be imaged simultaneously [35].

Fig. 8
figure 8

Quantification of multiplexed QD signal in tissue microarrays. a Conventional H&E staining. b Fluorescence staining for the disease markers EGFR and E-cadherin using QD-antibody conjugates, with each target defined by different spectral emission peaks. DAPI (4′,6-diamidino-2-phenylindole) staining locates the cell nuclei in blue. c Fluorescence intensity histogram for EGFR and E-cadherin, normalized to epithelial cytokeratin expression, showing the ability to quantify the biomarker expression (Reprinted from Ref. [35], with permission from American Chemical Society, Copyright 2006)

Multimodality and multifunctionality. An important characteristic of some NPs is that they can serve multiple purposes at the same time, namely either two or more imaging modalities or a combination of imaging with therapy (‘theranostics’). Thus, one can utilize the intrinsic properties of a nanoparticle, such as its fluorescence, in order to image its uptake and biodistribution or to monitor its targeting to tumor for detection or localization, together with its ability to carry a (non-fluorescent) drug for treatment. It should be noted that the functionality can be both optical and non-optical and further examples will be given below. In the context of endoscopy in cancer, this translates into the notion of <detect, decide, destroy> i.e. detect/localize the tumor at its earliest stage, decide if and how it should be treated and, if suitable for localized endoscopic therapy, destroy it in situ. For each step NPs provide specific capabilities to improve the efficacy and/or safety.

3 Applications of Nanoparticles to Endoscopy

Many of the concepts above are applicable across a range of biomedical imaging applications, both preclinical and clinical. Here, we will consider these in the context of the limitations of endoscopy and the different approaches that are being explored to overcome these. It will be seen that introducing NPs into endoscopy may not be as simple as just administering them to the patient, since significant modifications may also be required to the endoscopic instrumentation, both hardware and analysis software, in order to match the optical characteristics of the NPs. The principles and practice will be illustrated by three examples of ongoing developments that will be presented in some detail. First, however, we will consider how NPs may be used with each of the above optical imaging modalities that, in turn, might be implemented endoscopically.

3.1 Nanoparticles in Optical Imaging Modalities

3.1.1 NPs in White-Light Endoscopy

Optically-active nanoparticles, with or without biomarker targeting, could be used as alternatives to conventional, dye-based chromogenic stains and potentially could offer higher lesion contrast through either having narrower absorption line width and/or higher optical cross-section. However, to our knowledge, this has not been pursued to date.

3.1.2 NPs in Fluorescence Endoscopy

There are many studies using fluorescent NPs for bioimaging in cells, in tissues ex vivo and in vivo. These include NPs that are intrinsically fluorescent such as QDs [36, 37], as well as a range of different organic and inorganic NPs carrying fluorophores within them or attached to the NP surface. The primary advantages compared with fluorescent dyes are as above, namely potential for high loading, high brightness and/or low photobleaching, and decoupling of the fluorophore targeting/delivery from the fluorophore molecular structure and properties. Fluorescence endoscopy is already an established and commercialized technology based on tissue autofluorescence [14], although it is not widely used, due in large part to its relatively poor specificity for early cancer detection. Fluorescent dyes such as indocyanine green (ICG) and fluorescein are also already used in the clinic for multiple non-endoscopic applications. Extension to fluorescent nanoparticles would be most straightforward in terms of endoscopic instrumentation if the NPs were activated and emitted in the visible spectral range. NP fluorescence multiplexing should be possible, within the limitations of the spectral bandwidths, but would require either adding extra cameras to the endoscope or fast switching between wavelengths [38].

3.1.3 NPs in Endoscopic OCT

The primary mode of action that has been explored to date for NP-based OCT image contrast is to enhance the light absorption in the tissue using gold NPs [39, 40]. This compensates for the fact that the intrinsic tissue contrast in OCT can be rather subtle and difficult to interpret, so that exogenous contrast could be helpful, especially if it provided a degree of tumor selectivity. However, from the endoscopic perspective, increasing the local absorption in order to increase contrast comes at the price of reducing the depth of tissue that can be imaged. This is already marginal for some endoscopic applications, particularly in cancer staging where a major clinical requirement is to determine if there is submucosal tumor invasion. Hence, it is not clear if there is a role for NP-enabled endoscopic OCT, except for special clinical problems where the imaging depth is not limiting.

3.1.4 NPs in Photoacoustic Endoscopy

The primary mechanism of photoacoustic imaging is optical absorption. At present, intrinsic photoacoustic imaging based on the endogenous Hb and HbO2 contrast uses visible laser pulses, so that adding NP contrast is technologically easiest if the NP absorption is also in this wavelength range. The NP absorbance (the product of the absorption coefficient and concentration in tissue) then needs to be higher than that of the intrinsic tissue in order to distinguish the NP contrast from the tissue background. Narrow absorption peaks would then be advantageous. Photoacoustic contrast has been reported using gold nanospheres [41] and nanorods [42], porphysomes (all organic NPs, discussed below) [43], carbon nanotubes [44] and magnetic iron oxide NPs [45]. Nanomicelles composed of hydrophobic of naphthalocyanine dyes have recently been shown to withstand the conditions in the digestive tract and avoid systemic absorption to provide good endoscopic contrast [46]. Since the acoustic signals are generated by absorption of the diffusely-scattered light, the photoacoustic imaging depth can be up to several cm in tissue, depending on the acoustic frequency range used. Hence, the loss of depth by adding NP absorption contrast is not likely to be limiting for endoscopic applications, and the primary challenge lies in designing and fabricating photoacoustic light delivery and transducers packed in small-footprint endoscopic format [47].

3.1.5 NPs in Non-linear Optical Endoscopy

Upconverting nanoparticles (UCNPs) have favorable characteristics to be used as nanoprobes in medical imaging applications. Upconversion results from the sequential absorption of multiple photons, followed by the emission of a photon with higher energy (shorter wavelength) through a non-linear optical process. As an example, lanthanide (Ln3+)-doped nanoparticles can absorb in the near infrared (NIR) and emit in the visible. This takes advantage of lower absorption and elastic scattering of tissues in the NIR, allowing for deeper imaging. There is also a significant advantage over molecular fluorescent dyes that are excited in the visible range where the tissue autofluorescence background can be significant, so requiring higher concentration of the fluorescent tracer. Compared to QDs and gold nanorods, which can also be excited using two-photon absorption, upconverting NP excitation occurs via real electronic rather than virtual states, thereby increasing the activation cross section. This reduces the excitation power needed to achieve adequate signal and eliminates the need for short-pulse lasers, so that inexpensive CW diode lasers can be used [48]. Current research towards safe human use of lanthanide UCNPs has shown low toxicity compared to heavy metal-based QDs [49].

We will now discuss three specific NP-endoscopy topics in more detail to illustrate some of the principles, opportunities and challenges.

3.2 Example: Surface Enhanced Raman Scattering (SERS) Endoscopy

Several groups are developing SERS-based endoscopic imaging. The primary motivation is to achieve biomarker-targeted, multiplexed imaging by exploiting the distinctive spectral signatures of Raman-active molecules. The concept is illustrated in Figs. 9 and 10 and comprises the following steps:

  1. (i)

    Gold NPs are prepared at a specific size and with surface modification or coating suitable for the attachment of a large number of both Raman-active molecules and tumor-targeting moieties such as antibodies.

  2. (ii)

    The NPs are coated with specific molecules, M1, having a known Raman spectrum and then conjugated to specific targeting antibodies, A1.

  3. (iii)

    This is repeated for other reporter-antibody pairs, M2+A2, M3+A3, etc.

  4. (iv)

    These are mixed in appropriate proportions inversely as the expected expression levels of the antigens in the target tumor tissue, taking into account also the relative Raman signal strengths of each reporter molecule and the SERS amplification factor of each reporter molecule.

  5. (v)

    This cocktail is administered systemically or, for endoscopy, more likely sprayed onto the tissue surface topically through the endoscope instrument channel.

  6. (vi)

    After allowing time for the antibodies to bind to the tumor cells, the tissue surface is washed.

  7. (vii)

    Multiplexed imaging is performed (see below).

Fig. 9
figure 9

Schematic of the preparation steps for conjugation of Raman-active gold nanoparticles to antibodies. In this example, a fluorescent reporter is also conjugated to the gold NPs for validation of the SERS-NP binding by flow cytometry (Reprinted from Ref. [50], with permission from World Scientific Publishing Co Pte Ltd, Copyright 2014)

Fig. 10
figure 10

Example of SERS imaging in tissue. Left Photograph of tumors (SkBr3-breast adenocarcinoma, A431-epidermoid carcinoma). Right Ratiometric SERS images of tumor and surrounding normal tissue using two different antibody-conjugated SERS NPs targeting EGFR and HER2 and a control NP showing corresponding antigen expression on the tumor surfaces. Green—EGFR targeted versus non targeted NPs, red—HER2 targeted versus non targeted NPs (Reprinted from Ref. [50], with permission from World Scientific Publishing Co Pte Ltd, Copyright 2014)

We note that there is considerable materials science and chemistry involved in steps (i) and (ii) to ensure that (a) the maximum SERS signal is achieved, (b) the attached Raman reporters and antibodies remain optically active and retain their antigen targeting, respectively, (c) the formulation is stable and does not aggregate and, for clinical applications, (d) the formulation is suitable for in vivo administration, can be sterilized and has acceptable toxicity. The cost is also a significant factor, particularly if systemic administration is required, as discussed below. The antibodies (or other targeting moieties such as peptides or aptamers) are selected for their target tumor specificity, including both positive and negative controls.

The SERS spectral signal, S(ω) where ω is the wavenumber, from any point, (x, y), on or near the imaged tissue surface (ignoring any depth dependence, as would be valid for example using topical application of the NPs), is then proportional to the intrinsic Raman spectra, Ri(ω), of the individual reporter molecules, multiplied by the SERS amplification factor, AFi(ω), of each and weighted by the concentration, Ci, of the corresponding concentration of NPs bound to the target cells:

$$ {\text{S}}(\omega,{\text{x}},{\text{y}}) \propto {\text{R}}_{\text{i}} (\omega) \cdot {\text{AF}}_{\text{i}}(\omega) \cdot {\text{C}}_{\text{i}} \left( {{\text{x}},{\text{y}}} \right) $$
(1)

Equation (1) assumes that the measured spectrum, S(ω), has been corrected for the spectral sensitivity of the imaging system or that the basis spectra, Ri(ω), are measured using the same instrument. In the case of full-spectral imaging, in order to analyze the data and obtain images of the separate constituents of the cocktail, the measured spectrum in each image pixel is then fitted to the weighed sum of the basis spectra of each reporter, varying the weights to get the best fit. These weights then represent the relative concentration of the NPs, which in turn relates to the binding of these NPs to the tissue biomarkers (e.g. the antigen corresponding to each antibody). A ratiometric approach must be used to account for factors such as random pooling of the nanoparticles during staining and the varying working distance of the endoscope in order to quantitatively compare the detection of the bound antibody from the acquired signal [50, 51]. The ratio of the targeted NPs to the untargeted NPs is used as diagnostic information that determines the distribution of biomarkers in the tissue, classifying it as normal or diseased.

An alternative to measuring the full SERS spectrum from the tissue and performing this spectral decomposition is to identify different Raman peaks that are significant only in one of each of the reporter molecules. Sequential images are then taken using a different narrow-band filter corresponding to a separate peak for each Raman reporter molecule. In practice, the accuracy of measuring the relative concentration of each constituent can be improved by also imaging in a narrow band on either side of each peak in order to subtract the background signal. Figure 11 shows this approach.

Fig. 11
figure 11

Images of a surgically-exposed lung xenograft tumor, simulating SERS endoscopy. The tumor was stained by topical application of a cocktail containing SERS NPs with three different Raman reporter molecules and either EGFR antibodies or a control antibody. a White-light image, analogous to conventional endoscopy: part of the tumor is apparent. b, c EGFR-targeted NP false-color images using two different Raman spectral peaks unique to each reporter. Note the similarity in the images and the extension of the tumor boundary that is not seen in (a). d SERS image using non-specific control NPs (Reprinted from Ref. [52], with permission from Future Medicine Ltd, Copyright 2014)

To date, up to 10-plex SERS imaging has been demonstrated in model systems and detection of NP concentrations in vivo as low as tens of picomoles has been achieved, including endoscopically [53]. This is in the range corresponding to realistic levels of biomarker expression, i.e. there are enough antigens on tumor cell surfaces and enough tumor cells per tissue volume/surface area that multiplexed endoscopic SERS imaging of small tumors should be feasible. It has also been shown that this multiplexed imaging can be quantitative [54], i.e. the relative concentrations of the different NP constituents can be determined as a measure of the concentration of NPs bound to each target biomarker: again this would be expected for non-photobleachable NPs where the background signal subtraction is accurate and there is no saturation of the detector.

In terms of the corresponding instrumentation for multiplexed SERS endoscopy, novel instruments or significant modifications to standard white-light endoscopes are needed. The first reason for this is that SERS typically employs near-infrared wavelengths around 800 nm for excitation and detection, but standard endoscopes have poor performance in this spectral range. Secondly, as indicated above, either full spectral detection of the scattered light or multiple narrow-band detection is required, neither of which is employed in current clinical endoscopes. Examples of prototype instruments that have been reported are summarized below. It should be emphasized that these approaches are at the technological cutting edge of endoscopic imaging, so that they are not yet in clinical practice.

  1. (a)

    A conventional fiberoptic endoscope has been modified to improve the near-infrared transmission and with the addition of a narrow-band tunable filter and a near-infrared laser source. The filter can rapidly ‘hop’ between the selected peaks of each SERS reporter to generate the corresponding images of each constituent [54]. The challenge will be in correctly accounting for background fluorescence signal, which can fluctuate depending on tissue type.

  2. (b)

    A scanning-fiber endoscope [4] is fitted with a near-infrared laser source and a high-performance Raman spectrometer and CCD detector to perform full-spectrum imaging in the stationary mode. It can be equipped with ultra-narrow tunable filters for light collection at distinct Raman bands as the central fiber is scanned rapidly by a piezoelectric actuator across the tissue surface in a spiral pattern to form an image. A (small) fraction of the Raman scattered light is collected by the stationary ring of optical fibers which might become a limitation in achieving adequate signal intensity.

  3. (c)

    An endoscope equipped with a central single-mode fiber used for illumination surrounded by multimode fibers for collection of the scattered Raman light. Full-spectrum imaging is performed and spectral demultiplexing is done using a least-squares algorithm. A glass prism can be employed at the tip of the probe to maintain a fixed working distance and angle [50] or it has been shown that if the excitation light is collimated the dependence of the resolution and illumination power on the working distance is eliminated [51]. The limitation of this technique is that only point measurements are taken and image formation is through raster scanning.

3.3 Example: Theranostic Endoscopy Using All-Organic Nanoparticles

SERS endoscopy, as above, illustrates the use of hard NPs with specific optical signatures for the purpose of tumor detection with multiple biomarker sensitivity. It has required the development of both the nanoparticle technology and the optical instrumentation. The second example discussed here exploits a different novel class of soft multifunctional nanoparticles that can utilize existing endoscopic devices. As mentioned previously, liposomes are a well-known class of NPs comprising of a bilayer surrounding a liquid core. They have a 2-fold level of clinical benefit. Firstly, they can be targeted by conjugating with antibodies or other agents to the outer surface, allowing for higher concentration at the target. Secondly, they can carry different therapeutic payloads within the core or attached to the surface. While the payload can confer optical activity, liposomes are generally optically silent. This can be markedly changed by using lipid-porphyrin conjugates that self-assemble into spherical NPs (porphysomes [43]) that have very high porphyrin content (~80,000 molecules per NP). Porphyrins comprise a large class of optically-active molecules, many of which perform essential biological functions. For example, exogenously administered or endogenously synthesized, they are used in cancer detection by their characteristic red fluorescence under blue-green excitation and in photodynamic therapy of cancer and other pathologies through the photo-generation of excited singlet-state oxygen [55]. These imaging and treatment methods frequently include endoscopic implementation.

Porphysomes have both optical and non-optical functionality. The latter includes chelation of 64Cu or Mn ions to the porphyrin molecules prior to self-assembly of the NPs, enabling positron emission tomography (PET) or magnetic resonance imaging (MRI), respectively, to track the uptake and biodistribution [56]. This can help guide the endoscopic imaging or treatment by identifying the optimum time interval for tumor uptake of the NPs. Porphysomes have optical functionality through their extremely strong red/near-infrared absorption that is comparable to that of metal NPs (~108 cm−1 M−1). This provides high photoacoustic imaging contrast (Fig. 12b). The high optical absorption can also be used to enhance and spatially confine photothermal treatments (Fig. 12a). The second optical function comes from the fluorescent properties of the porphysomes. Fully-formed porphysomes are not fluorescent due to the quenching from the high packing density of the porphyrin molecules. Upon cell uptake, the NPs are dissociated and the porphyrins are unquenched, so that fluorescence imaging can be used to visualize cellular uptake (Fig. 12c, d).

Fig. 12
figure 12

Examples of the use of porphysome NPs. a Photothermal transduction: thermal imaging of solutions irradiated with a 673 nm laser showing comparable temperature rise with porphysomes and gold NPs. b Ratio of photoacoustic amplitudes for porphysomes and methylene blue (control) as a function of wavelength, showing the porphysome absorption peak. c Photoacoustic images of tubing containing porphysomes and PBS. Upon the addition of detergent the porphysomes dissociate and the photoacoustic signal decreases. d In vivo photoacoustic and fluorescence images with porphysomes: top lymph node imaging before and after intradermal injection of 2.3 pM of porphysomes with secondary lymph vessels (cyan), lymph node (red) and inflowing lymph vessel (yellow); bottom fluorescence activation at 15min and 2h after i.v. injection of porphysomes (7.5 pmol) in a tumor-bearing mouse (Reprinted from Ref. [43], with permission from Macmillan Publishers Ltd, Copyright 2011)

These optical functions can be implemented endoscopically. Thus, for example, endoscopic photoacoustic imaging has been demonstrated pre-clinically [57] and is under development for clinical applications. Compared to other endoscopic techniques, it would allow imaging to significant depths beyond the lumen, either to determine if there is submucosal extension of disease or to detect local lymph node involvement, both of which are critical for accurate staging of patients to enable appropriate treatment decisions. Figure 6b showed the ability to image these NPs beyond the full-thickness of a hollow organ. Fluorescence endoscopy could then be used to confirm intracellular uptake of the NPs by the tumor, while subsequent photothermal treatment could also be delivered endoscopically. For example, optical fibers placed through the instrument channel of a thin fiber endoscope could be inserted into a tumor mass, say in the peripheral lung, followed by delivery of near-infrared laser energy to destroy the tumor mass by heat-induced coagulative necrosis [58]. If the porphysomes, or other NPs with high NIR absorption, are preferentially located in the target tumor, then they would serve to enhance the local laser energy deposition and confine the thermal damage to the tumor.

3.4 Example: Nanoparticles in Endoscopic Photodynamic Therapy

Photodynamic therapy (PDT) is a technique for treating tumors and other non-malignant conditions using light and a photosensitizing agent in the presence of oxygen. It is the first drug-device combination approved by the US Food and Drug Administration and comprises systemic or local (e.g. topical) application of a photosensitizing agent that is able to form reactive oxygen species such as singlet oxygen, 1O2, upon excitation by a specific wavelength. Fiber-optic devices are often used for light delivery to the tumor region, with various forms of light distributor to match the target tumor size and shape, making PDT well suited to endoscopic applications through the accessory port. PDT has been used endoscopically to treat cancers of different stages, from pre-malignant lesions through to invasive and obstructing tumor masses, in most hollow organs, including the gastrointestinal tract, bronchus, bladder and cervix. It is an approved modality for several of these indications. For example, in patients with Barrett’s esophagus, the normal mucosal lining is inflamed and transformed due to chronic gastric acid reflux that may lead to formation of precancerous lesions (dysplasia). PDT light is delivered through an inflatable balloon device with a central diffusing fiber placed through the endoscope and offers a minimally-invasive way to destroy the abnormal mucosal layer to allow regrowth of normal mucosa. Clinical trials using porfimer sodium photosensitizer showed complete remission in 87 % of patients, with 5-year survival of 76 % [59].

For PDT in general—not only for endoscopic applications—a variety of nanoparticles have been investigated to achieve different objectives [60], as follows. Firstly, improving the delivery of photosensitizer to the tumor is the most fully developed application of nanoparticles in PDT; for example, like many drugs that are poorly soluble in water, the commercial photosensitizer Visudyne is formulated in liposomes [61]. Secondly, NPs have the advantage that they can carry a high payload of photosensitizer and be targeted to specific biomarkers: for example, gold NPs have been conjugated to phthalocyanine photosensitizers and then coupled to antibodies [62]. Multifunctional silica NPs have been developed that carry photosensitizer at the same time as allowing magnetic resonance imaging to track their uptake and biodistribution [63].

PDT is limited by the tissue penetration depth of the specific wavelength of light required for photosensitizer activation. There are NPs that convert one wavelength of light to another to maximize tissue penetration and treatment effect, as in upconverting lanthanide-containing ceramic NPs. The NIR light can penetrate more deeply into tissues than visible light, while at the same time photosensitizers can be used that are efficiently activated at short wavelengths. Another option is to use scintillation crystals that absorb X-rays and emit photoactivating light [64]: in principle this would allow very deep tissue penetration, although it is not yet clear if a significant level of photodynamic cell killing can be achieved at X-ray doses that would not also be cytotoxic. Carbon-60 NPs and semiconductor QDs have also been used as energy transducers in PDT: for example, the photosensitizer can be coupled to QDs to exploit the fact that the latter have very high light absorption cross section (particularly for 2-photon absorption) and can then transfer the absorbed energy to the photosensitizer via resonant energy transfer [65].

To date, little work has been done to explore how these various nanoparticle-based strategies to improve PDT can be applied specifically for endoscopic approaches. Any strategy that improves and targets photosensitizer delivery to the tumor is relevant to PDT treatment of advanced-stage cancers such as obstructing tumors in the GI tract or bronchus. Indeed, most of the studies of NPs in tumors to date have involved solid tumor masses. However, early-stage tumors or pre-malignant dysplasias are often the target for endoscopic PDT, for which topical application of photosensitizer can be considered. This results in much lower total doses of photosensitizer. A caveat is that the NPs must penetrate the full thickness of the mucosal layer and possibly into the submucosa, while not penetrating into the underlying muscle layer that provides mechanical integrity to the organ. In terms of other limitations, a specific issue with NPs for PDT is that the so-called dark toxicity, i.e. the degree of cell killing or tissue damage that occurs in the absence of light due to the intrinsic biological activity of the NPs, must be low compared to the photodynamic activity; otherwise the advantage of light targeting is lost. This is equally true in endoscopic PDT. In terms of X-ray activation, since by definition endoscopy involves delivering light directly into deep-seated organs, it is not likely that this would be particularly useful for endoscopic PDT. However, NIR activation using upconverting NPs may be relevant in some circumstances, e.g. treating local lymphatic spread of tumor by endoscopic delivery of the penetrating NIR light. In terms of NPs as photosensitizers, the endoscopic approach may be advantageous for early lesions where topical application would be possible. Finally, an important point is that endoscopy always involves real-time optical imaging, so that an endoscopic therapy such as NP-enabled PDT is intrinsically theranostic. Optical diagnostics and treatment guidance are particularly relevant to endoscopic PDT of early-stage lesions, while radiological imaging is useful as a theranostic component for larger tumors of hollow organs but is of limited use for early-stage disease.

3.5 Other Practical Factors in Translating NP-Enabled Endoscopy to the Clinic

As with any other nanoparticle platform, NPs for endoscopic applications face several substantial barriers to their use in patients. These include scale-up synthesis of the NPs and, if required, conjugate them to targeting moieties in quantities sufficient for human use, which are approximately 103-fold larger amounts than for preclinical studies in mouse models. The synthesis and conjugation must be carried out under good manufacturing practice (GMP) conditions, which include detailed tracking of all source materials, reproducibility, purity, and sterility of the pharmaceutical formulation. Toxicological evaluation must be performed, usually in at least one large animal model. Demonstration of safety, efficacy and utility compared to existing clinical agents through Phase I, II and III clinical trials are then required in order to gain regulatory approvals (e.g. from the US FDA or equivalent bodies in other countries).

Some of these barriers may be lower with endoscopic uses of NPs if it is possible to use topical rather than systemic (e.g. intravenous) delivery. This markedly reduces the dose of NPs required and thereby minimizes the cost and potential toxicity. Consider the example of SERS NPs. Studies in mouse tumor models [66] show that an intravenous dose of approximately 125 mg per kg body weight is required for SERS imaging in a clinically-realistic time and with acceptable image signal-to-noise. Even assuming that the effective dose of NPs per kg body weight is 10-fold higher in mice than in humans, then the estimated total dose for SERS endoscopic imaging in a 70 kg adult patient is about 1000 mg. At the current commercial price this would cost ~$50,000, which is clearly unrealistic for a diagnostic agent. Dose and cost reduction could be achieved if the NPs could be made ‘stealth’, i.e. coated so as to evade the reticuloendothelial system (liver, spleen) that traps a high percentage of the NPs. Nevertheless, the costs and potential toxicity remain very high. Compare this to topical administration to the target tissue region, as could be done endoscopically. In this case, mouse studies [67] required a total NP dose of only ~10–100 µg, corresponding to a cost reduction of 100–1000 fold in humans compared with systemic administration. Of course, for therapeutic applications, topical administration may not allow the NPs to diffuse to adequate depth into the tumor but for treatment the acceptable costs and risks can be much higher than for diagnosis.

In terms of potential toxicity of NPs for endoscopy, many of the issues that apply to non-endoscopic uses are relevant, at least for the case of systemic administration. As reviewed by Buzea et al. [68], there is potential toxicity from a wide range of NPs found in the human environment from both natural (volcanoes, forest fires, atmospheric chemistry) and man-made sources and these share many common features with NPs made for medical use. The possibility of harm arises from their chemical/elemental composition of the NPs, as well as from the fact that they are small enough to enter cells and may remain there for a very long time. Factors such as the NP shape, size, elemental/chemical composition, biostability, surface charge and tendency to aggregate all impact the potential for harmful effects. NPs may have much higher bioreactivity than the equivalent bulk materials due to their extremely high surface-to-volume ratio. For medical uses, additional factors such as the dose and route of administration and the presence of specific targeting moieties also come into play, as do the health status and life expectancy of the patient and the anticipated medical benefit from the procedure.

As general statements, we note the following.

  1. (a)

    Many drugs are already delivered by nanoparticles (e.g. liposomes).

  2. (b)

    ‘Hard’ NPs are likely of more concern for long-term effects because of the presence of inorganic materials (e.g. heavy metals in QDs) and their high stability during long retention times. All-organic NPs are more likely to dissociate into low-risk organic molecules that are cleared by the body, especially if they are composed of natural ingredients.

  3. (c)

    Potentially harmful biological effects have been observed with various NPs in cell and animal studies but at this stage the clinical significance of this is not known.

  4. (d)

    Topical administration, such as could be done endoscopically, is likely to be of much less concern than systemic administration, since the total NP dose is much smaller, there is less likelihood of the NPs entering the blood stream and being carried to other sensitive organs, and the clearance is likely to be much faster. This has been tested preclinically, for example, with SERS NPs [69].

4 Summary and Future Directions

Endoscopy is an established and important medical modality across a range of diseases and clinical applications. However, particularly in the area of early cancer detection and treatment guidance, there are significant clinical needs that are not fully addressed by conventional white-light endoscopic imaging. Hence, in the past two decades there has been a substantial research effort to improve endoscopy, particularly by exploiting different forms of light-tissue interactions, such as fluorescence, elastic and inelastic (Raman) scattering, optical coherence tomography and photoacoustic imaging. These advances have been enabled by the emergence of new optical technologies, including novel laser and non-laser light sources, specialized optical fibers and high-sensitivity imaging photodetectors. The various approaches are still not fully mature but have demonstrated, in different ways, improved sensitivity and/or specificity for endoscopic diagnosis and staging. At the same time, as genomics and molecular biology identify an increasing range of potential biomarkers of disease, the concept of molecular endoscopy has emerged, in which probes can be targeted to or activated by these biomarkers in order to provide novel image contrast. This matches well with the technology developments, for example, the convergence of fluorescence imaging systems and targeted fluorescent probes.

As this form of personalized endoscopy develops, nanoparticle-based probes are likely to play increasingly important roles because of the additional capabilities that they provide, including: enabling targeted delivery of high concentrations of imaging and therapeutic agents; the ability to perform multiplexed imaging of several biomarkers at the same time; and multi-functionality whereby endoscopic and radiological/radionuclide imaging may be combined.

Endoscopic imaging and treatment with nanoprobes presents several different challenges and opportunities compared with non-endoscopic imaging. In particular, topical delivery of the agents favorably changes the balance of risk and cost versus benefit compared with systemic delivery that is usually required for non-endoscopic imaging and treatment. At the same time, volumetric imaging is not usually required in endoscopy, where the focus is on surface or sub-surface disease, a domain in which optical imaging excels over other modalities. The challenges will be to select, optimize and validate the various possible combinations of endoscopic instrumentation and nanoparticle-based probes for each clinical application, to undertake scale-up and GMP certification of the probes and to demonstrate real clinical benefit through systematic clinical trials so that these approaches are adopted into routine clinical practice. The optimal nanoparticle-based endoscopic technique will depend on the disease type and the corresponding clinical needs in terms of increased sensitivity or specificity for overall diagnosis or staging. For example, in esophageal endoscopy of early cancer it is particularly important to determine the stage of the disease so as to select the best treatment strategy, whether surgical removal (esophagectomy) or much less invasive endoscopic resection of the diseased mucosa. In this case, nanoparticles with a single molecular target might suffice for diagnosis and disease localization, but a multimodality nanoparticle (e.g. porphysomes) would also provide information on the spread of the disease below the mucosa by combining fluorescence with photoacoustic contrast. On the other hand, for lung cancer it is important to classify the tumor in situ based on multiple biomarkers in order to make the correct therapeutic decision. In this case, nanoparticles with multiplexing capabilities (e.g. SERS) have an advantage. In each case theranostics makes use of the full potential of the optical properties of nanoparticles as simultaneous optical imaging and therapy agents.