Introduction

According to the World Cancer Report from the World Health Organization (WHO) in 2014, cancers figure among the leading causes of mortality and morbidity, which caused 8.2 million deaths worldwide in 2012. About 14 million new cases were reported that year, and this number is expected to increase by 70 % in the next 2 decades.

The detection of cancer at an early stage is often significant in the successful treatment of the disease [1]. To diagnose and stage the cancer, various techniques and tools have been applied, including X-ray [2], blood tests, colonoscopy [3], mammography [4], computed tomography (CT) [5], positron emission tomography (PET) [6], magnetic resonance imaging (MRI) [7], and ultrasonography [8]. However, most of these techniques can only give limited information about the presence, size, and location of the abnormalities, and some inconsistencies have been reported between the tests. Take lung cancer as an example: the detailed images taken by CT scanning technology often show a lot of small lung nodules [9], but currently it is not easy to find out which of these nodules are benign and which represent lung cancer at a very early stage [10]. Therefore, in many cases, a biopsy taken from the specific abnormal tissue is still necessary to finally determinate the cancer, which is inconvenient, complicate, costly, and carries the risk of potential morbidity and even mortality due to bleeding [11, 12].

Nevertheless, detection and analysis of volatile organic compound (VOC) biomarkers has recently been developed and recognized as a new frontier in cancer diagnostics owing to its potential in developing rapid, noninvasive, and inexpensive cancer screening tools [1317]. VOCs are organic compounds with relatively lower molecular weight and higher vapor pressure. Cancer-related VOC biomakers can be detected from blood, urine, feces, skin or sweat, exhaled breath, and the headspace of the cancer cells and tissues (the VOCs mixture trapped above the cancer cells in a sealed vessel) from cancer patients [1822].

Most of the analysis of VOC biomakers has been reported on exhaled breath samples [2327], because the samples are simple to collect and analyze, and thus the exhaled-breath VOC test can be performed frequently, and may reflect cancer progression. This advantage is helpful in monitoring clinical treatment [28]. Furthermore, the exhaled-breath test is painless and noninvasive, and therefore suitable for children and critically ill patients. Analysis of VOCs in exhaled-breath has therefore been recognized as a useful method for diagnosing various types of cancer [2931].

The principle behind the exhaled-breath VOC test rests on the fact that VOCs reflect the condition of the cells at the locations of disease. VOCs can derive from both exogenous and endogenous volatiles [32, 33]. For exogenous volatiles, the compounds can be inhaled or absorbed through the skin from the external environment, or can be produced from the oral ingestion of food [1]. For endogenous volatiles, the compounds can be produced from the physiological or metabolic processes. In the case of cancer, the pathophysiology causes metabolic changes, leading to the alteration of VOC compositions and concentrations [8]. The cancer’s development is related to one or a combination of the following factors: boosted oxidative stress, induction of CYP450 (a group of oxidase enzymes) [34], high rate of glycolysis [35], excessive lactate production [36], gene changes [37], protein changes [38], and lipid metabolism [13]. As a result, the tumor cells will generate a unique cancer VOC profile reflecting the disease conditions.

Once produced, the disease-related VOCs can be excreted into the body fluids, migrate throughout the tissue, and may be stored in fat compartments [39, 40]. These specific VOCs can be further released into the bloodstream and circulate in the vascular system. The VOCs in the blood can then be exchanged into the breath in both the airways and the alveoli, depending on the blood–air partition coefficient (λb:a) [8]. It has been reported that nonpolar VOCs with low solubility in blood (λb:a < 10, i.e., having a low blood–air partition coefficient) exchange almost exclusively in the alveoli. On the contrary, polar VOCs that are more soluble in the blood (λb:a > 100) tend to exchange in the airways. VOCs with 10 < λb:a < 100 can exchange in both the airways and the alveoli [41]. As a result of the blood–air partition coefficient, the VOC profile is also influenced by its concentration in blood and the retention time of the compound in the lung [42].

Collectively, endogenous VOCs can be transported from organs through blood to the lungs and are subsequently exchanged into exhaled breath [43]. When pathological processes occur, the body’s biochemistry is altered, leading to a change of endogenous VOCs and a shift in the exhaled air composition, which gives a unique breath-print profile pattern as a ‘mirror reflection’ of the disease states [44]. Therefore, detection of endogenous VOCs can provide discrimination between various diseases including cancers and give insights into health, whereas the assessment of exogenous VOCs would suggest the exposure to a drug or environmental compounds [45].

However, it has been reported that the exhaled breath contains hundreds of VOCs, with low concentrations ranging from a few parts per billion (ppb) to hundreds of parts per trillion (ppt). Therefore, it is challenging to distinguish exogenous VOCs from endogenous VOCs and to identify stable and unique VOCs which only exist in disease states rather than healthy states [40].

To detect specific VOCs of low concentration from exhaled breath, and to enhance the accuracy of early diagnosis, many breath collection and analysis approaches have been developed. This review will summarize exhaled-breath VOC-related sampling and detection methods, especially the recent development of VOC sensors.

Exhaled VOCs sampling and collection

Breath sampling and collection methods have not been standardized yet, which contributes to the variability of analytical results among different research works [46, 47]. The breath samples can be divided into the first 150 mL of air from upper respiratory airways, and the next 350 mL of alveolar air contained in deeper lung regions. The samples can be collected and analyzed directly in a single step, or first stored in a container before being delivered to the measuring instruments [23]. The disposable mouth pieces and bacterial filters can be used to prevent patient-to-patient contamination, and in-line spirometers can be employed in the sampling devices to control the breath volume. As the breath VOCs are present at very low concentrations, capture techniques and preconcentration methods have also been used in some cases, such as solid-phase microextraction (SPME) [48].

Sampling bags

Sampling bags, such as Tedlar® gas sampling bags (PVF), Mylar gas sampling bags, and polyester aluminum sampling bags (PEA), can be connected with SPME or thermodesorption (TD) tubes for gas chromatography–mass spectrometry (GC–MS) analysis, and can also be linked directly with other systems, e.g., sensor arrays, for ‘online’ analysis. These sampling bags are usually cheap and chemically stable, and can interface with clinical respiratory equipment. However, these bags may have the risk of leakage or VOC sorption (e.g., Tedlar® bags may be permeable to formaldehyde) and may suffer from ultraviolet (UV) degradation. So to avoid sample contamination, these bags must be handled and stored with care. Furthermore, when the bags are used with GC–MS systems, water in breath air samples may condense inside and thus interfere with downstream analysis [49, 50].

Flow reactor

This method is realized by exhaling air into a glass cylinder (Fig. 1). To avoid back flush of ambient air, a glass sieve can be included at the bottom of the cylinder. After each measurement, the cylinder can be purged with nitrogen for cleaning. The flow reactor can be linked to analysis systems, such as proton transfer reaction–MS (PTR–MS). As the sample volumes can be examined precisely every time, the reproducibility of this method is ensured. The cylinder is inert and can avoid water condensation. However, this equipment is expensive, and it requires a constant flow of inert gas, such as N2. Additionally, it is not suitable for sample storage [51].

Fig. 1
figure 1

Schematic diagram of the flow reactor, in which TVOC refers to total volatile organic compounds. Reprinted from Ref. [51] with permission of Elsevier

Bio-VOC™ breath sampler

This method is realized by exhaling into a one-way valve that is connected to a Teflon® bulb (Fig. 2). After breath collection, the internal standard (IS) can be added into the device, and the exhaled VOCs and IS can be extracted by SPME; the SMPE fiber should be put into the Bio-VOC™ for a certain period of time, and then thermally desorbed in the GC injection port. Bio-VOC™ is cheap, inert, and user-friendly. It can trap the last portion of exhaled air and avoid upper respiratory or oral contamination. But it can only collect 150 mL of end-tidal breath, so breath samples may vary according to patients’ lung volume [52].

Fig. 2
figure 2

The use of Bio-VOC™ breath sampler. a The breath sample was collected in a Teflon® bulb. b and c The VOCs can be extracted by inserting a Carboxen/PDMS SPME fiber into the bulb. Reprinted from [52] with permission of Springer

Breath collection apparatus (BCA)

Phillips et al. reported an example of the breath collection apparatus in 2003. It has a long tube as the breath reservoir, and a small tube affixed at the end as the sorbent trap to capture the VOCs. A flowmeter and a digital timer are also incorporated into the apparatus. This apparatus is portable and user-friendly. It can have separate traps and thus can collect different portions of the exhaled breath. This apparatus is compatible with analysis systems such as GC–MS. Although it is portable and user-friendly, the size of this BCA is quite large and the cost might be high [53].

Gastight syringe (GTS)

The GTS is a widely used transfer medium for VOC collection and analysis. The sorptive loss of the highly volatile compounds, such as aldehydes, ketones, esters, alcohols, and aromatic hydrocarbons, is significantly low. But conversely, it is not suitable for the collection of semivolatile compounds, such as carboxyls and phenols, because there may be a sorptive loss due to contact with the inner surfaces of the GTS, and the sorptive losses will increase with the increase of molecular weight and boiling point of the VOCs [54].

VOC extraction

Solid-phase microextraction (SPME)

SPME is a widely used sample preconcentration and storage technique. The storage device consists of an upper part, a sealing part, and an SPME fiber. The fiber is coated with an extracting phase (liquid or solid, such as Carboxen®/PDMS which can extract VOCs from the exhaled air). For preconcentration, the collected exhaled air can be transferred from the sampling bag to a glass vial, and then the SPME fiber should be inserted into the vial and exposed to the gaseous sample for a certain period. This technique is simple, fast, and solvent-free. The samples stored can be analyzed later with systems such as GC–MS, without significant loss of VOC compounds. Thus SPME is suitable for clinical applications [5559] (Fig. 3).

Fig. 3
figure 3

SPME storage device [60]. The SPME fiber should first be screwed into the upper part, and then inserted into the sealing part. Reprinted from Ref. [60] with permission of Elsevier

VOCs detection and measurement

The VOC analytical techniques can be classified into several categories. The first group is based on gas chromatography (GC) or mass spectrometry (MS). As the most common method, GC and MS have been coupled to various detection methods.

GC or MS-based techniques

Most of these methods are highly standardized, such as GC–MS, and have been widely used for VOC detection and analysis. These techniques are commonly compatible with preconcentration methods, e.g., SPME, for better sensitivity. But many of them are expensive and require a skilled operator [61].

Gas chromatography–mass spectrometry (GC–MS)

GC–MS can be used to identify unknown VOCs from complex gaseous mixtures, and it is currently recognized as the gold standard in breath VOC tests [6275]. Many of the cancer-related VOC biomarkers published thus far now were found by using GC–MS in exhaled breath analysis (Table 1). For GC–MS, the exhaled breath sample is first injected into the GC system for separation, and then the separated molecules are ionized in the mass spectrometer (Fig. 4). The most commonly used mass spectrometer in GC–MS methods is the quadrupole mass spectrometer. Other systems, such as time-of-flight MS (TOF–MS) and tandem quadrupole MS (MS–MS) have also been used.

Table 1 VOCs from exhaled breath samples identified as biomarkers of various cancers
Fig. 4
figure 4

Schematic diagram of GC–MS. Reprinted from Ref. [14] with permission of IOP Publishing

GC–MS technique has high sensitivity in the ppb range and can achieve reproducible results. Quantification of VOCs is also possible when the compound is already known. However, this system is slow, expensive, and currently immobile, and the samples often need to be preconcentrated and dehydrated. The real-time measurement is not possible for GC–MS. Therefore, this technique is not suitable for point-of-care use [85, 86].

Ion mobility spectrometry (IMS)

Compared with GC–MS, IMS systems are mobile and cheaper, and a preconcentration process is not needed (Fig. 5). This technique is based on separation of ions according to the gas phase mobility. The sample molecules are first ionized and then drift in the flight tube. The ions are separated as a result of the difference in their shapes and charges, and the velocity is influenced by both the electric field and drift gas.

Fig. 5
figure 5

Schematic diagram of IMS. Reprinted from Ref. [14] with permission of IOP Publishing

IMS is particularly useful in isomer separation, and the sensitivity is quite high, in the ppm range. But with IMS, compound identification is not possible, and it is also not suitable for real-time measurements. To obtain more information about VOCs, IMS is often coupled with GC or MS [87]. A recent development in IMS has allowed breath samples to be analyzed reliably, rapidly, and robustly. In 2015, Brodrick et al. developed a protocol by coupling a multicapillary column (MCC) with the ion mobility spectrometer for preseparation, and they successfully applied it for breath analysis. This MCC–IMS protocol was reportedly fast, accurate, and cost-effective, and may help in the standardization of breath analysis [88].

Field asymmetric ion mobility spectrometry (FAIMS)

FAIMS, sometimes called differential ion mobility spectrometry (DMS), is also based on separation according to the different mobilities of ions. In this technique, ions are subjected to different electric field strengths for various periods; therefore ions with certain mobilities can remain (Fig. 6). Compared with IMS, FAIMS separates the ions by asymmetric tuning of the control voltage, instead of using drift gas and electric field gradient. FAIMS can have sensitivity at the ppb level. This system is robust, portable, and miniaturized, and it can work at atmospheric pressure and room temperature. This portability and directed application provides FAIMS with huge potential in clinical use [49, 89]. It is commonly applied for various analytic purposes including VOC detection from human samples. But FAIMS is not suitable for measuring unknown compounds, so this system needs MS to confirm and quantify VOCs [90]. In 2008, Molina et al. used GC–DMS for the analysis of human exhaled breath condensate (EBC) and reported that this method could be used for non-invasive disease diagnostics. In addition, acetone, a reported biomarker in breath for lung cancer detection (Table 1), was used to spike the samples, and the acetone signal was recorded, which suggested the potential of DMS in VOC cancer diagnosis [91]. In 2010, Basanta et al. used GC–DMS to analyze exhaled breath and separate chronic obstructive pulmonary disease (COPD) subjects from healthy controls who smoke cigarettes, suggesting that this system could be very useful in the diagnosis of respiratory diseases including cancer [92]. Moreover, several studies had developed sensors based on FAIMS and a UV source for photo-ionization, and used this method to detect trace amounts of VOC gases, including acetone, toluene, and benzene, which were reported caner biomarkers in breath (Table 1), with detection limits in the order of 1–100 ppb [89, 93, 94].

Fig. 6
figure 6

Schematic diagram of FAIMS. Reprinted from Ref. [49] with permission of Elsevier

Selected ion flow tube mass spectrometry (SIFT–MS)

SIFT–MS can achieve real-time measurement and quantification of trace VOCs in humid air. This approach is based on chemical ionization of VOC molecules by using H3O+, NO+, and O2+ precursor ions during an accurately defined period along a flow tube. The precursor ions are produced from moist atmospheric air and corona discharge (Fig. 7). The specific precursor ions are selectively separated by the first MS process and then react with molecules coming from the breath sample. The precursor ions and product ions can be detected and counted by the second mass spectrometer, and then the concentrations of trace VOCs can be calculated. This system is fast, mobile, has a high sensitivity level in the ppb range, and thus has potential for online testing. As it allows real-time detection and quantification of trace VOCs in exhaled breath without sample pretreatment, in 2013, Kumar et al. used SIFT–MS to investigate 17 VOCs and found that the concentrations of four VOCs (phenol, hexanoic acid, ethylphenol, and methylphenol) were significantly different between patients with esophagogastric cancer and positive control groups. This real-time measurement without sample preparation provides SIFT–MS with particular advantages in the clinical environment owing to minimal delay and negligible concern for sample degradation. But this technique is expensive and not suitable for VOC chemical identification and broad profiling; therefore other chemical analytical platforms are required to identify VOCs which may be important in cancer diagnosis [80, 95].

Fig. 7
figure 7

Schematic diagram of SIFT–MS. Reprinted from Ref. [14] with permission of IOP Publishing

Proton transfer reaction–mass spectrometry (PTR–MS)

In this method, reagent ions (H3O+) are produced from water vapor by a hollow cathode (Fig. 8). The ion source is connected to a drift tube. In the drift tube, the proton trace reactions occur between H3O+ and any molecule whose proton affinity exceeds that of water. Therefore, components of the moist air, namely N2, O2, CO2, and water, will not impact the test. The ions will then reach the analyzer [96]. PTR–MS has high specificity and high sensitivity (down to parts per trillion). It is a powerful online tool for monitoring VOCs, and there is no need for sample preconcentration. The preconcentration steps are commonly time-consuming and may influence the breath samples because the adsorption and desorption of gas usually depends on the properties of the adsorption medium [97, 98]. Owing to these advantages, in 2007, Wehinger et al. used PTR–MS to detect primary lung cancer through analysis of VOCs in exhaled breath samples, and they found two new potential biomarkers to best discriminate between primary lung cancer subjects and healthy controls [79]. In 2009, Bajtarevic et al. analyzed exhaled breath using PTR–MS and SPME–GC–MS for the detection of lung cancer and discussed the advantages and shortcomings of both techniques. Compared with SPME–GC–MS, PTR–MS does not need preconcentration and it is relatively more sensitive; thus, it can give more reliable quantitative results. As PTR–MS is easier to handle and time-saving, the number of subjects investigated by PTR–MS was reported to be much higher than that of GC–MS, which makes PTR–MS attractive and valuable for a larger clinical estimation and cancers diagnosis. However, PTR–MS did not provide as much information as GC–MS, because it could not precisely identify the VOCs. Therefore, Bajtarevic et al. suggested that PTR–MS and SPME–GC–MS complement each other [62].

Fig. 8
figure 8

Schematic diagram of PTR–MS. Reprinted from Ref. [14] with permission of IOP Publishing

Gas chromatography–flame ionization detection (GC–FID)

GC coupled with FID is also a widely used technique for VOCs analysis. FID is mainly based on the detection of ions which are produced during the combustion of compounds. This approach can detect VOCs with a linear response, and it is relatively inexpensive to acquire and operate [99]. Although GC–FID has been reported as a common analytical method for detection of VOCs in gases, it is not easy for FID to detect inorganic substance, and the measurements usually require internal standards and calibration curves. Besides, this technique often has limitations of detection at a level of ca. 10 ppb. Thus for detecting and analyzing biomarkers with low concentration from exhaled breath samples, this system usually requires a preconcentration step [100]. In 2014, Zaric et al. developed a method to analyze breath samples by combining automated thermal desorption (TD), GC, FID, and an electron capture detector (ECD). It is reported that this TD/GC/FID/ECD method was able to identify VOCs including ethylbenzene, 1,2,4-trimethylbenzene, isopropyl alcohol, and acetone, which have been suggested to be cancer VOC biomarkers in previous publications (Table 1) [101].

Comprehensive two-dimensional gas chromatography (GC × GC)

As mentioned above, GC–MS has been frequently used to analyze exhaled breath samples. However, the separation efficiency of this method may not be high enough for complex breath samples, even with long narrow capillary columns. To improve the separation efficiency, GC × GC has been developed. In GC × GC, two capillary columns possessing different separation mechanisms are joined together via a modulator. The modulator can sample the fractions eluted from the first capillary column and re-inject them into the second column rapidly with high repeatability. The separation achieved in first column is maintained, and the separation in the second column is very fast. It is reported that GC × GC configured with flow modulators can increase the peak capacity and peak resolution significantly, compared to conventional one-column GC. Owing to these advantages, in 2014, Ma et al. developed a method by combining SPME for VOC preconcentration and GC × GC–FID for VOC analysis, and they successfully applied it to analyze human exhaled breath and determine biomarkers for lung cancer. The average concentrations of propanol, acetone, and methanol found with this technique were significantly higher than those in patients with lung cancer, which suggests that this method may have potential as a screening tool for cancer diagnosis [55].

Sensor-based techniques

Various types of sensors have been developed for exhaled-breath VOC analysis. VOC sensors are commonly cheaper, portable, programmable, and easy to use. They can obtain data in real time, with high sensitivities. Therefore, many sensor-based VOC detection techniques have huge potential in clinical point-of-care use.

Metal oxide chemoresistive sensors

Metal oxide chemoresistive sensors have been widely studied for VOC detections. These sensors rely on the electrical conductivity of metal oxide semiconductors, such as SnO2, ZnO, and TiO2, which can change according to the surrounding breath air samples. During breath detection, the target VOC gas can react with adsorbed surface oxygen, leading to a change in the transducer ability of the metal oxide semiconductors [102, 103]. Therefore, the sensing materials need to be carefully selected and modified, and the sensor film needs to be properly structured, to obtain a high efficiency of the catalytic reactions at the sensor surface, to increase the selectivity of the reaction for the target VOCs [104], to shorter the response time, and to lower the operating temperature (usually between 200 to 500 °C) [105, 106]. In 2014, Malagù et al. developed an array of metal oxide semiconducting chemoresistive sensors, and they successfully discriminated biomarkers of colorectal cancer with high selectivity, including 1-iodononane and benzene, from those interfering VOCs in in the gut, such as nitric oxide and methane. The array of sensors was obtained by combining different sensing materials. For each sensor, the best working temperature was determined and the responses to the target VOCs were analyzed. Additionally, the measurements were performed in the background of realistic concentrations of CH4, NO, and H2. For dry conditions, in the background of methane, the most selective sensors for benzene were TiTaV (TiO2, Ta2O5, vanadium oxide) and STN (mixed SnO2, TiO2, and Nb2TiO7). As for the NO background, the most selective sensors were ST25 650 (SnO2 and 25 % TiO2) and STN. Humidity, which needs to be considered because of the properties of human breath samples, was reported to lower the responses. In a wet ambient environment, the best sensors for the detection of 1-iodononane were ST20 650 (SnO2 and 20 % TiO2), ST25 650, and ST25 + Au1 % (SnO2, 25 % TiO2, and 1 % Au). It was suggested that these metal oxide chemoresistive sensors may represent a potentially inexpensive and noninvasive preliminary screening method for the diagnosis of colorectal cancer [107].

Nanomaterial-based chemoresistive sensors

The sensing mechanism of chemoresistive sensors is based on the changes of electrical conductivity due to the alteration of the surrounding air. The detection is mainly driven by the reactions of the VOCs with the surface groups [103]; thus, the sensing sensitivity depends on the surface area and the surface-to-volume ratio of the sensing particles [108]. During the last two decades, many efforts have been made to optimize the geometry of sensing particles and to increase the surface-to-volume ratio, and many chemoresistive VOC sensors based on nanomaterials have been produced [77, 84, 109112]. These sensors are commonly formed by capping a conductive nanomaterial, such as Au/Pt nanoparticles and carbon nanotubes, with organic functional groups [81, 83, 113, 114]. During measurements, VOCs will contact and react with tailored organic functional groups, leading to an alteration of the connections between the conductive inorganic nanomaterials, or in some cases leading to a charge transfer between the organic functional groups and the inorganic nanomaterials. These alterations will cause changes in the measured conductivity [115, 116]. For example, in 2009, Peng et al. produced an array of sensors by using functionalized gold nanoparticles. This sensor array can test exhaled breath samples without the need for dehumidification or preconcentration, and it can distinguish lung cancer patients and healthy subjects. This result showed that gold nanoparticle VOC sensors could provide a simple, portable, inexpensive, and noninvasive screening and diagnosis technology for lung cancer [110] (Fig. 9).

Fig. 9
figure 9

VOC sensors made from functionalized gold nanoparticles. a Photograph of the array of chemiresistors. b Scanning electron microscopy image of a chemiresistor. c Scanning electron microscopy image of a gold nanoparticles film placed between two adjacent electrodes. d Transmission electron micrograph of the monolayer-capped gold nanoparticles. Reprinted from Ref. [110] with permission of Nature Publishing Group

Another kind of nanomaterial-based sensors are called polymer composite sensors, which can be composed of a non-conducting polymer film with the addition of a conductive material such as carbon black. The selective absorption of VOCs can change the volume of the polymer composite, leading to the alteration of resistance between the electrical contacts [61].

The nanomaterial-based chemoresistive sensors usually have a rapid response with high sensitivity, and there is no need for preconcentration of the breath samples. But these sensors are often sensitive to humidity or temperature [117].

Piezoelectric sensors

Piezoelectric sensors are usually based on the response to applied mechanical stress. The most widely used piezoelectric sensors in VOC analysis may be the quartz crystal microbalance (QCM) [76, 118, 119]. In a QCM, the surface of the quartz can be coated with an appropriate molecular recognition membrane or layer. Therefore, when exhaled VOCs compounds are absorbed onto the surface, the change in mass will alter the fundamental oscillating frequency of the quartz crystal resonators.

Another well-known type is the surface acoustic wave (SAW) sensor. In a SAW sensor, acoustic waves propagate along the surface of an elastic substrate, with the amplitudes decaying exponentially with depth into the substrate. The surface can be coated with various selective materials. The adsorption and desorption of the exhaled VOCs from the coated film can result in a change in its mass and in the electrical conductivity (electric field of the SAW, associated with the acoustic field) of the chemical interface. These alterations may influence the amplitude and phase velocity of a SAW sensor [120].

In 2015, Speller et al. developed a concept of using a QCM-based virtual sensor array (VSA) to discriminate a wide range of VOCs. Commonly, sensor arrays require multiple sensor elements which have different binding affinities for different VOCs. However, instead of using chemical affinity, Speller et al. used various material properties, such as viscoelasticity and film thickness, as the discriminating factors. In this way, a single sensor can simulate a VSA and can provide multiple responses per analyte. This sensor was produced by depositing a thin film of ionic liquid onto the surface of a QCM-D transducer, because ionic liquids are highly tunable, have viscoelastic properties, and can reversibly capture organic vapors. When the sensor was exposed to different VOCs, the changes in frequencies (Δf) were measured at multiple harmonics. This method allowed the VOCs to be classified with nearly 100 % accuracy. These results suggested the potential of the QCM-D sensor and the VSA strategy in the detection of VOCs [121]. As the VOCs measured in this study included 1-propanol, 1-butanol, toluene, p-xylene, and cyclohexane, which are previously reported cancer markers from exhaled breath (Table 1), this QCM-D sensor may have displayed its special value in cancer diagnosis.

Piezoelectric sensors can have high sensitivity in the ppt range, and they can be tailored to precisely measure specific VOC compounds. The selectivity of the sensor can be controlled because the resonators can be functionalized with different coating materials [122, 123]. However, piezoelectric sensors are usually sensitive to humidity, temperature, and vibration, which may affect the resonant frequency of the sensor; so these parameters should be precisely controlled for in exhaled breath testing, to minimize their effect during the exposure to the samples [118].

Colorimetric sensors

Many materials can change color in response to their chemical environment, making them attractive for applications as VOC sensors [8, 124126]. Because of the diversity of these indicators, a wide range of VOCs can be selectively detected, and the sensor array may also be suitable for identifying highly complex mixtures. The colorimetric sensor output can be read by a spectrometer or even by the naked eye [124, 125]; moreover, many of these sensors can be easily fabricated and printed on various substrates. Owing to these advantages, colorimetric sensors have been used in lung cancer breath testing. However, the sensitivity of colorimetric sensors is often relatively low, in the parts per million volume (ppmv) range for many VOCs. Most of the indicators are not reversible and not suitable for humid air.

In 2012, Mazzone et al. developed a colorimetric sensor array to analyze exhaled breath for the identification of lung cancer. The exhaled breath samples from 92 patients with lung cancer and from 137 controls were analyzed by the disposable colorimetric sensor array. The array in this study applied a diverse range of chemically responsive dyes, which can change their colors as a result of dye–analyte interactions. These dyes can be classified into three categories: dyes containing metal ions which can respond to Lewis basicity, dyes with large permanent dipoles which can respond to local polarity, and pH indicators which can respond to proton acidity and hydrogen bonding. Therefore, this method can produce high-dimensional data with various color changes, which can provide facile discrimination for complex gas mixture samples. The sensitivity of the array used in this study varied with the specific compound, and many VOCs could be detected in the range of parts per million. According to the color changes, logistic prediction models, incorporating age, sex, smoking history, and COPD, were developed and statistically validated. This array is reported to be capable of identifying biosignatures of lung cancer from the exhaled breath, and the accuracy can be optimized by combining clinical risk factors and by evaluating specific histologies [125].

In 2014, Oh et al. introduced a new idea for the detection of gases and developed a kind of genetically engineered virus (M13 phage)-based colorimetric sensors (Fig. 10). These sensors mimic the collagen structures in turkey skin and are composed of phage-bundle nanostructures. When the sensors are exposed to various volatile organic chemicals, this kind of structure can swell rapidly and undergo viewing-angle-independent color changes. It is cheap and easy to fabricate large-area multicolor sensing matrices for use in this method, because the matrices are made of virus through a one-step self-assembly process. According to this report, this sensor array can detect several VOCs including isopropyl alcohol, which had been reported as a biomarker for lung cancer (Table 1). The most intriguing point is that the function of the phage matrices can be tailored by evolution of the virus for specific target molecules and by incorporating the target recognition motifs through genetic engineering. Thus, these sensitive and selective virus-based colorimetric sensing matrices may have great potential in developing rapid, portable, and simple VOC sensing devices for cancer diagnosis from exhaled breath [126].

Fig. 10
figure 10

Genetically engineered virus-based colorimetric sensors composed of phage-bundle nanostructures. a Phages genetically engineered to recognize target molecules and self-assemble into colored matrices composed of quasi-ordered bundled structures. b The chemical stimuli cause color shifts due to structural changes such as bundle spacing (d 1 and d 2) and coherent scattering. The target molecules can be identified in a selective and sensitive manner by using an iPhone and homemade software. c Photographs of the sensors after exposure to hexane, diethyl ether, isopropyl alcohol, ethanol, methanol, and deionized water, respectively. Reprinted from Ref. [126] with permission of Nature Publishing Group

Metal organic frameworks (MOF)

Metal–organic frameworks (MOFs), also called porous coordination polymers, have many advantages over conventional inorganic porous materials, because their structures and functions can be designed and readily modulated [127]. Owing to their unique characteristics, MOFs have been reported for a wide range of applications in gas storage, separation, catalysis, photonics, and drug delivery [78, 128130]. MOFs are crystalline hybrid coordination polymers with metal ions or clusters as nodes, and organic ligands as linkers [131133]. As a result of their hybrid structures which can offer tunable fluorescence [134136], MOFs have demonstrated huge potential in probing VOCs.

In 2014, Zhang et al. reported a kind of responsive turn-on fluorescent MOF according to aggregation-induced emission (AIE) mechanism, by using Zn4O-like secondary building units and a special angular ligand 4,4′-(2,2-diphenylethene-1,1-diyl)dibenzoic acid (DPEB) (Fig. 11) [131]. DPEB contains partially fixed tetraphenylethene (TPE) units and bears two freely rotating phenyl rings which can spread along the wide channels of the staggered framework. The special DPEB and MOF structures play a crucial role in the responsive fluorescence upon interactions with VOCs. The motion of the two dangling rings can be restricted when the molecules interact with various VOCs, showing responsive turn-on fluorescence [131]. Since the MOF sensor reported in this study can detect VOCs including cyclohexane, benzene, toluene, and p-xylene, which were previously reported as cancer biomarkers, this method may provide a new way for developing cancer diagnostic sensors.

Fig. 11
figure 11

Responsive turn-on fluorescent MOFs. a Chemical structure of DPEB. b Hydrogen bonding interactions (azure dotted lines) and C–H···π interactions (blue dotted lines) of DPEB. c, d Secondary building unit of NUS-1 and NUS-1a respectively (black C, red O, azure Zn). e C−H···π interactions between H and adjacent phenyl ring centroids. f Crystal structure viewed along the [010] direction, with red and blue represent two neighboring layers, and yellow capsules represent hollow channels. g Crystal structure viewed along the [001] direction. Reprinted from Ref. [131] with permission of the American Chemical Society

In 2014, Dong et al. also synthesized a luminescent MOF based on cadmium nanotube channels bridged by an (E)-4-(2-carboxyvinyl)benzoic acid (H2L) ligand, and further developed a dye@MOF sensor by putting Rhodamine B molecules into the pores (Fig. 12) [127]. This sensor was named Rho@CZJ-3. In this system, Rhodamine B emits red light around 595 nm upon excitation at 340 nm, and the L ligand emits blue light around 420 nm. This platform may probe various VOCs as it showed good fingerprint correlation between the VOCs and the emission peak height ratio of ligand to dye moieties. A mechanism was suggested in which the emission of the Rhodamine B dye moiety is mainly sensitized by the L moiety within the same framework; thus, the interaction between Rho@CZJ-3 and VOC molecules may subsequently tune the energy transfer efficiency between the excited state of L ligand and Rhodamine B moieties. This dye@MOF sensor was reported to be self-calibrating, stable, and instantaneous, and was suggested as a promising luminescent platform with wide applications [127]. In this study, various VOCs which were previously reported as cancer biomarkers can be measured, including acetone, acetophenone, phenol, p-xylene, benzene, toluene, and ethylbenzene (Table 1). Therefore, this dye@MOF sensor is worthy of further developments for cancer diagnosis.

Fig. 12
figure 12

The luminescent dye@MOF sensor. a Structure of (E)-4-(2-carboxyvinyl)benzoic acid (L) and b coordination mode of L in CZJ-3. c Side view of the partial nanotube wall in CZJ-3. D Perspective view of the 3D framework structure of CZJ-3. e Emission peak heights of L (dark bars) and dye (light bars) moieties. f Emission peak-height ratios between L and dye moieties in Rho@CZJ-3-f. Reprinted from Ref. [127] with permission of John Wiley and Sons

Silicon nanowire field-effect transistor (SiNW FETs)

Silicon nanowire (SiNW) field-effect transistor (FET)-based sensors are reported as promising candidates in VOC detection [82, 137142]. This approach is based on the molecularly modified SiNW FET that can supply a collection of independent features, with each responding differently to various VOCs. Compared with other sensing strategies, SiNW FETs can provide several advantages, including low power consumption, extreme miniaturization of the device dimensions, detection of VOCs at the low ppb concentration level, multiple parameters in one test, and the ability to control the sensing signals by varying gate voltages. To control the interactions between VOC compounds and SiNW FETs, and improve the sensitivity of the device, several studies have been done in recent years. In 2014, Wang et al. reported a method which can selectively detect 11 VOCs, including octane and decane, two previously reported cancer breath biomarkers, with high accuracy, and can estimate the VOC concentrations in both single-component and multicomponent mixtures [137]. This method is based on the use of a specific molecularly modified SiNW FET device (Fig. 13). The structural properties of the modifications are crucial to selective detections. The multiple independent parameters of this SiNW FET device, including voltage threshold, hole mobility, and subthreshold swing, were applied as inputs for artificial neural network (ANN) models to provide targeted detection. This method combined SiNW FET and ANNs, and it may have great potential in real-world applications.

Fig. 13
figure 13

Scheme of the molecularly modified SiNW FET sensor. Reprinted from Ref. [137] with permission of the American Chemical Society

In 2015, Shehada et al. also reported an ultrasensitive SiNW FET, which was modified with trichloro(phenethyl)silane (TPS), for use in the diagnosis of gastric cancer from exhaled breath [82]. This TPS-SiNW FET sensor has a detection limit down to 5 ppb, and it can distinguish gastric cancer-related VOCs from environmental VOCs. The high selectivity with greater than 85 % accuracy was validated in a clinical study by using breath samples from gastric cancer patients and from healthy volunteers, although an increased sample size is still required to further confirm the results. This sensor has provided a simple, noninvasive, portable, and inexpensive way to diagnose and predict cancer [82].

Olfactory receptor (OR)-based sensors

The sensing of vapor odorants widely exists in creatures, and the olfactory receptor (OR) gene family was reported to encode the most sophisticated protein-based chemical sensors in nature [138]. An animal may have approximately 100 to 1000 functional OR proteins, and each OR protein can recognize multiple ligands in an overlapping pattern [139]. During the olfactory sensing, the vapor odorant molecules first diffuse and penetrate into a thin layer of olfactory mucus or lymph which covers the surface of peripheral receptor neurons. The odorant molecules then bind to the ORs which are located on the surface of OR neurons, leading to the activation of electrically neural events and signal transmission to the higher nervous system. To apply the powerful OR proteins in biomedical and environment sensing, much work has been done to develop artificial OR-based biosensors [140142]. In 2014, Sato and Takeuchi built up a functional OR expression platform and developed a kind of OR-based sensors, by using gene expression techniques and bioinspired electrophysiological techniques, and successfully measured the olfactory response of the OR sensors to VOCs (Fig. 14) [138]. They reconstituted insect OR proteins into human embryonic kidney cells (HEK293T), because insect OR proteins consist of odor-gated ion channels which can convert odorant signals into cation currents. Then they used these OR-expressing cells to produce spheroids by applying microfluidic techniques. To mimic the interface between olfactory mucus and ORs, and to protect the cells from drying, the formed spheroids were integrated into a hydrogel microchamber system. When these insect OR-expressing spheroids were stimulated with chemical vapors, such as benzaldehyde, 2-methylphenol, and pentyl acetate, a negative extracellular field potential shift was observed and recorded, which suggests the efficiency and reliability of the sensors. This method may be very useful in the development of OR-based VOC sensing techniques, and it may provide powerful tools for the identification of VOC receptors. As benzaldehyde has been reported as an exhaled breath biomarker for detecting lung cancer, this OR-based VOC sensor may be worthy of further study for cancer diagnosis.

Fig. 14
figure 14

An extracellular field potential recording of the olfactory response of OR-expressing spheroids to vapor-phase odorant stimulation. a Experimental procedure. b Principle of extracellular field potential shift evoked by odorants. Reprinted from Ref. [138] with permission of John Wiley and Sons

Conclusion and future perspectives

Analyzing exhaled breath for cancer diagnosis is promising, mainly because the breath samples can be collected simply, safely, and frequently. This review summarized the principle behind the exhaled-breath VOC analysis, as well as the techniques applied during the sample collection, preconcentration, and detection. Among the detection methods, GC–MS is currently recognized as the gold standard, and various sensor-based techniques have been developed. The exhaled VOCs identified as cancer-related biomarkers by these methods thus far were also listed in this review.

For the aims of clinical point-of-care use and population-wide screening, an ideal tool for breath VOC tests and cancer diagnosis should be cheap, fast, portable, reusable, easy to use, tailorable for different types of diseases, compatible with various temperatures and humidity conditions, and should also have high sensitivity and high specificity. The future development not only involves the innovation or combination of advanced techniques for VOC sampling, detection, and analysis but also needs the validation and standardization of these methods for their clinical use in the real world.