Keywords

FormalPara Key Points
  • US is useful for the evaluation of thyroid, salivary gland, and nodal disease. US-guided FNA is the most accurate method for nodal staging. Newer US elastography technology is promising for characterization of malignant thyroid nodules and lymph nodes.

  • Contrast-enhanced CT is the best imaging modality for anatomic definition of tumors in the oropharynx, hypopharynx, and larynx. Multidetector CT technology minimizes artifact and allows for easy multiplanar visualization of tumors. Newer dual energy CT technology provides better contrast and differentiation of tissues.

  • Multisequence contrast-enhanced MR provides superior soft tissue detail for tumors in the nasopharynx, oral cavity, sinonasal cavity, and salivary glands. Higher soft tissue contrast in MR provides a window into bone marrow, vascular, and perineural infiltration.

  • Functional MRI can provide noninvasive biomarkers for assessing tumor histology, aggressiveness, and prediction of clinical outcomes.

  • Metabolic imaging with FDG PET/CT is vital for staging, treatment monitoring, and surveillance in advanced head and neck squamous cell carcinoma (HNSCC), is excellent at identifying unknown primary tumor sites, and identifies tumor recurrence earlier than conventional imaging.

Introduction

The anatomic and functional complexity of the head and neck region makes accurate diagnosis challenging. Head and neck cancer management is largely dependent on the locoregional anatomic extension and distant spread of tumors. Imaging provides critical detail that allows for cancer staging in accordance with the tumor-node-metastasis (TNM) system, treatment selection, and prognosis. While mucosal lesions are readily detected by clinical examination, submucosal extension and regional and distant spread cannot be assessed solely by clinical examination and require imaging.

Imaging for head and neck cancers follows a multimodal approach incorporating ultrasonography (US), computed tomography (CT), magnetic resonance imaging (MRI), and combined positron emission tomography and CT (PET/CT) to assess the anatomic and functional status of disease [1]. Modality selection and utilization is tailored depending on the organ of interest and whether imaging is acquired for diagnosis, staging, treatment planning, and/or surveillance.

While a thorough review of imaging in head and neck cancers is outside of the scope of this text, this chapter will explore state-of-the-art imaging techniques, the latest evidence, and recent advances in the application of anatomic, functional, and metabolic imaging to head and neck cancers.

Ultrasound

In head and neck cancers, ultrasonography is most often used in the characterization of superficial primary and nodal disease, including evaluation of thyroid nodules, local spread of thyroid cancer, salivary gland neoplasms, and neck lymphadenopathy [2]. The primary advantages of US in comparison to other modalities are low cost and easy accessibility, rapid scan times, high spatial resolution, and absence of ionizing radiation. Disadvantages include the inability to assess deeper neck structures and operator dependence.

Ultrasound is the first-line imaging tool for the evaluation of thyroid nodules. Because of the high prevalence of thyroid nodules and the typically indolent nature of papillary type thyroid cancers, non-evidence-based management of thyroid nodules has the potential to place significant burden on healthcare costs and increase patient anxiety without improving outcomes. Therefore, in 2015, a standardized Thyroid Imaging, Reporting, and Data System (TIRADS) was formalized to provide guidance on management of thyroid nodules based on sonographic features, allowing risk stratification of recommendations for follow-up and biopsy [3, 4]. The focus of TIRADS is identifying six categories of imaging features for risk stratification: nodule composition (solid or predominantly solid, cystic or predominantly cystic, and spongiform), echogenicity (hyperechoic, isoechoic, hypoechoic, very hypoechoic), shape (wider-than-tall or taller-than-wide), margins (smooth, ill-defined, lobulated or irregular, extra-thyroidal extension), and presence of echogenic foci (microcalcifications, rim calcifications, or punctate echogenic foci). Points are assigned based on the presence of these features, and recommendations for continued follow-up or tissue diagnosis are made based on the TIRADS level and maximum nodule diameter. Cystic and spongiform composition, hyperechoic echogenicity, wider-than-tall shape, smooth margins, and absence of microcalcifications or punctate echogenicities favor benign cytology. A recent meta-analysis of the accuracy of the TIRADS level 5 (highly suspicious) categorization demonstrated 70% sensitivity and 89% specificity.

Recent advances in the piezoelectric crystal technology have enabled higher frequency and bandwidth sonographic imaging which allows for better spatial and contrast resolution. For example, newer ultrasound probes can better assess thyroid and salivary gland masses for spiculated and infiltrative margins and thyroid, salivary, and nodal masses for extracapsular extension. With improvements in near-field resolution, US can play a role in detecting extralaryngeal disease in laryngeal neoplasms and visualize vocal cord movement [5]. Improved depth penetration has allowed transcervical evaluation of the oropharynx, enabling detection of small tumors [6]. These applications may be useful in patients for whom conventional CT and MR are not feasible.

Ultrasound elastography (USE) aims to characterize tissue based on its elasticity, operating on a principle similar to clinical palpation. Strain USE (sUSE) maps identify tissues that undergo mild deformation from the transducer during ultrasound interrogation, allowing qualitative and semi-quantitative assessment of tissue stiffness. Shear-wave elastography (SWE) allows for quantitative assessment of tissue elasticity in response to subtle motion in response to acoustic impulses. Tissue stiffness is measured utilizing the parameters of elasticity score (ES), strain ratio (SR), and SWE indices. USE can increase the accuracy of conventional ultrasound for thyroid nodules [7]. Papillary thyroid carcinomas demonstrate greater stiffness than benign nodules (higher ES, SR, and SWE indices) [8, 9]. A recent meta-analysis of thyroid nodules evaluated using USE demonstrated sensitivities of 83% and 78% and specificities of 83% and 82% for sUSE and SWE, respectively [8]. At present, operator, patient, and tissue variability contributes to frequent false positives and false negatives, precluding widespread adoption of USE; however, USE can play an important role in further characterization of indeterminate thyroid nodules on conventional US. USE also shows promise in identifying malignant lymph nodes as characterized by increased stiffness. Numerous studies have shown variable accuracies, for example, ranging from 62% to 94% for one SWE technique [10].

US also provides real-time imaging guidance for fine needle aspiration cytology (FNAC) and core-needle biopsies (CNB) in the head and neck. US-guided FNAC is the most accurate nodal staging method in most head and neck cancers [11, 12]. However, US-guided CNB is the optimal biopsy technique when lymphoma is clinically suspected; the tissue volume allows for accurate histological and immunohistochemical analysis [13, 14]. It is preferred over excisional biopsies in high-risk patients, including those for whom anesthesia poses a risk or those who have a history of radiation therapy or surgery in the neck due to increased wound healing complications [15]. Additionally, there are growing indications for US-CNB, including identifying the human papillomavirus (HPV) status of head and neck squamous cell carcinoma (HNSCC) tumors [10].

Lastly, there are several emerging applications of ultrasound for head and neck cancer. Transoral ultrasound can assist with tumor staging and anatomical delineation for oral cavity and oropharyngeal masses; tumor thickness in oral cancer is a predictor of nodal metastasis [16]. Contrast-enhanced ultrasound has yet to find a useful role in the management of head and neck cancer but has shown some utility in differentiating benign from malignant lymphadenopathy [17]. Studies are ongoing to assess the potential value of CEUS in diagnosis and staging of head and neck cancers.

Computed Tomography and Magnetic Resonance Imaging

Computed tomography (CT) and magnetic resonance (MR) cross-sectional imaging are invaluable in pretherapeutic staging and treatment planning for head and neck cancers. Both the modalities can provide complementary information about the location and extent of tumors, relationship of tumors to surrounding structures, and nodal involvement, allowing for accurate staging, treatment planning, and determining prognosis. Cross-sectional imaging is also routinely used in treatment monitoring and surveillance.

CT

CT is the most widely used imaging technique in head and neck neoplasms, although it is most commonly used in the evaluation of palpable neck masses and lesions within the oropharynx, hypopharynx, and larynx. CT can determine tumor extent and size, identify nodal disease, assess treatment response, and identify recurrence. Compared to MRI, CT has several advantages. Operationally, it is more widely available, costs less, is faster, and is easily reproducible. Reduced scan time is particularly important in the head and neck region, which is heavily susceptible to respiratory and swallowing motion artifact, improving accuracy of interpretation in addition to allowing for a better patient experience. Diagnostically, CT provides superior bone detail and intratumoral calcium detection. Easy multiplanar reformatting allows for easier interpretation. However, disadvantages of CT include inferior soft tissue contrast resolution to MR, the need for iodinated contrast (which should be avoided in treatment-naïve thyroid cancer), ionizing radiation exposure, and dental artifacts.

The major development in CT over the past two decades has been the introduction and refinement of multidetector spiral CT (MDCT), which enables rapid acquisition of volumetric data that can be reconstructed into multiple planes to optimize the signal to noise ratio. MDCT results in reduced scan time and patient radiation dose. Patients are scanned with the neck in slight extension during quiet respiration. Slice thicknesses of 0.6–1.25 mm are generally used. Three-dimensional display of volumetric data is also possible with this technique and is primarily used in the setting of surgical planning and virtual endoscopic visualization of tumors. Dynamic maneuvers can be used to improve visualization of certain anatomic structures. For example, a modified Valsalva maneuver dilates the hypopharynx and accentuates the pyriform sinuses and postcricoid region to assess lesions that are otherwise obscured by apposition of mucosal surfaces. Phonation can improve visualization of small lesions in the vocal cords. Open mouth instruction can allow visualization of lesions otherwise obscured by dental artifact [18].

Intravenous iodinated contrast is critical to provide contrast between soft tissue, vasculature, and pathology. A single bolus of 80–100 cc contrast injected at 1–2 cc/s suffices. The main contraindication to intravenous iodinated contrast is severe renal failure (eGFR <30, mL/min/1.73 m2); however, measures to mitigate the risk of contrast in these patients can be undertaken in cases where imaging is absolutely necessary.

The median radiation dose for multidetector CT scans of the neck is 3.9 mSv [19]. For reference, the average radiation dose that a person living in the United States receives annually is 6.2 mSv [20].

Dual Energy CT

Dual energy CT (DECT) is a newer technique that offers further differentiation of tissue based on composition. In conventional CT, the attenuation of tissues with differing elemental compositions can be similar; for example, iodine and calcium have overlapping CT densities. It can be difficult to differentiate vascular calcification from vascular contrast. The linear attenuation coefficient of a given CT voxel is related not only to material composition but also to the photon beam energy and mass density of the material [21]. With DECT, different energy spectra can be utilized to differentiate and quantify material composition. For example, iodine and bone have similar linear attenuation coefficients at 100 KeV; however, simultaneously obtaining additional images at 50 KeV can differentiate the two. Newer DECT protocols operate without increasing radiation dose to the patient. The standard display of DECT images uses virtual monochromatic maps that simulate a CT obtained at one energy spectrum. There are several promising applications of DECT, including the utilization of lower energy spectra to accentuate iodine contrast enhancement, basis material decomposition maps to label the concentration of iodine and other material within tissues, generation of virtual noncontrast images from contrast-enhanced images, and generation of virtual noncalcium maps to assess bone marrow edema. DECT has been used in the diagnosis and staging of head and neck cancers. HNSCC demonstrate improved visibility on virtual monochromatic maps reconstructed at energies lower than 65–70 KeV [22]. Similarly, low energy virtual monochromatic maps and iodine maps have been used to differentiate nonossified thyroid cartilage from thyroid cartilage tumor invasion, recurrent tumor from posttreatment change, and malignant from benign lymph nodes [23,24,25].

MRI

MRI is often the imaging modality of choice in regions that benefit from high tissue contrast such as the nasopharynx, oropharynx and oral cavity, sinonasal cavity, and salivary glands. It is particularly important in treatment planning prior to radiation therapy, allowing for precise delineation of radiation fields to spare surrounding structures. MRI is excellent at assessing soft tissue tumor infiltration, bone marrow infiltration, perineural spread, vascular invasion, and nodal disease [26]. Although MRI provides superior soft tissue contrast resolution and avoids ionizing radiation, longer scan times can lead to excessive patient motion and discomfort, and costs are higher. Use is also limited in patients with ferromagnetic implants, claustrophobia, and renal insufficiency, the latter due to the risk of nephrogenic systemic fibrosis.

Anatomical MR images are acquired using standard imaging protocols including fat-saturated T2-weighted, precontrast T1-weighted, and postcontrast T1-weighted sequences in multiple planes. High T1 signal can derive from fat, methemoglobin, melanin, proteinaceous fluid, and some paramagnetic substances. Low T1 signal derives from air, fluid collections, calcifications, scar tissue/fibrosis, and vascular flow voids. T1-weighted images benefit from the increased conspicuity of fat, thereby readily visualizing low signal tumoral tissue that infiltrates or effaces fat planes. Some head and neck lesions demonstrate characteristic signal on T2; for example, fibrous tissue demonstrates low T2 signal, and fluid collections and edema demonstrate high T2 signal. High T2 signal additionally derives from deoxyhemoglobin and sometimes fat. Low T2 signal additionally derives from calcification/mineralization, hemosiderin, paramagnetic substances, and vascular flow voids. Dynamic contrast-enhanced (DCE)-MRI utilizes gadolinium-based contrast agent (CA) administered intravenously that enhances the tissue water protons relaxation rate constant (R1 = 1/T1) [27]. Diffusion-weighted imaging (DWI-MRI) is used to identify areas of restricted diffusion as can be seen in hypercellular tumors and pathologic lymph nodes [28] on the basis of diffusion of water molecules in tissue.

Contraindications to MRI include patients with cardiac implantable electronic devices (although newer MR safe devices are available), metallic intraocular foreign bodies, implantable neurostimulators, cochlear implants, drug infusion pumps, and cerebral aneurysm clips. There are several additional relative contraindications including coronary stents, programmable shunts, intrauterine devices, and IVC filters. The reader is referred to MRIsafety.com for details on individual devices and scenarios. Contraindications to gadolinium contrast administration include severe renal failure (eGFR <30 mL/min/1.73 m2) and pregnancy [29].

Functional Imaging

CT and MRI are optimal for delineating tumor anatomy; however, histopathological identification, detection of small nodal metastases, distinguishing posttreatment change from residual or recurrent tumor, and assessment of treatment response remain major challenges for cross-sectional imaging. Over the past decade, functional MRI techniques including DW- and DCE-MRI have come to the forefront in head and neck cancers. These techniques enable both qualitative and quantitative evaluation of the functional status of tumors and posttreatment response. These emerging technologies will play a growing role in histopathological identification, predicting treatment response to chemotherapy and radiation, identifying recurrent disease, treatment monitoring, and surveillance.

DW-MRI

The diffusion-weighted magnetic resonance imaging (DW-MRI) technique generates the signal contrast by capturing random motion (i.e., Brownian motion) of water molecules in tissue [30]. The development of DW-MRI techniques has resulted in better diagnostic performance in detecting primary and recurrent head and neck (HN) cancer. The cell membranes, intracellular organelles, and macromolecules hinder and restrict water molecules’ movement in tissue. Tissue microstructural restrictions and microcirculation contribute to DW signal attenuation, reflecting abnormalities in tissue organization at the cellular level (e.g., tissue microstructure and cellularity) [31]. Therefore, quantitative DW images are used for lesion characterization, prognosis, and evaluation of treatment response.

DW-MRI techniques are rapidly evolving, with improvements in quality and speed of data acquisition [32]. Single-shot diffusion-weighted echo-planar imaging is commonly used for DWI data acquisition because of its short duration and good signal–noise ratio but is limited by lower resolution, especially at the neck region due to air–tissue interfaces. However, susceptibility variations across the neck regions cause local magnetic field inhomogeneities and can lead to image distortions [33].

The degree of water mobility in tissue can be quantified by calculating the apparent diffusion coefficient (ADC, [mm2/s]) from the DW signal with at least two or more b-values. The DW signal as a function of the b-values can be fitted for each voxel in the image using the following monoexponential model (Eq. 6.1):

$$ {S}_b={S}_0{\mathrm{e}}^{-b\times ADC} $$
(6.1)

where Sb and S0 denote signal intensities with and without diffusion weighting, respectively; b is the diffusion-sensitizing factor; ADC is a surrogate marker of tumor cellularity.

At higher b-values, the signal decay fits the multiexponential model in vivo. In contrast, at lower b-values (b < 100 s/mm2), the ADC is a composite metric influenced by tissue microperfusion and can lead to overestimation of the ADC. Thus, the choice of the optimal b-values will affect the signal in the DWI images and modify the ADC value. Le Bihan introduced the intravoxel incoherent motion (IVIM) model to account for thermally driven tissue diffusion and blood flow microcirculation of the randomly oriented capillaries [34]. Le Bihan hypothesized that signal attenuation due to perfusion is involved at low b-values (b ≤ 100 s/mm2), relating to the signal from the fraction of the perfusion space. In contrast, true diffusion signal attenuation dominates at higher b-values (b > 100 s/mm2) [34].

A biexponential model that describes the signal arising from the two-compartment tissue (i.e., intravascular and extravascular space) model as a function of b-values for the IVIM model without injection of CA is given by (Eq. 6.2) [34]:

$$ {S}_b={S}_0\left[f\ {\mathrm{e}}^{-b{D}^{\ast }}+\left(1-f\right)\ {\mathrm{e}}^{- bD}\right] $$
(6.2)

where D is the true diffusion coefficient, D* is the pseudo diffusion coefficient, and f is the perfusion fraction.

The complex cellular structures of tissue membranes alter the displacement of a water molecule that substantially deviates diffusion from a Gaussian nature (non-Gaussian [NG]) and readily observable at high b-values (b > 100 s/mm2). Jansen et al. apply the concept of diffusion kurtosis imaging (DKI) to an oncological setting, in particular HNSCC [35].

The DW signal data as a function of b-value to the DKI model are fitted as follows (Eq. 6.3) [36, 37]:

$$ S(b)={S}_0\ \left[{\mathrm{e}}^{-b\times {D}_{\mathrm{app}}+\frac{1}{6}{K}_{app}\ {\left(b\times {D}_{\mathrm{app}}\right)}^2}\right] $$
(6.3)

where Dapp and Kapp are the apparent diffusion (mm2/s) and kurtosis (unitless) coefficients, respectively.

The hindered and restricted diffusion can be described simultaneously by incorporating the diffusion kurtosis into the IVIM model called NG-IVIM. The NG-IVIM DW model provides estimates of the quantitative metrics f, D, D*, and K. The DW signal vs. b-value is fitted to NG-IVIM as follows [34, 38]:

$$ S(b)={S}_0\ \left[f\ {\mathrm{e}}^{-b{D}^{\ast }}+\left(1-f\right)\ {\mathrm{e}}^{-b\times D+\frac{1}{6}K\ {\left(b\times D\right)}^2}\right] $$
(6.4)

where K is the kurtosis coefficient.

DCE-MRI

DCE-MRI acquires sequential images before, during, and after injection of a CA [39]. The DCE signal enhancement is associated with the spin-lattice or longitudinal relaxation time (T1) of water protons by a short-range interaction (nm). The DCE signal can be modeled either semi-quantitatively or quantitatively. The semi-quantitative analysis uses the signal enhancement curve to calculate the upslope or washout phase, an initial area under the curve (AUC), and time to peak [39]. In contrast, the quantitative analysis utilizes the commonly used Tofts pharmacokinetic model to estimate the quantitative metrics that characterize underlying tumor physiology such as perfusion and permeability [40]. Accurate quantification of the multiple kinetic parameters requires selection of an appropriate pharmacokinetic model [41,42,43]. The extended Tofts model (ETM) estimates the volume transfer constant of a CA, Ktrans (min−1), the volume fraction of the extravascular extracellular space (EES), ve, and volume fraction of blood plasma space, vp. Ktrans represents plasma flow (Fp), when Fp ≪ PS and PS permeability surface area product (PS) when Fp ≫ PS [41]. The flow and permeability limited conditions can be seen in leaky vascular organs such as liver and largely intact blood–brain barrier, respectively.

A linear relationship is assumed between the change of longitudinal relaxation rate, R1R1 = 1/ΔT1), and the total amount of CA in the tissue, such that water exchange is in the fast exchange limit (FXL) [40]:

$$ {R}_1(t)={R}_{10}+{r}_1{C}_t(t)\to \Delta {R}_1(t)={r}_1{C}_t(t) $$
(6.5)

where t is time, R1(t) is the time course of tissue R1, R10 (s) is the precontrast R1, r1 (mM−1 s−1) is the longitudinal relaxivity of the CA, and Ct(t) is the tissue concentration of CA (mM). The longitudinal relaxivity is assumed to be a constant and is independent of its location in the tissue. It has been reported that relaxivity is dependent on CA macromolecular content [44, 45].

The relaxation rate constant of the EES for FXL is given by:

$$ {R}_{1\mathrm{e}}(t)={R}_{10\mathrm{e}}+{r}_1{C}_{\mathrm{e}}(t) $$
(6.6)

where R1e(t) is the time course of EES R1, R10e (s) is the precontrast R10, and Ce(t) is the EES concentration of CA (mM).

The ETM assumes that the CA exchanges between the vascular space and EES. The total tissue CA concentration is given by [40] (Eq. 6.6):

$$ {C}_t(t)={K}^{\mathrm{trans}}{\int}_0^t{C}_{\mathrm{p}}\left(\tau \right)\ {\mathrm{e}}^{-{k}_{ep}\left(t-\tau \right)}\mathrm{d}\tau +{v}_{\mathrm{p}}{C}_{\mathrm{p}} $$
(6.7)

where kep = Ktrans/ve is the rate constant of CA transport from the vascular space, ve is the volume fraction of EES, and Cp is the concentration–time course of contrast agent in the blood plasma, known as the arterial input function.

The TM model tissue CA concentration can be readily obtained from Eq. 6.6 for a weakly vascularized tissue (i.e., vp ~ 0) [40]:

$$ {C}_t(t)={K}^{\mathrm{trans}}{\int}_0^t{C}_{\mathrm{p}}\left(\tau \right)\ {\mathrm{e}}^{-{k}_{ep}\left(t-\tau \right)}\mathrm{d}\tau $$
(6.8)

T1w DCE-MRI accounts for equilibrium water exchange across the vascular wall (between intravascular space and extravascular space) and the cellular wall (between intracellular space [ICS] and EES). Notably, CAs do not enter the cell. Therefore, the relation between the relaxation rate constant and CA concentration is not so straightforward. The intercompartmental equilibrium water exchange kinetics can be described by a linear three-site two-exchange [3S2X] model for modeling longitudinal relaxation rate of tissue water protons which is adapted from the Bloch–McConnell equations [46, 47]. The full three-compartment model has five parameters, including Ktrans, ve, vp, and the two rate constants of water exchange across the vascular endothelium and the cellular wall [48]. The solution of Bloch–McConnell’s equations for two-site water exchange model (i.e., intracellular space and EES) yields two Eigenvalues of biexponential signal, representing the two longitudinal relaxation rate constants that provide estimates of three parameters, Ktrans, ve, and the mean lifetime of intracellular water protons, τi [49]. One of the Eigenvalues represents the longitudinal relaxation rate constant for the fast exchange regime model (FXR), also called the shutter speed model (SSM). The observable R1t(t) for the FXR regime is given by [49]:

$$ {R}_{1t}(t)=\frac{1}{2}\left[\left({R}_{1\mathrm{i}}+{k}_{ie}+{R}_{1\mathrm{e}}+{k}_{ei}\right)-\sqrt{{\left({R}_{1\mathrm{i}}+{k}_{ie}-{R}_{1\mathrm{e}}-{k}_{ei}\right)}^2+4\ {k}_{ie}\ {k}_{ei}}\right] $$
(6.9)

where R1i and R1e are the relaxation rates of ICS and ESS, kie (kie = 1/τi) and kei are the rates of water exchange from the ICS to EES, and vice versa.

Molecular Spectroscopy

Magnetic resonance spectroscopy (MRS), including both phosphorous-31 (31P) and proton (1H), characterizes the tumor tissue metabolism at a cellular level [50]. MRS technique has the unique ability to assess tumor physiology at the molecular level by evaluating the presence of specific metabolites [51]. Phosphorous MRS (31P MRS) is used to assess tissue bioenergetics and metabolism of membrane phospholipids [52]. In contrast, proton MRS provides information about cellular metabolism, describing the underlying biologic and pathophysiologic events associated with tumors [53]. The biochemical pathways involved in 1H MRS of choline may be different from the phospholipid metabolites seen on 31P MRS, and thus the two MRS techniques may provide complementary information on the tumor metabolism. MRS has shown promise to differentiate nonmalignant from malignant tumors and lymph nodes and to differentiate between residual malignancies from postradiation changes in head and neck cancers [53].

Tumor Characterization with Quantitative DWI and DCE-MRI

Monoexponential and NG-IVIM-derived ADC/D maps correlate with tumor cellularity due to the restricted free diffusion of water in tumors with increased cellular density; these tumors tend to have lower ADC values. Less dense tumors with necrotic and cystic elements have higher ADC values [54].

Several studies have demonstrated that DWI can distinguish HN tumor types, including differentiating nasopharyngeal squamous cell carcinoma from nasopharyngeal lymphoma, head and neck cysts from tumors, and benign from malignant head and neck tumors [55,56,57]. DWI has also been applied in the salivary glands to distinguish pleomorphic adenoma from carcinomas [58]. Warthin’s tumors show overlap with carcinomas, possibly due to the presence of lymphoid tissue. There is evidence for the utility of DWI in distinguishing benign and malignant thyroid nodules; this limited utility may be related to heterogeneity in the histologic composition of abnormal thyroid nodules. In general, head and neck malignancies demonstrate low ADC values due to hypercellularity, enlarged nuclei, and hyperchromatism. Head and neck lymphomas have the lowest ADC values, and benign lesions outside of the thyroid gland tend to have higher ADC values than malignant lesions. DWI has also been used to identify HPV-positive oropharyngeal squamous cell carcinomas, which carry a better prognosis [59, 60]. Nonkeratinizing and basaloid differentiated histology may contribute to lower ADC values in these tumors [61]. Additional studies have demonstrated potential corelates between ADC values and tumoral expression of Ki-67, EGFR, VEGF, p53, p16, and HER2 [62, 63]. Such results may facilitate the use of DWI in creating individualized tailored treatment plans with targeted agents. Representative monoexponential and NG-IVIM model-derived parametric maps are shown in Fig. 6.1.

Fig. 6.1
7 images of representative monoexponential and N G-I V I M model parametric map. The representative monoexponential image on the left is labeled T sub 2 weighted image. Images on the right labeled, a to f, highlight N G-I V I M model maps. Images b, c, and f are scaled from 0 to 2, d is scaled from 0 to 0.05, and e is scaled from 0 to 0.5.

Left: Representative T2-weighted image from a patient with head and neck cancer (65 years, male). Right: (a) diffusion weighted image (b = 0 s/mm2), (b). Apparent diffusion coefficient (ADC × 10−3 [mm2/s]), (c) true diffusion coefficient (D × 10−3 [mm2/s], (d) pseudo-diffusion coefficient (D* × 10−3 [mm2/s]), (e) perfusion fraction (f), and (f) kurtosis coefficient (K). ADC was derived from a monoexponential model, and D, D*, f, and K were derived from non-Gaussian intravoxel incoherent motion model

Similar efforts have been undertaken to evaluate the role of DCE-MRI in differentiating head and neck tumor types. DCE-MRI has proved useful in distinguishing SCC from lymphoma. SCC demonstrates increased tumoral perfusion and capillary permeability, possibly due to its lower cellularity in comparison to lymphoma [64, 65]. DCE-MRI parameters have been studied in differentiating paragangliomas from schwannomas in the carotid space, with paragangliomas demonstrating decreased perfusion parameters; this may reflect the poor perfusion environment in paragangliomas due to the presence of pathologic vasculature and extensive arteriovenous shunting [66, 67].

Tumor Staging

Nodal disease is a key component of TNM staging and treatment planning. Conventional cross-sectional CT and MR have limited sensitivity and specificity in the detection of malignant lymph nodes, as an evaluation solely based on size and morphology can miss active disease. DWI has been applied to differentiating benign from malignant lymph nodes; nodal staging is more accurate with the addition of DWI to conventional MRI. ADC values tend to be lower in malignant nodes [68,69,70,71]. DCE-MRI has also shown utility here, with malignant nodal tissue demonstrating decreased transit of contrast and reduced volume of the extravascular space; in one study, these nodes demonstrated longer time to perfusion, lower peak enhancement, and slower washout [72].

Therapy Response Assessment

Tumors with lower cellularity, increased necrotic and cystic components with poor oxygenation, higher stromal content, and HPV-negative status demonstrate greater resistance to treatment and poorer outcomes; these tumors tend to demonstrate higher ADC and D values [59, 73]. Higher baseline ADC and D values in HNSCC can predict poor local control and poor treatment response, correlating to increased risk of recurrence [74]. Evaluating changes between baseline ADC and D values during treatment may be more clinically relevant, as it minimizes variability due to differences in individual scanners and site protocols. Greater rise in ADC and D values following treatment is predictive of tumor response [59, 75].

Disordered tumoral angiogenesis with the development of leaky and tortuous vessels mediated by the release of VEGF results in a hypoxic tumoral environment [76]. Tumor hypoxia is associated with treatment resistance, aggressive disease, and poor clinical outcomes [77]. These hypoxic environments can develop in areas of high cellular density in addition to poor blood perfusion [78]. DCE-MRI parameters can be used to characterize tumor hypoxia. Several studies have demonstrated better treatment response in head and neck tumors with high baseline and posttreatment perfusion, likely through improved chemotherapeutic agent and oxygen delivery. These studies have shown higher rates of local control and complete response [38, 79, 80]. Higher perfusion on DCE-MRI also correlates with better lymph node treatment response in metastatic HNSCC [81, 82]. The mean lifetime of intracellular water protons derived from the FXR model have shown promise as a prognostic marker for patients with HN cancer [83]. Representative FXR DCE-MRI-derived parametric maps are displayed in Fig. 6.2.

Fig. 6.2
4 images of representative F X R D C E-M R I-derived parametric maps. The images are labeled a to d. The image a is not scaled. Images b, c, and d highlight F X R D C E-M R I-derived parametric maps scaled from 0 to 0.5, 0 to 0.5, and 0 to 2.0 respectively.

(a) Representative T1-weighted (T1w) image of an early dynamic phase after injection of contrast agent, (b) volume transfer constant (Ktrans [min−1]), (c) volume fraction of the extravascular extracellular space (ve), and (d) mean lifetime of intracellular water molecules (τi [s]) from a patient with head and neck squamous cell carcinoma (65 years, male). Parametric maps generated from the fast exchange model were overlaid on precontrast T1w images

Posttreatment Change

Anatomic distortion and inflammation from surgical and radiation therapy limit the utility of conventional CT and MRI in detecting underlying residual or recurrent tumor. Residual HNSCC after treatment demonstrates lower ADC than posttreatment fibrosis, likely secondary to increased cellularity within residual disease [84, 85]. DCE-MRI has also been used to identify residual disease posttreatment. Posttreatment fibrosis has been found to have higher permeability surface area, longer time to peak, lower relative washout ratio, and greater contrast uptake and enhancement ratio [86,87,88]. Enhancement in residual tumor is earlier and more intense due to differing perfusion microenvironments. Dose de-escalation approaches that have been proposed for HPV-related and p16+ SCC, wherein subclinical and nodal targets receive doses of 30 Gy instead of the standard 70 Gy, may minimize treatment-related toxicity and facilitate improved identification of residual or recurrent tumor [89].

Metabolic Imaging

Imaging of in vivo metabolic pathways and receptor-ligand interactions provides important information in the evaluation of neoplasms; PET achieves this by imaging the biodistribution of positron-emitting radioisotopes. PET/CT is the optimal imaging modality for staging, treatment monitoring, and surveillance of advanced HNSCC [26]. It is also useful for identifying synchronous or metachronous lesions in the head and neck as well as identifying unknown primary sites. Various radioisotopes can be used; however, 18F-fluorodeoxyglucose (FDG) is the most common. Similar to blood glucose, FDG is transported into cells with high glucose metabolism, thus identifying neoplastic, infectious, and inflammatory tissue. Because glycolytic activity in inflammatory tissue can result in false-positive FDG uptake, PET/CT is typically delayed until 8–12 weeks following radiotherapy. The most accurate imaging technique is to combine PET with contrast-enhanced CT in the head and neck, which allows for better anatomic localization. Simultaneous acquisition with CT-based attenuation correction reduces imaging time. Current standard indications for PET/CT for head and neck cancer are the evaluation of T3 and T4 tumors, clinically suspected nodal or distant metastatic disease, treatment monitoring, and surveillance to assess for recurrent tumor [90]. Patients must fast for at least 4–6 h prior to imaging to minimize blood glucose competing with FDG, which decreases image quality; 14–18 mCi FDG is injected and images are obtained 1 h after injection.

Unknown Primary

Five to ten percent of patients with HNSCC present with metastatic neck lymphadenopathy of unknown primary site [91]. PET/CT can identify primary tumor sites in approximately 25% of these patients [92]. A recent meta-analysis demonstrated 97% sensitivity and 68% specificity for PET/CT in the detection of primary tumor sites [93]. It is important to note that PET/CT frequently results in false negatives for small lesions less than 1 cm in size owing to its limited resolution and volume averaging effects.

Staging and Response Assessment

Accurate T staging requires precise anatomic definition from cross-sectional CT and MR imaging; the limited resolution of FDG PET precludes its application to T staging. Background physiologic FDG activity within the pharyngeal tissue can further obscure identification of the primary lesion. However, PET/CT has demonstrated value in T staging of oral cavity cancers because of its ability to readily detect mandibular involvement, which is an important determinant of surgical planning; one study demonstrated a sensitivity of 100% and specificity of 85% in the detection of mandibular invasion from intraoral squamous cell carcinoma [94].

Several studies have established the utility of PET/CT in nodal staging for HNSCC, with sensitives ranging from 86% to 98% and specificities ranging from 88% to 99%, with higher accuracy for nodal staging than CT or MR [95, 96]. CT and MR can fail to detect nonenlarged and morphologically normal lymph nodes with active disease. The recently completed ACRIN 6685 trial demonstrated a high negative predictive value of 94% for FDG PET/CT for T2 to T4 and N0 disease. Findings from PET/CT changed surgical management in 22% of these patients. Thus, FDG PET may be able to spare a subgroup of patients from undergoing elective neck dissections for accurate staging or empiric radiation therapy.

The incidence of distant metastases in HNSCC is approximately 2–18% [97]. Patients with three or more lymph node metastases, lymph nodes larger than 6 cm, bilateral nodal disease, and regional recurrence are at higher risk for distant metastasis [98]. PET/CT has a reported negative predictive value of 99% in the identification of distant metastases [99] and specificity and sensitivity up to 92% and 93%, respectively [100], and therefore is an important pretreatment step to avoid unnecessary treatment.

Postoperative and postradiation edema, fibrosis, and inflammation can mimic residual or recurrent tumor using conventional CT and MR in head and neck cancers, thereby limiting its utility in therapy response assessment. Furthermore, many newer treatments including immunotherapeutic agents are cytostatic, and therefore, tumor size may not serve as an adequate marker of response. FDG PET/CT provides utility for response assessment as it can evaluate the presence of viable metabolically active tumor. Both quantitative and qualitative assessments have shown high accuracy and reliability in detection of treatment response. A large meta-analysis of the performance of FDG PET/CT in posttreatment response assessment demonstrated 94% sensitivity and 82% specificity in the detection of residual tumor, with 95% negative predictive value [101]. Positive predictive value was lower at 75%, possibly reflecting a higher rate of false positives from posttreatment inflammatory tissue, which merits the need for both delayed scans 8–12 weeks after radiation treatment and careful attention to clinical and anatomic imaging findings. Representative CT, PET, and T1-weighted MR images are shown in Fig. 6.3.

Fig. 6.3
Three C T images a, b, and c with mandibular tumor. C T of image b has a tumor exposed to radiation therapy. Image c has C T of postsurgery of the mandibular tumor.

A 56-year-old male with left mandibular gingival tumor (* on CT) with osseous erosion of the mandible evident (black arrow on CT), status postsurgery and radiation therapy, with new ill-defined enhancement in the left oropharynx at the flap margin (open white arrows on MRI), indeterminate for posttreatment changes vs. recurrent tumor. Fused images from PET/CT demonstrate hypermetabolic activity in the region of enhancement in the left oropharynx, suspicious for recurrence. Biopsy demonstrated recurrent squamous cell carcinoma

PET/CT is superior to CT for the detection of viable tumor within residual lymphadenopathy. One study reported a negative predictive value of 97% for PET/CT in the detection of residual nodal disease [102]. Several studies support the utility of negative FDG PET findings despite persistent enlarged or morphologically abnormal lymph nodes on CT after definitive CRT [103].

A recent randomized controlled trial of patients with HNSCC and advanced nodal disease demonstrated similar 2-year overall survival rates for patients who underwent PET/CT surveillance vs. planned neck dissection after completion of chemoradiotherapy. This approach can significantly reduce patient morbidity from additional surgeries and is more cost-effective [104]. PET/CT identifies local, regional, and distant recurrence earlier than conventional CT and MR. In a recent prospective trial, FDG PET/CT detected 99% of recurrences in HNSCC treated with curative surgery of definitive chemoradiotherapy [105]. A more recent meta-analysis demonstrated a sensitivity of 92% and specificity of 87% for PET/CT in the detection of recurrences. The accuracy was higher in patients in whom recurrence was not clinically suspected.

PET/MR

Simultaneous acquisition of PET and MR images has been established as an increasingly viable alternative to PET/CT in recent years, performing equally in the staging of head and neck cancer and radiation therapy planning [106]. Its role in the clinical setting has been limited by high costs and logistical requirements.

Conclusion

Multimodality imaging of the head and neck is a necessary tool for optimal management of head and neck cancers. Recent advances in imaging technology have enabled clinicians to provide accurate tumor diagnosis and staging, effective treatment planning, and improved surveillance. US is valuable in the assessment of superficial tumors including within the thyroid gland, salivary glands, and lymph nodes. US-guided biopsies are a cost-effective and safer alternative to excisional biopsies. CT and MR are excellent for precise anatomic delineation of tumors and soft tissue detail for staging and treatment purposes. Advanced MRI and PET/CT provide complementary information on functional and metabolic pathways within tumors that can help guide tumor detection and differentiation, treatment planning and monitoring, and surveillance. As technology improves in the future, these tools will continue to develop and are likely to see increasingly widespread clinical utility.