Introduction

A wide range of technologies is available for in vivo, ex vivo, and in vitro molecular and cellular imaging. This article focuses on three key in vivo imaging system instrumentation technologies used in the molecular imaging research described in this special issue of Eur J Nucl Med Mol Imaging: positron emission tomography (PET), single-photon emission computed tomography (SPECT), and bioluminescence imaging (BLI). In this context “in vivo” refers to imaging molecular processes in living subjects. All three modalities covered in this article result in the emission of photons. The imaging system’s job is to capture these photons and form images that can be analyzed in order to monitor molecular-based characteristics of processes such as the spread of disease, gene expression, or the effects of a novel drug in vivo. For each modality covered, this article describes the basics of how it works, important performance parameters, and the state-of-the-art instrumentation. It also discusses comparisons and integration of multiple modalities. Examples of in vivo imaging applications of these modalities are presented in the other articles of this issue. The principles discussed in this article apply to both human and small animal imaging. The goals of new molecular imaging technology developments are to substantially increase the ability to detect, visualize and quantify low concentrations of molecular probe, its target, or their interaction, which we will term the molecular sensitivity. The imaging system’s photon sensitivity, spatial resolution, and contrast resolution all work together to define its molecular sensitivity.

Positron emission tomography

What is a positron emitter?

PET requires a molecular probe that is labeled with a radionuclide that emits positrons. A positron is a particle that has the same mass as an electron, but has opposite charge. Proton-rich nuclei emit positrons. Common examples are 18F, 15O, 13N, and 11C. Proton-rich radionuclides (e.g., 18F) may be synthesized by accelerating protons using a particle accelerator such as a cyclotron, and directing the resulting proton beam into an appropriate target (e.g., H2O with isotopically enriched 18O). Proton-rich nuclei may also be created using an appropriate nuclear generator, which creates short-lived positron-emitting radionuclides (e.g., 82Rb) from the decay of a long-lived parent (e.g., 82Sr).

What are the signals detected in PET?

In PET, a positron-emitting radionuclide is attached to atoms of the molecule of interest (probe molecule) in order to track its biodistribution in vivo, within tissues of the imaging subject. Positrons are ejected at high speeds from each radio-nucleus. The emitted positrons encounter and interact with electrons and nuclei of nearby atoms of the tissue. During its trajectory, the positron elastically scatters off the atomic nuclei, and loses energy and slows down through excitation and ionization of the atoms. Near the end of the position’s trajectory its velocity is low and it may combine with an electron in the vicinity and subsequently the pair will annihilate, whereby their mass is converted into electromagnetic energy in the form of high-energy photons. If the positron and electron are at rest when they annihilate, the result is almost always two photons emitted simultaneously in opposite directions (180° apart, Fig. 1), each with an energy of 511 keV, the rest-mass energy of both the positron and the electron. The most common PET system configuration surrounds the subjects with a complete cylindrical shell comprising contiguous rings of many position-sensitive 511-keV photon detectors. A PET acquisition consists of detecting and positioning millions of oppositely directed 511-keV coincident photon pairs emitted within the system. A PET scan can require 5–60 min, depending on parameters such as the system photon sensitivity, the mode of acquisition, the size of the imaging subject region of interest, and the amount of injected activity.

Fig. 1
figure 1

Depiction of one positron emission event from a probe molecule. The result is two 511-keV photons emitted simultaneously in opposite directions toward position-sensitive photon detectors in a PET system

What are the types of coincident events recorded in a PET system?

Figure 2 depicts three different situations in which two photons are considered to arrive simultaneously and are thus recorded as coincidence events. The good events are called true, where the line drawn between the two hit detector elements for that event passes through the point of emission of both photons. In scatter events, one or both 511-keV photons undergo Compton scatter before they are detected and the line drawn between the hit elements (dotted line in Fig. 2) does not pass through the point of emission. Random events occur when two distinct radionuclei each contribute one detected photon within the time resolution of the system and the line drawn between the two hit elements does not pass through the point of emission of either photon. Both scatter and random events are an undesirable source of background counts that will cause grossly mispositioned events and therefore loss in contrast resolution and quantitative accuracy. These undesired events may be reduced by narrowing the energy and coincidence time window settings and limiting the field of view (FOV) activity, but any residual scatter and random counts may also be subtracted from the data set using other techniques. Note that Compton scatter events can account for as many as 70% of all detected events in a 3D-acquired whole-body PET study, even after energy window discrimination is applied, and so it is important that the scatter correction method is highly accurate.

Fig. 2
figure 2

Types of events that lead to coincident photon detection in PET

Important performance parameters for PET

There are several important parameters of PET system performance. The photon sensitivity is the fraction of all coincident 511-keV photon pairs emitted from the imaging subject that are recorded by the system, and is also referred to as the coincidence photon detection efficiency. This parameter determines the statistical quality of image data for a given acquisition time. Photon sensitivity impacts image quality because it influences the noise level of images reconstructed at a desired spatial resolution. The spatial resolution describes a system’s ability to distinguish two closely spaced molecular probe concentrations and is important to detect and visualize subtle molecular signals from miniscule structures. Energy resolution is the precision that one can measure the incoming photon energy. Since scattered photons lose energy, good energy resolution may allow a narrow energy window to reduce scatter photon contamination in image data without significantly compromising photon sensitivity. A narrow energy window also helps to reduce the rate of random photon contamination since many of these photons also undergo scatter. The coincidence time resolution determines how well one can decide whether two coincident photons truly arrive simultaneously. Analogous to benefits of good energy resolution, good coincidence time resolution may allow a narrow time window to reduce random events without compromising photon sensitivity. The energy and temporal resolutions work together to define the available system contrast resolution, which is the ability to differentiate two slightly different concentration levels of probe in adjacent targets. The photon sensitivity, spatial resolution, and contrast resolution work together to define the molecular sensitivity of a PET instrument.

Photon sensitivity

Photon sensitivity in PET is improved by (1) increasing the probability that emitted photons will traverse detector material, which is known as the geometric efficiency, and (2) by increasing the likelihood that photons traversing detector material will be stopped, termed the intrinsic detection efficiency. The geometric efficiency is enhanced by tightly packing the detector elements together with little or no spaces, bringing the detectors as close as possible to the body, and covering the subject with as much detector area as possible; these factors decrease the chance that photons will escape without traversing detector material. However, bringing the detectors closer to the subject can lead to position-dependent parallax positioning errors (and hence loss of spatial resolution uniformity) owing to photon penetration into the detector elements. The intrinsic detection efficiency is improved by using denser, higher atomic number (Z), and thicker (longer) detector elements to improve the 511-keV stopping power. Annihilation photons interact with the medium they traverse through two processes: In Compton scatter, the photon scatters off a single electron in the outer shell of a traversed atom. The scattered photon changes its energy (frequency) and the outer shell electron is ejected from the atom. In the photoelectric effect, the entire photon energy is absorbed by an inner shell atomic electron which is ejected from that atom. These two interaction mechanisms work together to attenuate (reduce) the number of photons traveling along a given direction, with a given photon attenuation factor e−μx, where μ is the linear attenuation coefficient, which is related to the interaction probability of a photon with a medium and is a function of the atomic number Z, the attenuating material density, and the incoming photon energy, and x is the material thickness traversed by the photon beam. Ideally one would like minimal attenuation in the subject tissues and maximum attenuation in the sensitive detector materials. Typical PET detector system sensitivities range from <1% (1 coincidence photon pair collected for every 100 emitted) for clinical systems to a few percent for small animal systems.

Spatial resolution

PET spatial resolution is limited by the fact that one is trying to precisely determine the location of a positron-emitting nucleus attached to the probe molecule indirectly using the line drawn between the two annihilation photon hits in the detectors. Since this line results from two electronically determined interaction positions, this process is called electronic collimation. The spatial resolution is typically measured by imaging a point-like positron radioactive source and measuring its observed spread in the reconstructed images. The fundamental spatial resolution limit is dictated by the following factors:

  1. 1.

    The positron range effect, which is due to variations in direction and path length of all the possible positron trajectories (see Fig. 1) created from a given point positron source. The extent of this resolution-degrading effect depends upon the range of energies of the emitted positrons and the medium traversed by the positrons before they annihilate.

  2. 2.

    The photon acollinearity effect, which is caused by the fact that the two annihilation photons are not always emitted 180 degrees apart since the positron and electron are not always at rest when they combine. Hence the line defined by the two detectors hit will not always pass through the point of the positron–electron annihilation. The acollinearity effect on spatial resolution is worse for larger system diameters.

  3. 3.

    The size of the photon detector element (detector resolutioni.e., or pixel size), which determines how precisely a system can localize the photon hits. The size of the detector element used in PET has been gradually decreasing over time in order to improve spatial resolution. Typical clinical systems use 4- to 6-mm detector pixels and small animal systems use 1.5- to 2-mm detector pixels.

Figure 3 plots the combined spatial resolution limit from the above three effects as a function of detector pixel size for various system diameters ranging from small animal to clinical PET systems for an 18F point source [1]. We see that in principle spatial resolution may be improved significantly by reducing the 511-keV detector pixel size. The element size dominates spatial resolution for small-diameter (<20 cm) animal PET systems, since the acollinearity effect on spatial resolution is minor for small detector diameters. However, developing 511-keV photon detector arrays with miniscule detector elements is challenging and typically results in performance compromises in other important system parameters. For example, using a point 18F positron source, a 15-cm detector system diameter for small animals, and 1-mm scintillation crystal pixels, Fig. 3 indicates that it is possible in principle to achieve submillimeter full-width at half-maximum (FWHM) spatial resolution at the center of the system, provided there are enough counts in the acquired data (adequate photon sensitivity) to reconstruct images at that desired spatial resolution without requiring significant smoothing. However, it is very difficult to collect a high fraction of the available light out of narrow (1 mm width) and long (>2 cm) scintillation crystals [2]. Furthermore, this light collection efficiency varies as a function of interaction location within the crystal, and so energy and time resolutions suffer as a result. Typically, in order to achieve acceptable light collection with 1-mm crystal pixels, their length is limited to ∼10 mm [35], but this significantly compromises the probability of absorbing 511-keV photons and hence limits the overall photon sensitivity performance.

Fig. 3
figure 3

18F PET spatial resolution limit [FWHM and full-width at tenth-maximum (FWTM)] determined by positron range, photon acollinearity, and detector element width contributions plotted versus detector element size for different system diameters [1]. In theory, for system diameter <20 cm and 1-mm detector pixel size, submillimeter spatial resolution is possible

For standard human systems, the detector pixel dimensions are typically ≥4 mm, other resolution blurring terms enter the equation, the sensitivity is much lower, and so the reconstructed 18F point source resolution is typically 7–10 mm FWHM at the system center, depending upon image reconstruction parameters. Going to a narrower detector pixel dimension (<4 mm) is not practical for standard whole-body human systems since the diameter is large (typically ∼80 cm), which increases the chance that photons will escape undetected. Hence photon sensitivity is too low (<1%) to provide adequate counts for higher resolution image reconstructions.

Energy and coincidence time resolution

Energy and coincidence time resolution are improved by using scintillation crystals that generate brighter and faster light pulses, by using low-noise photodetectors, and by collecting a higher fraction of the scintillation light into the photodetector to create larger, more robust electronic pulses. A typical value for PET energy resolution is 25% FWHM at 511 keV and 3 ns FWHM for coincidence time resolution.

Count rate performance

Each detector signal recorded in a PET system has a finite processing time. If too many photons hit the detectors in a given time, the front-end photon detectors or subsequent acquisition electronics in the PET system can saturate due to piling up of more than one electronic detector pulse within the required signal processing duration. Typically the degree of pile up is limited by the photon detector signal processing time, which depends upon the decay time of the scintillation crystal, the effective integration time of the electronics, and the photon event rate seen by the detector. For example, suppose a 370-MBq (10-mCi) radionuclide dose is injected into a subject that attenuates (absorbs) photons by an average factor of 10 and the subject is placed in a PET system with 100 detector modules providing a total of 1% coincidence photon detection efficiency (photon sensitivity). Then the average photon event rate per detector module is roughly 3.7×108(radionuclide decays per second)×2(photons per event)×0.1(photon attenuation factor)×0.01(sensitivity)÷100(photon detector modules)=3,700 counts per second. If each system detector module required 1 ms of processing time per event, there could be significant pile up of events. For a given system sensitivity, for the best count rate performance, the system should use scintillation crystals with a fast decay time (see Table 1), detectors with excellent time resolution, fast processing electronics, and limited activity within the sensitive FOV.

Table 1 Some important properties of the most common inorganic scintillation crystals used in PET

Instrumentation to detect 511-keV photons

The front-end photon sensors (a.k.a. detectors) are arguably the most important (and expensive) components of a PET system since their characteristics determine important system performance parameters such as photon sensitivity and spatial, energy, and temporal resolutions. The standard configuration for a PET detector utilizes inorganic scintillation crystals, which absorb the 511-keV photons and generate a flash of light. Most state-of-the-art PET systems use arrays of discrete scintillation crystals separated by reflectors [510]. 511-keV photons are highly penetrating and in order to stop them efficiently for good photon sensitivity, the array crystals must have high atomic number (Z) and density and be relatively thick (long). For excellent spatial resolution, the crystals must also be very narrow for precise localization of the incoming photon interactions in the detector. Finally, for excellent spatial and temporal resolutions, the scintillation light yield should be bright and fast. Table 1 lists important properties of the scintillation crystals most commonly used in PET. Typically the crystals are arranged into arrays, coupled to photodetectors, and built into modules.The modules are fixed together to form a ring as depicted in Fig. 2 for a small animal system. Figure 4 shows the scintillation crystal array submodules used to build various Concorde Microsystems/Siemens microPET systems, which are high-resolution PET systems dedicated to preclinical small animal molecular imaging research [5, 6]. The crystal pixel dimension and the detector gantry diameter determine the main differences between human and small animal imaging systems. High-resolution animal imaging systems use ≤2 mm crystal width and <20 cm detector diameter; human systems typically use ≥4 mm crystal width and 80 cm detector diameter.

Fig. 4
figure 4

Crystal arrays (individual crystal size shown above) used in construction of the Concorde Microsystems/Siemens microPET Focus, R4, and P4 high-resolution PET systems. (Courtesy of Stefan Siegel, Siemens/CTI-Concorde)

In the conventional approach, the discrete crystal arrays are coupled directly to very sensitive light detectors called photomultiplier tubes (PMTs), which collect the light from the crystals and convert it to a robust electronic signal that can be used for spatial, energy, and time information. In order to extract the light out of the tiny array crystals required for high-resolution small animal PET, fiber optics and position-sensitive PMTs have been used [5, 6]; fiber optics cause further light signal loss. During system calibrations the energy spectra are measured and a photopeak window is selected for each individual crystal in each detector module of the system and stored in a calibration table. The electric signals from the PMTs are used to localize the incoming photon event to a given crystal within a given detector module. If that photon event has an appropriate energy (pulse height) corresponding to the photopeak window setting for that given crystal, the event is recorded. Coincidence between two photon interactions recorded on either side of the system is determined by comparing the time stamp that is assigned electronically to each detected hit. These steps highlight the importance of excellent energy and coincidence time resolutions. The resulting event is assigned to the appropriate tomographic line-of-response (LOR) drawn between the two detector pairs and that LOR value is incremented by one count. PET images require many such coincident photon pairs to be positioned (assigned to an LOR) in this manner. More data aquisition details are provided in the next section.

How are PET data acquired and organized?

The molecular probe radiolabeled with a positron emitter is introduced into the body of the imaging subject in trace (e.g., picomolar to nanomolar) quantities. After an appropriate waiting period for the probe molecules to reach the desired target molecules, the subject is placed within a PET system that surrounds the subject with many position-sensitive high-energy photon detectors. Photons that are absorbed in detectors create scintillation light pulses. The light pulses are converted into electronic signals in the PMTs coupled to the crystal array. A weighted mean of the pulse heights is used to identify which individual crystal element within an array was hit. The total scintillation pulse height created for that event represents the absorbed photon energy. This energy is compared with a predetermined photopeak energy window setting for that identified crystal. If the total event pulse height is within the window, the event is accepted, and the location of that crystal within the module, the location of the module within the system, and a time stamp for that event are recorded. Photons that are absorbed in the body or are not directed at detectors do not contribute to the data set. Coincident events are selected as pairs of single events with time stamps that match to within the coincidence time window setting for the system. The collected data set comprises the number of coincident photon pair events emitted from the subject and recorded along all system LORs (the response lines between any two detector elements on either side of the PET system).

There are two modes in which the coincidence photon pairs can be acquired in PET. In 2D PET acquisition, the coincident photon LORs are confined to essentially 2D detection planes corresponding to each ring (or perhaps two adjacent rings) of scintillation crystals within the cylindrical detector system. These detection planes are oriented perpendicular to the system axis. This 2D confinement is accomplished using lead “washers” called interplane septa inserted orthogonal to the system axis and in between every scintillation crystal ring. These septa prevent photons from entering at oblique angles with respect to the septa plane. In three-dimensional (3D) acquisition mode, the lead septa are removed and all LORs formed between any two crystals from any two detector rings are allowed. The advantage of the 3D mode is a significant (∼5–10 fold) increase in photon sensitivity because of the large increase in collected photon flux. The drawback of the 3D mode is a considerable increase in scatter and random coincidence event rate and system dead time owing to possible detector saturation. The acquired data are often organized into sets of parallel LORs, called projections, that give 2D representations of the probe distribution for all angular views about the subject. Often a continuous sequence or cine view of the sequential projection view data is displayed to allow gross visualization of the 3D radionuclide distribution rotating like a top about the system axis.

How are PET images formed?

The organization of data facilitates tomographic image reconstruction, which is a process that uses mathematical algorithms to estimate the 3D probe distribution volume from the 2D projection data, and yields cross-sectional slices through the probe distribution. The image reconstruction algorithm is a key component that turns raw hits recorded into 3D images. There are two basic classes of reconstruction schemes, analytic and iterative. Analytic approaches consider the acquisition process, the measurements, and the reconstructed image as continuous functions. The analytic image reconstruction algorithm (e.g., filtered back projection or FBP) is based on direct computation of an inverse transform formula that converts the recorded detector hits into an image [11, 12]. Iterative techniques consider the above functions to be discrete quantities. The iterative process starts with a “guess” at the 3D probe distribution and goes through iterative (successive) modifications of that estimate until a solution is reached [1317] (Fig. 5). Iterative algorithms differ by the algorithm by which the measured and current estimated projections are compared for a given iteration, and the algorithm for the correction that is applied to modify the current estimate for a given iteration. Iterative techniques may incorporate statistical methods and possibly accurate system models to find the best solution. Iterative approaches may be appropriate for photon count limited data and for PET systems with non-standard geometry. Analytic methods require spatial frequency filtering in order to reduce statistical noise, resulting in a loss of spatial resolution. Iterative methods allow an improved trade-off between spatial resolution and noise and permit a mechanism to incorporate accurate system modeling, but are more computationally intensive. The analytic methods are linear and typically more computationally efficient.

Fig. 5
figure 5

The process of iterative image reconstruction for emission tomography. An initial estimate of the true radionuclide distribution is reprojected (converted into fictitious projection data) and successively modified until it compares well with the measured projection data

Corrections and calibrations for PET data

There are several undesired physical effects inherent in the process of detecting annihilation photons in PET. Acquired PET data must be corrected for these physical factors either before or during image reconstruction in order to facilitate image quality and quantitative accuracy. Since acquired PET data are often organized in terms of sets of LORs, in order to understand how these correction factors are applied to the complex 3D volume of data acquired, it is simplest to visualize these correction factors being applied to one LOR at a time. The undesired physical factors include:

  1. 1.

    511-keV photon attenuation within the tissue, which causes the probe distribution to appear less intense for deeper structures. Photon attenuation is the largest correction factor applied to PET data. Attenuation correction for PET is accomplished by first observing that for any given LOR the total attenuation factor for any two photons emitted along that line depends only upon the linear attenuation coefficients and the total thickness of the body along that line, and not on the origin of the two-photon emission. Thus, photon attenuation may be corrected for every LOR by measuring the total photon attenuation factor of an external radiation source that transmits activity through every LOR that passes through the subject. This transmission measurement may be accomplished by using an external point, rod, or shell radioactive source geometry, as typically used for small animal PET, or by using X-ray computed tomography (CT), as is the case for a clinical PET/CT system. In this latter case, the 511-keV attenuation coefficients are determined from an appropriate scaling of the attenuation coefficients measured at X-ray energies.

  2. 2.

    Detector response non-uniformity, which causes the probe distribution to appear artificially non-uniform due to non-uniform photon detector response throughout the system. This artifact is normalized by measuring the non-uniform response for every LOR with an external radiation source and correcting every measured data set with that normalization file.

  3. 3.

    Detector saturation or dead time, which can cause artificial loss in spatial and contrast resolutions. Saturation is produced when the incoming photon flux is higher than the system processing bandwidth allows. Typically an analytic model of this saturation allows calculation of correction factors that are applied to every LOR.

  4. 4.

    Random coincidence background, which can cause loss in quantitative accuracy and contrast resolution. Effects of random coincidences are worse for higher detected photon rates, which may result from higher FOV activity or better photon sensitivity, and for poor system coincidence time resolution. Typically, estimates of the random coincidence rate for every LOR are obtained from measurements or calculations and are subtracted.

  5. 5.

    Scatter coincidences, which can also cause degradation of quantitative accuracy and contrast resolution. Scatter coincidences are also worse for larger subjects, higher detected photon rates, and poor energy resolution. Using a narrow energy window setting allows rejection of some large angle scatter events; however, small angle scatter photons will still be accepted. Small angle scatter estimates are calculated for each line of response and subtracted.

  6. 6.

    Isotope decay, which leads to an artificially lower measured probe concentration as a function of time. This may be a problem for multi-bed position studies such as clinical whole-body PET, where the subject is slowly translated through the scanner, as the later bed position data will inherently record lower intensities. The decay effect is easily compensated for by knowing the half-life of the isotope used and by keeping track of the time when each piece of sequential data is acquired.

Quantitative accuracy of PET image data further relies upon proper calibration of image counts to true isotope activity. This can be accomplished by scanning a rod source of known activity concentration and measuring the total image counts within a region of interest encompassing the rod. Quantification also relies on correction for spatial resolution blurring (a.k.a. partial volume effect), which can artificially reduce intensity for structures that are on the order of the system resolution, or smaller. This correction factor can be accomplished by measuring the intensity reduction effects versus structure size in a phantom with known activity concentration in spheres of various known diameters.

Single-photon emission computed tomography

What is a single-photon emitter?

Molecular imaging using SPECT requires a molecular probe that is labeled with a radionuclide that results in the emission of gamma ray photons or high-energy X-ray photons. In contrast to PET, only a single photon is detected per event and that photon is emitted directly from the radioactive atom. Gamma ray photons are emitted from the nucleus as a result of relaxation of neutrons and protons that are in an excited energy state of the nucleus. Common examples of gamma ray emitters are: 123I, 125I, and 99mTc. X-ray photons may result from alternate nuclear relaxation or decay processes that involve the removal of an inner shell atomic electron. When an outer shell electron fills this inner shell vacancy, X-rays may be emitted from the atom. Common examples of radionuclides that result in X-ray emissions used in single-photon imaging are 111In and 201Tl, which undergo electron capture decay, whereby an inner shell electron is “captured” (absorbed) by a proton within the nucleus. This leaves an inner electron shell vacancy of the atom that results in the emission of a characteristic X-ray. Single-photon-emitting radionuclides can be created using a nuclear reactor by bombarding reactor-generated neutrons onto high Z targets or as a product of the fission process itself. Single-photon emitters can also be produced using a nuclear generator, which creates short-lived positron-emitting radionuclides (e.g., 99mTc) from the decay of a long-lived parent isotope (e.g., 99Mo).

What are the signals detected in SPECT?

Single photons are ejected at the speed of light from every radioactive atom attached to the molecules of the SPECT probe distributed throughout the body of the subject. The emitted photons interact with electrons and nuclei of nearby atoms of the tissue through Compton scatter or photoelectric absorption. Unlike a beam of positrons traversing matter, the energetic photons do not “slow down” from interactions in body tissues or external detector materials, but rather the photon beam is attenuated (the number of photons traveling along a given line is reduced). The photons that escape from the body can be used for SPECT imaging. In the most common SPECT system configuration, the subject is surrounded by one or more position-sensitive gamma ray photon detector panels, which typically are large scintillation detectors. A SPECT acquisition consists of detecting and positioning many single photons traversing the detectors and can take 20 min to an hour depending upon parameters such as the collimator used, the size of the imaging subject region of interest, and the amount of activity available.

Physical collimation of single photons

In order to determine the correct LOR for SPECT, just as was required for PET, it is crucial to be able to precisely fix the photon direction of incidence into the system and determine its interaction location in the position-sensitive detectors. In PET, the two photon hits on either side of the system give the correct LOR assignment through electronic collimation. In SPECT, since there is only one photon per event, electronic collimation is not possible and physical collimation must be used to determine the photon’s incident direction. Physical collimation is accomplished using a structure made of high density, high Z material such as lead or tungsten with a well-defined configuration of holes for the photons to enter. Photons that hit the holes at the wrong angle do not make it through the collimator, are absorbed in the collimator material, and thus do not contribute to the image. Photons that do make it through the collimator holes have a well-defined direction of entrance into the camera. The LOR is determined by the hole the photon entered and the interaction location within the position-sensitive detector. Figure 6 depicts two common types of gamma ray collimator employed in molecular imaging with SPECT, the parallel-hole and pinhole collimators. Figure 7a depicts single photons entering a parallel-hole collimator. For a parallel-hole collimator the possible photon response lines are orthogonal to the surface of a given position-sensitive scintillation detector panel and yield a 2D projection of the single-photon radionuclide distribution onto the face of the camera.

Fig. 6
figure 6

Depiction of a point source projecting gamma rays through a parallel-hole (left) and a single pinhole (right) collimator. The important parameters determining the collimator efficiency and spatial resolution are defined in the drawings

Fig. 7
figure 7

a Depiction of example high-energy photon events that can take place in SPECT imaging with a parallel-hole collimator. The good events are those where the emitted photons make it through the collimator holes and are fully absorbed in the detector crystal without interacting in tissue or the collimator. b, c Spectra of pulse heights created in an NaI(Tl)-PMT scintillation camera from a 122-keV photon point source in air (b) and 140-keV photon sources distributed in water (c)

What types of events are recorded in a SPECT system?

Figure 7a depicts several possible single photon events that can occur. The good events occur when the line drawn between the interaction location within the scintillation crystal and the photon emission location within the subject passes through a collimator hole. For a scatter event, the photon undergoes a Compton scatter in the tissues before it enters the collimator and the line drawn between the crystal interaction point and the collimator hole does not pass through the point of photon emission. As was the case for PET, scatter events are background events that can cause photon positioning errors and a background “haze” in the image, and therefore reduce contrast resolution. Since scatter photons lose energy when they change direction, these undesired events are reduced, without significant loss in instrument sensitivity, by having good energy resolution and using a narrow energy window setting around the photopeak. As seen in Fig. 7a, photons may be absorbed in the patient or collimator or not be directed toward a detector at all. Photons may interact in the crystal but lose some of their energy owing to escape of a resulting characteristic X-ray that results from photoelectric absorption, they may scatter first and then leave the crystal, they may backscatter off external materials and re-enter the crystal, or they may pass through the crystal undetected. To demonstrate some of these effects on the pulses created in a scintillation camera, Fig. 7b and c show typical pulse height spectra (a.k.a. energy spectra) measured in a NaI(Tl) scintillation camera using a collimated 57Co (122-keV) point source in air and 99mTc distributed in a water phantom, respectively. Similar detection principles hold for 511-keV photons interacting in scintillation crystals.

Important performance parameters for SPECT

Similar performance parameters such as photon sensitivity (efficiency), spatial resolution, and energy resolution that are used to characterize a PET system are also used for SPECT. The combined effects of these performance parameters dictate a SPECT system’s molecular sensitivity.

Photon sensitivity

SPECT system sensitivity is a product of the collimator geometric efficiency and the intrinsic detector efficiency. The collimator efficiency is the probability that an emitted photon will pass through the collimator and depends upon the collimator type and material, and collimator properties such as hole size, thickness of the septa between holes, length of the holes, and distance of the activity source(s) from the collimator. The intrinsic detection efficiency quantifies how well the scintillation detector absorbs incoming photons, which depends upon the photon energy and the crystal material effective density, Z, and the thickness along the photon’s path. Typically, the collimator efficiency determines the overall system efficiency. The efficiency and spatial resolution for parallel-hole and pinhole collimators are given by well-known expressions [18] involving the parameters defined in Fig. 6. Figure 8 plots efficiency for parallel-hole collimator versus hole length, and for a single pinhole collimator versus source to collimator distance for various aperture sizes. Unlike the pinhole collimator, parallel-hole collimator efficiency is insensitive to source–collimator distance. Typical collimator efficiencies for SPECT imaging conditions range from 10−4 (1 photon collected per 104 emitted) to 10−5, and thus photon sensitivityfor SPECT is relatively low compared with that for PET, limiting statistical quality of data for the same study duration. For SPECT systems that utilize pinhole collimation, efficiency may be significantly improved by increasing the number of pinhole apertures used [19, 20] and, if appropriate, by bringing the tissue(s) of interest in close proximity to the pinhole apertures. This multi-pinhole SPECT approach has been applied successfully for small animal imaging [19, 20], but not for human imaging.

Fig. 8
figure 8

Calculated efficiency for a high-resolution, lead, hexagonal-hole, parallel-hole collimator as a function of collimator thickness (hole length, l) (left) and for a single pinhole collimator versus source to pinhole distance, x, for various pinhole diameters, d (right). See Fig. 6 for definition of parameters

Spatial resolution

Since the single photons in SPECT are emitted directly from the radioactive atom of interest (unlike in PET), spatial resolution is mainly limited by how well the incoming photons can be collimated. Spatial resolution in SPECT is a convolution of the collimator resolution and the intrinsic detector resolution, but is typically limited by the collimator hole size. The collimator spatial resolution depends upon the collimator type (e.g., parallel-hole or pinhole), and collimator properties such as hole size(s), length of the holes, and distance of the activity source(s) from the collimator. Typical high-resolution collimator hole sizes are ∼1.3 mm (parallel holes) for human systems and 0.5–1.0 mm (pinholes) for small animal systems. The intrinsic detector resolution depends on the shape of the light distribution created from each interaction, which mainly depends on the crystal design and properties, and how that distribution is sampled by the photodetector array. Typical intrinsic (detector only) spatial resolutions are ∼3.5 mm FWHM for human SPECT systems that use continuous sheet crystals (Fig. 9) and ∼1.5 mm FWHM for small animal systems that use arrays of discrete crystals (similar to that shown in Fig. 4 for microPET). The intrinsic resolution is convolved with collimator resolution to obtain the total system resolution. Spatial resolution in SPECT is typically measured by imaging fine line or point sources placed on top of the collimator and measuring the observed spread in the reconstructed images. In general, since typically SPECT system sensitivity is significantly lower than for PET, the statistical quality of the data is lower for the same aquisition duration and more smoothing is required during the image reconstruction process, leading to lower spatial resolution. However, if small aperture pin-hole collimators and long acquisition times are used the reconstructed spatial resolution may be superior.

Fig. 9
figure 9

Depiction of the standard position-sensitive scintillation detector configuration for a clinical SPECT system. A scintillation light flash results from the absorption of a high-energy photon. A weighted mean of the pulse heights produced in the PMT array very accurately determines the center of the light flash, which is assumed to be the photon interaction point. An image is formed by positioning many such flashes of light

Figure 10 plots SPECT system point source spatial resolution for high-resolution parallel-hole and pinhole collimators as a function of source to collimator distance. We see that SPECT spatial resolution can be significantly improved by using smaller hole size (e.g., smaller pinholes). For pinhole collimation there is also a magnification factor, given by the pinhole to detector crystal distance divided by the activity source to pinhole distance, that allows further improvement of spatial resolution. For close proximity SPECT imaging, where the source is <5 cm from the collimator, the intrinsic spatial resolution of the detector plays a larger role in determining the spatial resolution. For a 1-mm pinhole, 1-mm crystal pixels, and 5-mm source to collimator distance, <2 mm FWHM spatial resolution may be easily achieved. In human systems using a low-energy high-resolution (LEHR) parallel-hole collimator, a typical value for the best resolution achievable is 7 cm FWHM at 10 cm source to collimator distance. In reality SPECT spatial resolution depends strongly upon the statistical quality of the data, the reconstruction algorithm used, and the degree of smoothing is required, and clinical spatial resolutions can be well over 1 cm FWHM.

Fig. 10
figure 10

Calculated total system spatial resolution (convolution of collimator and detector components) for a high-resolution, lead, hexagonal-hole, parallel-hole collimator as a function of source to collimator distance, x, for different collimator thicknesses, l (left), and for a single pinhole collimator versus source to pinhole distance, x, for various pinhole aperture-to-detector distances, h (right). Scintillation detector resolution of 2 mm FWHM was assumed for both plots. See Fig. 6 for definition of parameters

Energy resolution

Typical energy resolutions are superior in SPECT vs PET because SPECT systems use NaI(Tl) as a scintillation crystal (see Table 1), which produces brighter scintillation light flashes and creates larger, more robust electronic pulses. Clinical systems use a continuous sheet crystal geometry (Fig. 9), which provides a high aspect ratio for light collection into the PMTs. A typical energy resolution for a SPECT system that uses NaI(Tl) scintillation is 10% FWHM at 140 keV, which allows excellent rejection of scatter photons. Compton scatter is a more significant problem for PET owing to the facts that two photons must be detected per decay, at 511 keV essentially all interactions in tissue are due to scatter, and the system has relatively wide acceptance angles for scatter photons.

Timing and count rate performance

Note that unlike for PET, timing resolution is not important in SPECT since this image modality requires only one single photon to be detected per event. Owing to the presence of a highly inefficient collimator, count rate performance is also not critical for a SPECT camera unless an enormous amount of high-energy photon (e.g., 131I) activity is injected into the patient, one is performing first-pass blood pool imaging, or there is some other application where a large amount of activity is present in a relatively focal region in front of the camera. Of course, count rate performance is critical if the collimator is removed and coincidence imaging is to be performed with the SPECT system.

Instrumentation to detect single photons

Many of the instrumentation requirements for PET apply for SPECT, with a few exceptions:

  1. 1.

    Since collimators are required for SPECT, the detectors are not configured in rings but rather in flat panels called heads that must rotate around the patient to view the photon activity from all angles. Only one rotating head is required for SPECT, but most systems have two heads to improve system sensitivity.

  2. 2.

    Since photon energies are lower in SPECT, crystals do not have to be as thick, dense, and high Z in order to have high intrinsic detection efficiency [see NaI(Tl) entry in Table 1]. Most of the world’s SPECT systems use NaI(Tl) scintillation crystals.

  3. 3.

    Since the collimator (e.g., Fig. 6) determines the geometric efficiency and spatial resolution performance of SPECT, the detector crystal design requirements are somewhat relaxed. For example, most clinical systems use a continuous sheet of NaI(Tl) scintillation crystals (see Fig. 9) rather than discrete crystal arrays. Another example is that high spatial resolution small animal SPECT can be accomplished with standard clinical scintillation detector heads by substituting a special high-resolution multi-pinhole insert for a standard clinical collimator [19, 20]. However, special miniscule discrete crystal arrays are most common for small animal SPECT system detector designs [21, 22].

A different type of detector material that has become popular in recent years as a candidate to replace the scintillation crystal in SPECT is cadmium zinc telluride (CZT) [23, 24]. CZT is a semiconductor crystal, not a scintillation crystal. Photons that are absorbed in CZT result in a cloud of electron-hole pairs, as is the case for scintillation crystals, but instead of producing light, these electrons and holes drift in opposite directions in a strong electric field applied across the material, and the electronic pulse is directly extracted from the crystal (Fig. 11), rather than having to go through the intermediate steps of scintillation light creation, collection, and photoelectric conversion in a photodetector. The main advantages of this direct conversion of the incoming photon energy to an electronic signal are that (1) the energy resolution (∼5–6% FWHM at 140 keV) is typically much better than for scintillation detectors [23, 24], and (2) high intrinsic spatial resolution may be achieved using fine “pixels” defined by the electrode pattern deposited on the detector faces, rather than requiring arrays of miniscule discrete crystals. Since semiconductor crystals cannot be formed to be as large as scintillation crystals such as NaI(Tl), a CZT-SPECT detector panel typically comprises many submodules tiled together. Perhaps a disadvantage of the pure semiconductor approach is that the detector signals are relatively small in comparison to scintillation detectors and special low-noise electronics are required for electronic readout of each electrode of each module, which significantly increases electronic readout complexity.

Fig. 11
figure 11

Depiction of a basic CdZnTe detector. A high-energy photon interacts in the CdZnTe and creates electron-hole pairs that drift in a strong electric field (∼1 kV/cm) established across the detector. Opposite polarity signals are induced on the anode and cathode planes with pulse height directly proportional to the amount of charge liberated in the photon interaction. Because the electrons drift over an order of magnitude faster than holes, the signals induced are dominated by electron motion only

How are SPECT data acquired and organized?

In principle, SPECT data from a distribution of a single-photon emitting molecular probe are acquired and organized into 2D projections in a very similar manner to that described previously for PET, with a few exceptions. First, the camera heads are not rings of open crystal but typically one or more rectangular plates covered with collimator that do not completely cover the subject and must be rotated around the patient in order to acquire the full angular sampling/projection data sets required for tomographic reconstruction. Second, if the detector plates use continuous scintillation crystal slabs rather than discrete pixels, as is the case for most clinical systems (Fig. 9), each interaction in the detector is positioned using the light recorded from several PMTs, and the positioned event typically assigned to fictitious bins in order to have well-defined LOR assignment of events. Data from discrete crystal designs of SPECT cameras are processed very differently. If a discrete crystal design is used, the event position is assigned to the individual crystal that was hit, similar to the process used in PET. Third, only single photons are recorded per positioned event and so there is no coincidence event processing step. Similar to PET, the total pulse height is measured for each event and is compared with the energy window setting for the crystal (see vertical blue lines in Fig. 7c for 99mTc photons). However, if continuous sheet crystals are used, there is just one energy window for the entire crystal. If the measured energy is within the window for the crystal, the event xy position is recorded and assigned to the appropriate position bin. The collected data set comprises the number of photons emitted from the subject and recorded along all system LORs, which are the response lines formed by the collimator and the assigned positioning bin.

How are SPECT images formed?

As for PET, both iterative and analytic approaches are used for SPECT. Owing to the relatively low statistical quality of data and the need to incorporate models of collimator effects such as resolution blurring, typically iterative reconstruction is quite widely utilized for both clinical and preclinical SPECT imaging. The iterative image reconstruction process for SPECT is analogous to that depicted for PET in Fig. 5.

Corrections and calibrations for SPECT data

SPECT also has several corrections that must be implemented for best qualitative and quantitative accuracy of image data. The method of correction is in general different for continuous sheet (Fig. 9) versus discrete crystal designs. Some of most common corrections required are as follows:

  1. 1.

    Uniformity correction is used to account for the variations in intrinsic detection efficiency that can cause hot or cold spot artifacts to appear. Typically a high statistics flood field image is acquired for each head to form a flood table indicating the interpixel efficiency variations and any acquired image data are normalized (pixel by pixel) by this flood table.

  2. 2.

    Energy (pulse height) correction is used in continuous sheet crystal detector head designs to account for the different total pulse magnitude created across a given detector head owing to variations in properties such as crystal light yield and PMT quantum efficiency (probability of turning light into electrons through the photoelectric effect) as a function of the position across the detector head. A good energy correction will allow the camera to achieve the best energy resolution, which is useful to reject scatter photons. This correction is typically determined using a collimated point source that is moved at positions across the crystal that are directly over the center of the PMTs, and amplifier settings or PMT bias are adjusted until the measured photopeak pulse height is the same for all positions. Discrete crystal detector designs do not typically require this energy correction step as a distinct energy window for each discrete crystal may be stored in a look-up table that is referred to when deciding whether to accept an event assigned to a given individual crystal location.

  3. 3.

    Spatial linearity correction is used in continuous sheet crystal designs in order to correct for spatial distortions that misrepresent the true radionuclide distribution. A table of corrections versus detector bin location may be achieved using a rectangular grid of equally spaced holes placed on top of the scintillation crystal, which provides reference points from which to measure the degree of mispositioning over the head. The grid is flood irradiated and a shadow image of the holes is formed, which provides a map of the displacement in both directions from the correct location across the head. For all future acquisitions, positioned events are translated by an amount predetermined by the distortion table. Spatial linearity correction for discrete crystal array designs is easily implemented since one knows the true position of each individual array crystal to which an incoming event is assigned.

Bioluminescence imaging

What is bioluminescence?

Molecular imaging using BLI requires cellular expression of an enzyme known as luciferase that is responsible for making some insects, jellyfish, and bacteria glow [25]. The gene for this enzyme is incorporated into DNA of cells, micro-organisms, or animal models of disease. If an appropriate substrate is available for the enzyme to act upon, the result is a reaction that emits a subtle glow of visible light called bioluminescence (BL) that can be used to monitor cellular and genetic activity of every cell that expresses the luciferase enzyme [26, 27]. In the case of firefly luciferase, the substrate D-luciferin must be present (introduced into the subject) as well as oxygen and adenosine triphosphate (ATP) in order for the reaction to occur. The peak wavelength of that glow for naturally occurring firefly luciferase is at ∼560 nm. For bacterial luciferase, the substrate is produced endogenously and the peak emission wavelength is ∼490 nm. Recently, luciferase genes isolated from insects and sea organisms have been genetically modified to be efficiently expressed in mammalian cells [28]. The result of two of these mutations has been a shift in the BL emission peak to ∼615 nm (Fig. 12).

Fig. 12
figure 12

BL emission spectra from four modified luciferases that have been used for molecular imaging. These spectra were obtained at 35°C. hRlu Codon usage humanized Renilla luciferase, CBGr green click beetle luciferase, CBRe red click beetle luciferase, Fluc modified firefly luciferase. The modified firefly enzyme spectrum shows a red shift at body temperature. The emission spectra of all other luciferases are not affected by temperature. Photons emitted in the shaded gray region of the plot (>600 nm) have the best tissue transmission for imaging. Courtesy of Hui Zhao and Christopher Contag, Stanford University [28] (with permission)

What are the signals detected in BLI?

Since BL emissions result from an ongoing chemical reaction with inherent kinetic variations, the resulting light is emitted in a continuous glow of visible light photons of varying frequencies (color), known as a spectrum, that is peaked at a most probable emission wavelength (see Fig. 12). Various forms of luciferase and associated substrates are available which produce light in the visible range (wavelengths of 400–700 nm or energies of 1.5–3 eV). In contrast, radionuclides of interest emit high-energy photons of very specific energies and the photons are emitted essentially one at a time (for SPECT) or in a coincident pair (for PET) as a result of decay of a single atomic nucleus. Thus, in radionuclide imaging, photons are detected and processed one at time, but in BLI, a continuous current of photons is collected (integrated) and processed in a single or multiple exposures of the optical camera sensor. Typically, extremely low levels of light (low quantum yield) are generated from BL reactions and significantly less light escapes the subject; hence exposure times are required to be relatively long compared with ordinary photography, and so the electronic noise level of the sensor should be very low.

What types of events are recorded in a BLI system?

BL is emitted from every cell throughout the body of the subject that expresses the luciferase enzyme and that reacts with the appropriate substrate. Figure 13 depicts several visible light photons emitted from a BL source within a mouse and migrating through tissue. Like high-energy photons, emitted BL photons (a.k.a. the fluence) propagate through and interact with tissue molecules through scatter and absorption and the photon beam is attenuated. However, tissue is a turbid medium for visible light propagation, and the scatter or absorption attenuation coefficients (μs, μa) are typically a factor of 50 to 100 hundred times larger for visible light than for high-energy photons (e.g., μs∼10–20 cm−1 and μa∼0.5 cm−1 for red light propagating in tissue vs μs∼0.1 and μa∼0.0005 cm−1 for 511-keV photons). It is often convenient to quantify the average scatter probability in terms of the reduced scattering coefficient μs’=(1−gs, where g is the average cosine of the photon scatter angle over many scatters [29, 30].The reduced scattering coefficient is roughly the equivalent scattering coefficient assuming scatter was isotropic. In most tissues, the average scatter angle is typically small, with g∼0.9. At the microscopic level, light scatter in tissue is mostly due to refractive index variations encountered at the cellular membrane and cellular constituents. The extent of absorption varies with the cell and tissue type encountered; absorption is especially significant for tissues with high hemoglobin content, which strongly absorb blue-green emissions, but tissue is relatively transparent for red wavelengths [29, 30]. Thus, depending upon the depth of BL source(s), most of the emissions may not escape the subjects body for imaging.

Fig. 13
figure 13

Depiction of a spectrum of photons emitted from a BL focus near the center of a mouse and migrating through tissue. Blue-green photons tend to be absorbed. Red photons tend to scatter. The result is a diffuse distribution of light emitted from the surface that is viewed by the photon imaging system

Figure 14 depicts the dependence of μa and μs’ in liver tissue on wavelength. Since the lower wavelength light photons of a BL emission spectrum (<600 nm) tend to be absorbed in tissue >1 mm thick, and the higher wavelength photons (>600 nm) are much more likely to scatter (μs≫μa), the remaining photons that do escape the subject’s body are highly diffuse and reddish in color and have undergone multiple scatters before radiating from the surface (see Fig. 13). This high degree of scatter causes the photons to take long and highly irregular (i.e., diffusive) paths through tissue. Figure 15a plots the approximate BL radiance signal intensity versus BL source depth within tissue for red (∼650 nm), orange (∼590 nm), and yellow-green (∼550 nm) light [31]. We see that red and yellow-green light sources 1 cm deep within soft tissue are attenuated by a factor of approximately 100 and 109, respectively. Thus, in vivo applications of BLI systems are most useful for small mouse (typically nude or shaved) models of disease since most of the organs of interest are found at most 1–2 cm deep within turbid tissue. It is also clear that for best depth sensitivity, the camera system should be very sensitive in the red and near-infrared (NIR) portion of the BL emission spectrum (700–900 nm).

Fig. 14
figure 14

Plot of light absorption and reduced scatter coefficients, μa and μs’, versus wavelength in liver tissue. Light of wavelength below 600 nm is strongly absorbed in soft tissue. Above 600 nm, light is strongly scattered rather than absorbed. The absorption of light is particularly high in liver tissue owing to the high vascular content (presence of oxyhemoglobin and deoxyhemoglobin). Courtesy of Bradley Rice, Xenogen Corporation (with permission)

Fig. 15
figure 15

a Intensity of transmitted light (arbitrary units) and b resolution (FWHM) of surface radiance spot versus the light source depth within tissue for yellow-green, orange, and red wavelength light. The absorption and reduced scatter coefficients assumed for the plots are in the legend of b. Adapted from [31], with permission from Bradley Rice, Xenogen Corporation

Important performance parameters for BLI

As for PET and SPECT systems, performance parameters dictate a BLI system’s molecular sensitivity. However, because BL signals comprise a continuous glow of a spectrum of light rather than individual, monoenergetic photon events, the performance parameters for BLI are quite distinct from those for radionuclide imaging systems, and BLI systems have much more in common with ordinary digital cameras.

Photon sensitivity

Analogous to radionuclide imaging, the BLI photon sensitivity could be defined as the fraction of light emitted from a source that is collected, detected, and used by the imaging system to form images. But since optical imaging systems run in photon integration mode, a more practical and useful definition of “sensitivity” is the minimum light signal that may be detected, which is ultimately limited by the level of background signal present. Since BL emissions comprise visible light, the BLI system sensitivity depends strongly upon the light sensor quantum efficiency, or probability that light impinging upon the sensor will be converted to electric charge through the photoelectric effect. For BL sources >1 mm depth, mainly >600 nm light radiates from the body [31] and so the BL sensor quantum efficiency should be high for red and NIR light. The sensitivity to very low levels of light (perhaps from BL sources deep in tissue) also depends upon the level of the background dark current (the level of electronic current present in the sensor without a light source), which is mainly due to the thermoionic emission and electronic readout noise contributions of the sensor, and should be low for highest sensitivity. The minimum detectable number of BL photons must exceed this inherent noise level of the light sensor system. Finally, the BLI system sensitivity also depends upon the geometric light collection efficiency, which in turn is dependent upon the light collection efficiency of the BLI system optics and the focal distance to the surface radiance of interest. The minimum detectable sensitivity of a BL system is any radiance measurement that is just above the effective observed background radiance level, which depends upon the total noise intensity measured for a given pixel size and exposure time, and typical minimum sensitivity values are 20–100 photons/pixel/second [31].

Spatial resolution

The BL photons that escape the body through a given area and direction on the surface of the animal (a.k.a. the surface radiance in Watts/cm2/steradians, where steradian is a unit of “solid” or volumetric angle) are used for BL imaging. Spatial resolution in BLI is mainly limited by the surface radiance spatial resolution (spot size), which is in turn a function of the BL source depth. The BL camera spatial resolution, which is a function of the camera lens magnification factor and the light sensor pixel size, is typically high (in the micron range), and so contributes insignificantly to the overall measured surface radiance spot size compared with the BL source depth effect. Figure 15b plots the approximate BL surface radiance signal spot size (i.e., spatial resolution) versus BL source depth within tissue for red (∼650 nm), orange (∼590 nm), and yellow-green (∼550 nm) light [31]. Owing to the highly diffusive nature of red and NIR light propagation, the measured spatial resolution is worse (surface radiance spot size is larger) for light in the high wavelength portion of the BL emission spectrum. A good rule of thumb is that the measured spatial resolution (FWHM) of the surface radiance signal from a red light source is roughly equal to (slightly less than) the depth below the surface in which it resides.

Instrumentation to image BL emissions

For radionuclide photon imaging (PET and SPECT) one is able to precisely fix the direction of photon incidence into the system in order to determine a response line through the imaging subject that is correlated with the location of the radionuclide that emitted the photon. In BLI, such collimation and response line identification is not possible since the signal comprises many photons that are scattering multiple times before exiting the tissue. Furthermore, since tissue and air have a different index of refraction, the light reaching the tissue surface will undergo significant refraction before entering the camera. Therefore what remains for imaging at a particular single projection view is a diffuse “blob” of photons, and imaging systems used for BL are typically more similar to ordinary digital cameras. In addition, considerations of how the light from BL surface radiance is collected and focused onto the light sensor are important. Most often a standard optical lens system is used to collect light from the animal (optically focusing on the surface of interest) and focus the image into the relatively small light sensor that is used to create a digital image. Finally, for highest sensitivity to low levels of BL light, there should be no contamination from ambient laboratory light.

Figure 16 depicts an in vivo imaging system (IVIS) configuration offered by Xenogen Corporation [32] for BLI. BLI requires a light-tight box that isolates the animal from ambient light. A highly sensitive digital camera system is carefully coupled to this box. The camera system comprises an optical lens system that collects light from the animal with a focus on the surface of interest. The animal stage can translate toward and away from the lens in order to vary the FOV. The lens transmits the collected light image and focuses it onto a highly sensitive, low-noise, charge-coupled device (CCD) that converts the light image into a digital image. A typical CCD chip used may have a 2.5×2.5 cm2 sensitive area with a 1,024×1,024 array of 24-μm pixels. The optics facilitate collection of light from a relatively large FOV (e.g., 10×10 cm2) into the smaller (e.g., 2.5×2.5 cm2) CCD sensitive area. Each pixel collects photons and converts them into photoelectrons for a selectable time (the frame duration or integration time) before readout. The pixels in a CCD are connected in such a manner that during readout the charge passes through each pixel, with the output of one pixel serving as the input to the next one. This yields a sequential pattern of the charge collected for every pixel from a given exposure, corresponding to the incoming light intensity, that is read out into an output register, amplifier, and digitizer that converts the pattern to a corresponding digital image pixel intensity. The CCD display software in turn converts this intensity pattern into an image.

Fig. 16
figure 16

Depiction of the IVIS system available from Xenogen Corporation [32]

The CCD chip used in a typical BLI system is different from that used in an ordinary digital camera in a few ways. First, the CCD sits in a cryogenic chamber for cooling to very low temperatures (e.g., −105 to −120°C). Cooling reduces the dark noise by significantly reducing the background rate of thermoionic emission (constant thermal release of electrons from the silicon crystal lattice) in the silicon CCD chip. The dark current falls by a factor of 10 for every 20°C drop in temperature. Second, the quantum efficiency of the CCD can be significantly improved for low light detection by thinning the conventional backside of the CCD, turning the chip around so that light illuminates the backside, and using antireflective coating. The quantum efficiency of such a high sensitivity back-thinned, back-illuminated CCD camera is typically greater than 80% for red and NIR light. A variable optical filter is employed to allow the user to select certain regions of the BL emission spectrum, if desired. A red light-emitting diode (LED) is also present to allow ordinary reflected light photographic images to be taken of the surface of the animal in order to have a natural anatomical framework on which to fuse the BL emission images. If the imaging system includes a powerful excitation source that can excite a fluorescent center inside the animal, optical fluorescence emissions can also be imaged with a typical BL system configuration.

How are BLI data acquired and organized?

If firefly luciferase is the BL-producing enzyme under study, the substrate luciferin is introduced into the subject an appropriate time before the study begins and the animal subject is positioned on a shelf within the dark box. A typical BLI acquisition consists in first activating the LEDs, acquiring a photographic image with the CCD camera system, and displaying the image with a gray-scale color scheme. The LEDs are turned off and then the camera is exposed to the BL surface radiance within a dark box. The exposure duration can take from seconds to minutes depending on the amount of active luciferase available within the subject, the depth of the active region of interest, the amount of light escaping the subject, and the number exposures desired. Owing to the very low light level emitted in a typical BL reaction, a typical exposure duration might be 30 s to a minute. For very low light levels, the image signal-to-noise ratio may be improved by summing the charge generated from a square group of multiple small pixels (e.g., 10×10 pixels each 24×24 μm) into one big pixel [31]. This operation increases the electronic signal per image pixel, but reduces the image resolution along any direction by the same factor (e.g., a factor of 10). Since the spatial resolution is limited by the surface radiance spot size rather than the CCD pixel size, as long as the grouping is not too large, this resolution–sensitivity trade-off is typically acceptable. The acquired image results in a picture of the BL surface radiance in units of photons/second/steradian. The BL image intensity is often displayed using a red-green-blue color scale (with red indicating the highest and violet the lowest intensity, as depicted in Fig. 13) and the BL image is overlaid onto the gray-scale photographic image for anatomical correlation.

BL tomography?

Unlike in PET and SPECT, the field of BLI up to now has mainly used single-view, non-tomographic, planar imaging to estimate the luciferase-producing cell distribution within a mouse. Limitations of planar compared with tomographic imaging are: (1) planar images are a superposition of emissions from all depths, which limits image contrast resolution, especially if there is more than one BL focal source; (2) lack of precise depth localization; (3) strong depth-dependent resolution blurring; and (4) lack of quantification due to significant photon attenuation. 3D BLI is needed because there are applications for which the BL surface radiance signal is weak from certain views, but stronger for others, and it is simply important to sample the signal from different projection angles. For 3D projection imaging capabilities, the BL system should include a mechanism to acquire multiple planar views from several orientations about the animal.

For a given view, the surface radiance is often very diffuse and the depth of the BL source is uncertain. BL tomography (BLT) would give the ability to reconstruct cross-sectional slices through the BL site(s) and recover depth and spatial resolution of BL sources and could significantly enhance contrast resolution and quantitative accuracy. However, BLT requires the incorporation of an accurate photon migration model into the reconstruction process to account for light absorption and scatter along any given path of the photons [33, 34]. For tomographic imaging capabilities, since the surface radiance strongly depends upon the tissue thickness that the light traverses, a special light source should be available to measure the 3D contours of the surface of the animal. This map of the animal’s surface boundaries together with an accurate model of light propagation through the tissue along many different paths and/or measured or estimated optical scatter and absorption parameters may be incorporated into a 3D image reconstruction algorithm to estimate the 3D distribution of cells expressing active luciferase in the animal. As stated, BLT may allow better resolution of structures deep within tissues, improve quantification of image data, and give a more faithful visual representation of the BL source(s). These properties will provide more useful correlation to images from other modalities, such as PET. However, the very low BL quantum yield, high scatter and absorption, and significant tissue heterogeneity make in vivo BLT an extremely challenging problem [33, 34].

Corrections and calibrations for BLI data

For background correction, typically images without luminescent sources present and the camera shutter closed are acquired daily for various acquisition times and all subsequent studies of a given duration are corrected using the appropriate background file. Reliability of system measurements is checked weekly with a calibrated light source of known intensity. Since a significant portion of the BL emission spectrum is strongly attenuated in tissue, the intensity of light detected at the surface from a given population of BL cells is strongly dependent upon its depth. Thus, for planar BLI, it is not possible for absolute quantification of signals in vivo from an unknown distribution of BL-emitting cells, especially if any of the cells are ≥1 mm deep in tissue below the surface from which a radiance measurement is taken. However, it is possible to perform an absolute intensity calibration of the BL imaging system in order to convert the measured image intensity into units of photon radiance (photons/cm2/steradian). To determine this conversion factor, typically a calibrated, very weak light source of known photon radiance is placed in front of the camera and imaged. The conversion factor of image intensity into radiance is determined as a function of the lens position, the imaging FOV, the selected imaging pixel bin size, and the wavelength of the calibration source light.

Bioluminescence versus fluorescence imaging

Optical fluorescence imaging (FLI) is capable of imaging a variety of in vivo processes such as protein function, enzyme biodistribution, and gene expression occurring deep within tissues of live small laboratory animal subjects (mainly mice), by observing the surface distribution of FL signals. FL molecules may be genetically engineered into a mouse, for example by incorporating the gene for an FL protein as a reporter gene [35], or by using fluorophores or fluorescent particles known as quantum dots to label a biologically interesting molecule [36]. BLI has an advantage over FLI in that an external excitation light source is not required. Thus, since external excitation light required for FLI will be significantly attenuated in tissue and produce significant amounts of auto-fluorescence, especially for FL sources deep within tissues that emit at wavelengths <600 nm, the background light levels are much lower for BLI than for FLI. However, auto-fluorescence is less of a problem if one uses NIR fluorochromes, because hemoglobin and water, the major absorbers of visible and IR light, respectively, have their lowest absorption coefficients in the NIR. One may use an appropriate long wavelength (low-pass) filter in front of the photodetector (e.g., CCD) to allow only the NIR light to pass. Another advantage with imaging BL-tagged cells in tissue is that the data are more easily quantified because the signal measured on the surface of the animal subject is simply proportional to the number of luminescent cells. For fluorescence, the signal level is proportional to both the number of FL cells and the intensity of the external excitation light, which is strongly attenuated in tissue present in front of the target fluorophore. This problem is especially evident if the target fluorophore absorption spectrum peaks at lower wavelengths. These factors make it more difficult to quantify the fluorescent probe concentration distribution. Nevertheless, photon migration models have been developed to account for these effects in an attempt to restore quantitative accuracy for FLI [37].

FLI has some major advantages over BLI. The FL quantum yield is orders of magnitude higher than for BL. Genetically modified luciferases that emit at higher wavelengths [28] are critical to improve tissue penetration of the weak BL light signal. Thus, high-sensitivity, low-noise imaging detectors such as cooled CCD imaging systems are required for BLI [31]. In contrast, there is much work developing red- and NIR-emitting fluorescent molecular probes with high quantum yields that will facilitate more robust signals emitted from deeper within tissue, which somewhat relaxes the noise requirements of the photodetector. FLI can be performed in both live and fixed cells and no substrate is required. Fluorochromes can be coupled to peptides and antibodies and fluorescence signals may be activatable or switched on and off by the presence or absence of specific molecules or molecular events [38], which can help to further reduce background signal. In contrast, the generation of BL is specific to cells that contain the luciferase reporter gene, and is thus limited to studying genetically manipulated cells, transgenic mice, or infectious agents such as bacteria or viruses.

FLI has other distinct advantages over BLI for in vivo tomography. Because there is no control or modulation of BL intensity once the reaction begins, the image reconstruction problem for BLT is more ill-posed than for fluorescence tomography (FLT). As a consequence, most BL imaging studies are still acquired in non-tomographic, 2D planar projection mode. In contrast, FLT has appeared in research literature for years [39, 40]. FLT images molecular processes in 3D by reconstructing the distribution of molecular probes tagged with fluorescent proteins, preferably emitting in the NIR for better tissue transmission. In FLT typically an intense laser excitation source propagates light into the subject and the emitted FL signals are collected from multiple views. Three approaches are used to probe deep tissue volumes in FLT. The time-domain (TD) approach [41] uses the fact that those paths of light propagation that arrive first at the photodetector have undergone the least scatter, and have therefore on average interacted with less diffusive tissue than photon tracks arriving at latter times. The TD method requires extremely fast NIR laser pulses, detectors, and electronics to be able to measure the time-of-flight distribution over the scanned tissues. In addition to the intensity distribution, the temporal information gives the fluorescence lifetime, independent determination of absorption and scatter coefficients, and fluorophore depth. The frequency domain approach [42, 43] uses an intensity-modulated excitation light source wave. The wave is distorted in optically heterogeneous tissue, resulting in reductions in amplitude and phase shifts of the excitation light with respect to the emitted light wave. These changes are measured in photodetectors placed at the surface of the body. The resulting information is converted into maps of the tissue interior. In the continuous wave approach [42, 44] the excitation and FL emission light are steady state light sources and the distribution of FL emitters throughout the subject is reconstructed from intensity measurements at the subject boundaries. All three approaches to FLT require an accurate photon transport model.

Optical versus radionuclide tomography

In non-diffusive imaging techniques such as PET, SPECT and X-ray computed tomography (CT), the photons are assumed to travel along straight lines between the source and detector and image reconstruction does not critically rely on incorporating a model of photon transport through the tissue. In optical tomography methods (BLT and FLT), it is extremely unlikely that light photons will arrive at a photodetector without scattering multiple times. Thus, accurate BLT or FLT image reconstructions rely on including an accurate photon migration model (the forward model). Modeling may be performed numerically using the general light transport equation, referred to as the Boltzmann equation, analytically using the diffusion approximation to this equation, or through Monte Carlo simulation techniques that treat light as a collection of discrete particles migrating through tissue. Owing to strong attenuation of the light signals in tissue, BLT and FLT are most sensitive to shallow sources, but can perhaps resolve light sources up to ∼1 cm deep. Thus, in vivo BLT and FLT are confined to small laboratory animal imaging, and are not easily translated to the clinic with the exception of imaging thin extremities such as fingers, toes, and perhaps compressed breast tissue [37]. In contrast, PET and SPECT are clinically used to image structures deep within the human body. Quantitative radionuclide tomography is facilitated with accurate photon attenuation correction methods and various calibrations described earlier.

Combining multiple molecular imaging modalities

There is no best modality for molecular imaging and one may have to use a combination of more than one imaging modality and molecular contrast strategy to answer the questions of interest [45]. Combining complementary modalities can, for example, add anatomical and/or physiological information to molecular imaging studies, correlate two distinct biological measurements in time, or allow simultaneous imaging of multiple molecular targets or probes. Software fusion of data from two separate imaging modalities is possible with the help of anatomical or fiducial markers that allow spatial registration of the two image volumes, but such efforts are most successful for studies of organs and tissues that do not move with time, such as the brain [46]. The other approach for multimodality imaging is to develop a system that combines more than one modality into one instrument. Such a hybrid system might allow simultaneous and/or sequential acquisitions with the different modalities. A clinical example of the power of multimodality imaging is the combining of PET and CT to localize primary, recurrent, and metastatic cancer throughout the body [47]. PET/CT is an ideal combination since the result is a tool that provides information that cannot be obtained as easily using the two modalities separately. PET is used to measure the increased metabolic or cellular activity of the cancer and CT is used to provide high-resolution visualization of the corresponding anatomy where the cancer resides. Adding CT to PET has the additional benefit of enhancing PET’s accuracy and throughput by facilitating a rapid, low-noise, accurate estimate of photon attenuation coefficients [48, 49]. Furthermore, the integrated PET/CT system does not compromise the performance of either system. Clinical SPECT/CT systems that have become available recently will likely also play important roles in characterizing diseases for which SPECT has strong imaging capabilities [50, 51]. The greater flexibility offered by small animal imaging research has resulted in the development of several high-resolution dual- and tri-modality systems such as SPECT/CT [5254], SPECT/optical [55], PET/optical [56], PET/SPECT/CT [52, 57], and PET/magnetic resonance imaging (MRI) [58, 59]. Such multimodality systems facilitate a range of in vivo strategies to obtain rich, correlative information about the molecular basis of disease, and enhance interpretation and quantification capabilities of data from the individual modalities involved.

Table 2 lists the general properties of several commercially available, non-invasive, in vivo imaging modalities that may be combined to enhance the information obtained in molecular imaging studies. A distinct advantage of radionuclide methods is that the small probe mass and labeling strategies do not significantly perturb the biological processes under study. PET has high molecular sensitivity and strong quantitative potential. SPECT can image multiple probes simultaneously provided they each emit distinct photon energies. BLI and FLI have the strengths of high molecular sensitivity (BL has the highest listed in the table), low cost, and no requirement for ionizing radiation. MRI and CT combine high-resolution morphological capabilities with physiological information. MRI can also provide molecular information using mass quantities of probe [60]. MRI’s morphological contrast resolution is high in soft tissue. CT’s contrast resolution is best for bone and lung. Ultrasound has the advantages of being widely available clinically, relatively inexpensive, and capable of acquiring real-time physiological information. Ultrasound also shows promise for targeted imaging [61].

Table 2 Properties of several commercially available non-invasive in vivo imaging modalities that are used in molecular imaging studies