Keywords

11.1 Introduction

The ability of fluorescence microscopy to simultaneously image multiple specific molecules of interest has allowed biologists to infer macromolecular organization and, in the case of live cell imaging, even transient molecular interaction. Many consequential conclusions are drawn based on the various approaches to display or quantify these images. These include merged colorimetric display of two monochrome images and colocalization analyses. Collectively, these techniques may yield information about relative molecular abundance, spatiotemporal co-occurrence of molecules within a given cellular space, biological functions (in the case of biosensors or ionic probes), as well as other more complex examples of coupled variables. Unfortunately, when implemented without in-depth understanding, these approaches are often fraught with problems. Advances in computer technologies and software development have made the implementation of these techniques appear, at first glance, so deceivingly straightforward and intuitive that the various caveats and the underlying quantitative aspects of these methods are frequently overlooked. This chapter will discuss the underlying principles of how these techniques quantify their corresponding coefficients as well as their strengths and limitations. It will subsequently explore the practical applications of these methods.

11.2 Colocalization: The Analysis of Similarity in Two Grayscale Images

One of the most common questions in life sciences is to interrogate the extent of biological association – whether a biomolecule or structure of interest is associated with a given organelle, compartment, another protein, or other structure within a cell (Dunn et al. 2011). This analysis of coupled variables forms the foundation of a “colocalization” study. Colocalization is often used by biologists as a proxy for molecular interaction. However, this analytical approach is fraught with potential problems. One rather surprising fact is that none of the so-called colocalization indices actually measures “colocalization” per se in the strictest biological sense (Ramírez et al. 2010). This is especially the case when the biological question involves the study of interaction at the molecular level, as colocalization is a technique of measuring relative proximities within the limitation of the spatial resolution of the component images. In short, “colocalization” is a misnomer, and the use of the term should indeed be discouraged and phased out. In general, colocalization analysis methods tackle the problem from one common angle – that is, to compare “image similarity” by comparing the coupled variables, which are the signals from two monochrome channels. The outcome of these analyses is affected by the resolution of the optical instrument, observer’s color perception, autofluorescence and other background noise in the images, and image processing strategies. The accuracy of image similarity analysis ultimately hinges on implementation of the optimal method to tackle the biological questions in hand. We will begin by addressing these issues individually .

11.3 Resolution

One of the most important factors that would immediately affect the measurement of image similarity is the resolution of the optical instrument. In fact, the accuracy of the image similarity analysis can only be as precise as the resolution of the imaging instrument. This limitation, unfortunately, is often not well considered, resulting in biologists erroneously equating any readout from these quantitative indices as “colocalization” or even molecular interaction. There is a fundamental mismatch between normal optical resolution (on the order of 300 nm or larger) and the truly meaningful associative distance between biomolecules as defined by the Pauli exclusion principle (Pauli 1925), which is usually 10 nm or less. Yet, because of diffraction, even a single fluorescent molecule will appear as an “Airy disk” as described by the point spread function in a conventional optical image (Sheppard 2017). Figure 11.1 shows an idealized representation of this concept. While there is clearly signal overlap between green and red channels, as shown by the yellow pixels, the actual positions of the two molecules, indicated by a white + and × signs, respectively, are separated by >150 nm in both the lateral and axial planes – far beyond meaningful molecular interaction distance . We therefore cannot claim molecular interaction based on pixel overlap, and the appearance of a yellow pixel (indicative of overlap of red and green pixels) should never be used as a quantitative measure. The only commonly used optical technique to directly measure molecular interactions makes use of Förster resonance energy transfer (FRET) (Chew and Chisholm 2005), which has an effective proximity range of <10 nm, or fluorescence cross-correlation spectroscopy. Yet, these techniques have their own limitations, and FRET calculations are themselves frequent victims of poorly performed channel bleed-through correction.

Fig. 11.1
figure 1

A single molecule is imaged in the green channel, while another single molecule is imaged in the red channel. Each molecule is imaged in three dimensions, resulting in both a lateral plane image and an axial plane. Their merged display is shown in the right column

Likewise, the development of localization-based super-resolution fluorescence microscopy capable of resolving molecular separation at the range of 10–20 nm (reviewed by Schermelleh et al. 2010) has highlighted one of the most glaring limitations of image similarity studies. At low magnification, objects as large as single cells can sometimes overlap, yet in PALM/STORM images with resolution of <20 nm, even single molecules rarely show any real spatial co-occurrence (i.e., the absence of “yellow” pixels when a super-resolved red image is digitally merged with a super-resolved green image). This observation challenges the notion of using colorimetric analysis, in which yellow is often rudimentarily interpreted as overlap when the corresponding pixel pair contains signals from both green and red monochrome channels , as a reliable proxy for molecular interaction .

11.4 Color Perception and Colorimetric Display

Arguably the most common colocalization analysis is the visual perception of a secondary color, such as when the simultaneous presence of green and red in a pixel makes it appear yellow. However, color perception is nonlocal: it is also influenced by the color of surrounding regions in the image as well as the brightness and color of the lighting in the room. To illustrate these issues, Fig. 11.2a–c shows three pseudo-colored variations on the same three-channel biological image. In each variation, each channel’s monochrome intensity values are identical: Only the pseudo-color assigned to each channel has been changed. If human perception of color was local and accurate, each image should look about the same, except for the differences in color. In fact, each variation appears to be almost an entirely different image. Objects easily visible in one variation are virtually invisible in others and vice versa, all depending on the colors and combinations of colors present. More disturbingly, objects that appear “colocalized” in one image seem totally non-correlated in another. Thus, it is impossible to accurately judge if and how the objects in the three channels overlap. For these reasons, colorimetric methods should generally be avoided for all but the most qualitative analyses.

Fig. 11.2
figure 2

Human visual perception can be misleading. The optic lobes from third-instar Drosophila melanogaster larvae were tripled-stained showing photoreceptor intricate spatial relationship of axons and glia. In panels ac, the same combination of three monochrome images was displayed without manipulation of pixel intensities . However, the look-up table (LUT) assignments for the three channels were scrambled. Image courtesy of Dr. Vikki Weake, Purdue University

Among the many underlying factors that contribute to such display discrepancy in Fig. 11.2 is the fact that most of the color schemes used conventionally to display the merged images from two monochrome channels do not have uniform luminosity . As a result, even though the pixel values of a merged image are faithfully maintained (whose corresponding information can be easily extracted in most image processing software simply by pointing the cursor to any pixel), ratiometrically accurate pseudo-color and grayscale intensity values are not necessarily perceptually equivalent for human vision (Taylor et al. 2017) (Fig. 11.3).

Fig. 11.3
figure 3

Comparing the RGB and PUP color spaces. (a) RGB spectrum and the corresponding luminosity (perceived brightness) of color scheme. The profile plot highlights the irregularity in the luminosity in the RGB color space . (b) Perceptually uniform hues used in the PUP display. Note the uniform luminosity level throughout the color space

In fact, in a 24-bit standard RGB scheme (red, green, and blue) merged image, the most intense green color (intensity value equivalent to 0, 255, 0) is perceived to be approximately twice as bright as the most intense red color (intensity value equivalent to 255, 0, 0). While human visual perception should never be trusted to perform image quantification, the decision to explore certain biological features or molecular relationships is still frequently based on perceptual impressions. In fact, quite often the decision to implement further quantitative image analysis is driven by a visually perceived outcome. We therefore argue that, as underappreciated as it is, a perceptually accurate display should be treated as an important component of the investigative process in biological science. In light of that, as the first exploratory step, it is important to turn to a color scheme in which the luminosity values of the hues in the spectrum are equalized: the PUP (Perceptually Uniform Projection ) display method (Taylor et al. 2017). This color scheme is available as a download from https://tinyurl.com/yc9daskb.

11.5 Optimization of Image Quality

Most image analysis methods are pixel-based, and as a result, they are blind to the biological structures abundantly apparent to the biologist who is looking at the same set of images. In other words, image analysis software, in the absence of appropriate object segmentation, cannot differentiate a real biological object from its surroundings. These programs are therefore sensitive to any contaminating signals in the digital images, including noise, shading errors, saturated pixels, shifts in image registration, channel cross talk, and improper (or the lack of) object segmentation.

In addition to the guidance provided in Chap. 9 of this book, there have been several excellent tutorials to aid readers in collecting high-quality digital fluorescence images (North 2006; Waters 2009). North describes the many optical considerations that must be optimized for high-quality microscopy images (North 2006). Waters continues by outlining important acquisition parameters for obtaining quantitative data (Waters 2009). This chapter builds off these key points, and the end user should ensure that any digital images undergoing further analysis (i) are optimized to have the highest signal-to-noise ratio (SNR) possible (Stelzer 1998), given the experimental constraints, and (ii) fall well within the linear dynamic range of the microscope detector, with no image saturation (Nakamura 2005; Stelzer 1998). (See also Chap. 9.)

There are several image corrections and manipulations that may be necessary before further image analysis can be performed. Firstly, it is important to subtract the image offset, as failing to do so will inflate the apparent signal level. Offset refers to the constant intensity value added to all pixels, regardless of the detected signal, and can generally be provided by the camera manufacturer or be measured directly. Further, many microscopes do not evenly illuminate the entire field of view. In such cases, it is imperative to obtain a correction image of a highly homogeneous sample with which to account for such imperfections. This so-called shading correction (Leong et al. 2003) is especially important when performing ratiometric imaging (discussed later). Further, it is important to assess the amount of signal from each fluorophore that is detected in the opposite color channels. The presence of such channel “bleed-through ” requires subtraction of a proportion of one image from the other. The exact proportions are measured using control samples containing each fluorophore individually, under identical conditions as the experimental samples (Piston and Kremers 2007). Other image corrections are equally imperative. For instance, all images should be corrected for fluorophore photobleaching if time-course experiments are being performed (Vicente et al. 2007). Microscopes with multiple cameras, as well as the chromatic aberrations present in nearly any optical system, may require researchers to align one image with another. Multicolor sub-diffraction-sized fluorescent microspheres, such as Tetraspek® beads (Life Technologies, T-7280), can be used for this purpose in combination with affine transformation or image correlation techniques, among many other methods (Zitová and Flusser 2003).

Moreover, it is vital to subtract the unwanted background signal from the images. This signal is usually due to fluorescence from endogenous cellular components (Andersson et al. 1998), although mounting media or even the glass coverslip can contribute. One of the most commonly used techniques to remove unwanted background is the “rolling-ball” method (Dickinson et al. 2001; Sternberg 1983), which looks for minimum pixel intensity values within small neighborhoods throughout the image (Dickinson et al. 2001). More sophisticated methods based on spectral unmixing and apodization can also be employed to remove unwanted background (Haaland et al. 2009; Ojeda-Castaneda et al. 1988).

Finally, as will be discussed further, it may be necessary to determine an unbiased threshold image intensity value below which the image signal is not considered. There are numerous techniques proposed toward this end that utilize a variety of information, including image intensity distributions, image entropy, morphological features, and combinations of each (Glasbey 1993; Kapur et al. 1985; Otsu 1975; Peters 1995). Readers should explore numerous methods to determine which algorithm produces the desired results given the samples and structures being imaged. To aid in performing the preprocessing steps outlined above, we have collected a list of software plug-ins that are available for the open-source ImageJ/FIJI image processing package. These are summarized in Table 11.1. Similar functionalities are also provided in many commercially available software packages. We encourage the readers to consult with the manufacturer for more information should you wish to use them. In any case, the readers should always explain in detail any manipulations that are performed on image data featured in publications, and these manipulations should always fall well within the guidelines set forth by any publisher or funding agency such as those given by NIH in Chap. 12.

Table 11.1 List of freely available FIJI/ImageJ plug-ins for image preprocessing, colocalization analysis, and intensity-modulated display of ratiometric images

11.6 Object-Based Overlap Analysis

When the objects of interest are significantly larger than the diffraction-limited spot size, object-based overlap can be useful and is far more reliable than visual perception of secondary colors. The process of defining objects within an image is termed segmentation (Solomon and Breckon 2011) whereby a threshold is applied to create a binary image that distinguishes structures of interest from background signal. The binary images are often further morphologically filtered (selected based on size or shape), until only the objects of interest remain. As a quality control check, we strongly recommend overlaying final binary images onto the original images to verify the accuracy of the segmentation procedure. Once the binary images from each channel are acceptable, a Boolean AND operation is used to create an image of objects that represent the amount of overlap between the two channels. Figure 11.4 gives a schematic representation of this general algorithm.

Fig. 11.4
figure 4

Object-based overlap study. (a) Green- and/or red-labeled neurons in situ. Note that only one neuron is visually yellow. Image processing and segmentation operations are used to define the objects of interest on each channel (here, cell bodies). (b) The resulting binary images can be automatically counted to determine the number of green objects and the number of red objects. (c) Combining these binaries with an AND operation creates a new image containing objects that are both green and red, which again can be automatically counted

The number and/or size (or shape) of these overlapped objects can be measured automatically. If the objects in the input channels are expected to be entirely coincident, the number of overlapped objects will be the most useful measurement. Other definitions of object overlap, such as measuring the distance between the input object’s centroids, have also been devised (Lachmanovich et al. 2003).

Additional steps should be taken, however, to determine that the amount of overlap observed is greater than that expected by chance alone. One way to tackle this problem is to leverage the power of repeated random sampling in Monte Carlo-based simulations to show all the possible outcomes. Costes et al. have devised a method which is essentially a block-scrambling technique (Costes et al. 2004) – by randomly shifting blocks with the dimension of the full width at half maximum (FWHM) of the optical point spread function and then recalculating their overlaps many times until the process produces a distribution of chance outcomes. The Costes method will be discussed in greater detail in Sect. 11.8.3 when we delve further into image similarity analysis. However, one important shortcoming of the Costes method in dealing with object-based overlap analysis is that it assumes that the pixels composing the objects in each channel are independently distributed in space, when in fact their positions are often highly dependent, because they are grouped together to form a small number of larger objects. Thus the Costes method tends to inflate randomization of the fluorescent pattern. Rather than block-scrambling, a better strategy for object-based overlap analysis would be to randomize the locations of the input object through a technique called confined displacement algorithm (Ramírez et al. 2010). To implement this method properly, the random locations should be restricted to image areas that are physically plausible, e.g., the objects should always fall within cell boundaries if subcellular objects are being measured. This method can be complex to implement in practice and requires programming knowledge. However, a FIJI/ImageJ plug-in is available that can perform this analysis, as summarized in Table 11.1.

11.7 Scatterplot Analysis

Intensity correlation offers an added dimension over color-based or object-based colocalization analysis. By plotting the intensity distribution of corresponding color 1 and color 2 pixels in a scatterplot, the degree of colocalization can be qualitatively assessed (Fig. 11.5a–d). In the case with a high degree of colocalization (Fig. 11.5a), an increase in the color 1 channel pixel intensity is accompanied by a proportional increase in the color 2 channel pixel intensity . An opposite situation is shown in Fig. 11.5b, whereby regions of high color 1 signal are accompanied by little or no color 2 signal and vice versa. A third case is illustrated in Fig. 11.5c, displaying no intersection between color 1 and color 2. A practical example of such behavior can be seen when labeling two molecules that occupy separate cellular compartments. Lastly, Fig. 11.5d shows two signals with zero correlation, indicating that the interaction of the two signals is random and displays no discernible relationship.

Fig. 11.5
figure 5

Scatterplots for assessing theoretical pixel correlation. These four scatterplots represent four theoretical correlative relationships between signals from two channels. (a) Linear correlation. An increase in the image 1 signal intensity is accompanied by a proportional increase in the image 2 signal intensity at each pixel. (b) An opposite situation is illustrated. In this case, high image 1 intensity is accompanied by low image 2 intensity and vice versa. This indicates that signals in each image tend toward mutual exclusion, often termed molecular repulsion. (c) Zero intersection. The two signals do not interact. (d) No correlation between pixels in image 1 and image 2. In this case, no clear relationship between the molecules of interest can be surmised

Practically speaking, a scatterplot will nearly always display at least two and sometimes even all three of these cases due to biological factors such as non-specific labeling and image noise. Thus, subtle changes can be hard to observe qualitatively. Therefore, numerical coefficients have been proposed to better quantify changes in colocalization .

11.8 Image Similarity Coefficients

An image similarity coefficient describes , in numerical terms, the degree of overlap or correlation between two image channels. Two indices are commonly used for this purpose. The first measures the degree of synchrony (or correlation); the second quantifies the extent of contribution (which measures co-occurrence). These phenomena should not be confused with each other. Indeed, each can occur in the absence of the other. Before we proceed, our discussion of image similarity analysis hereafter assumes that the two images to be analyzed have been properly background-corrected and that appropriate intensity threshold has been applied.

11.8.1 Pearson’s Correlation Coefficient

Pearson’s correlation coefficient (PCC) evaluates image similarity by measuring intensity correlation between two channels (Pearson 1896). It asks: when a pixel in channel 1 deviates from the mean intensity value, how likely will the corresponding pixel intensity in channel 2 deviate in the same manner? PCC can be expressed as:

$$ \mathrm{PCC}=\frac{\sum \limits_i\left(C{2}_i-\overline{C2}\right)\cdot \left(C{1}_i-\overline{C1}\right)}{\sqrt{\sum \limits_i{\left(C{2}_i-\overline{C2}\right)}^2\cdot {\sum}_i{\left(C{1}_i-\overline{C1}\right)}^2}} $$

where C2i and C1i refer to each pixel in the “color 2” and “color 1” image, respectively, while \( \overline{C2} \)and \( \overline{C1} \)denote the mean pixel intensity of the entire image in each channel. Note that intensities are expressed with respect to their deviation from the mean values and have a range of −1 to 1. A coefficient of 1 is a complete synchrony, while a value of −1 is 100% anticorrelation.

It is important to note that Pearson’s calculation only applies to the groups of pixels in which the two channels intersect, as shown in Fig. 11.6. It does not consider pixels that appear only in one of the channels. This has significant consequence. As can be seen in Fig. 11.6, Pearson’s coefficient is insensitive to the percentage area of intersection; it merely concerns itself with how well the pixels between the two channels in the area of intersection “correlate” with one another in intensity fluctuations. While it is immensely powerful to determine how well the intensity signals are correlated between the two channels (an important proxy for “interaction” or “association”), it does not calculate the area of overlap . It is therefore important to use PCC carefully and appropriately.

Fig. 11.6
figure 6

Pearson’s coefficient and object intersection. Consider two objects (red and green), each with an area of 100 square pixels , and a quarter of each of the objects intersects one another. Pearson’s coefficient would only be applicable in the area in which the two segmented objects intersect

The power of Pearson’s coefficient thus lies in the pixel-by-pixel covariance between the two channels (Adler and Parmryd 2010). Practically speaking, Pearson’s coefficient can be particularly sensitive to changes in colocalization patterns when one or both images contain relatively sparse signals across the field of view. But this sensitivity can create alarming situations for unsuspecting biologists. Put simply, if the intensity in either image channel does not vary greatly (such as when labeling a large and homogenous biological structure), the PCC will likely return a nonintuitive result. Let’s consider the situation presented in Fig. 11.7. This illustration provides a good (while extreme) example of how the PCC can return an unexpected result. The diagram shows a red object overlapping with a green object. The area of intersection is 50% for each object. The most important feature in this hypothetical situation is that the two objects are saturated in their intensity; thus both objects show no intensity variations . Even if by all biological definitions these two objects are “colocalized, ” PCC will fail mathematically to return a coefficient. Since there is no variation in intensity, PCC simply cannot compute due to division by zero. Not only does this extreme and hypothetical situation serve as a cautionary example of why image similarity analysis should not be performed on images with saturated pixels, but it also serves as a good example of strong co-occurrence in the absence of correlation. It is important to remember that PCC relies on each image containing a wide range of pixel values over which to correlate each image. Small (or the lack of) variances in signal can produce problematic PCC values, even if there is a strong signal overlap. Likewise, the inclusion of background pixels will artificially inflate PCC. This is because the background of two images can be highly correlated with each other, not to mention that they can significantly deviate from the mean intensity values. Excluding the image background via the application of threshold will generally return a more intuitive PCC value.

Fig. 11.7
figure 7

Effects of intensity variability on Pearson’s correlation coefficient. Two identical-sized objects (red and green) are shown here with 50% area overlap. These two objects have homogeneous pixel intensity with no variability, thus contributing to a zero value in the denominator for Pearson’s correlation coefficient , rendering the calculation impossible. This is an example where co-occurrence of signals does not translate into correlation

These situations, while extreme, show how a powerful analytical tool can be wrongly interpreted. In short, the PCC is most valuable to biologists when considering images that vary widely in pixel intensity and where the background can be ignored by applying threshold. If the main aim of the analysis is to quantify overlapping area, then PCC is not the right algorithm. However, if the goal of the experiment is to quantify how the two signals correlate with one another in the area in which the two signals intersect, then PCC is the right tool. Another drawback is that the PCC provides no channel-specific information. To help answer this question, we turn to Manders’ coefficients .

11.8.2 Manders’ Overlap Coefficients (MOCs)

It is common to encounter situations wherein most of the pixels from one channel contribute to colocalization, while the other channel does not. For example, almost all the signal from labeled transcription factor molecules will colocalize with a DAPI-stained nucleus, but not vice versa. A more pertinent measurement in this case may be to quantify the contribution of both fluorescent intensity and area from each channel toward the overlapping region. In such situations, one may prefer turning to Manders’ overlap coefficients (MOCs) (Manders et al. 1993). The MOC examines the ratio of intensity-weighted intersecting volume to total object volume. In other words, what percentage of color 1 (and color 2) pixels and cumulative intensity contribute to the color overlap? The MOC is defined as follows:

$$ \mathrm{MOC}=\frac{\sum \limits_iC{2}_i\cdot C{1}_i}{\sqrt{\sum \limits_iC{2}_i^2\cdot \sum \limits_iC{1}_i^2}} $$

where C2i and C1i are defined as previously described. Notice that pixel intensity values are now expressed in absolute terms, not as deviations from the mean as in the PCC . Thus, pixels with zero intensity are intrinsically omitted, eliminating the possibility of negative values. Manders also proposed individual channel coefficients to determine the overlap of each channel on the other:

$$ {M}_1=\frac{\sum \limits_iC{1}_{i,\mathrm{coloc}}}{\sum \limits_iC{1}_i} $$
$$ {M}_2=\frac{\sum \limits_iC{2}_{i,\mathrm{coloc}}}{\sum \limits_iC{2}_i} $$

where

$$ C{1}_{i, coloc}=\Big\{{\displaystyle \begin{array}{c}C{1}_i\; if\;C{2}_i>0\\ {}0\; if\;C{2}_i=0\end{array}} $$

and

$$ C{2}_{i,\mathrm{coloc}}=\Big\{{\displaystyle \begin{array}{c}C{2}_i\; if\;C{1}_i>0\\ {}0\; if\;C{1}_i=0\end{array}} $$

Unlike PCC which focuses on the signal correlation within the image intersection (Fig. 11.6), MOC is implemented by calculating the coefficients based on the union of the two channels , as shown in Fig. 11.8.

Fig. 11.8
figure 8

Manders’ coefficient and object union. Consider two objects (red and green), each with an area of 100 square pixels , and a quarter of each of the objects intersects one another. Manders’ coefficient would be implemented in the total area covered by both signals, which is referred to as the “union” of the two signals

Consequently, this creates significant mathematical difference between MOC and PCC. First, MOC is sensitive to the number of pixels that are above threshold. In other words, the area covered by above-threshold pixel will affect MOC, but not PCC. MOC can therefore report the percentage of total intensity contributed by each channel to the overlap. This is something PCC cannot deliver. Consider the situation in Fig. 11.9. Figure 11.9b and c has extra green objects in the merged image that are not found in Fig. 11.9a. The extent of colocalization has decreased with the addition of extraneous, nonoverlapping objects . Yet, since the intersection has not changed, PCC remains identical throughout Fig. 11.9a–c. However, since the additional objects affect the “union” of the two signals, the MOC value drops accordingly.

Fig. 11.9
figure 9

Effects of adding “non-colocalizing” signals on PCC and MOC . Consider two objects (channel 1, red, and channel 2, green), each with 10 × 10 pixels. In panel a, 50% of these pixels overlap. This gives rise to an overall and channel-specific MOC of 0.500 and PCC of 0.445. In panels b and c, extra nonoverlapping green objects are added. While the degree of “colocalization” should decrease in these latter cases, only MOC reports the decrease in signal overlap accordingly. PCC, which only reports signal correlation within the area of intersection, is not affected by these extraneous objects. This is because the area of intersection has not changed in these scenarios

In addition, since MOC is weighted for intensity value , it is therefore more than merely a calculation of area overlap. The brighter the pixels in the overlap, the higher the MOC score. Likewise, it has the intrinsic propensity to diminish the contribution of dim background noise. So MOC is relatively refractory to SNR fluctuation to a certain extent. Yet, this feature is a double-edged sword. Pixels that contain very high intensity values (such as non-specific antibody binding, large shading, or out-of-focus light) will inflate the readout of MOC. A word of caution is that none of these features of MOC preclude the need for background signal subtraction. It is important to remember that pixel intensity is still affected by unwanted signals.

Therefore, the fundamental difference between the MOC and the PCC is how each image pixel contributes to the overall coefficient value. The MOC is based on the absolute magnitude of fluorescence intensity, while the PCC is based on deviation from the mean intensity. Thus, as the intensity of a given pixel decreases , its overall contribution to the total Manders’ coefficient is likewise reduced. In the same way, if the background/offset in either image is significant, it will severely skew the resulting MOC to a higher value. In addition, an abundance of high-intensity co-occurring pairs can produce Manders’ coefficient that is refractory to other low-intensity pairs, whether the latter are colocalized or not.

To further understand the relationship between the PCC and MOC, we will examine how they assign weight to each pixel intensity pair. Figure 11.10a–b shows images of α-actinin and actin, respectively, in a cultured murine embryonic fibroblast. There are areas where the two proteins show strong colocalization and other locations where α-actinin decorates focal adhesion complexes devoid of actin (Fig. 11.10c). An example of low colocalization is highlighted by yellow arrows, in B and C. Figure 11.10d shows the relative contribution of each pixel pair to the total MOC. As such, areas with poor overlap (yellow arrows) are assigned minimal weight. Similarly, the coverslip area makes no contribution to Manders’ algorithm due to low intensity. Most visible cytoskeletal structures, on the other hand, receive equal weight except for a few high-intensity spots where color 1 and color 2 pixels co-occur.

Fig. 11.10
figure 10

How PCC and MOC weigh biological image. Panels a and b show α-actinin and actin immunostain in a murine embryonic fibroblast , respectively. (c) Overlapped pixels are highlighted in white. Yellow arrows indicate areas where the two proteins show no overlap. (d) Relative contribution of each pixel pair to the total MOC. (e) Relative contribution of each pixel pair to PCC prior to threshold application. (f) The same as e but with threshold applied. White arrows show how pixels with negative correlation respond to intensity threshold. Intensity scale bars on the right of df show relative weights assigned to the pixels in each calculated scenario

Figure 11.10e shows areas of both positive (red) and negative (black) correlation as calculated by PCC – offering information that is unavailable using the MOC. The white arrowhead points to an example area with negative correlation. Interestingly, however, the PCC assigns a non-negligible weight to the coverslip area. This is due to the fact that, while dim, these pixel intensities deviate from the mean image intensity value and thus increase the PCC due to their correlation. To avoid this, a threshold should be applied to the images (Fig. 11.10f). Refer to the “Image Acquisition and Preprocessing” section above for guidelines on selecting an appropriate threshold value. Note that, regardless of the method being employed, it is important to apply the same methodology to all images that are being compared in a set of experiments.

A question arises, however, as to what pixel intensity pairs contribute most to either the PCC or MOC. To elaborate this important distinction, scoring matrices for each coefficient are plotted in Fig. 11.11. This illustration is adapted from Adler and Parmryd (2010) and indicates the relative “importance” of any given pixel intensity combination, within an 8-bit range. These plots show that pixel pairs with near identical intensities receive the highest relative weighting from both algorithms. However, as the color 2 and color 1 pixel intensities diverge from each other, so do the behaviors of each coefficient. Figure 11.11a shows that the importance given to the MOC decreases proportionately as one channel’s intensity changes relative to the other. On the other hand, the weightings of the Pearson’s coefficient (Fig. 11.8b) follow a more complex pattern that is determined by two factors: (i) the intensity differences between pixel pairs and (ii) the deviation from each channel’s mean intensity.

Fig. 11.11
figure 11

Correlation of individual pixel pairs with the scoring scheme of MOC and PCC. (a) Scatterplot derived from the two channels used in Fig. 11.10. The pink area shows pixel pair s that do not fall into the linear correlation. (b) The underlying “heat map” indicates how MOC assigns its scores. High-intensity areas (white and yellow) receive high MOC scores, while the darker areas where two channels show decreasing overlap receive low MOC scores. The scatterplot in panel a is then transposed onto panel b, showing that the pixel pairs with linear correlation receive comparable MOC scores. (c) The underlying heat map shows how PCC assigns its scores. The scatterplot from panel a is likewise transposed onto panel c. Note that in this example, pixels with linear correlation receive varying scores from PCC; the blue arrow shows pixels receiving high PCC scores, while the white arrow shows pixels within the same linear correlation receiving low PPC

Figure 11.11a shows a scatterplot derived from the images in Fig. 11.10a and b. It indicates that a portion of the α-actinin signal (color 2 channel) is correlated well with actin (color 1 channel). However, a portion of the α-actinin does not correlate with actin, as highlighted by the pink bounding box. Now let’s superimpose this colocalization scatterplot onto the scoring matrix of PCC and MOC (Fig. 11.11b–c). It is clear that the pixels outside of the bounding box receive similar importance from the Manders algorithm (Fig. 11.11b), regardless of their absolute intensities. This linear relationship is intuitive; as long as the molar ratio of actin and α-actinin remains consistent, an equal importance is assigned toward the final MOC.

However, if one traces the linearly correlated pixels within the scatterplot from low-intensity pairs to high-intensity pairs, one would notice that PCC assigns strong significance to the very dim and the very bright pixel pairs (blue arrow) but low significance to the pixels near the mean intensity value (white arrow), as shown in Fig. 11.9c.

11.8.3 Setting Appropriate and Unbiased Intensity Threshold Level

Pearson’s coefficient is therefore highly sensitive to a pixel pair’s deviation from their respective mean intensity values and also to the difference in pixel intensity between the two channels. To minimize artifacts from this effect, ensure that (i) the dynamic range is fully filled in each image and (ii) offset/background subtraction and thresholds are applied. The dependence of PCC on SNR of the images highlights one of its weaknesses. A decreased SNR would concomitantly decrease the predictability of the relationship between the intensities of the two images, likewise making it difficult to set the proper intensity threshold without introducing observer bias.

In order to introduce a quantitative and unbiased method to set intensity threshold for correlative analysis, Costes et al. (2004) devised a progressive method by calculating the PCC scores across a range of threshold values. In this approach, thresholds for the two images are first projected at near the maximum pixel values for each. The PCC is computed for pixels both above and below the threshold values, and the process is reiterated with incrementally lower threshold values that fall along a linear regression of the scatterplot. This process is repeated until the PCC for the subthreshold pixels approaches zero. This is considered the Costes threshold value for segmentation from the background. It is important to note that this method may not work optimally when the signal of interest is not well correlated in comparison to the background. The Costes thresholding method is not able to identify the clear distinction in the PCC values between the two.

11.8.4 Expanding Correlation Analysis with Spearman’s Rank Correlation Coefficient

PCC makes a frequently underappreciated assumption, in that it expects the signals in the intersection to exhibit a linear correlation. PCC assigns its highest magnitude score (+1 or − 1) only when the pixel-intensity relationship is linear. As a result, in a situation wherein two signals are clearly correlated, but with a varying degree of proportions, PCC tends to underestimate the degree of correlation. This shortcoming in PCC is addressed by Spearman’s rank correlation coefficient (SRCC) (Adler et al. 2008). In essence, the SRCC computation is equivalent to PCC, except that it is applied to pixel intensity ranks, whereby PCC is applied to the intensities themselves (Spearman 1904). SRCC converts pixel value to pixel rank by giving the lowest above-threshold pixel intensity in the image a rank of 1, the next lowest intensity value would then receive a rank of 2, and so on until every intensity value in the image has iteratively received a rank. In cases where multiple pixels have the same intensity, that particular intensity value would be assigned an average score. For instance, if two pixels are tied for the fifth and the sixth lowest value, they would both be ranked with the average value of 5.5. This ranking approach thus linearizes a scatterplot, making the PCC applicable to nonlinear correlation.

Examples of linearly and nonlinearly correlated image pairs can be found in Fig. 11.12. In Fig. 11.12a, two nearly identical images are displayed in columns 1 and 2. The linear correlation is likewise reflected in the intensity scatterplot presented in column 3. The ranked scatterplot as implemented by SRCC is shown in column 4. In this example, the SRCC and PCC show an almost identical result as the original intensity values are linearly correlated, so the SRCC ranking makes negligible impact. However, in a situation where the two signals are well correlated but in a nonlinear fashion, such as that presented in Fig. 11.12b, the benefit of SRCC becomes apparent. In Fig. 11.12b, the channel 2 image (red) has been altered from panel 11.12a. This modification produces a very well-correlated, albeit nonlinear relationship between the two images, as indicated by the scatterplot in column 3 of panel 11.12b. As a result of the nonlinearity, the PCC value is lower than expected, even though the two signals show near-perfect correlation. On the other hand, by linearizing the relationship through ranking the pixel intensity values in the two images, the SRCC restores the near-perfect correlation. It is therefore important to note that by eliminating its reliance on a linear relationship, the fidelity of SRCC is not dependent on the assumption that the signals must obey a linear relationship and therefore is more practically versatile in assessing biological signals , whose association rarely exhibits clean and straightforward linearity. SRCC should therefore be preferred over PCC for all practical purposes .

Fig. 11.12
figure 12

Pearson’s and Spearman’s rank correlation coefficients. (a) Two nearly identical images (except for random noise introduced into the images ) are shown in columns 1 and 2. The images in panel a exhibit a linear correlation that produces comparable results for PCC and SRCC. (b) Another pair of nearly identical images is shown in columns 1 and 2. The green image is identical to the one in panel a, while the red image has been slightly altered compared to its counterpart in panel a to generate a highly correlated but nonlinear relationship with the green image. This relationship is displayed in column 3 of panel b. Due to the nonlinear correlation, PPC score decreases to 0.875, demonstrating the tendency of PCC to underestimate perfectly correlated but nonlinear signals. However, SRCC (column 4) converts the pixel intensity values of the two images into ranks, essentially “linearizing” the correlation. In doing so, this reflects the near-perfect correlation more accurately

11.9 Global Factors Affecting Molecular Clustering

The study of image similarity , however, is more complex than obtaining readout from colocalization coefficients, especially when these indices are often considered a representation of spatial relationship of biomolecules. Even with the appropriate implementation of the optimal quantitative index, these measurements, which are statistical and probability analyses, do not consider other biological variables that will impact the outcomes. It is not uncommon that a global factor can vastly affect the pattern of molecular clustering. These global factors can result from (but are not limited to) drastic cell shape changes, molecules being confined to structural constraints during transportation, cellular polarization, macromolecular realignment, rapid variations in molecular intensity due to expression levels, etc. These factors, while biologically significant, may skew the apparent image similarity measurements such as colocalization and ratiometry, leading to misinterpretation of the intrinsic, local molecular interaction.

Unfortunately, these “global biases” are rarely decoupled from the local molecular relationship prior to making a data interpretation (Adler and Parmryd 2010; Bolte and Cordelieres 2006; Costes et al. 2004; Dunn et al. 2011; Tambe et al. 2011; Yannis et al. 2015). These confounding global effects often skew, if not inflate (or conceal), the underlying molecular interaction at the regional level. For example, a metastatic cancer cell squeezing itself through tight spaces in between the extracellular matrix and the endothelium during invasion tends to “indiscriminately” squeeze a lot of biomolecules within the narrowest part of itself, as it actively changes its shape to overcome the size constraint of the obstacle. This process will inevitably increase the “colocalization” readout due to molecular crowding, regardless of what image similarity coefficient (be it PCC , MOC, SRCC, etc. ) is used. This increase may not be a result of underlying local molecular forces but merely due to the drastic change of cell shape. So some mathematical, heuristic approach must be devised to uncouple that global bias. There are several approaches to this problem (Helmuth et al. 2010; Lagache et al. 2015). One way is to normalize the readout to simulated data and identify the global bias as a confounding factor for subsequent elimination from the calculation (Vander Weele and Shpitser 2013).

Yet, eliminating the global effect outright would also mean throwing away equally essential biological data. These global biases frequently are the real biological effects (e.g., cell shape changes) that define the molecular events being interrogated. A more desirable method would be one that could simultaneously decouple the global bias from the local interaction yet is capable of scoring both factors. In light of that, Zaritsky et al. (2017) have proposed an elegant algorithm, DeBias. The underlying assumption of DeBias is that the apparent spatial relationship between two variables is the sum effects of a global bias and a local interaction component. Briefly, to decouple the two factors, the algorithm randomizes the two variables that carry orientation information (Drew et al. 2015; Nieuwenhuizen et al. 2015) and then resamples the distribution of interactions (“resampled”). In this case, in a scenario with neither global bias nor local interaction contributions, the randomized alignment would be uniformly distributed (“uniform”). The power of the randomization step is that it allows the global bias factor to be easily extracted, as it decouples the effect of any local interactions from the global bias. In this case, the global bias is defined as the difference between the “uniform” and the randomized (and “resampled”) distributions. On the other hand, the effects of local interaction are represented by the difference of dissimilarity between the observed and “uniform” distributions and dissimilarity between the “resampled” and “uniform” distributions. This freely available (https://debias.biohpc.swmed.edu) mathematical tool should therefore be in the repertoire of anyone interested in analyzing image similarity. It is also important to note that in addition to biological factors, global bias may also be introduced by non-biological factors, including spatially correlated noise and/or detector offset .

11.10 Take-Home Message

The seemingly simple concept of measuring image similarity is complex. As discussed, no colocalization coefficient is perfect, as none of them really measure the misnomer term “colocalization” per se. While various arguments exist in favor of one coefficient over another, such discussions may fail to consider practical experimental issues – unequal antibody affinities, naturally imbalanced protein stoichiometry, and the association of low-abundance proteins with a large, bright biological structure. Overall, it is important to remember that image similarity analysis never directly measures molecular interaction. It gives a scoring system to evaluate the relationship between different molecular images of the same sample. Image similarity studies are only meaningful if these coefficients reproducibly show change that can be related to experimental intervention or compared to good controls.

Another important point discussed in this chapter is how the resolution of the microscope impacts the analysis of image similarity. While recent advances may offer biological details unattainable prior to the advent of super-resolution microscopy (discussed in Chap. 8), they may also create confusion for end users. However, there are commercial instruments that merely enhance the resolution approximately 1.5–2 fold. These categories of instruments or techniques, which include structured illumination microscopy (SIM), image scanning microscopy, and the closely related pixel reassignment techniques (Sheppard et al. 2013; York et al. 2013), would indeed improve the accuracy of image similarity analyses such as SRCC and MOC. But beyond that, localization-based super-resolution techniques (Betzig et al. 2006; Rust et al. 2006) and, also to a lesser degree, stimulated emission depletion (STED) microscopy (Hell and Wichmann 1994; Klar et al. 2000) offer resolution near the molecular level and call into question the entire exercise of analyzing colocalization at the realm described by the Pauli exclusion principle. In fact, super-resolution microscopy raises questions about whether conventional analyses on image similarity can be implemented in images with sub-diffraction resolution. Unfortunately, an in-depth discussion of various new techniques to characterize and quantify signal relationships is beyond the scope of this chapter. We point readers to several original papers and reviews that deal specifically with leveraging spatial statistics-based techniques to quantitatively assess molecular interactions (Coltharp et al. 2014; Lagache et al. 2015; Nicovich et al. 2017).

In summary, the analysis of image similarity hinges on five important decisions on the part of the user: (i) appropriate image acquisition and processing parameters, (ii) correct object segmentation, (iii) informed use of the optimal coefficient for image similarity analyses, (iv) confidence in the statistical significance in the readout of the coefficient, and (v) decoupling of global bias from local interaction. Only when all five decisions are made wisely is one able to deduce meaningful biological information from the analysis of image similarity.