1 Introduction

Matrix-assisted laser desorption/ionization mass spectrometric imaging (MALDI MSI) [1, 2] is a versatile technique, which has gained special interest in proteomics to discover the spatial distribution of proteins, mainly those below 20 kDa [3]. The spatial analysis of small molecules such as lipids by MALDI MSI is far more simple [4], since small molecules ionize easier by MALDI. In this respect, a high mass accuracy and resolution mass spectrometer can be employed such as Fourier-transform ion cyclotron resonance (FT-ICR) [5, 6], allowing to make hypotheses on chemical formulas directly from the exact masses of the detected peaks. However, because of the high mass resolving power and mass accuracy, FT-ICR MSI produces big datasets, especially when MSI is performed with a high spatial resolution that is beneficial for many applications in biology and medicine. One possibility to reduce the datasets is to re-bin them, similar to that incorporated in the software package FlexImaging (Bruker Daltonics). The re-binning replaces the original measured values at a detailed m/z grid by a value of a larger m/z-interval. Nonetheless, the analysis of the raw data in a non-targeted, discovery manner by visualizing each single peak in the average spectrum in an extracted ion image with the corresponding co-registered hematoxylin and eosin (H&E) image is very time-consuming, considering that the dataset typically is comprised of millions of signals.

To overcome this challenge, we have utilized a workflow, previously developed for medium mass resolution MALDI-TOF MSI datasets [7], by spatially segmenting the MALDI FT-ICR MSI datasets. Precisely after the MSI run the serial tissue section is stained, histopathologically annotated, and finally superimposed onto the segmentation map, generated automatically from the imported raw dataset. The generated segmentation maps color regions of distinct molecular composition, facilitating significantly the data interpretation. The implementation of this approach to MALDI FT-ICR MSI data in order to determine the in situ lipidome of head and neck tumor revealed in most cases good agreement between histopathologically relevant features and certain molecular characteristics. The subsequent follow-up analysis of the lipid signals in certain clusters provided a way to obtain insights into the functional lipidomic organization of head and neck tumor.

2 Experimental

2.1 Materials and Reagents

HPLC grade acetonitrile, ethanol, water, and trifluoroacetic acid (TFA) were purchased from Thermo Fisher Scientific (Fair Lawn, NJ, USA). Alpha-cyano-4-hydroxycinnamic acid (CHCA) was obtained from Bruker Daltonics (Bremen, Germany).

2.2 Tissue Collection and Preparation

Four head and neck tissue samples (two cancer samples, one benign tumor, and a normal salivary gland tissue) were obtained after surgical resection at the Department of Otorhinolaryngology of the Jena University Hospital. After resection, the specimens were snap-frozen in liquid nitrogen, and stored at −80°C until further analysis. The tumor specimens were categorized according to the WHO classification [8]. Ethical approval was obtained from the Ethics Committee of the Jena University Hospital.

Each frozen clinical sample was attached with water on a cryostat steel plate and left for 1 h to adapt to the cryomicrotome cutting temperature of −20°C (Leica CM 1950; Leica Biosystems, Nussloch, Germany). Two 12 μm thick sections were thaw-mounted onto an indium-tin oxide (ITO) glass slide (Bruker Daltonics), and a third section on a regular glass slide for H&E staining. The two tissue sections on the ITO glass slide were subsequently desiccated for 30 min and immediately covered with CHCA at a concentration of 7 mg/mL in 50:50 (v/v) acetonitrile:TFA (0.2%) using the ImagePrep deposition device (Bruker Daltonics).

2.3 MALDI FT-ICR MSI Measurements

MALDI MSI was performed on a Solarix 12 T FT-ICR (dual ion source ESI, MALDI; Bruker Daltonics) mass spectrometer using a smart beam laser (1000 Hz). The spatial resolution was set to 50 μm and 20 laser shots per spectrum were used. The mass spectra were acquired in the positive ion mode with a m/z range of 350 to 1,500. The mass resolution at m/z 412.278 was 206,139.

2.4 Processing of MALDI FT-ICR MSI Datasets

The ion images were visualized either using FlexImaging (ver. 4.0; Bruker Daltonics) or SCiLS Lab (ver. 2014b; SCiLS GmbH, Bremen, Germany). The latter was used for processing the datasets as follows: Peak picking was performed on the mean spectra of the datasets, and 1000 peaks were selected by modeling the mean spectra as a sum of peaks of the Gaussian shape plus noise, as described by Alexandrov et al. [7].

2.5 Spatial Segmentation of MALDI FT-ICR MSI Datasets

For the unsupervised mining of the large MALDI FT-ICR MSI datasets, spatial segmentation was employed by clustering the spectra into distinct groups according to their similarities. For the clustering process, an efficient method called “bisecting k-means” was selected, which is optimized for hierarchical clustering of large MALDI MSI datasets in order to find hidden structures in the unlabeled data [9, 10]. The algorithm has the advantage that a prioiri the number of clusters need not be defined by the user, since it recursively partitions the data into two clusters at each step. Hence, the clustering process was conducted with the following parameters: for the cluster initialization, the farthest distance, and for the distance metric the correlation distance were used, respectively. The reduction of the spectrum-to-spectrum variation was accomplished by edge-preserving spatial denoising, which was performed prior to the segmentation. Note, all described steps were carried out within the “segmentation pipeline” of the SCiLS Lab software. Finally, the generated segmentation maps of two up to seven clusters were overlaid on the corresponding histologic images.

2.6 Tentative Lipid Assignments

To make hypotheses on detected lipid species, an average peak list from each dataset was created in the SCiLS Lab software, imported as a txt file in mMASS [11], and the peak centroids were manually queried against lipids monoisotopic molar masses by using the online database www.lipidmaps.org. The database was searched with a tolerance of 2 ppm.

2.7 Identification of Discrete Molecular Signals in Different Biological Compartments within the Tissue

The accordance between the histologic tissue regions of the tissue sample and the biochemical differences determined by clustering the FT-ICR MSI datasets was checked by an experienced pathologist, who finally decided the number of meaningful clusters. Afterwards, each tentative assigned lipid signal was attributed to a tissue region, which is defined by a cluster. Subsequently, the Pearson correlation coefficient (PCC) between each tentative assigned lipid specie in the dataset and the corresponding cluster was calculated using SCiLS Lab, on the basis of significant correlations (P-value <0.05). A PCC ≥0.5 was used as an indication that the specific lipid signal can be correlated with the cluster, below 0.5 the association between lipid signal and cluster was considered as not conclusive.

3 Results and Discussion

Head and neck cancer comprises malignant tumors located in the upper aerodigestive tract, for instance in the oral cavity or in the throat. The typical histology of head and neck tumor is squamous cell carcinoma (HNSCC), which is fatal without radical therapy. Nevertheless, half of the patients develops a locoregional recurrence in the next 2 years. Hence, the spatial molecular characterization is crucial to improve individualized therapy. The molecular imaging technique MALDI MSI that complements the histopathologic analysis and enables the simultaneous analysis of hundreds of molecular compounds in a single measurement has been applied to head and neck cancer to discover the spatial functional proteome [12, 13] as well as the lipidome [14]. Especially the lipidomic analysis of head and neck cancer is crucial for understanding the cellular physiology and pathology of this type of malignoma and, hence, has been selected for further examination. In the current study, a collection of four head and neck tissue samples were analyzed by MALDI FT-ICR MSI (see histologic images in Figure 1a). The challenges encountered during the FT-ICR MSI measurements were to handle the big datasets and to extract the important information. First, we performed the in-depth analysis of the lipidomic signatures by manually inspecting the spatial distribution of each individual signal in the dataset and its agreement with the corresponding H&E stain. Due to the fact that this tedious data analysis was lasting over 2 months for one sample, we decided to go for a computational approach, namely to represent the full dataset with a segmentation map by clustering mass spectra by their similarity and representing the clusters with different colors [7, 13]. This was performed with the commercially available SCiLS Lab software, which converts the datasets generated from any Bruker Daltonics mass spectrometer into an h5 file via a FlexImaging data importer. Since the current version is for MALDI-TOF MSI datasets, the following parameters have been modified to enable the analysis of high-resolution MSI datasets: the first two steps, including normalization to the total ion current (TIC) followed by the baseline correction of each spectrum were skipped, since it was not essential for the FT-ICR MSI datasets. Since, typically, FT-ICR mass spectra contain significantly more peaks than TOF mass spectra, the peak picking method was adapted by searching for a larger number of peaks, namely, 1000 in total. Edge-preserving spatial denoising and spectra clustering by means of “bisecting k-means” [9, 10] were done identical to the MALDI-TOF workflow. The segmentation maps generated accordinly for the four head and neck tissue samples are presented in Figure 1. The first HNSCC sample shows a complex morphologic structure as indicated in the H&E stain in Figure 1a, including tumor stroma, epithelial tumor components, and connective tissue. The automatic segmentation analysis resulted in up to seven clusters. Due to the assignment of a specific color to each cluster, the margins of each cluster within the investigated HNSCC sample are clearly visible (see Figure 1b). In order to present these findings, three clusters have been selected for a detailed examination. The first cluster, presented in yellow, consists of epithelial tumor components and central parts, as clearly observable in the area A1 when the cluster is set transparent and the co-registered H&E stain appears (see Figure 1c). The second cluster shown in light blue is defined by epithelial tumor cell components, as highlighted in the area A2, when the cluster is set transparent. The segmentation map is shifted to the right, since a consecutive section was chosen for the H&E stain. The third cluster shown in red includes pre-existing connective (hyalinized) and fat tissue, which is obvious, when the cluster is set transparent and the H&E stain appears, as displayed in area A3. In the subsequent clustering of the three remaining tissue samples, the resulting segmentation maps were not always in perfect accordance with the histology, for which there are several explanations as follows. The quality of the tissue is very important. It relies on the biopsy area selected by the physician and the fast cooling of the specimen to avoid any degradation [15]. The second possible explanation is the selected spatial resolution for the acquisition of the MSI data. Ideally, the spatial resolution should be as small as possible to see all histologically relevant features. Currently, a spatial resolution of 5 to 10 μm in MALDI MSI is possible [16]; however, it prolongs the acquisition time very much and increases the dataset size enormously. We have chosen a spatial resolution of 50 μm, causing on one hand some difficulties to visualize very small tissue compartments but, on the other hand, the experiments could be performed overnight resulting in datasets in the size of 100 gigabytes. The third explanation for this not “ideal” correspondence between molecular distribution and histologic features could be the use of a consecutive section for H&E staining, as shown earlier, which was necessary because the hydrophobic character of the ITO-glass slides led to the disruption of the tissue during matrix removal. Consequently, only meaningful data could be generated by choosing the segmentation maps with two clusters in the three remaining tissue samples (second HNSCC tissue sample, parotid gland, and adenolymphoma), as presented in Figure 1. Additionally, the clusters were typically not as clearly defined as in the first HNSCC sample, resulting in numerous tissue compartments comprised in one cluster. For example, the first cluster of the second HNSCC sample shown in blue in Figure 1 is defined by epithelial tumor components, but they are strongly keratinized. In the third tissue sample, the normal sample of parotid gland tissue, the first cluster shown in blue, includes mainly interstitial connective tissue but also fat tissue and parts of adjacent glandular parenchyma. Finally, the first cluster of the adenolymphoma section, also shown in blue, encloses epithelial parts, luminal parts, as well as lymphoid parts of the adenolymphoma. The latter parts are predominantly in this cluster.

Figure 1
figure 1

Digital histological analysis of the two considered head and neck carcinoma samples in comparison to normal parotid gland tissue, and a benign parotid tumor performed by means of spatial segmentation of MALDI FT-ICR MSI data: (a) Microscopic image taken from a serial section stained with H&E, (b) segmentation map generated from the MALDI FT-ICR MSI dataset, (c) zoom-in visualizing the associated histological features of one selected cluster, which was set transparent

The advantage of performing MALDI MSI measurements on an FT-ICR mass spectrometer is the high mass resolving power, which allowed us to make hypotheses on chemical formulas of numerous lipids. Figure 2 depicts the average mass spectrum obtained from the MALDI FT-ICR MSI analysis of the first HNSCC tissue (see Figure 1). The measured m/z values, which matched to possible lipid entities using the database LIPID MAPS within an allowable mass error of ±2 ppm, are marked with grey dots in Figure 2. It has to be kept in mind that the in situ analysis of intact lipids based on exact masses is often complicated by isomeric interferences. Hence, further confirmation by LC-ESI MS/MS from lipid extracts of the investigated tissue is ideal, which was, in our case, not possible because of the limited amount of available tissue. Hence, the received maximum number of lipid species based on exact mass measurements was 243 for the considered HNSCC tissue. Herewith, the categories glycerophospholipids [GP], glycerolipids [GL], and sphingolipids [SP] made up the majority. The complete list of the tentative lipid assigments of the evaluated HNSCC tissue can be found in the Supplementary Material.

Figure 2
figure 2

Average spectrum obtained by MALDI FT-ICR MSI in a HNSCC tissue section (first section in Figure 1). Peaks matched to monoisotopic molecular masses of + H, +Na, and + K ions of lipids from the database LIPID MAPS are annotated with grey dots. The list of the tentative lipid assignments can be found in the Supplementary Material

In order to get a deeper knowledge about the complex lipidomic composition of the analyzed head and neck tissue in the different histomorphologic regions, the average spectrum of each cluster was created from the MALDI FT-ICR datasets using the software SCiLS Lab, which generates the region for each cluster and calculates the mean mass spectrum from the assigned spectra-group of this region. Figure 3 represents the average mass spectra for the first three clusters of the first HNSCC section (see Figure 1). This section has been chosen for in-depth analysis, since it showed the best accordance between segmentation results and histology.

Figure 3
figure 3

Comparison between the average spectra of selected clusters obtained by MALDI FT-ICR MSI in a HNSCC tissue section (first section in Figure 1). The regions of each cluster is shown as inserts

In Figure 3, the area of each cluster within the HNSCC tissue is presented, using the same color scheme as in Figure 1. The first two clusters show high signals between m/z 700 and 800, which are characteristic for different phospholipid (PL) species. The strongest signal in both clusters is at m/z 782.567, which can be assigned to phosphatidylcholine (PC) as well as phosphatidylethanolamine(PE), which represent the most abundant PL classes in mammalian cell membranes [17]. It is comprehensible that the first two clusters share the same main lipid species, since they differ in their tumor environment only slightly; both consist mainly of epithelial tumor components. In contrast, the third cluster, which has a completely different morphology as revealed by the histologic examination (see Figure 1c). It consists mainly of fatty tissue. This morphologic difference is clearly reflected in the lipidomic profile, as shown in Figure 3 (bottom average spectrum in red). The average spectrum of this third cluster shows abundant signals between m/z 850 and 950, which are characteristic for triacylglycerols (TG). Since TGs represent the storage form for fatty acids in the human body, it is obvious that they are mainly localized in adipose tissue (fat tissue).

In order to identify differences in the lipid biochemistry within the HNSCC tissue according to the underlying morphology on the basis of a single lipid species, the Pearson correlation coefficient (PCC) between each tentative assigned lipid specie in the HNSCC tissue and the corresponding cluster was calculated (P value <0.05), using SCiLS Lab, as described in the Experimental section. The resulting PCC for each cluster (in total seven clusters, as shown in Figure 1) can be found in the Supplementary Material. Therein, it can be seen that PCC can vary between +1 and −1, whereas 1 is total positive correlation, 0 is no correlation, and −1 is total negative correlation, meaning that the specific signal is with high confidence in the remaning tissue region. Furthermore, a value between 0.5 to 1.0 indicates high correlation for the considered cluster, and a value between 0.3 to 0.5 a medium correlation. Subsequently, the tentative assigned lipid species with the highest PCC for the first three clusters are summarized in Table 1 with the corresponding elemental composition.

Table 1 Elemental Composition of the Compound with the Highest Pearson Correlation Coefficient (PCC) in the Considered Cluster in a HNSCC Tissue (First Section in Figure 1)

It should be noted that signals with a high signal intensity in a certain cluster are not necessarily suitable candidates to discriminate this tissue region from the surrounding area; e.g., the signal at m/z 782.567 (Table 1, first row) has a high signal intensity in the first cluster, as indicated in Figure 3 (first average spectrum in yellow); however, the PCC with 0.441 reveals only a medium association between the signal and the first cluster. In fact, this signal also has high signal intensities in the second cluster, as indicated in Figure 3 (second average spectrum in blue). This behavior is clearly shown in Figure 4a): high signal intensities of m/z 782.567 are observed in the constructed ion image outside the first cluster. For a better view, the regions of the first cluster are outlined in yellow.

Figure 4
figure 4

Extracted MALDI FT-ICR MSI ion images of PL species at (a) m/z 782.567, and (b) m/z 788.616. The region of the first cluster is outlined in yellow, the one of the second cluster in blue, respectively

Within the second and third clusters, several specific m/z values could be identified with high correlations (>0.6), which can be considered as suitable candidates to allow a discrimination between the considered cluster and the remaining tissue. The ion species with the highest PCC of the second cluster, m/z 788.616 (Table 1, second row), has been selected to demonstrate the discriminative potential. The extracted ion image of this signal, presented in Figure 4b), clearly illustrates an overexpression of the PL signal in the second cluster (outlined in blue), which consists mainly of ephithelial tumor parts, and little to no expression in the pre-existing tissue. From this it can be speculated that m/z 788.616 may function as a suitable candidate to define ephithelial tumor components. Of course this finding is currently still a hypothesis and a comprehensive validation in a follow-up study is required, including MALDI FT-ICR MSI analysis of a statistically significant number of head and neck cancer samples with defined ephitelial tumor regions, followed by LC-ESI MS/MS assigment of the lipid signals as evidence. Related published work is in good agreement with our hypothesis by demonstrating the spatial confinement of PCs to specific cancer regions, such as the MALDI MSI study on colorectal cancer, which shows elevated levels of PC (16:0–18:1), LysoPC (16:0), and LysoPC (18:1) in the cancerous tissue [18], as well as the MALDI MSI study on breast tumor xyenographs, which shows elevated levels of PC (16:0/16:0), PC (16:0/18:1), PC (18:1/18:1), and PC (18:0/18:1) in the viable tumor regions [19]. Furthermore, the review article by Ridgway [20] proposes that PC contributes to proliferative growth and programmed cell death.

Finally, in order to identify the global lipidomic changes associated with the specific histopathologic entities in the first HNSCC tissue, the 243 tentatively assigned lipid species were arranged into the cluster with the highest PCC, given the PCC was ≥0.5; otherwise no assigment was performed. Using this approach, 73 tentatively assigned lipids could be categorized into the first three cluster, as shown in Figure 5. To simplify the interpretation, the lipid species were divided into their functional categories. Overall, it can be concluded that the first cluster cannot be differentiated from the second and the third cluster by a single lipid species. The second cluster contains all four categories, GL, GP, PR, and SP, highlighting the complexity of this tumor region. In the third cluster, the main category represents GLs, which certainly reflects the tissue morphology, mainly fat tissue, as discussed earlier in this paper.

Figure 5
figure 5

Graphical display of the tentative assigned lipids, divided into their categories, in the first three clusters in a HNSCC tissue section (first section in Figure 1). Each lipid species was allocated to the cluster with the highest PCC, provided the PCC was ≥0.5 to ensure a high correlation. No assigment was performed if the PCC was below 0.5. GL = glycerolipids; GP = glycerophospholipids; PR = prenol lipids; SP = sphingolipids

4 Conclusion

In summary, the results indicate that the morphological changes observed in the investigated head and neck tumor samples are predominantly reflected in the MALDI FT-ICR MSI data. This gives hope that the developed approach will help in gaining further knowledge in the area of cancer research and will allow a broader community to enable extracting useful information of huge high mass resolution MSI datasets. However, our experience shows the need to evaluate the spatial segmentation maps by an experienced pathologist.

Additionally, the matching of the discovered peaks against the lipidomics database LIPID MAPS provided hundreds hypotheses on the detected molecules providing valuable reduction from millions of original signals to possible candidates which can be validated by MS/MS using microdissection and “Folch” extraction [21] on fresh biopsy material.