Introduction

Menière’s disease (MD) is a disorder of inner ear homeostasis causing vertigo, hearing loss and tinnitus in about 0.2–0.5 % of the general population [1, 2]. Its pathological hallmark is a distension of the endolymphatic space, termed endolymphatic hydrops (ELH) [3, 4], resulting in an increased proportion of endolymph (EL) within the total fluid space of the inner ear. The diagnosis of Menière’s disease is based upon the typical clinical syndrome and the demonstration of ELH [5]. Since the latter could previously be obtained only by histological post-mortem examination, clinical practitioners and researchers had to rely upon the clinical manifestations alone in order to establish the diagnosis of MD.

After Zou et al. first reported separate visualization of endolymph and perilymph spaces in living humans by magnetic resonance (MR) imaging [6], Nakashima et al. [7] first provided evidence for the feasibility of visualization of ELH in living MD patients by locally enhanced inner ear MR imaging (LEIM). Using a three-dimensional (3D) fluid-attenuated inversion recovery (FLAIR) sequence, a high signal intensity within the perilymph was achieved after application of gadolinium-based contrast medium (GBCM) into the middle ear. This method could separate perilymph from both endolymph and bone, but not endolymph from bone. Naganawa et al. [8] then introduced the 3D Real reconstruction inversion recovery (Real-IR), which was able to differentiate endolymph, perilymph and surrounding bone for the first time by a single sequence. More recent work used modifications of these sequences to demonstrate endolymphatic hydrops even after single dose i.v. administration of GBCM [9].

So far, however, in vivo measurements of ELH have only led to semi-quantitative and subjective data, based on the 2D or 3D manual segmentation of endolymph and perilymph spaces [8, 1012]. The objective and volumetric quantification of the endolymph / total fluid space (EL/TFS) ratio in patients with MD is, however, a necessary next step to allow systematic pathophysiological and therapeutic studies of ELH in MD. Several factors make an objective volumetric assessment of inner ear fluid spaces a challenging task: (1) Intra-tympanic GBCM application achieves a higher signal intensity than does i.v. application, but the signal distribution within the inner ear is less uniform [13]. An uneven signal distribution impedes precise automated segmentation using a single threshold value. (2) The Real-IR sequence provides a very good contrast between endolymph and perilymph. However, the border between perilymph and surrounding bone is too fuzzy for precise automated segmentation. We aimed to volumetrically quantify endolymph and perilymph spaces of the inner ear in order to establish a methodological basis for further investigations into the pathophysiology of Menière’s disease and to potentially monitor future therapeutic approaches.

Methods

Subjects

This prospective study was approved by the local Institutional Review Board / University Ethics Committee (Protocol No. 093-09). All patients provided oral and written informed consent. Sixteen consecutive patients with unilateral MD, (eight female; mean age, 55 years; range, 38–71 years), were included in the study. Inclusion criteria were the clinical diagnosis of definite MD according to the guidelines of the American Academy of Otolaryngology Head and Neck Surgery [5] and an age above 18 years. Exclusion criteria consisted of MR-related contraindications such as cardiac pacemakers or claustrophobia, a history of allergies to GBCM and an inability to provide informed consent, as well as middle ear pathology that could impede local contrast uptake.

Contrast agent application

Gadopentetate dimeglumine (Magnograf, Marotrast, Jena, Germany) was diluted eightfold in saline solution and injected intra-tympanically (0.4 ml) under microscopic control. The patients remained in a supine position for a further 30 min with the head turned 45 degrees toward the contralateral side, instructed not to speak or chew during this period. MR imaging was performed 24 hours after application of the contrast agent on a 3 Tesla MR scanner (Magnetom Verio, Siemens Healthcare, Erlangen, Germany), using a commercially available 32-channel head coil (Siemens Healthcare, Erlangen, Germany).

MRI acquisition

We used a Real-IR sequence to differentiate EL from perilymph (PL) and bone and a T2-SPACE sequence to delineate the total inner ear fluid space from the surrounding bone.

T2-SPACE

We used a heavily T2-weighted 3D “Sampling Perfection with Application-optimized Contrasts using different flip angle Evolutions” (SPACE) turbo spin echo sequence, with long spin-echo trains with a duration of 477 ms each. The matrix was 384 × 384 pixels for a field of view of 192 × 192 mm2; 56 slices were reconstructed and interpolated to a thickness of 0.5 mm, resulting in a voxel size of 0.5 × 0.5 × 0.5 mm3. Phase oversampling of 100 % and slice oversampling of 14.3 % was applied. Parallel imaging with the GRAPPA algorithm was used with an acceleration factor of two. The echo time (TE) was 135 ms, the repetition time (TR) 1000 ms, the refocusing RF angle 110 degrees, and the receiver bandwidth 283 Hz/pixel. A 90-degree restore pulse was used after each echo train to recover the remaining magnetization. The acquisition was averaged four times to improve the signal-to-noise ratio (SNR). The resulting total scan time for this sequence was 8:34 min.

3D Real-IR

An isotropic 3D real-part reconstruction inversion recovery (Real-IR) turbo-spin echo sequence was acquired, based on the publication by Naganawa et al. [8]. The parameters were: TR 6000 ms, TE 155 ms, TI 1500 ms, fat saturation, constant flip angle of 180 degrees, echo train length of 35, echo train followed by a 90 degree restore pulse, matrix size of 320 × 320, 36 acquired slices (with 11.1 % slice oversampling), 0.5 × 0.5 mm2 in-plane resolution at 0.5 mm slice thickness, receiver bandwidth 195 Hz/pixel, and number of excitations 1. The examination time was 15:14 min.

Image processing

Image Processing (Fig. 1) contained resampling and coregistration of the T2-SPACE volume to the Real-IR volume, Contrast Limited Adaptive Histogramm Equalization (CLAHE), segmentation of the inner ear total fluid space using Random Forest Classification machine learning, fusion of this template with the 3D Real-IR image, endolymph/perilymph segmentation using a Niblack local threshold algorithm, and 3D reconstruction. Details of these procedures are given in the Appendix. Four months after the original image processing algorithm was applied to the raw image data sets of the 16 patients, the procedure was performed again on all raw image data by the same researchers (AB, DK).

Fig. 1
figure 1

Workflow of image processing

Pre-processing

In a first step, we re-sampled the Real-IR and T2-SPACE sequences with cubic interpolation using Analyze 11.0 software (AnalyzeDirect Inc., Kansas City, KN), resulting in a voxel size of 0.1 × 0.1 × 0.1 mm3. The interpolation was done for each voxel by using continuous and smooth cubic polynomials. Coregistration of the T2-SPACE volume to the Real-IR volume was achieved by using a normalized entropy measure for multimodality image alignment [14], incorporated in the image processing software Analyze 11.0. Then, 3D Gaussian and Median filtering 5 × 5 × 5 [15] were applied in order to improve the SNR while preserving edges [16]. CLAHE [17] was used to enhance local contrast by performing pixel value equalization on local bases in a 120 × 120 pixel neighbourhood [18]. This results in an evenly linearized Cumulative Distribution Function (CDF) of grey-scale values within the dynamic range, unveiling areas that are overexposed and underexposed while avoiding over-amplification of noise.

Total fluid space segmentation

Generation of the T2-SPACE template, representing the total fluid space of the inner ear, was achieved through interactive segmentation of the T2-SPACE sequence by an ensemble learning classification method named Random Forest (RF) Classification and introduced by Breiman [19], as implemented in the Interactive Learning and Segmentation Toolkit (ILASTIK) [20]. RF combines Breiman’s bagging [21] with randomized decision trees proposed by Amit and colleagues [22]. The RF ensemble algorithm consists of a number of unpruned, randomized decision trees, and is capable of capturing highly nonlinear decision boundaries. RF is inherently parallel, fast and robust against noise [2325]. From the cohort of 16 patients, six patients were randomly selected as the learning data set for this training procedure, which served as a basis for the RF Classifier segmentation of the T2-SPACE sequence in all patients.

Fusion

The external borders of the inner ear in the 3D Real-IR sequence were defined by fusing the previously segmented MR cisternography T2-SPACE template (Fig. 4.2) and the gadolinium-contrasted 3D Real-IR sequence volume (Fig. 4.1). Fusion was performed using the minimum function, Min (T2-SPACE, Real-IR), resulting in a volume termed “Real-IR/T2-SPACE-Temp” (Fig. 1c and Fig. 4.3).

Endolymphatic and perilymphatic space segmentation

The heterogeneous gadolinium contrast medium distribution results in considerable fluctuations of the EL/PL contrast. The small organ size and the relatively low in vivo SNR impedes automated EL/PL segmentation across the inner ear using a global threshold. Therefore, local threshold algorithms (LTA) were used, which adapt the threshold value of each pixel to the local neighborhood characteristics within a defined radius r. The threshold values are spatially determined based on the local image variance [26]. The parameter R, introduced by Sauvola and Pietikainen [27], is the maximum value of the standard deviation (R = 128 for a greyscale image). It can be thought of as a global normalization of the local standard deviation. Trier and Jain found that Niblack local thresholding is one of the best performing local thresholding methods [28]. The Niblack method computes the threshold according to the local mean m(i, j) and standard deviation s(i, j) for each pixel according to the image characteristics within a window of radius r around it [39] as follows:

$$ T\left(i,j\right)=m\left(i,j\right)+k\sigma \left(i,j\right) $$

The size of the local neighbourhood parameter r should be small enough to preserve local details, but at the same time large enough to suppress noise. The default value of the weight k for bright phase GBCM enriched perilymph is 0.2 [29]. According to Trier and Jain, a radius of r = 15 pixels (px) and a bias setting of k = 0.2 were therefore considered suitable.

In addition, the EL/TFS area ratios of the mid-modiolar cochlear section and the vestibular section were determined by manual selection of regions of interest (ROIs) for endolymphatic and perilymphatic spaces (Table 3).

3D Reconstruction

The post-processing included median smoothing (3 × 3 × 3) and separation of the inner ear from the eight cranial nerve / internal auditory meatus using Analyze 11 (3D-Volume Module; AnalyzeDirect Inc., Kansas City, KN). This was done manually by a radiologist and a neurotologist (BEW, RG; both with more than 10 years of experience in temporal bone image evaluation), in a consensus manner. The inner ear was then three-dimensionally reconstructed with different colour codes for the total fluid space and the endolymphatic space.

Four months after the original image processing algorithm was applied to the raw image data sets of the 16 patients, the procedure was performed again on all raw image data by the same researchers (AB, DK). The two calculated sets of data for cochlear and vestibular EL/TFS volume ratios were analyzed for test–retest reliability.

Neurotologic assessment

All patients underwent a neurotologic evaluation including otomicroscopy, audiometry, tympanometry. Cerebellopontine angle tumors were ruled out by i.v. contrast-enhanced T1-weighted cranial MR imaging. Caloric vestibular responses were tested with videonystagmography according to Hallpike [30], the mean maximal slow phase velocity was determined, and the degree of horizontal canal paresis was evaluated according to Jongkees et al. [31]. All patients were diagnosed by two neurotologists with more than 10 years of experience (R.G., E.K.).

Statistical analysis

Test–restest reliability of the semi-automated volumetry and the manual ROI segmentation of cross sections was analyzed using the intra-class correlation coefficient (ICC) and the Pearson correlation coefficient for two trials [32]. The correlation analysis between hearing loss and cochlear EL/TFS ratio was performed with the Pearson correlation coefficient. Statistical analyses were carried out with the IBM SPSS Statistics 20 software (SPSS Inc., Chicago, IL, USA) package.

Results

Clinical characteristics of the study population are summarized in Table 1. The study cohort comprised patients in all stages of Menière’s disease. We performed six different LTAs (Fig. 4.4–4.9; Bernsen, Mean, Median, MidGrey, Niblack, Sauvola, [27, 3337]), comprised of the ImageJ 8-bit Auto Local Threshold plugin [29, 38] on our Real-IR/T2-SPACE-Temp test data sample set (six patients). Niblack’s method [39] was least vulnerable to fluctuations of the EL/PL contrast and noise in comparison with other LTA algorithms (Fig. 4). A window radius r = 4 px of Niblacks local thresholding algorithm led to many small clusters and over-amplification of noise (Fig. 4.10), whereas a window radius of r = 16 px led to cluster agglomeration and loss of details (Fig. 4.12). The intermediate local radius r = 8 px performed best across the Real-IR/T2-SPACE-Temp test data sample (Fig. 4.10).

Table 1 Characteristics of the study population (n = 16)

The 3D reconstruction of the inner ear fluid space according to the method described above resulted in a clear delineation of endolymphatic space within the total fluid space in all patients. In general, the endolymphatic space, even in the presence of severe endolymphatic hydrops, could hardly be visualized within the semicircular canals (Fig. 2a, c). 3D reconstruction in cases with only moderate ELH resulted in discontinuous visualization of the cochlear duct (Fig. 2a), whereas in cases with severe ELH, the cochlear duct is visualized continuously (Fig. 2c). Vestibular endolymphatic space, due to its more spherical geometry and relatively large dimension, is best visualized. While the saccular and the utricular subcompartments are depicted separately on 2D images (Fig. 2b) in cases with moderate ELH, they appear confluent in cases with severe ELH (Fig. 2d). Three-dimensional images depict the delicate shape of the vestibular EL space in cases with moderate ELH (Fig. 2a) and the rather blunt appearance of vestibular EL space in cases with severe ELH (Fig. 2c). Furthermore, the recently described herniation of vestibular membranous labyrinth into the semicircular canals [40] is easily appreciated on 3D images (Fig. 2c). A video rendering of the 3D reconstruction is provided in the appendix.

Fig. 2
figure 2

Locally enhanced inner ear MR imaging (LEIM) of two patients with definite Menière’s disease of the right ear after co-registration of T2-SPACE and Real-IR images and automated segmentation of total inner ear fluid space (cyan) and endolymph space (red). A and C depict the 3D reconstruction, B and D depict axial cross-sections. The three semicircular canals are continuously visualized. The interscalar septum is correctly excluded from the inner ear fluid space (thin filled arrows in B and D). A and B show the right inner ear of a patient with moderate endolymphatic hydrops. The cochlear duct cannot be visualized continuously (thin transparent arrow), due to its minute dimensions. The saccular and the utricular subcompartments of the vestibulum are separately depicted (thick filled arrow) on the 2D images. C and D show the right inner ear of a patient with severe endolymphatic hydrops. The cochlear duct is markedly distended and is continuously visualized within all three cochlear turns (thin transparent arrow). The saccular and the utricular subcompartments of the vestibulum are confluent (thick filled arrow) on 2D images. The herniation of vestibular membraneous labyrinth into the semicircular canals is easily appreciated on the 3D images (thick transparent arrows)

The results of volumetric measurements of inner ear fluid spaces are summarized in Table 2. The endolymph / total fluid space (EL/TFS) ratio, as a measure for severity of endolymphatic hydrops, was 2–25 %, 15 % (min to max; mean) for the cochlea; and 12–40 %, 28 % (min to max; mean) for the vestibulum. Therefore, a wide range of degrees of endolymphatic hydrops was volumetrically quantified. The cochlear EL/TFS ratio was significantly correlated with hearing loss (Pearson coefficient = 0.747; p = 0.001) (Fig. 3).

Table 2 Inner ear fluid spaces as measured by Locally enhanced inner ear MR (LEIM) volumetry in 16 patients with definite Menière’s disease. The severity of endolymphatic hydrops is expressed as endolymph/total fluid space ratio. Endolymph space within the semicircular canals is too small to be separately visualized by MR imaging; therefore, only the total fluid space of the SCC is listed.
Fig. 3
figure 3

Correlation between hearing loss, expressed as the four-tone average (PTA) at 0.5, 1, 2, 3 kHz in dB and the cochlear endolymphatic hydrops severity, expressed as the EL/TFS volume ratio

The results of a second trial of image processing are summarized in Table 3. Test–retest reliability was statistically analyzed for the two primary outcome variables, cochlear EL/TFS volume ratio and vestibular EL/TFS ratio. Both the intra-class coefficient (ICC) and the Pearson correlation coefficient for the EL/TFS volume ratio were 0.99 for both the cochlea and the vestibulum. In contrast, the ICC of the manually segmented EL/TFS area ratios (Table 3) was 0.87 for the cochlea and 0.91 for the vestibulum.

Table 3 Test–retest reliability analysis

Discussion

More than 150 years after its initial description, MD still represents a major diagnostic and therapeutic challenge. Only recently, the visualization of ELH—as its pathological correlate—has been achieved. Endolymphatic hydrops imaging has been used to link ELH to the deterioration of audiovestibular functions in Menière’s disease [4146]. First evidence from a cross-sectional study supports the notion that ELH is a morphological phenomenon that progresses with the duration of the disease [41], suggesting that ELH is a valuable disease marker for MD.

A critical review of the clinical evidence reveals a lack of evidence from controlled trials for the effectiveness of any therapeutic approach towards MD. The strongest evidence is in support of Gentamicine therapy, allowing the statement that it “seems to be an effective treatment” [4751]. Gentamicine is a destructive treatment aiming for a partial ablation of vestibular functions. It inherently carries a risk of hearing deterioration as a side effect [52]. A therapy with a proven beneficial effect on the natural course of the disease is still lacking. Ideally, a future treatment for Menière’s disease should reverse the progression of ELH as its primary pathological feature. A method of objectively quantifying the extent of endolymphatic hydrops is therefore needed. A 3D Real-part Reconstruction Inversion- Recovery sequence with an inversion time of about 1700 ms allowed, for the first time, a separate visualization of endolymph, perilymph and surrounding bone with a single sequence [8]. The delineation between endolymph and perilymph and between endolymph and bone, however, is much more distinct than the contrast between perilymph and bone. This and the variability of contrast uptake [53] render the segmentation between perilymph and bone difficult.

We sought to solve this problem by combining a sequence with excellent demarcation of endolymph from perilymph and bone (3D Real-IR) and a sequence with optimized demarcation between the inner ear fluid space and bone (T2-SPACE). The endolymph signal generated in one sequence was subsequently subtracted from the total inner ear fluid space as defined by the other sequence. Heavily T2-weighted sequences, like CISS or FIESTA, have been used for many years to evaluate the fluid space of the inner ear, e.g., for the diagnosis of inner ear malformations or for the evaluation of cochlear fibrosis in cochlear implant candidates. These sequences are known to be prone to band-like susceptibility artifacts due to local field heterogeneities that mainly affect the vestibulum and the semicircular canals [54]. Therefore, we used a SPACE sequence to reduce imaging artifacts without significantly increasing the imaging time [55].

We used a Random Forest (RF)-based method to define the outer border of the inner ear fluid space. Numerous studies (e.g., in the fields of dementia research, mammography, pre-operative assessment of the femur, or cardiology) have demonstrated the performance of the RF-based systems to be comparable to or to outperform experienced radiologists [5659].

For the segmentation of EL/PL spaces within the inner ear, the Niblack method [29] performed most effective against fluctuations of the EL/PL contrast and noise in comparison with other LTA algorithms in our hands (Fig. 4), which is in line with the findings of Trier and Jain [28].

Fig. 4
figure 4

Comparison of different automated local threshold segmentation algorithms applied to the Real-IR/T2-SPACE-Temp dataset (4.3), which results from the fusion of the previously segmented MR cisternography T2-SPACE template (4.2) and the gadolinium-contrasted 3D Real-IR sequence volume (4.1). The Niblack algorithm (4.8) was chosen for the segmentation of enolymph from perilymph space. A window radius of r = 8 px was chosen (4.11). For comparison, Fig. 4.10 shows segmentation with a window radius of r = 4 px, Fig. 4.12 shows segmentation with a window radius of r = 16 px

Buckingham and Valvassori [60] estimated the average total inner ear fluid volume of the bony labyrinth from histological cross-sections to be 193 mm3. Average inner ear fluid volumes have been measured by MR volumetric assessments in 29 healthy subjects [61] at 195 mm3 (range 150–279 mm3). The authors of this study outlined the borders of inner ear structures manually to obtain a volume. Our 3D-semi-automatic inner ear reconstruction (total fluid space ranged from 155 to 212 mm3 with an average of 181 mm3) is consistent with these previously reported data.

Liu et al. [62] used a 3D FLAIR sequence after intratympanic application of GBCM in order to quantify ELH in six MD patients. The mean EL/TFS area ratios were 31 % (range 18–48 %) for the cochlea and 44 % (range 33–58 %) for the vestibulum. The major disadvantage of a FLAIR-based ELH quantification is the limited distinction of endolymphatic space from the surrounding bone. Furthermore, segmentation of EL space was performed manually.

A very recent study [63] presented another approach to quantification of ELH in a sample of patients with various inner ear disorders. These authors also manually drew the outer border of the inner ear fluid space on MR cisternography and fused this region of interest to an i.v. contrast-enhanced 3D Real-IR sequence-like contrast image. Then, endolymph / total fluid space segmentation was performed with a fixed global threshold. The authors reported low inter-observer variability for their method. However, both of these approaches were done on one selected section through the vestibulum and the cochlea only, whereas the 3D volumetric assessment presented here assesses the entire inner ear. I.v. application of GBCM is less invasive than the intratympanic route, at the cost of a lower perilymph signal intensity. Nevertheless, the segmentation method presented here may also be applicable to MR image data obtained after i.v. GBCM application in the future.

Hearing loss in MD patients has been shown to correlate with the severity of cochlear ELH in 2D MR imaging studies [41]. This correlation was not found between semicircular canal paresis and vestibular ELH [64]. In our study, the volumetrically determined cochlear EL/TFS ratio, as an expression of ELH severity, was significantly correlated with hearing loss, supporting the validity of our volumetry method (Fig. 3).

There are limitations to our study that need to be taken into account when interpreting the data. The volumetric method presented here is not exclusively automated, but contains a user-defined input during (1) the interactive Random Forest Classifier based segmentation of the inner ear fluid space, (2) the subdivision of the inner ear into cochlea and vestibulum and (3) the delineation of the inner ear from the internal auditory canal / vestibulocochlear nerve. Nevertheless, we have found excellent test–retest results for our data set, demonstrating that the subjective input to the method is likely to only have a minor effect on the overall results.

Secondly, the volumetric method presented here does not by itself allow the establishment of the diagnosis of MD in an individual patient. For example, patient No. 2 in this series had an EL/TFS ratio of only 2 % in the volumetric evaluation. Despite the clinical diagnosis of definite MD, this patient had only slight functional audiovestibular deficits, most likely due to the early stage in the disease course. Due to the small values for the EL/TFS ratio both in healthy subjects and in the earliest MD disease stage, it may be difficult to define a clear cutoff value between the upper limit of normal variability and the pathologically enlarged EL space in an individual subject, even when larger samples of healthy subjects will be screened with the currently achievable MR image resolution. However, for the purpose of longitudinal volumetric monitoring of ELH during therapeutic studies in MD, this is not necessary, but future studies should investigate this important topic.

In summary, this study for the first time reports an in vivo, computer-aided volumetric quantification of endolymphatic hydrops. By applying the Random Forest Classification machine learning algorithm to MR inner ear cisternography and a Niblack segmentation algorithm to a 3D real reconstruction inversion-recovery sequence, the endolymph / total fuid space was reliably measured over a wide range. This computer-aided volumetric approach is a promising tool to monitor ELH in therapeutic trials in Menière’s disease.