Introduction

Since the groundbreaking work of Yousem et al. [26, 29] magnetic resonance imaging (MRI)-based measurements of the olfactory bulb (OB) have become an established tool in the investigation of olfactory function. It has been used in healthy subjects, people with isolated congenital anosmia, patients with olfactory dysfunction due to chronic rhinosinusitis (CRS), head trauma (TR), postinfectious olfactory dysfunction (PIOD), patients with epilepsy, schizophrenia, Parkinson’s disease, Alzheimer’s disease or children after chemotherapy (for review see [22]). At the same time, the olfactory sulcus (OS) has also been mentioned as a correlate of olfactory function, although it has been studied less rigorously [21, 23].

Aim of this study was to investigate the OB and OS in a large group of patients who were well-characterized in terms of olfactory function, with a specific focus on the comparison between patients with olfactory loss due to CRS, TR, or PIOD, as these are the most frequent causes of olfactory loss apart from age [2, 3]. The dataset from the Department of Otorhinolaryngology at the Brussels’ St. Luc Hospital provided ideal prerequisites to conduct such a study as patients have been investigated since years with the same diagnostic tools and, importantly, the same MR scanner/software.

Materials and methods

Subjects

This study was conducted at the Department of Otorhinolaryngology of the Saint Luc University Hospital in Brussels. Data from 378 consecutive patients were included. All of them had received scans of the olfactory bulb (OB) and/or olfactory sulcus (OS) and/or brain using magnetic resonance imaging, they had received an otorhinolaryngological investigation including nasal endoscopy, had a detailed interview with an experienced otorhinolaryngologist (PR, CH), went through psychophysical tests of orthonasal olfactory function.

Psychophysical testing of olfaction performance

Psychophysical testing of olfactory function was performed with the validated “Sniffin’ Sticks” test [10]. Odors were presented to the patients in felt-tip pens. For birhinal stimulation, the pen’s tip is placed approximately 2 cm in front of both nostrils. This test encompasses three different approaches. First, odor thresholds (T; Threshold) are assessed for n-butanol with stepwise dilutions in a row of 16 dilutions. Thresholds are determined using the single staircase technique based on a 3-alternative forced choice (3-AFC) task. Second, patients are asked to discriminate odors. For each discrimination task (D; Discrimination), three pens are presented, two containing the same odor and the third containing the target odorant (3-AFC task). The target odors should be recognized in a row of 16 trials. Thirdly, a row of 16 odors was presented to the patients together with a list of 4 verbal descriptors (I; Identification). Subjects were asked to identify the odors using this multiple forced choice approach. For healthy subjects, the TDI score at the 10th percentile is 30.5 for ages from 15 to 35 years. Functional anosmia (further termed “anosmia”) is diagnosed if the TDI score is less than 16.5. With a TDI score between 16.5 and 30.5, patients are considered hyposmic.

Magnetic resonance imaging (MRI) protocol and measurements

Patients were examined on a 1.5 Tesla MRI system (Signa Echospeed, GEMS, Milwaukee, WI, USA) using a standardized protocol for OB analysis (see Figs. 1, 2). The protocol included: (1) 4-mm-thick standard T2-weighted fast spin-echo images covering the whole brain without interslice gap to rule out any organic brain disorder; (2) 4-mm-thick standard FSE fluid-attenuated inversion recovery images covering the whole brain and 4-mm-thick T2-weighted images gradient-echo images using the Echo-Planar imaging technique (EPI-GRE-T2*) covering the whole brain to rule out the presence of any parenchymal or meningeal post-traumatic hemosiderin deposit; and (3): 1.5-mm-thick T1- and T2-weighted fast spin-echo images without interslice gap in the coronal plane covering the anterior and middle segments of the base of the skull. The analysis followed a standardized method [16]. In summary, OB volumes were calculated by planimetric manual contouring (surface in mm2) and all surfaces were added and multiplied by appropriate factors due to the varying slice thickness to obtain a volume in mm3.

Fig. 1
figure 1

T2-weighted image coronals in a normosmic, 60-year-old woman with normal OB volumes (depth of left-sided OS is indicated by white line; right-sided OB is indicated by 3 white arrows)

Fig. 2
figure 2

T2-weighted image coronals in a normosmic person. OBs are detected and encircled in slices 5–12. In slice 6 (the plane directly behind the eye bulbs), the OS is measured on both Tables

Statistical analyses

All statistics were performed using SPSS software vs. 23.0 (SPSS Inc., Chicago, Ill., USA). Results were submitted to analyses of variance (ANOVA) for repeated measures. Correlations (Pearson) were computed between volumetric measures of the OB and functional measures. The level of significance was set at 0.05.

Results

A total of 378 people were included in the study: 99 had chronic rhinosinusitis (CRS), 201 had post-traumatic olfactory loss (TR), and 78 had postinfectious olfactory loss (PIOD); 193 patients were women, 185 were men. Average age was 49 years (SD 14 years). A detailed list of the measured parameters separately for the causes of olfactory loss is presented in Table 1. Results from partial correlations controlling for age between the obtained parameters are shown in Table 2. Interestingly, depth of the right, but not the left OS correlated with odor thresholds (threshold: r 297 = 0.15, p = 0.012).

Table 1 Descriptive statistics of the results from olfactory testing (orthonasal and retronasal) and measurements of olfactory bulb volume, separately for causes of olfactory loss [chronic rhinosinusitis (CRS), head trauma, infections of the upper respiratory tract]
Table 2 Partial correlations between obtained variables across all participants controlling for age (n = 297, r coefficients of correlations, p level of significance, n.s. not significant, italic p < 0.05, bold p < 0.01)

When comparing the OB volume between left and right side in relation to etiology (CRS, TR, and PIOD), analyses did not show side-related differences [F(1,348) = 1.90; p = 0.17] but OB volumes differed significantly in relation to the cause of the olfactory disorder [F(2,348) = 9.07; p < 0.001]. Posthoc tests showed that OBs were smallest in TR patients compared to CRS and PIOD patients (p < 0.007). In contrast to OB, OS depth exhibited side differences [F(1,325) = 19.5; p < 0.001] with the right OS being deeper than the left OS. Again in contrast to OB measures, the OS did not vary in relation to the cause of olfactory loss [F(2,325) = 0.15; p = 0.86].

The average OS was 6.7-mm deep on the left and 7.0-mm deep on the right side; we also found that, across all subjects, the left OS depth decreased with the subjects’ age; for the right side only a tendency for such a relation was seen (left OS: r 332 = −0.12, p = 0.024; right OS: r 340 = −0.09, p = 0.093).

We also compared results from the three olfactory tests between the three major etiologic groups. TR patients performed worst compared to CRS and PIOD patients [F(2,364) = 14.9, p < 0.001; posthoc tests: TR vs CRS/PIOD: p < 0.002]. Importantly, there was also a significant interaction between factors “olfactory test” and “cause of olfactory loss” [F(2,364) = 3.77, p = 0.024] indicating that patients of different groups performed differently with the tests. TR patients performed worst for all three tests; SND and PIOD patients had—on average—identical scores at threshold level while their scores were most different at odor identification with CRS patients performing worse than PIOD patients.

To judge the quality of the technique used, we performed a number of additional measurements. In one instance, we looked at the inter-observer reliability. These measurements were made in 50 bulbs by the authors AU and CH. The inter-observer reliability was very high, for the left OB: r 48 = 0.98, p < 0.001; and the right OB: r 50 = 0.98, p < 0.001. Average differences between measures by the two observers were 0.27 mm3 (SD 3.04) for the right OB and 0.07 mm3 (SD 2.46) for the left OB. With regard to intra-observer reliability repeat measures in 224 OBs by one observer produced the following results: left OB—r 224 = 0.945, p < 0.001; right OB—r 224 = 0.949, p < 0.001. The average differences between measures by one observer were 0.84 mm3 (SD 4.77) for the right side and 0.61 mm3 (SD 4.36) for the left side.

In an additional analysis, the question should be answered whether two observers select the same number of slices for measurements, or, in other words, whether two observers use the same criteria for separating OBs and OTs. Results obtained from measurements by AU and CH in 50 OBs with a median number of 7 slices (range 3–10 slices) indicated that the number of slices chosen for measurements for the right OB was the same in 52 %, differed by 1 slice in 34 %, and differed by 2 slices in 14 %; for the left OB this was the same in 54 %, differed by 1 slice in 36 %, by 2 slices in 8 % and 3 slices in 2 %.

Discussion

Major results of the present study in patients with olfactory loss included the (1) demonstration of a correlation between olfactory function and OB volume across the various pathologies in a very large group of subjects; (2) the three functional olfactory tests exhibited a similar degree of correlation with OB volume. (3) The right, but not the left OS correlated with olfactory function; in addition, (4) OS was negatively correlated with age. In contrast to OS, (5) no side differences were found for the OB. Finally, (6) the three different causes of olfactory loss exhibited different patterns of results for the three olfactory tests used. A limitation of the present study was that olfactory testing has been performed in both nostrils simultaneously, so differences in olfactory function between right/left nostril cannot be ruled out.

Magnetic resonance imaging-based volumetric measures of the OB have been studied in various groups of patients with different causes of olfactory loss (e.g., [18, 20, 24, 25]). Overall, they indicated that OB volume is a morphological indicator of olfactory function. The present study confirmed this view in a large group of subjects with mixed causes of olfactory loss. Also, across this group of patients a correlation was found between OB volume and psychophysical tests of olfactory function, again strongly confirming previous work. This analysis also showed that none of the three olfactory tests used, odor threshold, odor discrimination or odor identification exhibited a “best” correlation with OB volume. This corresponds to a recent review analysis showing, across numerous studies from various laboratories, that OB volume is correlated to scores from odor thresholds and odor identification in a similar strength [22]. Thus, OB volume is an indicator of general olfactory function.

At the level of the OS, but not at the level of the OB, side differences were found, with the right OS being deeper than the left OS. In addition, a correlation between odor thresholds and odor discrimination was found for the right OS, not the left OS. Why was this seen for the right side? One idea relates to the potentially higher significance of the right hemisphere in terms of olfactory function which has been indicated by numerous studies [7, 9, 12, 28, 30].

Another question relates to the fact that this side difference was not seen for the OB. However, here it has to be kept in mind that the OB is subject to relatively short-term changes following changes in olfactory function [4]. Such changes have not been observed for the OS [19]. It appears as if the depth of the OS was more determined by long-term changes, e.g., aging and corresponding changes such as brain atrophy. This negative correlation with age (which also has been shown recently by [5]) may also explain that the average OS depth was below the depth limit of 8 mm that has been suggested in a group of younger subjects to separate people with isolated congenital anosmia form healthy controls [1, 10]. Because mostly older people have been investigated in the present study, the OS probably became shallower with age. Accordingly, normative data for the OS depth should be related to age.

As an additional outcome, the present study showed that the cause of olfactory loss is related to different patterns in scores for odor threshold, odor discrimination, and odor identification (compare [13]). TR patients had the lowest scores which are congruent with the fact that TR patients typically have relatively low recovery rates, considering that recovery is related to residual olfactory function [8, 14]. In contrast, SND and PIOD patients had—on average—identical scores at threshold level while their scores were most different at odor identification with CRS patients performing worse than PIOD patients. This seems to indicate in a clinical setting that the three olfactory tests tap into different dimensions of olfactory function, as already shown previously [6, 15]. Although the current findings are presented at a group level, still it seems to be worth the while to examine the patients with olfactory dysfunction with different tools to obtain comprehensive information at different levels.

The current study also showed that both inter- and intra-observer reliability are very high (r = 0.98 and r = 0.94, respectively). Considering these high coefficients of correlations it seems negligible whether observers agree only in about 50 % of the cases on the number of slices to be included in the volumetric measure; obviously the volumes at the proximal end of the OB (in patients with olfactory loss!) are too small to produce a major difference in overall OB volume, regardless whether they are included or not. Interestingly, the current results compare very well to previous research. Yousem et al. [27] reported inter-observer coefficients of correlation of r = 0.97 and intra-observer coefficients of correlation of r = 0.92; similar figures were reported a few years later with coefficients of correlation between measures (inter-observer reliability) of r = 0.96 for the right OB, and r = 0.98 for the left-sided OB [17].

Changes in relation for the duration of olfactory loss have not been investigated in the present study. However, previous work already showed that there is a negative correlation between duration of olfactory loss and OB [20] and changes of OB. Specifically, in 23 patients there was a negative correlation between OB volumes and duration of the olfactory loss (left OB: r = −0.57, p = 0.003; right OB: r = −0.59, p = 0.002). In contrast, no such correlation has been reported for the depth of the OS.

In the present study, the olfactory sulcus has been found to be deeper on the right side than on the left. This has already been observed in people with isolated congenital anosmia [1], and also in healthy, normosmic subjects [11]. However, as this is not found in all studies (e.g., [20]) it is difficult to speculate about possible causes of such lateralized differences.

To summarize, the present data suggest that the morphological assessment of the OB volume and OS depth is a relatively simple but very useful clinical measure to assess olfactory dysfunction.