Introduction

The sense of smell widely affects our quality of life through determining the palatability of foods and beverages, enjoyment of flowers and perfumes, reproduction of memory and warning for dangerous situations like spoiled food, fire, and gas leakage (Murphy 1985; Landis et al. 2005). Olfactory disorder may be caused by a number of factors such as sinus or nasal disease, head trauma, toxic chemical exposure, using some drugs, neurodegenerative disease, aging etc. (Murphy 1985; Landis et al. 2005; Doty 2001). Clinical evaluation of olfactory performance is an essential step to the diagnosis and treatment of olfactory dysfunction. Psychophysical assessment methods consist of odor detection and recognition threshold, odor detection, odor identification, odor discrimination, and odor memory tests, providing an effective and practical way to rapid assessment of olfactory function (Harper et al. 1968; Cain et al. 1992; Doty 1995; Eibenstein et al. 2005).

The most famous olfactory test, University of Pennsylvania smell identification test (UPSIT) is widely used as a diagnostic tool in researches and clinical settings in the USA (Doty et al. 1984a, b). UPSIT focuses on the comparative ability of individuals to identify various odors. It has efficiency for detecting malingering as well as some olfactory disorders (Doty 1995). As the sense of smell highly depends on social/cultural lifestyle of population, familiarity with the odors in the smell test is a considerable matter. Sniffin’ Sticks is another olfactory performance assessment test adjusted for European people with a combination of odor identification, odor discrimination, and olfactory threshold tests (Kobal et al. 1996; Hummel et al. 1997). However, indeed, there are some odors in Sniffin’ Sticks, which are unfamiliar for the people in some European countries (CĂtanĂ et al. 2012). In recent years, many efforts have been made to standardize olfactory tests according to cultural features (Nordin et al. 2002; Thomas-Danguin et al. 2003; Cardesín et al. 2006; Saito et al. 2006; Cho et al. 2009; Silveira-Moriyama et al. 2010; CĂtanĂ et al. 2012; Oniz et al. 2013). In other words, there is no universal gold standard test for the assessment of olfactory function.

The major issue of this study was the absence of an accepted smell test in Iran. Iranian otolaryngologists have difficulties in determining the degree of olfactory disorders. Although most of otolaryngologists use handmade traditional bottles filled with different fragrances, some of them uses the original version of the University of Pennsylvania smell identification test (UPSIT) or Sniffin’ Sticks. Despite this situation, neither the original tests nor the traditional bottles are validated for Iranian population.

In this study, we tried to design a reliable and valid test for clinical assessment of olfactory function based on familiar odors in Iranian population named Iran smell identification test (Iran-SIT). Here, we report the results of pilot, main, and test-retest studies.

Methods

Determining Familiar Odors for Iranian Population

Two famous tests, University of Pennsylvania smell identification test (UPSIT) and Sniffin’ Sticks, are comprehensively used as the main reference for the smell identification tests developed in different countries (Doty et al. 1984a, b; Kobal et al. 1996; Hummel et al. 1997). In this study, we considered the 40 odors of the UPSIT as the base of our test. Iran is a big country with different ethnicities, so the odors must be familiar for all Iranians. Thousands of students from all around the country live in Tehran (mostly in dormitories). They represent the cultural diversity of Iran very well. Ninety students were asked to list the familiar odors of the UPSIT. For better linguistic perception, we prepared a translation of the original version of UPSIT in Farsi. In order to replace unfamiliar odors, they were also asked to propose some odors which people from different regions of Iran are commonly encountered in their daily life. By considering the odor categories proposed by Castro et al. (2013) and feasibility of supplying the odorants, the first version of Iran-SIT was designed using 40 items.

Preparing the First Version of Iran-SIT

In order to prepare the first version of Iran-SIT, the natural or synthetic odorants were provided and fragrance microcapsules were produced unless they were commercially available. Afterwards, to prepare the scratch and sniff stickers, microcapsules were mixed with varnish ink and printed on sticker papers using silk screen printer machine. Finally, a questionnaire containing these stickers was designed in a four-alternative forced-choice (4-AFC) test format.

Pilot Study

Before the main experiment, a pilot study was carried out by 43 subjects (23 female and 20 male) with ages ranging from 20 to 40 years using the first version of Iran-SIT. The aim of pilot study was to reveal deficiencies in the procedure and select the best odors from 40 items and their alternatives. The subjects were asked to scratch the stickers, sniff them, and choose one of the four alternatives. The data obtained from this experiment was analyzed, and 16 odors with lowest identification score were omitted and some alternatives were switched.

Main Study

Five hundred seventy-seven healthy subjects (mean age 32.46, SEM (standard error of the mean) 0.681), 298 female (mean age 32.76, SEM 0.97) and 279 male (mean age 32.13, SEM 0.96) from different regions of Iran with ages ranging from 6 to 68 years were selected to participate in main study. They were classified in thirteen 5-year age groups. The number of people in each age group was based on Iran demographics pattern (Asia-Pacific Population Journal 2006). The number of subjects (female, male, and total) of each age group is shown in Table 1. All subjects gave their informed consent, and this project was approved by the Ethics Committee for research in Tehran University of Medical Sciences.

Table 1 The number of subjects (female, male, and total) of each age group

In order to select subjects who qualified to participate in the main study, we determined the inclusion/exclusion criteria. All subjects underwent physical examination which consisted of items like deviated nasal septum (DNS), tight nasal valve, dried nasal mucus, and nasal adhesion (Snow et al. 1991). They were also interviewed about their past medical history. The medical history was evaluated using the questionnaire developed in the University of Pennsylvania Smell and Taste Center (Deems et al. 1991). The questions included sinus or nasal disease, history of pre and post-operative radiotherapy and/or chemotherapy, history of head trauma, toxic chemical exposure, serious upper respiratory problems, history of head and neck surgery, nasal allergies, and family history of smell problems. Based on the results of physical examination and medical history, we selected subjects to participate in the main study. Moreover, people with smoking habit (Doty et al. 1984a, b; Frye et al. 1990; Ishimaru and Fujii 2007) and/or taking medicines affecting olfaction (Mair and Harrison 1991; Henkin 1994) were excluded from the study. Finally, 577 healthy subjects were selected to participate in main study.

By omitting 16 problematic odors, the main experiment was carried out using the modified 24-items test named Iran smell identification test (Iran-SIT). It was designed as a four-alternative test in a forced-choice paradigm. The procedure was fully explained to the all 577 subjects as the following lines. The subjects were asked to scratch the stickers by means of a pencil tip to release the odors. They were encouraged to sniff the scraped sticker immediately and choose one of four alternatives. If they claimed that the odor they smelled was not presented in the alternatives, they were asked to mark the answer closest to their experience. The time interval between each sniff was 30 s. In some cases, the examiner helped administer the test to subjects who could not read or who had impaired eyesight. Iran-SIT score was considered as the number of the items that were correctly answered.

Test-Retest Study

Reliability and stability of Iran-SIT over time was assessed by administrating the Iran-SIT 5 months after main study. Ninety-six (44 female and 52 male) of 577 subjects with ages ranging from 10 to 60 years, who had different identification scores, were selected to participate in the retest study. The retest study was administered using the same procedure as main study.

Results and Discussion

All the analyses were performed by STATA software version 12 and p value less than 0.05 was considered statistically significant.

Determining Familiar Odors for Iranian Population

The aim of this study was to develop a standardized olfactory test for Iranian population considering their cultural background. For designing a reliable and valid test, it was important to choose the familiar odors covering the different smell categories. Because 21 of 40 odors of UPSIT were mostly unfamiliar for the subjects, the first version of Iran-SIT was provided by replacement in some odors of UPSIT, which were unfamiliar for Iranian population. It was tried to preserve the main categories of odors in all replacement; for instance, jasmine and tuberose was replaced for lilac and clove, cake was replaced for gingerbread, and vanilla was replaced for licorice. Furthermore, we had to change some alternatives related to odors. Finally, our 40-item list was obtained as shown in Table 2.

Table 2 Forty odors and alternatives of each odor used in UPSIT and the first version of Iran-SIT

Pilot Study

To assess the quality of the odors and to choose the suitable alternatives, we analyzed the data obtained from 43 subjects in the pilot study. The identification percentage (95 % confidence interval) for each odor of the first version of Iran-SIT is presented in Table 3. Results indicated that 16 odors had identification percentages less than 70 %. Most of them were difficult to identify correctly due to manufacturing difficulties, so we had to omit them for the main experiment. Moreover, we switched some alternatives because they misled participants.

Table 3 Identification rate of each odor used in the first version of Iran-SIT

Main Study

We confirmed that all of the 24 odors used in the final version of Iran-SIT obtained an identification percentage of more than 70 % (Table 4). We photographed the final version of Iran-SIT which comprised 24 odors (Fig. 1). There are thousands of aromas that humans can smell. Castro et al. (2013) have used a computerized technique to whittle down odors to their most basic essence. They classified the odors into 10 basic categories; floral, fruity (non-citrus), woody, chemical, minty, sweet, pungent, popcorn, citrus, and decayed. Twenty-four odors used in the final version of Iran-SIT could be classified in eight categories shown in Table 5. None of the odors belonged to chemical and decayed categories. All odors are assumed to stimulate the first cranial nerve (olfactory nerve); however, a few subjects reported that some odors like garlic and mint caused a mild irritation in nose.

Table 4 Identification rate of each odor used in the final version of Iran-SIT
Fig. 1
figure 1

Iran smell identification test

Table 5 Classification of 24 odors used in the final version of Iran-SIT

The means of identification scores were 20.06 for all subjects, 20.22 for female, and 19.87 for male. In order to examine the effect of gender on olfactory function, unpaired t test was conducted for identification scores of each subject. t test demonstrated no significant difference between female and male (t(575) = 1.15). Furthermore, in order to examine the effect of aging on olfactory function, we conducted one-way factorial analysis of variance (ANOVA) for identification score with the age group as the between-subject factor. ANOVA revealed significant main effect of age group (F(12, 564) = 58.24). Multiple comparisons by Tukey’s method for the significant main effect demonstrated significant differences between some combinations of age groups as shown in Table 6. It is well known that human’s chemosensory function declines with aging (Ship and Weiffenbach 1993; Doty et al. 1984a, b; De Jong et al. 1999; Hummel et al. 2003). Murphy et al. (2002)) reported that the 24.5 % of people over 53 years of age and 62.5 % of those aged 80–97 years suffered from olfactory impairment. In our study, the decrease of olfactory function was observed over 50 years. Adult aged 20–50 years kept significantly higher olfactory function than children or elderly people. Children, especially those under 10 years of age, markedly obtained the low scores in Iran-SIT due to their insufficient experience.

Table 6 Identification score of each age group

Test-Retest Study

We created a bubble chart for the test-retest study in Fig. 2. In order to assess reliability and stability of the final version of Iran-SIT over time, we calculated the Pearson’s correlation coefficient (r = 0.93) and Spearman’s rank correlation coefficient (ρ = 0.89) between test-retest identification scores. Test for non-correlation demonstrated significant correlations for both coefficients (p < 0.000). In previous studies, Pearson’s correlation coefficients between test-retest identification scores was 0.949 for UPSIT with interval of 2 weeks (Doty et al. 1985), 0.918 for UPSIT with interval of 6 months (Doty et al. 1984a, b), 0.71 for cross-cultural version of UPSIT (CC-SIT) (Doty et al. 1996), and 0.73 for Sniffin’ Sticks (Kobal et al. 1996). Compared with other olfactory identification tests reported in previous researches, Pearson’s correlation coefficient of the final version of Iran-SIT had an acceptable value. We assumed that the final version of Iran-SIT was reliable and stable over time.

Fig. 2
figure 2

Bubble chart of the relation between test and retest study with 5 months interval

Diagnostic Criterion of Olfactory Disorder Using the Final Version of Iran-SIT

Based on identification scores obtained from adult aged 20–50 years in main study, we determined diagnostic criterion of olfactory disorder using the final version of Iran-SIT (Table 7). Our results showed that 95 % of these subjects correctly identified over 18 odors, so that subjects who gained identification scores from 19 to 24 was considered as normosmia with error at 5 % level. According to the laws of probability in four-alternative forced-choice test using 24 odors, the accumulation probability to identify correctly below 10 odors at chance level is 94.5 %. Hence, subjects who gained identification scores from 0 to 9 were considered as anosmia with error rate at 5.5 % level. Subjects who gained identification scores from 10 to 18 were subdivided into two levels: severe microsmia (the scores from 10 to 13) and mild microsmia (the score from 14 to 18). This heuristic classification has been applied to classify UPSIT data obtained from approximately 4000 subjects (Doty 1995).

Table 7 Diagnostic criterion of olfactory disorder using the final version of Iran-SIT

Conclusion

In the present study, we have developed a standardized 24-item smell identification test to assess the olfactory function of Iranian population considering cultural adaption. Iran-SIT has adequacy to classify adult patients into four levels: ones with the normal olfactory function (normosmia), ones with mildly decreased olfactory function (mild microsmia), ones with severely decreased olfactory function (severe microsmia), and ones with loss of olfactory function (anosmia).