Introduction

Chronic rhinosinusitis (CRS) is a highly prevalent disease among society, which causes significant changes in patients’ quality of life. It is estimated to affect 5–15 % of the European population; therefore, it is important to provide elements to evaluate these changes to render proper healthcare to society.

Following the indications published in 2012 in the European Position Paper on Rhinosinusitis (EPOS) [1], CRS (with or without nasal polyps) is defined as inflammation of the nose and the paranasal sinuses characterized by two or more symptoms, one of which should be either nasal blockage/obstruction/congestion or nasal discharge (anterior/posterior nasal drip), ±facial pain/pressure, ±reduction or loss of smell; and either endoscopic signs of polyps and/or mucopurulent discharge primarily from middle meatus and/or edema/mucosal obstruction primarily in middle meatus and/or CT changes showing mucosal changes within the ostiomeatal complex and/or sinuses. CRS symptom must have been present for more than 12 weeks.

The sino-nasal outcome test questionnaire 22 (SNOT-22) or the 31-item rhinosinusitis outcome measure (RSOM-31) is recommended among the validated and published instruments to assess the impact of CRS on the quality of life in adults. The sino-nasal outcome test questionnaire 20 (SNOT-20), which is a modification of RSOM-31, has also been previously used. SNOT-22 is a modification of SNOT-20, adding to the latter two specific rhinological symptoms: (a) nasal obstruction and (b) loss of sense of taste and smell [2].

These questionnaires, besides being validated to evaluate the impact of the disease on quality of life, also allow us to appraise the evolution of the disease over time and its variation with medical or surgical interventions; therefore, it is necessary to use these tools to estimate the responsiveness of patients to the applied medical procedures. However, these tools have been developed in English language for use in English-speaking countries, to apply them in another cultural and linguistic environment; they must be adapted to the field in which they are intended to be used.

Even though this questionnaire is recommended for its use in CRS, it is not available in Spanish. The purpose of this work was to adapt the questionnaire to Spanish language and validate the Spanish version of the SNOT-22 for its use with Spanish-speaking patients.

Materials and methods

We performed a prospective study with patients diagnosed with CRS at the Otolaryngology clinic in our hospital. The study was previously approved by the Hospital’s research ethics committee.

To perform our study, we followed the guidelines proposed for cross-cultural adaptation of health-related measures of quality of life [3, 4], including: (a) translation to Spanish from the original English version, (b) back translation by bilingual translators, (c) review by a translation and retranslation committee, (d) conducting of a pilot study to test comprehension and suitability of the questionnaire. After verifying proper translation of the questionnaire, the study was performed in CRS patients and healthy individuals as controls.

General inclusion criteria required to be over 18 years old, have Spanish as their native language, understand the purpose of the study and be available to repeat the questionnaire within 3 weeks approximately. Individuals included in the case group were patients who met the clinical criteria for CRS according to EPOS 2012; additionally, patients were classified into CRS with nasal or without nasal polyps after clinical examination. A control group of healthy volunteers without nasal pathology who met all the inclusion criteria except for the nasal pathology was also selected.

The study included 119 individuals divided into two groups: 60 cases, formed by patients who met diagnostic criteria for CRS according to EPOS 2012; and 59 controls, consisted of healthy adults who reported no sino-nasal disease.

In total, there were 51 women and 68 men, with a mean age of 54 years among cases and 41 years among controls. As for cases, 40 individuals (66 %) were diagnosed with CRS with nasal polyps and the remaining 20 (33 %) with CRS without nasal polyps; in addition, 20 of them underwent medical treatment, while 40 cases received surgery as a treatment option, all of these following the guidelines of EPOS 2012.

All individuals, 60 patients diagnosed with CRS and 59 controls filled in the questionnaire completely and repeated it 3 weeks after treatment.

To assess the sensitivity to change, patients were provided with the SNOT-22 questionnaire at the office in the first consult where they were diagnosed with CRS, as well as 3 weeks after performing the surgical or medical treatment recommended by EPOS 2012. The overall subjective sensation of change 3 weeks after surgical or medical treatment (external criterion) was also collected using a Likert scale of 5 points: much worse, worse, unvarying, better or much better.

The validity and reliability of the test were analyzed by the internal consistency and reproducibility of response after repeating the test. Internal consistency refers to the homogeneity of the questions comprising the questionnaire and was measured using Cronbach’s alpha coefficient. A value of 0.7 is estimated as a minimum acceptable [4]. In addition, reproducibility was assessed by repeating the test at 3 weeks and evaluating the agreement of responses between the two moments by the Kappa coefficient and the intraclass correlation coefficient (ICC), both tests for the mean of the overall result and for each item, considering good correlation with values above 0.7 [5, 6].

Validity indicates the ability of the test to detect differences between known groups. This difference between cases and controls was analyzed using the Mann–Whitney U test.

Responsiveness is the capacity of the test to detect clinical changes over time. To estimate responsiveness, we used the Wilcoxon test and sensitivity to change (effect size), calculated as the ratio between the mean value of the variation in scores and the standard deviation of the initial values in the groups formed by the external criterion (Likert scale). A slight change was considered if the effect size varies between 0.2 and 0.5; moderate change between 0.5 and 0.8 and important change above 0.8.

STATA v.13.0 was used for statistical analysis. The significance level set as statistically significant was p < 0.05.

Results

After translation, back translation, revision and cultural adaptation, a final version of the Spanish SNOT 22 was obtained (Fig. 1).

Fig. 1
figure 1

Translation of SNOT-22 questionnaire to Spanish language

The mean score for each one of the items either for patients and controls is shown in Table 1. In the first assessment, the mean SNOT 22 score for the cases group was 47.18 with a standard deviation of 20.99 and a median of 47. On the other hand, in the control group, the mean score was 4.49 with a standard deviation of 7.35 and a median of 2.

Table 1 SNOT 22 scores for cases and controls

In cases, Cronbach’s alpha was 0.91 both before and after treatment; in controls, it was 0.90 in their first assessment and 0.88 at 3 weeks.

Kappa coefficient was calculated for each item, with an average score of 0.69, reflecting a good agreement. ICC was also performed for each item, with a score of 0.87 in the overall score and an average score among all items of 0.71.

The validity of the questionnaire was measured with the Mann–Whitney U test, comparing the difference between cases and controls. The median (percentiles 25 and 75) score for cases was 47 (6; 90) and 2 (0; 39) for controls, finding the difference to be highly significant, with a p < 0.0001 (Table 1).

Regarding responsiveness, differences in median score were found among treated patients (before 47 (6; 90) and after 13.5 (0; 65), (Wilcoxon, p < 0.001), but not among controls (same score before and after) (Table 1). In our study, the difference before and after medical or surgical treatment was statistically significant (p < 0.0001).

The effect size, evaluated in each group according to the Likert scale, resulted in 0.14 in treated patients who referred to their status at 3 weeks as unvarying; 1.03 in those who were better and 1.89 in which were much better; among controls, they all referred as unvarying and their effect size was 0.05. No individual was worse or much worse after treatment. The data of the effect size for diagnostic and therapeutic groups are presented in Table 2. Surgical treatment appears to have a greater effect.

Table 2 Effect size for diagnostic ant therapeutic groups

Discussion

The development of health-related measures of quality of life, allows us to have a better understanding of the impact of health interventions in our patients. Measuring the impact caused by the disease and the therapeutic measures applied to it, from the patient’s perspective, it is probably a better reflect of this situation when compared to the evaluation done only from the physician’s point of view.

Measuring instruments are usually scales or questionnaires that can be quantified and averaged getting an overall score. Moreover, patients are able to assess how the disease affects to different aspects of their life.

Although generic questionnaires can be used to evaluate different pathological situations, therapeutic interventions, or even costs on a certain group of patients, they are usually not able to detect the impact on quality of life due to a specific disease.

Currently, there are several specific instruments to assess rhinosinusitis impact over patient’s quality of life. Based on the published literature, EPOS 2012 consensus recommends using the following tools to measure results: SNOT-22 or RSOM-31 in adults with chronic rhinosinusitis, SNOT-16 in adults with acute rhinosinusitis, SN-5 in the pediatric population with chronic rhinosinusitis and S-5 in pediatric population with acute rhinosinusitis.

Measures of health-related quality of life are often developed in English, to be used in English-speaking population. However, it is necessary as well to design similar tools for its use in different groups with other languages. To achieve this, there are two options that can be followed: (a) creating of a new questionnaire from scratch in the language which it is intended to be used or (b) using a measuring instrument previously developed in another language and perform a cross-cultural adaptation. The first option is a slow and expensive process. The second option is not achieved with the simple translation of the questionnaire, because the perception of quality of life and the way to express the health problems vary among cultures. Therefore, a systematic process of translation and cross-cultural adaptation of health-related quality of life measures is required.

SNOT-22 was developed and validated in English [2], it has already been translated and adapted to other languages such as Danish, [7] Czech [8], Swedish [9], Chinese [10], Lithuanian [11], Portuguese [12], Persian [13] and Greek [14]; nevertheless, it has not been adapted and validated for Spanish patients until now.

For SNOT-22 translation and cross-cultural adaptation into Spanish, we have followed the formal procedure proposed and published by Guillemin et al. [3]. A process of translation, back translation by bilingual translators, reviewing of the different versions by a translation and retranslation committee as well as a conduction of a pilot study to test comprehension and suitability of the questionnaire has been followed. Once we had obtained an apparently equivalent instrument, its characteristics were evaluated by the present prospective study, to assess its internal consistency, reliability, validity and sensitivity to change.

Internal consistency can be measured with Cronbach’s alpha index. Cronbach’s alpha index can estimate whether a set of items measure the same construct or theoretical dimension; as closer to 1 the index, the greater the internal consistency of the questionnaire items. A ratio above 0.7 is considered acceptable, above 0.8 is considered good and more than 0.9 is excellent [4, 15]. In our study, it was of 0.91 and remained the same after treatment, being these results similar to other cultural adaptations of the SNOT-22 questionnaire [7, 12]; therefore, we can consider the internal consistency of the questionnaire as excellent.

The questionnaire test–retest reliability was assessed by means of the intraclass correlation coefficient (ICC) and the Kappa coefficient. Although other published validations have used Pearson correlation, the ICC is the recommended index for measuring the reliability of measurements associated with these quantitative variables [4, 5]. It allows assessing the agreement between two measurements on the same individual, and the general agreement between two different observations. The ICC measures the proportion of total variability due to patients. Its values range between 0 and 1, with 0 being no match and 1 being the absolute consistency or reliability of data. Although the value of satisfactory reliability is arbitrary and depends on the use made of it, in general, is considered acceptable above 0.4 and excellent above 0.75 [5]. In our case, we obtained an ICC of 0.87, reflecting an excellent reliability value.

Another aspect of reliability is to assess the agreement between two different time points over the same observer (intra-observer agreement). In case of categorical variables, the Kappa coefficient is the recommended index to perform it. Its value ranges between 0 and 1, considering the scales accepted for interpretation of this data and the data similar or slightly lower in other cultural adaptations of the SNOT-22 [7, 14], the value of 0.69 obtained in our study represents a substantial agreement.

To check the validity of the questionnaire, the ability to reflect differences between known groups has been measured. We evaluated the difference between the scores from patients diagnosed with CRS and the control group using the Mann–Whitney U test. The result was highly significant (p < 0.0001), and indicates the validity of the Spanish SNOT-22 to detect the difference between the two groups.

The responsiveness is the ability of the questionnaire to detect clinical changes. It can be evaluated by comparing the scores of SNOT-22 before and after treatment. We used the Wilcoxon test, which is the non-parametric alternative to the paired t test. In our study, the difference before and after medical or surgical treatment was statistically significant (p < 0.0001), similar to other authors’ results and reflects a good responsiveness of the Spanish version of the SNOT-22 questionnaire. To quantify the responsiveness, we obtained the magnitude of the effect.

We would expect a near zero effect size in the unvarying group of the Likert scale and that this effect size grows if the group on this scale is better or much better, and so it is in our study. The effect size in those subjects who perceived changes in their quality of life according to the external criterion used (Likert scale) was greater than 0.8. This is an important change; hence, a very important quantitatively difference before and after treatment (Table 2), with a great improvement in quality of life, as it has happened in other cross-cultural adaptations such as Greek [14] Lithuanian [11] and Portuguese [12].

Conclusion

The Spanish version of the SNOT-22 has the internal consistency, reliability, reproducibility, validity and responsiveness necessary to be a valid instrument to be used in clinical practice as well as for assessing the quality of life in Spanish-speaking patients with chronic rhinosinusitis.