Introduction

Sleep quality is a clinically relevant dimension of sleep-wake function. It refers to several different parameters such as sleep onset latency, sleep duration, sleep efficiency and depth, and number of awakenings [1]. Poor sleep quality is associated with increased health care costs, absenteeism from work, increased risk for psychiatric disorders and generally with a negative impact on health measures [2]. A possible secondary effect of impoverished sleep quality is diurnal sleepiness that, in turn, can increase the risk of motor vehicle accidents and related mortality, worsen psychophysical health, reduce memory and learning abilities, impoverish school and academic outcome [35].

The gold standard for sleep quality assessment still remains polysomnography (PSG) and/or electroencephalographic (EEG) spectral content analysis, even though behavioural parameters can be effectively used in specific cases (e.g., actigraphy). Unfortunately, both kinds of measures can present practical limits in large-sample studies as well as in field studies [1]. To overcome these limitations and gather preliminary information on possible sleep diseases in both normal and pathological individuals, self-report retrospective and prospective questionnaires have been developed.

The 19-item Pittsburgh Sleep Quality Index (PSQI) is probably the most commonly used retrospective self-report questionnaire, that measures sleep quality over the previous month [68]. Seven clinically derived domains of sleep difficulties (sleep quality, sleep latency, sleep duration, habitual sleep efficiency, sleep disturbances, use of sleeping medications, and daytime dysfunction) are assessed by the questionnaire. Taken together, these sleep domains are scored as a single factor of global sleep quality. Usually, a global score higher than 5 is considered as an indicator of relevant sleep disturbances in at least two components or of moderate difficulties in more than three components [6]. More recently, an overlapping of some components has been observed and three distinct factors have been extracted [9]: sleep efficiency (including sleep duration and habitual sleep efficiency), perceived sleep quality (including subjective sleep quality, sleep latency and use of sleeping medication), and daily disturbances (including sleep disturbances and daytime dysfunction).

Several psychometric aspects of the PSQI have been investigated and reported, such as internal consistency [6, 7], reliability and construct validity [7, 8], and stability over 1 year among a population-based sample [10]. Moreover, it has been successfully used in both normal [11] and pathological aging [12].

The concurrent validity of PSQI and PSG has been repeatedly assessed [6, 7]. All the studies reported a low concurrent validity, confirming the well-known reduced correlation between objective and subjective measures of sleep, usually found in both normal and pathological individuals [1].

This extensive analysis of statistical and psychometric aspects of PSQI has developed together with an effort to adapt it to different cultures and languages [1316]. Despite its widespread use in Italy both in clinical and healthy populations [1720], at the moment no validation has been performed yet. In some cases, the authors did the first step of a validation, translating the PSQI from English into Italian and then back-translating it for comparison with the original version [17, 20].

The aim of this study is to provide a contribution to the Italian validation of the PSQI by comparing questionnaire scores and PSG parameters in five different groups of individuals [young healthy controls (HC), healthy elderly (HE), patients with dementia (DEM), obstructive sleep apnoea syndrome (OSAS) patients, depressed patients (DEP)]. Furthermore, the PSG measures were correlated to the corresponding dimensions provided by the PSQI. Hence, the study is mainly aimed at evaluating the discriminative and concurrent validity of the questionnaire in its Italian version.

Methods

Participants

Fifty individuals participated in the study, 10 for each group: young HC, HE, DEM, OSAS patients, DEP. The demographic characteristics of each group are reported in Table 1.

Table 1 Characteristics of groups participating in the study

Based on a clinical interview, we selected the healthy young and aged controls ascertaining that they did not suffer from neurological, psychiatric or any other serious medical condition. Moreover, the aged group was tested for the global cognitive status showing a normal score at Mini-Mental State Examination (MMSE; mean 26.7 ± 1.34, range 25–29). Further requirements for inclusion were: normal sleep duration and schedule, no daytime nap habits, no excessive daytime sleepiness and no other sleep disorder, as assessed by a 1-week sleep log.

The other groups were composed by patients treated for specific and isolated diseases (dementia, OSAS or depression). The group with Alzheimer’s dementia was diagnosed following the international guidelines NINCDS-ADRDA [21] and had a mean score at MMSE of 21.8 (±1.93; range 19–25).

Sleeping apnoea was defined as a complete or almost complete cessation of airflow, indicated by reduction to 25 % or less of baseline amplitude for 10 s or more, and hypopnoea was defined as a clear decrease in airflow to 50 % of baseline for at least 10 s. Only apnoeas and hypopnoeas that were followed by a 4 % or greater decrease in oxygen saturation were counted. OSAS severity was classified by the frequency of apnoeic and hypopnoeic events per hour of sleep [apnoea–hypopnoea index (AHI)] [22]. The OSAS group in this study had a diagnosis of severe obstructive syndrome, thus presenting an AHI ≥30/h. None of them had a history of hypertension and other cerebrovascular disturbances.

The group with depression was assessed by means of the Hamilton Depression Rating Scale (HDRS), showing a mean score of 26.3 (±2.74; range 22–31); none of the DEP suffered from other neurological or serious medical condition.

The protocol of the study was approved by the local Institutional Ethical Committee, and the participants gave their written informed consent, according to the Declaration of Helsinki.

Italian version of the PSQI

The PSQI was translated into Italian by two experienced sleep researchers. The translation was then evaluated by two independent experts in sleep disorders. Finally, one translator back-translated the questionnaire from Italian to English. After this procedure the Italian version of the PSQI was obtained (see “Appendix”).

Procedure

All the participants in the study slept for two consecutive undisturbed nights (the first being an adaptation night) in a sound-proof, temperature-controlled room. During both the two nights, each of them underwent a polysomnographic (PSG) recording, but to avoid the inconveniences related to the first-night effect [23], only the second night was analyzed and taken into consideration in the subsequent analyses. EEG signals were high pass filtered with a time constant of 0.3 s and low pass filtered at 30 Hz. Five unipolar EEG derivations of the International 10-20 system (Fz, Cz, Pz, O1, O2) were recorded from scalp electrodes with an averaged mastoid reference. The submental EMG was recorded with a time constant of 0.03 s. Bipolar horizontal eye movements were recorded with a time constant of 1 s. The bipolar horizontal EOG was recorded from electrodes placed about 1 cm from the medial and lateral canthi of the dominant eye. Impedance of all the electrodes was kept below 5 KOhm.

Two days before the PSG recording, each participant underwent a clinical interview during which the PSQI was also administered by a trained experimenter.

Data analysis

Sleep measures

Sleep stages were visually scored in 20 s epochs, according to the standard criteria [24]. The following were considered as dependent variables: stage 1 latency, stage 2 latency, SWS latency, REM latency, percentage of stage 1, percentage of stage 2, percentage of SWS, percentage of stage REM, number of awakenings (i.e., 20-s epochs with >50 % of wakefulness/activation), total sleep time (TST), total bedtime (TBT), sleep efficiency index (SEI = TST/TBT × 100).

The PSG measures were submitted to one-way analyses of variance (ANOVAs), comparing the five different groups (HC, HE, DEM, OSAS, DEP). All analyses were performed using Statistica 6.0 for Windows (StatSoft Inc.). Post hoc comparisons were carried out by means of Fisher Least Significant Difference (LSD) tests.

PSQI scores

To assess the PSQI internal consistency, Cronbach’s-α coefficients were computed for the whole sample.

The Italian version of the PSQI was administered and evaluated following both the original scoring system [6] and the revised 3-factor scoring model proposed by Cole et al. [9]. Thus, the following were considered as dependent variables: global score, subjective sleep quality (C1), sleep latency (C2), sleep duration (C3), habitual sleep efficiency (C4), sleep disturbances (C5), use of sleep medications (C6), daytime dysfunction (C7), perceived sleep quality (F1), sleep efficiency (F2), daily disturbances (F3).

As for the PSG measures, the PSQI scores were submitted to one-way ANOVAs, comparing the five different groups (HC, HE, DEM, OSAS, DEP).

Psychometric properties of PSQI were compared in the whole sample of participants calculating the receiver operating characteristics (ROC) curve and the area under the curve (AUC) [25] relatively to the different possible cut-off scores.

With respect to concurrent validity, a correlation analysis (Pearson’s r) was carried out between PSQI global scores and PSG sleep variables (stage 1 latency, stage 2 latency, SWS latency, REM latency, percentage of stage 1, percentage of stage 2, percentage of SWS, percentage of stage REM, number of awakenings, TST, TBT, SEI) recorded In the whole sample.

Results

Sleep measures

The one-way ANOVA showed statistically significant differences between groups for all PSG measures with the exception of stage 2 latency and TBT (Table 2). As expected, SWS latency was higher for DEP than for the other participants, number of awakenings was higher for OSAS patients, while sleep efficiency was dramatically reduced in all the groups compared to the healthy young controls.

Table 2 Means and standard deviations of the PSG variables, during the baseline night

PSQI scores

The PSQI showed an overall reliability coefficient (Cronbach’s α) of 0.835, indicating a high internal consistency.

The one-way ANOVA demonstrated significant differences in PSQI global scores between the five groups (p < 0.000001, see Fig. 1). Post hoc tests revealed that the global PSQI scores for dementia, OSAS and depressed groups were significantly higher than for the healthy young controls (p < 0.02) and the healthy elderly controls (p < 0.01). On the other hand, the scores for the OSAS group did not differ from those of the depression group, and no significant difference was found also between the two control groups (young and elderly).

Fig. 1
figure 1

Mean (±SD) of PSQI global scores for each group of control and patients. The horizontal dotted line indicates the cut off for sleep disturbances

The mean PSQI single-component scores showed again sharp differences between groups (Table 3). All the components but the C7 (Daytime dysfunction) showed statistically significant differences between groups. Looking at group’s means, a similar scenario can be observed: higher scores in the patients groups (worst reported sleep measures) than in the young and elderly controls. Post hoc tests confirmed that also the components of the PSQI were significantly higher for OSAS and depressed groups than for the healthy young and elderly controls (p values ranging from 0.01 and 0.000001).

Table 3 Means and standard deviations of PSQI global scores and PSQI components and factors

The mean PSQI single-factor scores [9] also indicated statistically significant differences between the five groups (Table 3). Again, HC and HE groups showed less sleep disturbances than the other groups. Again, post-hoc comparisons indicated that the three PSQI factors were significantly higher for OSAS and depressed groups than for the healthy young and elderly controls (p values ranging from 0.02 and 0.0000001).

The results of the ROC curve and AUC are reported in Table 4. They show that the best cut-off score is 5 with a relatively high sensitivity (0.939), the highest specificity (0.471) and the highest and statistically significant AUC (0.705; p = 0.0019).

Table 4 Psychometric properties of the PSQI at different cut-off scores

As far as the correlational analysis is concerned, PSQI global score correlated significantly with stage 2 latency, SWS latency, percentage of stage 1 and stage 2 (Table 5).

Table 5 Pearson’s r (and related level of significance) between PSQI global scores and PSG sleep dependent variables

Discussion

To our knowledge, this is the first attempt to validate the PSQI in Italian language. The present study shows that (1) the Italian version of the PSQI has a high degree of internal consistency, (2) PSQI allows to discriminate between healthy controls and patients with different sleep complaints and (3) the traditional cut-off of 5 (or higher) can be reliably applied also to the Italian population.

The internal consistency resulted very high (Cronbach’s α of 0.835), indicating that the classical seven sleep difficulties components of the questionnaire (sleep quality, sleep latency, sleep duration, habitual sleep efficiency, sleep disturbances, use of sleeping medications and daytime dysfunction) do effectively evaluate a particular domain of sleep quality. The value of Cronbach’s α was comparable with several previous studies [8, 16], including the first one reported by Buysse et al. [6].

The PSQI global score allowed discriminating between healthy controls and patients, and between groups of patients affected by different pathologies, with an impaired overall sleep quality in OSAS and DEP (and, secondarily, in patients with dementia) compared to both the healthy elderly and young groups. Such an impoverished self-reported sleep quality corresponds to the alterations in the objectively measured sleep variables in different groups, as showed in Table 2, where SWS latency was higher for DEP than for the other participants, number of awakenings was higher for OSAS patients, while sleep efficiency was dramatically reduced in all the groups compared to the healthy young controls. These results indicate that the participants were representative of groups with a significantly different sleep quality. In fact, patients with a well-known impairment of self-reported sleep quality (OSAS, DEP and DEM) differed from healthy controls but did not substantially differ amongst them. Similarly, healthy young and elderly controls show largely overlapping results, and they both differ from the pathological groups. Such effects resulted statistically significant despite the choice to exclude all the patients with frank sleep disorder. Most of the previous validation studies, in fact [6, 8, 16], took into consideration also insomniacs: here, we decided to rule out such patients to directly test the PSQI reliability in distinguishing between healthy controls and individuals with slight sleep complaints.

The mean PSQI scores were slightly higher than those of the original version of the questionnaire [6] or recent validation studies in young controls, OSAS and depression groups [16]. This difference may depend on the fact that our participants were all young university students, who usually present small alterations in sleep-wake patterns. It should be stressed that none complained of sleep problems and their PSG did not present any sign of objective alteration. With respect to the pathological groups, both OSAS and DEP suffered from chronic pathologies, and this could explain their extreme feeling of disease and thus their evaluation of a strongly reduced sleep quality.

Some differences between groups have been also found on all the components but the C7 (daytime dysfunction), which did not differ significantly between groups. Likely, this omnibus difference does not reach the significance due to the fact that this component mainly assesses daytime sleepiness. This impairment characterizes largely more OSAS patients [26] than depressed or dementia patients. In fact, the comparison between OSAS patients and the two control groups was significant. More robust differences between pathological and control groups can be observed on components C2 (sleep latency), C3 (sleep duration) and C6 (use of sleep medications).

Furthermore, the sharp differentiation between groups reported in Table 3 point to the fact that the Cole et al. [9] 3-factors scoring model 9is applicable also to the Italian version of the PSQI. All three the factors indicate wide discrepancies between groups, but the strongest result to be Perceived sleep quality (F1) and Sleep efficiency (F2). Again, the highest scores were shown by depressed and OSAS patients.

The ROC curve and the AUC are usually used when it is necessary to dichotomize a continuous scale, allowing to establish the test’s ability to discriminate between groups and to choose the optimal cut-off point [25]. Our results indicate that also in the Italian version of PSQI, the best cut-off score can be placed at 5 (or above), with a high sensitivity and specificity, i.e. reducing both false-negatives (pathological cases judged to be normal) and false-positives (normal cases placed in the pathological group). Finally, regarding AUC, a value higher than 0.70 can be accepted as an index of moderate-to-high accuracy [27].

According to several previous studies [6, 11, 16], the correlational analysis showed weak results. Some correlations have been observed, indicating that a higher PSQI global score tend to correspond to a general worsened objective sleep quality (increased latency and reduced percentage of NREM sleep). Such reduced level of concurrent validity between PSG and subjective assessment of sleep quality can be explained on the basis of different aspects. First, daily fluctuations of sleep cannot be significantly described by a questionnaire that aims to investigate the sleep quality over the past month. Secondarily, some kind of dissociation between objective and subjective measures of sleep is a very common issue in sleep research [1]. This is true both in normal and pathological individuals [28, 29], and it seems to be influenced by different aspects such as sleep setting, personality traits and constitutional factors [28]. It should be also remembered that self-report estimates are very vulnerable to memory processes, misperception, overt or covert tendency to exaggerate number and gravity of symptoms [1], and that this vulnerability largely affects more retrospective than prospective (i.e., sleep diaries) measures. It is nevertheless noteworthy, that some associations between PSQI scores and most of measures of NREM sleep were found.

A possible limit of the study could be related to the relative small sample size, but it is counterbalanced by the fact that all the participants underwent complete and repeated PSG recordings. Moreover, the selection of participants (with rigorous inclusion/exclusion criteria) was very careful and any participant presenting comorbidities was excluded in the final analysis. As a further point of strength of the present work, the inclusion of an elderly control group and of the group with dementia should be mentioned. Future studies could clarify the specific cut-off scores for different sleep and sleep-related pathologies, and the possibility to extend the usefulness of the PSQI in different age groups.

In conclusion, the results of the present study indicate that the PSQI is a useful, valid and reliable tool for the assessment of sleep quality also in the Italian language, and that it appears equivalent to the original version proposed by Buysse et al. [6]. Because of its simplicity, PSQI can be easily understood and answered also by patients with dementia.

Moreover, the Italian version of the questionnaire also provides a good and reliable differentiation between normal and pathological groups, with higher scores reported by people characterized by more impaired sleep quality (i.e. OSAS and DEP).

Finally, the PSQI also presents a similar cut-off of 5 as a boundary between “good” and “impoverished” sleep quality.