Introduction

Lack of insight is a prevalent feature affecting approximately 50 to 80 % of patients with schizophrenia [1], and insight is a major objective of pharmacological and psychological treatment [2]. Thus, the understanding of insight in relation to psychopathology and clinical outcomes has important implications for the development of effective and efficient treatment strategies [35]. However, previous studies have been unable to describe the relationship between insight and severity of psychopathology [69], depressive symptoms [8, 10], compliance to therapy [11], quality of life [12, 13], or neuropsychological functions [14, 15]. The lack of a consistent definition of insight has been signalled out as an explanation of these inconsistencies [16, 17]. Therefore, a consensus has been progressively reached in recent years on the definition of insight, which is now considered to be a continuous and multidimensional construct that includes the following points: (1) awareness of having a mental illness, (2) understanding the need for treatment, (3) awareness of the social consequences of mental disorder, (4) awareness of symptoms, and (5) attribution of symptoms to a mental disorder [18]. Thus, when investigating the role of insight in schizophrenia, studies should incorporate these dimensions, making it possible to compare results across studies [18].

The Scale to Assess Unawareness of Mental Disorder (SUMD) is one of the most widely used instruments to measure insight, given the aforementioned continuous and multidimensional approach [18, 19]. However, there seems to be some uncertainty regarding the appropriate use of insight measures, including the SUMD. This uncertainty may have serious consequences on the type and amount of evidence found, and such evidence is essential in determining the best prevention and therapeutic strategies. Several issues should be considered when using the SUMD. First, there are two SUMD versions (the long form and the short form), and they vary in content, scoring, and interpretation of insight scores [8, 20], which may be the basis for some of the confusion among researchers and clinicians. Moreover, biased interpretations and findings may result from methodological problems such as the use of the SUMD in the absence of cross-cultural validation, the use of small and heterogeneous samples (e.g. mixing schizophrenia and other mental disorders), and analysis using inappropriate statistical methods [21]. To the best of our knowledge, a detailed and critical review of the use of the SUMD has never been systematically performed. The aim of our study was to retrieve and review all studies using the SUMD that were published in the last 20 years (the date of the initial validation of this scale) [20], with special attention to the characteristics of the SUMD (version, rating scale, scoring, and item/dimension used), the methodological aspects (country, language, subject-inclusion criteria, and sample size), and the statistical methods used to analyse insight.

Methods

The SUMD

The SUMD long version [20] is a 20-item scale that attempts to assess current and past awareness of illness. The first three items, which assess general awareness of mental illness, are (1) awareness of mental disorder, (2) effects of medication, and (3) social consequences of mental disorder. Items 4-20 pertain to specific symptoms. If the subject shows awareness of a symptom, he is asked about the attribution of this symptom. Awareness and attribution items are rated from 1 to 5, with higher scores indicating poorer awareness or attribution. The 17 symptom items render four subscale scores: current awareness, past awareness, current attribution, and past attribution. This version has been validated for schizophrenia and schizoaffective disorders. Each scale is calculated by dividing the sum of the Likert scale scores by the number of symptoms. Scores on each general item are interpreted separately.

The short version of the SUMD [8] consists of nine items (three general items and six symptom items) assessing the current awareness of mental illness. Eleven symptom items, the past awareness subscale and the attributional subscales that are included in the long version are omitted in the short version. Each item is examined separately without calculation of subscale scores. The items are rated from 1 to 3, with higher scores indicating poorer awareness. The short version has been used with patients with schizophrenia, schizoaffective disorder, and bipolar and unipolar mood disorders with or without psychotic features.

In addition to English, the SUMD has been adapted and validated in French [22, 23], Spanish [24], and Portuguese [25] (Brazilian sample).

Search Strategy

We performed an electronic search of MEDLINE via PubMed to identify all studies published from June 1, 1993, to June 30, 2012. The following search equation was used: ‘Scale to assess Unawareness of Mental Disorder’ OR ‘SUMD’.

Selection Criteria

One of the authors (R.D.) read the titles and abstracts of all retrieved articles. All English language studies using the SUMD, whatever their design or methodology (cross-sectional, case-control, cohort studies, or clinical trials), were included. Letters to the editor, case reports, case series, validation or metrological studies, studies not assessing insight with the SUMD, and non-English language studies were excluded. A second author (K.B.) read all articles of uncertain eligibility, and the final decision for inclusion was obtained by consensus between the two reviewers.

Data Extraction

To analyse the content of the articles, we generated a standardised data collection form based on a review of the literature and a priori discussion. As a calibration exercise prior to data extraction, two members of the team (R.D., L.B.) evaluated a random set of ten studies. All disagreements were resolved by consensus, and the form was modified accordingly. The following data were extracted from each article:

  1. 1.

    General characteristics of the selected studies: first author, year of publication, and country.

  2. 2.

    Characteristics of the population: inclusion criteria (schizophrenia and/or schizoaffective disorders, bipolar or unipolar disorder, psychosis other than schizophrenia or schizoaffective disorders), and sample size.

  3. 3.

    Characteristics of the SUMD: version (short version, long version, or not specified), rating scale (3-point Likert scale (1-3), 5-point Likert scale (1-5), or not specified), and items/subscales used.

  4. 4.

    Statistical methods to analyse insight: analysis of insight scores (separate analysis of items, analysis using the sum total or mean scores of items), use of categorical or continuous variables, and the definition of impaired insight.

The same reviewer (R.D.) independently completed all of the data extractions.

Statistical Analysis

A descriptive analysis was conducted. The data were summarised as numbers and percentages for qualitative variables. This statistical analysis was performed using the SPSS version 17.0 software package (SPSS Inc., Chicago, IL, USA).

Results

Selection of Relevant Studies

A flow chart of the selected studies assessing the SUMD is presented in Fig. 1. Briefly, the electronic search yielded 133 citations, 117 articles were selected for further evaluation, and a final 100 studies were selected after reading the full text.

Fig. 1
figure 1

Flow diagram of publications identified in PubMed database with keywords for “Scale to assess Unawareness of Mental Disorder” or “SUMD”

Characteristics of the Selected Studies

The characteristics of the selected studies are presented in Tables 1 and 2. The number of studies increased over the past 20 years; 52 studies (52 %) were published over the past five years. The SUMD was preferentially used in Europe (43 %) and North America (27 %), and more rarely in Asia (17 %). Several studies used the SUMD in a language for which no cultural validation has been published to our knowledge, including studies performed in Turkey (3), South Korea (2), and Iran (1).

Table 1 Characteristics of studies and population
Table 2 Characteristics of the SUMD

Characteristics of the Population

The studies included a broad range of mental disorders, but schizophrenia or schizoaffective disorders were the most prevalent (60 %). More rarely, studies focused exclusively on patients with mood disorders (bipolar and/or unipolar disorder, with or without psychotic features) (10 %), or other psychosis such as brief psychotic disorder, schizophreniform disorder, or delusional disorder (2 %). Finally, some studies included a range of mental disorders, including schizophrenia/schizoaffective disorder, mood disorder, anxiety disorder and other psychosis (20 %), or schizophrenia/schizoaffective disorder and mood disorder (8 %).

A majority of studies had relatively small samples; only 28 (28 %) had more than 100 patients, and 12 studies (12 %) included a sample lower than 30.

Characteristics of the SUMD

Sixty-five studies (65 %) referenced the long version, while 35 (35 %) used the short version of the SUMD. The use of the SUMD varied in terms of response modalities, number of items and subscales. Of the 65 studies referencing the long version, four (6.2 %) used the short version 3-point scale [14, 43, 66, 67] instead of the 5-point scale. In addition, three studies (4.6 %) used a modified scoring system: Goodman et al. [56] combined scores of 1 to 3 into a score of 1 and scores of 4 to 5 into a score of 2; Karow et al. [80] used a reversed scale with a score of 1 for “poor insight” and 5 for “full insight”; and Kemp et al. [27] used a 4-point scale corresponding to no, partial, moderate, and full awareness or correct attribution. Four articles (6.2 %) did not specify the rating scale used [50, 59, 75, 110]. Of the 35 studies referencing the short version, nine (25.7 %) used the 5-point scale [64, 70, 78, 79, 93, 105•], and nine (25.7 %) did not specify the rating scale used [12, 42, 47, 55, 62, 65, 72, 89, 114].

Regarding the assessment of current and past insight, all studies considered current awareness. Of the 65 studies referencing the long version, 34 (52.3 %) assessed current attribution, ten (15.4 %) assessed past awareness and five (7.7 %) assessed past attribution. Of the 35 studies referencing the short version, three studies (8.6 %) assessed current attribution, and one (2.8 %) assessed past awareness and attribution.

In terms of the items selected for use, of the 65 studies referencing the long version, 56 studies (86.2 %) considered all of the general items. Fifty-two studies (80 %) assessed three items, three studies (4.6 %) assessed only one or two general items, and one study (1.6 %) did not specify the number of general items used. Forty-three studies (66.2 %) considered the symptom items: 17 of these studies (26.2 %) assessed the complete 17-item version, 12 (18.5 %) assessed a number of items ranging from 1 to 8, and 14 (21.5 %) did not specify the number of symptom items used. Of the 35 studies referencing the short version, all considered the general items. Thirty-one of these 35 studies (88.6 %) assessed three items, one study (2.9 %) assessed only two items, and three did not specify the number of items assessed (8.6 %). Of the 13 studies (37.1 %) that considered the symptom items, nine (25.7 %) assessed the complete 6-item version, one study (2.9 %) assessed only one item, and it was impossible to deduce the number of items assessed in three studies (8.6 %).

Statistical Methods to Analyse Insight

Of the 56 studies referencing the long version and assessing general items, 48 (85.7 %) performed a separate analysis for each item, eight (14.3 %) used the mean score of all general items, and nine (16.1 %) used a sum total of all the general items. Of the 43 studies assessing the symptom items, 33 studies (76.7 %) performed an analysis using a mean of these items, and three studies (7 %) used a sum total of the symptom items. Nine articles (20.9 %) performed a separate analysis for each of item. Of note, 14 studies (33.3 %) used the subscale scores as described by Amador et al. [20]. Three studies (7.1 %) [31, 48, 66] used alternative subscales, placing symptoms in a positive, a negative or a disorganised category. Of the 35 studies referencing the short version, 24 (68.6 %) performed a separate analysis of each item, 10 (28.6 %) used a sum total of items, and two (5.7 %) used the mean of the items. For the 13 studies assessing symptom items, a separate analysis for each item was made in nine studies (69.2 %). Three studies (23.1 %) used the mean of these items and one used the sum total (7.7 %).

Twenty-five studies (25 %) created a “poor” or “good” insight variable to categorise their sample [15, 30, 31, 36, 44, 48, 51, 57, 61, 66, 69, 70, 7881, 85, 87, 88, 90, 91•, 104, 111••]. Cut-off scores for the level of insight varied across the studies. Using the 5-point scale, impaired insight was defined as a score ranging from >3 to >27 (see Table 2 for methods of calculation). Using the 3-point scale, impaired insight was defined as a score ranging from >1 to ≥5. Using a reversed scale, Karow et al. [80] defined impaired insight as a score of ≤3. Using a modified scoring (a 2-point scale), Goodman et al. [56] defined poor insight as a score of 2 (which combines scores of 4 and 5 on the 5-point scale). Nine studies (9 %) categorised their samples in three categories (fully aware, somewhat aware, unaware) [5, 11, 14, 28, 40, 46, 63, 82, 103], and Aspiazu et al. [94] used five categories (fully aware, partially aware, somewhat aware, scarcely aware, unaware).

Discussion

This paper provides a systematic review of studies using the SUMD and delineates important differences in the version used, the methods of calculation, and the interpretation of scores. Several issues need to be considered and discussed.

Results of some studies may be erroneous because of the possibly unsatisfactory psychometric properties of the ‘modified’ SUMD. The use of modified versions of the SUMD (number of items, number of sub-scales, or use of different rating scales) may affect psychometric properties such as validity (i.e. the extent to which an instrument measures what it purports to measure), which could lead to erroneous conclusions. Indeed, it has been suggested that multi-dimensional questionnaires should be used in their entirety and that the use of selected items could, by taking them out of context, compromise reliability and validity in addition to eliminating the option of comparing scores across studies or with population norms [115]. In addition, shorter versions of certain multidimensional questionnaires have been introduced to improve response rates and save time and resources, but these shorter versions may attenuate the original scales and have inferior performance [116, 117]. On the other hand, some studies have suggested that the use of selected scales from a multi-scale health-status questionnaire seem to yield results similar to those obtained with the use of the entire questionnaire [118]. However, little research has been done on the validity of the remaining scales of the SUMD in which some items or subscales are excluded. Research demonstrating the psychometric properties of selectively used items and subscales of the SUMD is necessary. Future studies should evaluate whether the scores obtained when using selected items or subscales are similar to those obtained when the entire questionnaire is administered. This issue is important because such a similarity would allow for interpretation of scores when selected items/subscales or the entire scale was used, and it would allow comparison across studies. The choice of different Likert scales also raises issues. Using a 3- or 5-point Likert scale can introduce problems of comparability across studies, particularly because it produces different scores, and thus, can make score interpretation difficult. Of note, 13 % of studies did not specify the rating scale used. A short statement on the rating scale and score calculation in the description of the methods is necessary. In addition, several studies suggest that the response scale may affect the reliability and validity of questionnaires [119, 120], and several authors suggest that an unbalanced 5-point Likert scale is more informative and discriminative than a 3-point Likert scale is [121]. Further research is required to determine whether a 3- or 5-point Likert scale should be used with the SUMD.

Difficulties may arise from using the SUMD in different cultures and populations. The SUMD was developed and validated in the United States and in the English language. However, we noted that the SUMD was used in countries for which linguistic or cross-cultural validations are not available [6, 87, 90]. The definition of what constitutes a sign of mental disorder may vary from one culture to another [20]. Cross-cultural adaptation is necessary to validate the collection of information in other cultures. Furthermore, SUMD was used in psychotic disorders other than schizophrenia and schizoaffective disorder (which is the target population of the scale). Although lack of insight is found in all psychotic disorders [122], it is necessary to confirm that SUMD has satisfactory psychometric properties in non-target populations.

Finally, results of studies may not be comparable because of the absence of agreement in the calculation of insight scores and the lack of a consistent insight impairment threshold. The absence of a unique method to calculate insight score raises a problem in the interpretation of insight severity scores. In addition, several authors used a cut-off to distinguish “poor” and “good” insight. This cut-off is problematic for several reasons. The SUMD considers insight to be a continuous construct, and using a cut-off (i.e. considering insight a dichotomous phenomenon) does not include or acknowledge partial insight. Moreover, the absence of a similar cut-off across studies causes variations in the interpretation of insight scores. Adopting a widely accepted standard for the computation and the interpretation of scores on the SUMD is necessary.

Limitations

This review has limitations that warrant consideration. The literature search terms were selected to be as inclusive as possible, but some relevant articles may have been omitted, including studies that did not mentioned “Scale to assess Unawareness of Mental Disorder” or “SUMD” in their title, abstract, or keywords. Due to the language criteria, relevant information published in languages other than English may have been missed. Literature relevant to the present review was identified through Medline; inclusion of other databases may have led to the identification of additional papers that matched the inclusion criteria. However, the main finding of our review is the heterogeneous use of the SUMD, and we may assume that a more exhaustive review would not significantly change this result.

Conclusion

The SUMD is one of the most widely used instruments to measure insight, and it has satisfactory psychometric properties. The SUMD also incorporates the continuous and multidimensional approaches. This measure is unique in its detailed assessment of patients’ awareness of, and attribution for, a wide range of signs, and symptoms. However, the use of a modified SUMD may compromise the psychometric properties of this scale, lead to erroneous conclusions and prevent comparison across studies. Our review underlines the need for the standardised use of the SUMD.