Introduction

Studies examining quality of life (QoL) and health-related quality of life (HRQL) have typically used standardised measures [1]. Such instruments are routinely used in clinical trials and make a valuable contribution by taking patients’ perspectives into account when evaluating treatment and care [2]. Standardised instruments usually include a pre-defined set of domains, often with a focus on health status [2], and have therefore been criticised for possibly missing domains important to the individual patient, while at the same time including other domains that might be of less importance. Moreover, they assume that physical limitations, by default, must lead to decreased QoL [3, 4]. Individualised measures have been developed that allow for the respondent to choose the most important domains to be evaluated, which overcome the limitation with standardised measures [5, 6]. Hence, this approach is proposed to be more suitable than standardised measures in capturing the patient’s perspective. Furthermore, monitoring individual patients’ QoL has been proposed as a useful tool for care planning and follow-up of individuals in clinical practice [7]. Individual measures are derived from an idiographic approach in contrast to standardised measures that are based on a nomothetic approach [8]. The nomothetic approach focuses on general laws concerning biology and human behaviour. When adopted for the QoL measurement, we find a dominating psychometric tradition, on which most existing measures are based. The idiographic approach, on the other hand, is focused on the study of individual human beings in order to understand and interpret the uniqueness of the individual in a historical and social context. When adopted for QoL, we find that this approach is influenced by phenomenology and focuses on the psychological processes involved when individuals construct/appraise their QoL [8]. These different approaches are, therefore, not interchangeable and may complement one another in research and practice [9].

The most widely used individualised measures are the Patient Generated Index (PGI) [6] and the schedule for the evaluation of individual quality of life—direct weighting (SEIQoL-DW) [10], an abbreviated form of the SEIQoL [5]. Both instruments use semi-structured interviews to collect data and allow the individual to freely nominate areas, followed by a rating and weighting procedure. They differ with regard to their focus, in that the PGI is used to address the impact of the disease on patients’ QoL, while SEIQoL enquires about QoL in general.

When performing the standard assessment using the SEIQoL-DW, participants are first invited to nominate the five domains they currently consider to be the most important in their life. If someone finds it difficult to nominate five domains, a standard list of prompts is used [10, 11]. Secondly, the person is asked to rate how she/he is doing in each of these domains on a visual analogue scale (SEIQoL-VAS), and in the third stage, the patient is asked for the relative importance of each area by a weighting procedure. The two instruments, the original SEIQoL and its abbreviated version, differ in the way in which the weighting is performed. Weighting by the original SEIQoL is based on judgment analysis (JA) and will hereafter be referred to as SEIQoL-JA, whereas the SEIQoL-DW uses a direct and simpler technique for weighting the importance of the nominated domains [10, 11]. Respondents are asked to quantify the relative importance of each area, represented by five differently coloured areas in a pie chart, by adjusting the sizes of the identified life areas. All areas add up to 100, and the area perceived to be of greatest importance should be assigned the largest pie area. Both versions produce an overall QoL Index score to enable comparisons at group level. The Index score is calculated by multiplying the rating of each area with the same domain’s weight and then summing the products. The completion of the PGI is similar to the SEIQoL procedure but differs in how the weighting is performed.

Since the two most widely used individualised measures, the PGI and the SEIQoL-DW, have been available for a full decade up to this point, there is need for a systematic evaluation of their strengths and weaknesses [12]. Recently, Martin et al. [13] reviewed the PGI for its psychometric performance. The measure was found to be adequately reliable for group comparisons, and it yielded adequate levels of validity; however, the findings regarding responsiveness were inconclusive. To date, psychometric properties of the SEIQoL-JA have been reported to some extent and show acceptable levels of reliability and internal validity [5, 14, 15]. Regarding the SEIQoL-DW, some papers have included psychometric results of the instrument. However, no systematic evaluation has been performed regarding either instrument [12]. In PubMed, the publications using SEIQoL-JA have decreased over the last 5 years, whereas the SEIQoL-DW is increasingly being used, both in research and in clinical practice. Hence, this report will focus on the SEIQoL-DW. The aim of the present study is to review the published results regarding the use, feasibility and psychometric performance of the SEIQoL-DW in clinical research.

Methods

Selection of papers

A systematic literature review was conducted to identify published articles using the SEIQoL-DW. Papers were included if they reported empirical data of the SEIQoL-DW employed in sample sizes over 30 and if they were published in English. The Medline, CINAHL and PsycINFO databases were searched up to May 2007 for abstracts and articles including the keyword “SEIQoL”, as this was found to be the most comprehensive search term. This search generated 88 papers published in peer-reviewed journals. Fifty-one papers were excluded since they did not report empirical results using the SEIQoL-DW (n = 18), did not present results of the SEIQoL-DW despite having used the instrument empirically (n = 2), were not published in English (n = 8), had a purely qualitative design (n = 1), or used sample sizes of <30 (n = 22). Two additional papers were identified through the reference lists of the reviewed papers, resulting in 39 papers for review. The papers were scrutinised according to a pre-defined checklist based on the criteria described in the following section. For two reasons, authors were not contacted for additional information or clarifications when papers lacked information included in the checklist. First, such a procedure could have introduced a bias by giving an opportunity for some authors to clarify and provide additional information not presented in the paper. Secondly, we decided to include only data that had been peer-reviewed.

Review checklist

The checklist was compiled based on the literature and consensus discussion in the research group. The following information regarding the SEIQoL-DW was extracted from the articles, if explicitly reported by the authors. When this information was not provided, it was registered as not reported (NR). Quality of the reviewed papers was not explicitly graded. As the aim of the study was to report on the use, feasibility and psychometric characteristics of the instrument and since many studies do not report any psychometrics at all, we would have had to exclude studies if we would have assigned grades based on psychometric properties. Hence, we decided to present the information available and give the readers access to all the information and the opportunity to judge the quality of the paper themselves. However, we did decide to exclude studies including sample sizes of <30 individuals to increase the quality of the reviewed papers. Further, to make comparisons easier across the studies, we excluded all qualitative studies.

Application

Population (number/age/sex/diagnosis/prognosis if applicable/treatment); setting (recruitment of respondents, e.g. population-based/via in-patient clinics/outpatient departments/general practices/specialised care).

Design

Study objective; hypotheses; study design (e.g. cross-sectional/longitudinal); response rate.

Procedure

SEIQoL-DW version used (standard procedure/modified approach); use of prompt list (yes/no); mode of administration (semi-structured interview/touch screen/telephone interview/postal questionnaire/group setting/other).

Feasibility

Time needed for completion of interview; missing data due to difficulties in understanding the procedure; self-reported or objective data regarding the instrument’s acceptability and feasibility.

Analytical approach to qualitative SEIQoL data

Description of content analysis; check of the content analysis (e.g. consensus discussions within research team/assessing inter-rater agreement); presentation of nominated cues (e.g. percentage of most commonly nominated cues/categories of cues).

Construct validity

Internal scale structure (correlation between different components within the instrument, e.g. between ratings and weights); convergent validity and discriminant validity (i.e. correlation between the SEQoL-DW Index score or the SEIQoL-DW VAS, and other measures). Following Cohen [2], correlation coefficients of <.49 are interpreted as lack of convergent validity and coefficients of >.49 as evidence for convergent validity. Regarding discriminant validity, coefficients of <.49 are interpreted as evidence for discriminant validity and coefficients of >.49 as lack of discriminant validity. Based on the original intention of the instrument [16], we hypothesised that the SEIQoL Index would relate moderately to strongly to other global or overall QoL scales, for example the EORTC QLQ-C30 Global QoL scale, QoL VAS, and measures of life satisfaction, mental health and social functioning (convergent validity). Conversely, the SEIQoL Index score was hypothesised to relate weakly to measures of physical health and functional status (discriminant validity). Correlation coefficients will be reported according to these hypotheses. Correlation coefficients of <.29, .30–.49 and >.49 are interpreted as small, moderate and large respectively [2].

Criterion-based validity

Associations with demographic, clinical and other non-self-reported data; known-group comparisons, that is, comparisons of subgroups known to differ in QoL or health.

Assessment over time

Test–retest reliability (i.e. of cues, ratings, weights and/or Index scores when circumstances can be assumed to have remained stable during the study time; thus, in the absence of intervention and/or disease progression). Responsiveness (i.e. sensitivity to change over time due to an intervention or disease progression in content of areas, number of areas, ratings of areas, weights of areas and/or Index scores).

Results

A detailed description of the information extracted from the reviewed papers is presented in an “Electronic supplementary material (Appendix)” linked to this paper.

Application

Most studies (31/39) examined adult patients or former patients [10, 15, 1745] with two studies additionally investigating the care givers’ perceptions [46, 47]. Five studies reported data of exclusively non-patient populations [11, 4851], and one study investigated children [52]. Diagnoses, prognoses and settings varied across studies. The most common diagnosis was cancer (n = 10) [15, 19, 20, 24, 30, 31, 4042, 45] followed by neurological disorders (n = 8) [17, 19, 21, 23, 27, 28, 43, 46].

Design

The majority of papers (n = 31; 79%) presented cross-sectional results [10, 15, 1721, 2342, 46, 48, 5052] and eight papers reported results from longitudinal studies including at least two assessments [11, 22, 32, 4345, 47, 49].

Procedure

The standard procedure of SEIQoL-DW was used in a majority of the presented studies (n = 26) [10, 15, 1921, 2330, 3438, 43, 44, 4649, 51, 52]. Three papers presented results from an extended version including a disease-related part (SEIQoL-DR) assessing the domains most affected by disease and treatment at an individual level [4042].

The wording of the questions for rating the nominated domains varied (Table 1). Thirty-eight percent of the studies (n = 15) asked participants to rate their level of functioning or their current status with regard to the nominated domains [15, 18, 2325, 3035, 37, 45, 47, 52]. In line with this, some studies asked more specifically for the quality of each nominated domain: for example, “… rate each domain between best possible and worst possible …” [10, 11, 21, 22, 49, 50]. Ten of the reports explicitly asked participants to rate the nominated domains regarding satisfaction [19, 20, 26, 28, 36, 3942, 53] and additionally four studies asked for rating of functioning and/or satisfaction [27, 44, 46, 48].

Table 1 Framing of question for rating nominated domains within the SEIQoL-DW, total number of reviewed papers N = 39

Eighteen percent of the papers (n = 7) reported using a prompt list for patients who were not able to spontaneously nominate five areas of importance [10, 15, 20, 35, 37, 45, 52]. One study permitted participants to nominate less than the recommended five areas if the participant had difficulties spontaneously coming up with five areas [4042].

The mode of administration was reported in 30 papers. A majority (n = 28) of these had used semi-structured interviews [10, 15, 19, 20, 2331, 3438, 4043, 4548, 50, 52], while one study tested a touch-screen version of the instrument [49] and one study administered the instrument in written form [51].

Feasibility

The time for completing the interviews was presented in 36% of the papers (n = 14) and the reported mean ranged from <5 to 50 min. Missing data were reported in 10 studies [22, 24, 25, 29, 32, 34, 35, 37, 43, 52] and ranged from 8 to 83% of participants failing to complete the procedure. Reported reasons for this included confusion, distress, fatigue and difficulty understanding the task. The paper reporting the highest rate of missing data (83%) investigated frail older people living in nursing homes [29]. Five studies reported failure to nominate five domains [24, 4042, 45].

Difficulties using the disc due to power loss in arms and hands for patients with amyotrophic lateral sclerosis (ALS) were reported in one study [23]. For these patients, the disc was adjusted by the interviewer. Many papers commented that patients found the SEIQoL-DW acceptable [10, 11, 23, 30, 36, 40, 44, 49, 50, 52], whereas one study assessing stroke survivors described that participants had difficulty following the instructions [25]. One study reported that the majority of nursing home residents, frail old people, were unable to complete the assessment due to poor physical condition or confusion [29]. For the disease-related SEIQoL-DW, included in three of the studies, the nominated areas influenced by disease varied from none to five [4042]. However, no missing data due to failure to understand the procedure was found for the SEIQoL-DR version [40].

Analytical approach to qualitative SEIQoL data

The majority (n = 25; 64%) of the papers did not describe the method for analysing the qualitative data, whereas 11 papers gave some description of the analysis [18, 20, 22, 28, 33, 34, 37, 40, 42, 43, 50]. However, these descriptions were, in general, very brief: for example, “cues aggregated into groups” [20, 22, 28, 43]. Only one paper reported inter-rater agreement [37].

Construct validity

Internal scale structure was examined in four papers. Moons et al. [32] investigated the association between actual status (rating) and relative importance (weighting) of the nominated domains and found the correlation coefficient to be small (r = .26). Another study correlated the individually assessed weights and ratings of the five cues when ordered from the most to the least important, and found a stronger association between weights and ratings of domains weighted with less importance [50]. Evaluation of the weighting procedure by analysing correlation coefficients of weighted versus unweighted Index scores with scale scores of the SF-36 showed no evidence of any impact of the weighting procedure on the Index score [42]. In another paper reporting on the relation between the Index score produced by the standard version, and the corresponding score produced by the disease-related version, the coefficient was found to be large (.50) [41].

Convergent validity examined by correlations between the SEIQoL Index score and other self-reported measures of overall QoL, mental health and social function was presented in 36% of the papers (n = 14) [17, 20, 21, 25, 26, 30, 33, 35, 36, 39, 41, 45, 46, 48]. In some cases, the magnitude of the coefficients were not reported and only presented as non-significant. However, 25 correlation coefficients were provided for convergent validity (Table 2). All but four of these coefficients were either moderate (n = 13) or strong (n = 8) (>.49) indicating evidence for convergent validity. Two studies investigated associations based on regression analyses [17, 23]. Lee et al. [23] found the SEIQoL Index score to be highly related to depression (beta = −.46), which explained more than 30% of the variance (R 2 = .34). In another study, 44% of the variance of the SEIQoL Index score was found to be explained by perceived social support, religiosity, depression and social status [17].

Table 2 Convergent validity as measured by associations between the SEIQoL-DW Index score and other self-reported measures of emotional function, quality of life, life satisfaction and social function and support

Discriminant validity was measured by correlations between the SEIQoL Index score and self-reported measures of health and functional status in eight studies (Table 3) [20, 25, 26, 28, 32, 35, 42, 43]. Four of the coefficients were moderate (.30–.49), while eight were weak (r < .30). One paper presented the correlation between the SEIQoL Index and measures of physical and functional status to be non-significant [21].

Table 3 Discriminant validity as measured by associations between the SEIQoL-DW Index score and self-reported measures of health and functional status

Criterion-based validity

Eleven papers presented results of correlations between the SEIQoL Index score and clinical characteristics [20, 2224, 28, 33, 34, 3941, 43] and five presented results of relations to demographic parameters [24, 26, 28, 40, 51]. These results were most often presented as non-significant results, with the exception of three papers which reported small coefficients between the Index score and clinical variables, that is, S-albumin [39]; disease stage, treatment modality and time since diagnosis [41]; and disease severity and heart functional status [33]. Furthermore, patients with malignant cord compression evaluated to have low Karnofsky performance scores (poorer function) had significantly lower SEIQoL-DW Index scores [24]. Another study that compared patients diagnosed with ALS with cancer patients found that the latter had significantly lower Karnofsky and SEIQoL Index scores than the ALS patients [19]. The results regarding demographics (age and sex) were mostly presented as non-significant [24, 28, 39, 40, 48, 51]. Only one paper reported coefficients for age in two samples, which were weak and negligible [26]. Criterion validity examined by study of the content of areas was examined in three studies [32, 34, 42]. Moons et al. [32] found that unemployed respondents reported lower functioning of the cues ‘job/education’ and ‘financial means’ than their employed counterparts. In another paper, the same author reported that the cues ‘health’ and ‘family’ were more frequently nominated with increasing age, while the cue ‘friends’ was less frequently nominated by older patients [34]. When it comes to known-group comparisons, the median Index score was found to be higher for healthy couples than for ALS patient–caregiver couples (P = .001) [28]. One of the papers based on the disease-related SEIQoL-DW showed that those who reported several life domains that were affected by the disease also rated a worse physical and mental health as measured with a standardised questionnaire, compared to those reporting only one or no area being negatively influenced by disease [42].

Assessment over time

Test–retest reliability was assessed only in three studies, where circumstances were assumed to have remained stable [11, 32, 49]. Browne et al. [11] studied change in weights and found a mean change of 4.5 points over a 1-week period. Moons et al. [32] did not find any change of the SEIQoL Index scores over 1 year in medically and psychosocially stable patients with congenital heart diseases. Ring et al. [49] studied change in the content of cues in a stable non-patient sample and found that 35% of the interviewed students picked a new area at the second appointment.

Three papers reported on responsiveness by examining change of the SEIQoL Index score [22, 32, 45], and two papers investigated change of cues [44, 45]. In contrast to hypothesis, the SEIQoL Index score was not found to increase 2 years after receiving a pacemaker, even though a change was detected at 1 month [22]. In metastatic patients, the Index score was hypothesised to improve over 6 months, due to response shift, which was verified [45]. Moons et al. [32] hypothesised that patients with congenital heart diseases experiencing complications leading to a change in health status would not necessarily report a corresponding decrease in the Index score. In line with this, the authors found that a deterioration in health status corresponded to an increase in Index score [32]. Two of the papers reported on change of nominated cues over time [44, 45]. One of these studies reported that as much as 81% of the patients nominated at least one new cue between two assessments, at 3 months compared to baseline [44], and the corresponding numbers reported for metastatic cancer patents was ~50% over a period of 6 months [45].

Discussion

We reviewed empirical studies using the SEIQoL-DW to assess QoL focusing on the instrument’s use, feasibility and psychometric performance. The SEIQoL-DW has been included in studies of a variety of populations, with samples of both healthy individuals and patients, including those who are severely ill. Several papers commented that those completing the instrument were generally positive towards it, including those who were quite disabled and not able to conduct the weighting procedure without assistance. According to the findings, the SEIQoL-DW appears to be a feasible and valid instrument for use in quantitative research in persons with the required cognitive capabilities, with a limited burden on participants and, overall, with few missing data. However, one of the papers reported that more than 70% of the participating frail old people living in a nursing home failed to complete the procedure. Hence, SEIQoL-DW does not seem suitable for use in this population [29]. A recent qualitative study confirms the instrument’s feasibility outside the research setting as well. Patients and doctors who had tried the instrument in routine care believed that it was practical and may support monitoring of patients’ QoL in relation to care and treatment [54]. Till now, few studies have evaluated the qualitative elements of the instrument, that is, the nomination of cues, making it difficult to draw firm conclusions on this aspect. Cues are most often reported descriptively on a group level. Despite the individual approach being emphasised when describing the SEIQoL-DW, most published reports present results based on the quantitative parts of the instrument. Issues that need to be further evaluated include what the nominated areas mean to the interviewees; for example, are the most important areas the ones reported, and how should cues be categorised without losing the individual’s notion of the nominated area? Interviewing respondents according to the CASM (Cognitive Aspects of Survey Methodology) [55] may contribute important information to the nomination procedure, as well as to how respondents reason when performing the rating and weighting procedure.

We hypothesised that the SEIQoL-DW would correlate moderately to high with measures of global QoL, life satisfaction and mental health, and weakly with measures of functional status and health. The results of convergent and discriminant validity support these hypotheses. This lack of relation between the Index score and health and functional status as well as demographic parameters may be explained by the instrument being idiographic and reflecting the capacity of a patient to appreciate and value important areas in life, despite health problems [15, 56]; for example, due to effective coping behaviour. Only a few studies examined criterion validity through study of the content of areas [32, 34, 42]. In contrast to the non-significant results regarding the Index score, the studies analysing criterion validity through nominated cues found that the SEIQoL-DW and the disease-related SEIQoL-DW reflected poor health [42], unemployment [32] and expected differences related to age [34]. The lack of relation to clinical criteria and functional status may be explained by the context in which patients have been assessed in. Whereas many of the reviewed studies included patients with severe physical conditions, such as cancer and ALS, these patients were approached later in the disease trajectory, allowing adjustment to their situation. It would be of interest to follow newly diagnosed patients over time to see what information the SEIQoL-DW can add in situations of unexpected changes in QoL and health.

As only a few of the included studies employed a longitudinal design, it is difficult to draw firm conclusions about the test–retest reliability and responsiveness of the SEIQoL-DW. The two papers that assessed test–retest reliability of the Index score and weights found it acceptable, which is promising [11, 32]. The only study that had analysed test–retest of cues found that more than a third of a sample of students picked a new area at a second assessment 3 months later [49]. The change in cues is in line with results found in patients with metastatic cancer [45], patients undergoing stem cell transplantation [56] and receiving endotolous implantation [44], as well as in a study using the SEIQoL-JA [57], where a significant proportion of patients were found to pick a new domain when re-assessed. Those who have investigated whether change of cues is related to change of ratings or Index scores have not found any evidence for such a relation [11, 32, 45, 49, 56, 57]. Content of nominated domains appears to change over time irrespective of whether the situation is stable or not. If a patient, for example, nominates ‘relationship to a partner’, the couple may break up, and at the next assessment the same patient may nominate ‘dating on the Internet’. This is a new domain; nevertheless, it is based on a similar area of life. One may suspect that a publication bias concerning change of cues (content) exists due to the described difficulties in analysing it, that is, in how to define a change, and how to analyse change over time [45, 56]. Only three of the studied papers had predicted a change of the SEIQoL Index score, and for two of these the stated hypotheses were verified [44, 45]. Furthermore, no reviewed paper had evaluated responsiveness by standardised methods such as by the standardised response mean (SRM) [2]. Thus, continued evaluation of responsiveness is recommended.

The wording used in the rating step varied between the reviewed papers, and the potential framing effects of different wording warrant further attention. The paper by Hickey et al. [10], introducing the SEIQoL-DW, use the phrasing: “How would you rate yourself on each of these areas at the moment, on a scale from the worst possible to the best possible?” Hence, there is no clear distinction to the focus of the rating. Further evaluation is needed to assess whether differences in wording have any impact on the Index score or whether functioning, satisfaction and quality can be used interchangeably, as is now the case. The weighting procedure is another area of growing interest. One of the papers revealed that the Index scores are largely ‘driven’ by the satisfaction ratings, with the importance ratings (e.g. weighting with the disc) having only a minor impact [42], thereby not adding any extra information. Another paper found that the cues given the least weighting had the highest correlations with the ratings of the same domains [50]. The weighting procedure does limit the respondent to freely weight the domains since they are set up to add up to 100. In a few of the reviewed studies, participants were allowed to choose less than five areas of importance, and it was shown that the number of nominated cues did not have an impact on the overall Index score [40, 45]. The difficulties in interpreting a combined satisfaction and importance ratings have previously been pointed out [58, 59], and the impact of the weighting procedure on the overall score needs further exploration. Another aspect to consider is that the weighting procedure does not only add to the Index score. It may also be of great value at the individual level by identifying areas that are especially important to the individual patient. This may be of special importance if used in a clinical practice setting, where the instrument may support prioritisation in relation to clinical decision-making.

Originally, the SEIQoL-DW and the SEIQoL-JA were administered through individual semi-structured face-to-face interviews. To date, the instrument has been administered in group settings (unpublished data, O’Boyle et al.), by telephone interviews [60], and by self-administration [18]. Furthermore, a computer-administrated version has been developed [49]. As presented in the “Results” section, an extended form of SEIQoL-DW, focusing on disease-related QoL, has been developed, showing promising results [4042]. New versions of the instruments and new means of data collection will allow wider applicability. The disadvantage of this multiplicity of approaches is, however, that it may compromise direct comparability between studies.

Study limitations

The total number of reviewed papers is small (n = 39), reflecting the overall limited number of studies using the SEIQoL-DW compared to traditional measures like the standardised instrument, the short form health survey (SF-36) [61]. Further, even though studies only employing sample sizes of more than 30 participants were included, these studies may still have been underpowered, especially in cases where sub-samples were analysed.

Conclusion

The SEIQoL-DW has been included in studies of a variety of populations, in samples of both patients, including those who are severely ill, and non-patients. The instrument appears to be feasible even among those who are quite disabled, and the overall internal attrition is low in cognitive capable respondents. Construct validity assessed by means of convergent and discriminant validity was shown to be acceptable. Adding the results of the assessment of criterion-based validity, the SEIQoL Index score seems to tap into a different construct than physical health and functional status, which supports the instrument’s intended focus on individualised QoL. Responsiveness of the instrument remains unclear at this stage due to the fact that few studies have examined this; consequently, continued psychometric evaluation, including SEMs, in larger populations with a longitudinal design is therefore recommended. The qualitative part of the SEIQoL-DW, for example, the content analysis of cues, and administration forms, are other issues that need to be further studied. In conclusion, although some aspects require further investigation, the SEIQoL-DW is found to be a feasible and valid complement to standardised measures for use in clinical research.