1 Introduction

In what later became a citation classic from Social Indicators Research (Michalos 2005), Tom Atkinson (1982) published results from a panel study involving “a representative sample of 2,162 Canadians interviewed in 1977 and again in 1979”. Apart from some results dealing with a subset (N = 285) of the national sample used in Campbell et al. (1976), this was the first published report “on the stability of QOL [quality of life] measures over time”. The particular measures employed included a single-item life satisfaction measure with 11 response categories running from 1 = completely dissatisfied to 11 = completely satisfied, similar to those used in Great Britain and in the European Community; an 11-step (0–10) self-anchoring ladder scale adapted from Cantril (1965); the Gallup 3-step (“very happy, fairly happy, or not too happy”) happiness scale and single-item domain satisfaction scales for five domains; job, finances, housing, health and marriage/romance. A General Quality of Life Index was formed by combining responses for the life satisfaction, happiness and Cantril scales, and a Domain Satisfaction Index was formed by combining domain satisfaction scores. In order to assess levels of change in people’s lives from the first to the second survey, respondents were asked in the second wave if their current status was the same, better or worse, for life in general and for specific domains. They were also asked which of 16 significant life events they had experienced in that period, e.g., divorce/separation, serious injury, new job or house.

His most important conclusions were summarized as follows.

“Opinions have been expressed in some quarters that subjective measures such as satisfaction are poor social indicators because they were so conditioned by expectations and restricted awareness as to be insensitive to changing circumstances. It is also argued that expectations and aspirations adjust very quickly to new situations and that satisfaction and other measures revert to their original levels immediately. This position would have led to predictions that (a) very few individuals would indicate any change in their situation, and/or (b) virtually no adjustment in subjective indicators would occur when changes did occur.

Our findings contradict both hypotheses in that significant numbers of respondents perceive changes in their lives and those changes were reflected, for better or worse, in their satisfaction levels. The fact that these changes took place over a two year period indicates that, while adaptation probably does occur, it is not instantaneous and will be detected by an indicator series which utilizes fairly frequent measurements” (Atkinson 1982, pp. 128–129).

Because the questionnaire included respondents’ perceived changes (subjective indicators) and numbers of specific life events experienced (objective indicators), Atkinson was able to compare levels of association between each of these measures and QOL measures. Among other things, he found that

“The hypothesis that the No Change group would have more stable QOL scores than those reporting change is supported when the perceptual indicator of change is used as the independent variable but not when the event measures are involved…[As well] Relationships between self-reported change and satisfaction levels are always higher than between events and changes in satisfaction” (Atkinson 1982, pp. 122, 126).

About a decade after the Atkinson article appeared, Headey and Wearing (1992) published results of their Australian/Victorian Quality of Life Panel Study that monitored the lives of “…a representative sample of 942 people in Australia’s most densely populated State, Victoria”. Panel members were interviewed in five waves for 1981, 1983, 1985, 1987 and 1989, with the “central question posed” being “What causes change in people’s levels of well-being?”. Given the 5 waves of data and a diverse array of particular measures, these authors were able to develop and test some very sophisticated models to explain changes in respondents’ reported wellbeing. Their main preference was for something they called “a dynamic equilibrium model”. In a sequel to this paper we will address the causal question, the model of Headey and Wearing, some others and especially Multiple Discrepancies Theory (MDT; Michalos 1985). In this paper our focus is on measuring stability and sensitivity to change following Atkinson’s work. We are concerned here to show that appropriate changes in variable values and correlations did or did not occur, leaving the question of why to the sequel.

Headey and Wearing had a list of 93 kinds of life events that respondents might experience, including all the discrete, relatively objective events used by Atkinson (e.g., respondent got married, lost a close family member through death, got a new job) and a variety of continuing or possibly repetitive occurrences (e.g., respondent had arguments with children, increased the number of family outings, took courses that “seemed pointless”). For each event, respondents were asked if it occurred in the past 2 years, when, if it was continuing, if not then when it ended and its level of satisfaction or distress on a 10-point scale (Headey and Wearing 1992, pp. 30–31, 193–208). The dependent variable that was tracked over time with the occurrence of various life events was called the “life satisfaction (2) index”. It was an average score based on two single-item, 10-point measures, one of life satisfaction and another of life fulfillment. The correlation of the values of this index for 1981–1983 was r = 0.64 and for 1981–1989 it was r = 0.43. On the basis of several analyses using their “dynamic equilibrium model” the authors concluded that “life events in fact have a significant impact over and above personality”, that “favourable events seem to have at least as much impact on well-being as adverse events” in the expected directions and that, therefore, whatever adaptation there is to new circumstances is not so rapid as to put people on “a hedonic treadmill” (pp. 138–143).

In another citation classic from Social Indicators Research, Veenhoven (1994) summarized overtime correlations of a variety of happiness-related variables in 26 longitudinal studies. (Atkinson’s study was not one of the 26.) The variables were all employed as dependent and included Bradburn’s Affect Balance Scale, some single-item measures of happiness, life satisfaction, contentment with life, appreciation of life and cheerfulness. The time-spans involved ranged from 4 months to 40 years, although the latter were based on interviewer assessments rather than self-reports of happiness. In a nutshell, he concluded that those studies plus others reviewed in the article showed that “happiness is no trait…[because]…happiness is not temporally stable…is not situationally consistent…and not entirely an internal matter” (Veenhoven 1994, pp. 145–146). Considering the 26 studies as a whole, he wrote,

“It appears that the stability of happiness is a short-term matter. The highest correlations concern mainly time lags of several months; overtime correlations are around +0.60. Over the years the correlations drop considerably. After five years overtime-correlation is almost halved and varies around +0.30. Over periods of ten to fifteen years the correlation shrinks back to about +0.15. Extrapolation of this trend predicts complete disappearance of all overtime-correlation after twenty years (p. 109).”

Taking into consideration measurement errors, he thought that “the true stability of happiness is probably somewhat higher than the correlations suggest…about +0.77 after one year, about +0.50 after 5 years, and about +0.40 after 10 years” (p. 110).

Given the mixed bag of happiness-related measures used in the 26 studies, these are fairly bold estimates. Only two of the 26 studies provided somewhat comparable data for our purposes. Using a single-item measure of happiness, Eels (1985) reported a two-year correlation of r = 0.41 for an adult sample of N = 1,188 in Nebraska, and using a single-item, 10 step life satisfaction measure, Landua (1992) reported a two-year correlation of r = 0.46 for an adult German sample of N = 7,091. More to the point, based on results from 7 longitudinal studies (2 from the set of 26 plus 5 new ones), he found that the values of some sort of happiness-related dependent variables did decrease for people who became unemployed, had a spouse die, got divorced or had a net balance of negative versus positive events in their lives between surveys taken in time-spans of from 6 months to 11 years (pp. 121–124).

Ehrhardt et al. (2000) studied changes and stability in values of a 0-10-point single-item scale of life satisfaction over the eleven year period from 1984 to 1994 using the German Socio-Economic Panel. Respondents were interviewed annually, beginning with an N = 9,000 and including N = 5,483 people who responded to the life satisfaction question in each survey. For present purposes the following results should be noted.

“Year-to-year correlation started at +0.45 and increased gradually to +0.54. The correlation between the first and later reports declined through the years, the correlation between the 1st and the 11th report was only +0.29…life-changes explained 30% of the variance…These results mark a considerable mobility along the life-satisfaction ladder in a modern society: over a lifetime less than 30% of the original rank order in life-satisfaction will be left. That outcome is at odds with common theories of class and personality” (Ehrhardt et al. 2000, p. 177).

The correlation between the first and third year values of their dependent variable was r = 0.38.

So far as we know, Richard E. Lucas and colleagues have made the most extensive use of the German Socio-Economic Panel in explorations of stability and change over time, using a 0-10-point single-item scale of happiness. Lucas et al. (2003) traced the impact of “marital transitions” on values of this variable over 15 years with an N = 1,761 respondents who “began the study unmarried, became married at some point during the study, and stayed married until the most recent wave of data (or until they were unreachable)”. (Over 24,000 individuals participated in the surveys.) Their general “take-home message” follows.

“Brickman and Campbell (1971) were correct that adaptation to events does occur. People initially react strongly to both good and bad events, but then their emotional reactions dampen. Headey and Wearing (1989) were also correct that people return to a positive rather than to a neutral baseline, and this baseline is probably influenced by one’s personality. Our study adds to the understanding of adaptation to marital transitions by following individuals for many years before and after events have occurred. Our results show that (a) selection effects appear to make happy people more likely to get and stay married, and these selection effects are at least partially responsible for the widely documented association between marital status and SWB [Subjective Wellbeing]; (b) on average, people adapt quickly and completely to marriage, and they adapt more slowly to widowhood (though even in this case, adaptation is close to complete after about 8 years); (c) there are substantial individual differences in the extent to which people adapt; and (d) the extent to which people adapt is strongly related to the degree to which they react to the initial event – those individuals who reacted strongly were still far from baseline levels years after the event…Thus, life circumstances are necessary for our understanding of long-term SWB; all happiness is not due to temperament” Lucas, Clark, Georgellis and Diener 2002003, p. 538).

In another study using the German Panel over the same 15 year period, Lucas et al. (2004) showed that

“In accordance with set-point theories, individuals reacted strongly to unemployment and then shifted back toward their baseline levels of life satisfaction. However, on average, individuals did not completely return to their former levels of satisfaction, even after they became reemployed. Furthermore, contrary to expectations from adaptation theories, people who had experienced unemployment in the past did not react any less negatively to a new bout of unemployment than did people who had not been previously unemployed. These results suggest that although life satisfaction is moderately stable over time, life events can have a strong influence on long-term levels of subjective well-being” (Lucas et al. 2004, p. 8).

Using the German Panel to track changes over 18 years, Lucas (2005) showed that

“…there is considerable variability in the extent to which people react and adapt to major life events…just as people vary in their average levels of happiness, people vary in the degree to which their happiness changes following life events. Many people actually experience an increase in satisfaction following their divorce, whereas many others experience a greater drop in satisfaction…This study adds to a small but growing body of research showing that adaptation is not always quick and complete…Reactions vary for different events and even for different individuals who experience the same event…some people may never adapt to some life events, at least not without intervention. The challenge for future research will be to identify which events cause lasting changes and why” (Lucas 2005, pp. 948–949; see also the summary paper by Diener et al. 2006 and Chapter 6 of Diener et al. 2009)

Reflecting on all these longitudinal studies, then, it is fair to say that at least for some single-item measures of life satisfaction, there is compelling evidence that variable values change with some significant changes and/or perceived changes in people’s life circumstances, i.e., these life satisfaction measures are appropriately sensitive to changes in life circumstances over considerable lengths of time. In the present article, we provide some evidence from 3 relatively small panels tracked over 3 years showing that among 27 frequently used quality of life variables and indexes there is considerable variety in their sensitivity to changes in life circumstances measured in different ways. Efforts put into exploring the sensitivity of a few single-item, life satisfaction measures must be extended to these frequently used other variables and indexes.

The structure of this paper is as follows. In the next section we describe our questionnaires and sampling technique. Following that, there is a Sect. 3 describing our sample characteristics. Descriptive statistics and test–retest correlations of our substantive variables are presented in the fourth section. The longest Sect. 5 describes a variety of bivariate relationships, tracking changes in the values of our variables and correlations among them over time and with different measures of changing conditions. Following this section, there is a conclusion.

2 Questionnaires and Sampling Technique

Two questionnaires were used in our study. The first questionnaire was 5 pages long. It began with 33 items asking respondents to assess their levels of satisfaction with 30 aspects or domains of life (e.g., housing, job, family relations, air quality) and with their life as a whole, their overall standard of living and the overall quality of their life. The scale used had 7 response categories running from 1 = very dissatisfied, 2 or 3 = dissatisfied, 4 = even balance of satisfaction and dissatisfaction, 5 or 6 = satisfied, to 7 = very satisfied. These items were followed by a happiness item also containing 7 response categories running from 1 = very unhappy to 7 = very happy. A self-reported general health item came next, with 5 response categories running from 1 = poor to 5 = excellent. There were then 7 items designed to test some of the basic hypotheses of Multiple Discrepancies Theory (MDT, Michalos 1985), e.g., Considering your life as a whole, how does it measure up to your general aspirations or what you want out of life?, How does it measure up to the best in your previous experience? Eleven items were included to allow us to construct the Satisfaction With Life Scale (SWLS) from Diener et al. (1985) and the Contentment with Life Assessment Scale (CLAS) from Lavallee et al. (2007). Finally, there were 6 demographic items, e.g., gender, age, highest level of education completed.

The second questionnaire had all of the items in the first one plus items borrowed from Atkinson (1982) designed to measure changes in respondents’ life circumstances since the previous survey. First there was an item asking “Since the last survey, would you say that your life has (1) become better, (2) become worse or (3) stayed pretty much the same?”. Following that, there was a list of 16 questions about events that may or may not have occurred in the lives of respondents since the last survey, e.g., Did you get married?, Become separated or divorced?, Did you have a close friend die? For each event, respondents were asked to indicate whether it occurred or not; if so, in what month and year, and whether it made their life (1) better, (2) worse or (3) had little or no effect. An event was only counted if it occurred within the last year of the survey and counted as negative or positive as indicated in the complete list of events given in the Appendix.

Two random samples of households were taken from the whole province of British Columbia and one was taken from the city of Prince George. One British Columbia sample (BF) and the Prince George sample (PG) were taken in February 2005, and a second British Columbia sample (BM) was taken in May 2005. These initial surveys contained a separate sheet asking respondents if they would be willing to complete the same questionnaire plus a page of new items two more times in the future, one in February 2006 and a final one in February 2007. Respondents who agreed to complete two more questionnaires became the initial members of one of 3 panels, but only those who actually completed questionnaires at three points of time were retained in panels.

3 Sample Characteristics

A total of 462 people distributed across three different panels (BF, PG, BM) completed questionnaires at three points in time. Members of panels BF and PG completed questionnaires at two 12-month intervals (February 2005, 2006, 2007) and members of panel BM completed questionnaires at one 9-month interval (May 2005-February 2006) and one 12-month interval (February 2006-February 2007). For all analyses, we treated panel BM the same as the other two.

Table 1 summarizes some demographic features of the total sample and each of the three panels. Two hundred and forty-five (53.0%) of the total sample were females and the mean age for the sample was 54. One hundred and sixty-eight (38.0%) of the total sample completed university, 171 (40.0%) were employed full-time and 268 (60.0%) were married.

Table 1 Demographics for panel survey 2005 to Feb./May 06 to Feb. 07 (in PanelTable_2007_Summary_Demographics.doc)

Panel BF was the largest with N = 192 or 41.6% of the total sample, panel BM was second with N = 151 (32.7%) and panel PG was smallest with N = 119 (25.7%). The largest panel (BF) had the smallest percentage of females (50.5%) and full-time employed respondents (34.2%), and the highest mean age of respondents (55). The middle-sized panel (BM) had the largest percentage of females (57.0%) and respondents who had completed university (43.7%), the smallest percentage of married respondents (54.5%) and the lowest mean age of respondents (52). The smallest panel (PG) had the largest percentage of full-time employed (48.7%) and married respondents (64.7%), and the smallest percentage of respondents who had completed university (30.4%). The panels are not representative of the province or the city.

4 Descriptive Statistics and Test–Retest Correlations

Table 2 lists the 27 substantive variables of primary concern in this essay with their mean scores of three waves for each panel, their mean scores and standard deviations for the total sample, and the Ns available for the total score calculations. Each variable was standardized to have seven response categories and each one was put into one of five clusters, with subtotal means and standard deviations recorded for each cluster.

Table 2 Panel survey 2005 to Feb./May 06 to Feb. 07 (in PanelTable_2007_SummaryTot_3Waves_Std)

There are seven overall life assessment variables, including four single-item measures (satisfaction with life as a whole and with the overall quality of life, happiness and general health) and three indexes (SWLS, CLAS and SWB). SWB is a 4-item index of Subjective Wellbeing calculated by combining scores on four single-item measures (satisfaction with life as a whole, with the overall quality of life and one’s standard of living, and happiness). Following these two clusters, there is a cluster of seven single-item discrepancy measures for the gap between what respondents have and want (self-wants), others of the same sex and age have (self-others), deserve (self-deserve), need (self-need), expected 3 years ago to have (self-progress), expect to have in 5 years (self-future), and the best they ever had (self-best). Then, there is a cluster of ten domain satisfaction items for home, neighbourhood, family relations, living partner, job, friendships, religion/spirituality, financial security, recreation activities and self-esteem. Finally, there are three domain satisfaction indexes for physical/mental health, environment and government. The measure for health is the mean of the physical and mental health satisfaction items, the measure for the environment is the mean of the air, land and drinking water satisfaction items and the measure for the government is the mean of the federal, provincial and local government satisfaction items.

Examining the table beginning with the total sample score column in the lower right-hand corner, one finds that the mean score (5.19) for all 27 items is clearly on the positive side of the 7-point scale, with a standard deviation of 1.16. Looking at the mean scores for the five clusters, one finds that the cluster of single-item overall life assessment measures has the highest mean (5.56) and that the cluster of three domain satisfaction indexes has the lowest mean (4.68). The latter is low as a result of the very low government satisfaction score (3.65). All of the measures in the first three clusters involve some kind of an overall life assessment, while those of the last two clusters involve assessments of distinct domains of life. While the mean score of the three overall life assessment clusters (5.10) is lower than that of the mean score of the two domain satisfaction clusters (5.28), it is the mean scores of the cluster of single-item discrepancy measures (4.87) and the overall life assessment indexes (5.01) that drags the mean score of the three overall life assessment clusters down. On average, the four single-item overall life assessment scores are higher than the ten single-item domain satisfaction scores. As well, on average, there is less variability (1.15) in the four single-item overall life assessment scores than in the single-item domain satisfaction scores (1.23). Among single-item domain satisfaction scores for the total sample, on average, living partner scores are highest (5.90) and financial security scores are lowest (4.91).

Although none of the panels is representative of the populations from which they were drawn, the two provincial panels are relatively similar compared to the city sample. The mean scores of the total set of 27 variables and of each of the five clusters for the PG panel are lower than those of the corresponding mean scores for the BF and BM panels. In fact, mean scores of 21 of the 27 variables are lower for the PG panel than for the other two panels. On average, among the ten single-item domain satisfaction scores for the three panels, neighbourhoods have the highest scores for the BF and BM panels, while living partner has the highest score for PG. Government satisfaction scores are the lowest of all 27 variables in all three panels.

Table 3 lists the 27 substantive variables with their average test–retest correlations for two 1-year intervals and a test–retest correlation for one 2-year interval, for each panel and for the total sample, and the average Ns available and their ranges for the total score calculations. All correlations are significant at P < 0.05 or better. The 27 variables are again grouped into five clusters with average values calculated for each cluster.

Table 3 Correlations for panel survey 2005 to Feb./May 06 to Feb. 07 (in PanelTable_2007_SummaryCorrelationsTot_3Waves)

Beginning with the total sample score column in the lower right-hand corner, one finds that the average year-to-year correlation (r = 0.63) for all 27 variables is slightly higher than the 2-year correlation (r = 0.61). This is also true for each of the average scores for the five clusters. However, it is not true for all 27 variables. There are 7 (25.9%) cases in which the 2-year correlation is higher than the average year-to-year correlation, i.e., happiness, SWB, self-deserve, job, friendships, health and self-esteem. Besides the fairly high average correlations for the total set of variables in both columns, it is worth noticing that while the range runs from r = 0.41 to r = 0.80, there are only six cases out of 54 (11.1%) in which the correlations are in the 0.40 s. Three of these six cases occur with single-item discrepancy measures, including two with the self-future variable.

Examining the column of average year-to-year correlations and the column of 2-year correlations for the five clusters, one finds that the cluster of three overall life assessment indexes has the highest mean in both columns, r = 0.77 and r = 0.76, respectively. Since multi-item measures tend to have less random error than single-item measures, one might have expected the average correlations for the two clusters of indexes should have been higher than the averages for the other three clusters. In fact, closer examination of the two columns reveals that the correlations for each overall life assessment index are higher than those for any other measure. The range of correlation values for the three overall life assessment indexes is also narrower and higher than that for any other cluster, running from r = 0.75 to r = 0.80. However, examination of the scores for the domain satisfaction indexes are not particularly extra-ordinary compared to many of the single-item scores in those two columns.

The patterns of higher average year-to-year correlations and 2-year correlations for the means of the five clusters for each of the three panels are the same as for the total sample, with four exceptions. The subtotal average year-to-year correlations and 2-year correlations for the mean of the four single-item overall life assessment measures and the mean of the three overall life assessment indexes are equal in the BF panel, the subtotal average year-to-year correlation is lower than the 2-year correlation for the mean of the four single-item overall life assessment measures in the PG panel and the three domain satisfaction indexes in the BF panel. Close examination of the two columns for each of the three panels reveals that the correlations for each overall life assessment index are higher than those for any other measure, with five exceptions in 162 cases (3.1%). The exceptions are the two-year correlations for satisfaction with general health and self-esteem in the PG panel, the average year-to-year correlation for satisfaction with financial security in the BF panel, and the 2-year correlations for satisfaction with financial security and self-esteem in the BF panel.

5 Bivariate Relationships

Table 4 is a correlation matrix for the 27 substantive variables for the total sample (N ≥ 372). Respondents’ mean scores for the three waves for all three panels (Table 2) were used for the calculations. All correlations are significant at P < 0.01 except for three that are significant at P < 0.05. As one would expect, SWB has the highest correlations with three of its component parts (satisfaction with life as a whole (r = 0.92) and with the overall quality of life (r = 0.94), and happiness (r = 0.90)). Of the four single-item overall life assessment variables, general health has the lowest correlations with the other three, running from r = 0.44 to r = 0.52, and indicating again that respondents recognize a difference between assessments of their health and of their life as a whole. This recognition is carried over to the correlations among the three overall life assessment indexes and general health, which run from r = 0.44 to r = 0.51.

Table 4 Panel Survey 2005 to Feb./May 06 to Feb. 07 correlations with N ≥ 372, P < 0.01 except for 3 entries with *, where P < 0.05 (in PanelTable_2007_CorrelationsTot_3Waves_Std.doc)

On average, the single-item discrepancy measures correlate at r = 0.59 with the single-item life satisfaction measure, r = 0.63 with the overall quality of life satisfaction measure, r = 0.62 with happiness and r = 0.37 with general health, r = 0.67 with SWLS, r = 0.61 with CLAS and r = 0.67 with SWB. The correlations among the average of the 13 domain satisfaction scores and the 7 overall life assessment scores are r = 0.59 for the single-item life satisfaction measure, r = 0.59 for the overall quality of life satisfaction measure, r = 0.51 for happiness and r = 0.33 for general health, r = 0.49 for SWLS, r = 0.51 for CLAS and r = 0.60 for SWB. It is worthwhile to notice that with the exception of the single-item life satisfaction measure, on average the single-item discrepancy measures had stronger associations than the domain satisfaction measures with the overall life assessment measures.

A basic assumption of those using measures of the sort listed in our tables is that these measures are sensitive to changing life circumstances. As Diener, Lucas, Schimmack and Helliwell (2009, pp. 93, 97) emphasized, “if no gold-standard measure exists” to match against another’s performance, one way to validate the latter is to “compare the results that we get from the new measure to our expectations for how the measure should behave based on our theories of the underlying construct, as well as with other established measures…well-being measures will be most valuable when they behave in ways that often match our intuition…”. While most of the measures considered here are not new, the validating strategy those authors suggest is still appropriate. Strictly speaking, as those authors also explained (p. 84), the strategy involves testing two things at the same time, namely, the assumption that individual’s views about life as a whole or some domain of life actually changes with some change or other in their life circumstances and the further assumption that our measures are sensitive enough to capture changes in their views. Here we have generally proceeded as if the first assumption is true and only the second is being tested.

Table 5 contains results addressing the question of whether or not respondents who perceived changes in their overall life circumstances for the better or worse since the last survey would have variable mean scores increasing or decreasing more than respondents who did not perceive such changes. (This summary table and the following 5 (Tables 6, 7, 8, 9, 10) are based on 115 pages of detailed tables covering the three waves of data for the three panels, which are available on request.) We separated the cases in which perceived changes were for the better from those in which perceived changes were for the worse in order to see if on average the two sorts of changes had different impacts compared to no-change cases.

Table 5 Summary of 3 waves for 3 panels, numbers and percents of cases in which variable mean scores in 2006 and 2007 are higher or lower given perceived changes in overall life circumstances for the better or worse since 2005 and 2006, respectively, than they are given no perceived changes
Table 6 Summary of 3 waves for 3 panels, numbers and percents of cases in which variable test–retest correlations 2005–2006 and 2006–2007 are higher given no perceived changes in overall life circumstances for the better or worse since 2005 and 2006, respectively, than they are given perceived changes
Table 7 Summary of 3 waves for 3 panels, numbers and percents of cases in which variable mean scores in 2006 and 2007 are higher or lower given a net balance of selected positive or negative events since 2005 and 2006, respectively, than they are given an equal number of positive and negative events
Table 8 Summary of 3 waves for 3 panels, numbers and percents of cases in which variable test–retest correlations 2005–2006 and 2006–07 are higher given an equal number of selected positive and negative events since 2005 and 2006, respectively, than they are given a net positive or negative balance of events
Table 9 Summary of final column results from Tables 5, 6, 7, 8, percentages of cases in which assumptions of year-by-year variable mean score changes and variable correlation changes were consistent with changes in perceived life circumstances or changes in the net balance of life events
Table 10 Summary of 3 waves for 3 panels, numbers and percents of cases in which variable test–retest correlations 2005–2006 and 2006–2007 are higher given no events since 2005 and 2006, respectively, than they are given some positive and/or negative events

Beginning in the lower right-hand corner of the last column of Table 5, one finds that on average there is pretty good support for the basic assumption. Mean scores are higher in the change groups than in the no-change groups for 71.3% (231/324) of the cases compared. The percentage of cases favourable to our basic assumption is higher when perceived changes are for the worse (77.8%) than when they are for the better (64.8%). For respondents in our samples, on average negative changes in life circumstances have a greater impact than positive changes, at least over a 12-month period. This is true for four of the five clusters as well as for the total set of measures. For the three domain satisfaction indexes, on average positive changes have a greater impact than negative changes.

Examining the subtotal percentage figures for the five clusters, one finds that the cluster of three overall life assessment indexes has the highest value (88.9%) and the cluster of three domain satisfaction indexes has the lowest value (58.3%). In fact, the cluster of 13 domain satisfaction variables has a lower value (61.5%) than the 3-group subtotal including all three clusters pertaining to some sort of overall life assessment (80.4%) as well as to each of the three clusters in that group; i.e., the clusters of four single-item overall life assessment variables (79.2%), three indexes (88.9%) and seven single-item discrepancy measures (77.4%). What’s more, this low status of the two domain satisfaction clusters compared to the other three clusters in the two kinds of change situations is the same as that in the total change column, with one exception. On average, given change for the better, the three domain satisfaction indexes have a higher success rate (72.2%) than the four single-item overall life assessment variables (70.8%). While the exact values of these clusters were not expected, we did expect to find more support for our basic assumption from the overall life assessment variables than from the domain satisfaction variables because the perceived changes respondents were asked about concerned their life in general (i.e., life as a whole), not any specific domain of life. It is quite possible for one to perceive that one’s life got better or worse, on the whole or all things considered, over the past year, although some aspects or domains of life got worse or did not change at all. In fact, it is also possible to perceive that some particular domain got better or worse on the whole even though some domain-relevant event occurred that had an opposite impact, e.g., one might judge that on the whole relations with one’s friends improved over the past year although a close friend passed away.

The variety of percentage values among the 27 variables is remarkable, ranging from 33.0 to 100.0%. The sample size for each variable is of course small, but that would not imply success or failure of our basic assumption. For our most successful cluster, the success rates run from 100.0% for SWLS, to 91.7% for CLAS and 75.0% for SWB. For the cluster of single-item overall life assessment measures, the happiness variable has the greatest success (91.7%), followed by satisfaction with the overall quality of life (83.3%), general health (75.0%) and life satisfaction (66.7%). Since some sort of single-item life satisfaction measure is probably the most frequently used overall life assessment scale in the literature, these results should provoke some concern and second thoughts about the wisdom of having so many eggs in one basket. For the cluster of discrepancy measures, the range is from 58.3% for self-others to 100.0% for self-progress. In previous studies with MDT (e.g., Michalos 1991a, b, 1993a, b), the self-others measure was usually one of the two strongest predictors in the set of seven discrepancy measures, self-wants being the other one. So, it is encouraging to see the self-wants variable with a high success rate of 91.7% but not encouraging to see the relatively low success rate of the self-others variable. For the cluster of three domain satisfaction indexes, it is perhaps not surprising to find the health satisfaction measure with a relatively high success rate of 83.3% because it usually has fairly strong associations with overall life assessments, but it is unclear why the single-item neighbourhood satisfaction measure would be equally successful to the health satisfaction index.

The assumption leading to the exploration of results displayed in Table 5 has a variant leading to the exploration of results displayed in Table 6. Insofar as our measures are sensitive to changing life circumstances, variable test–retest correlations for 2005–2006 and 2006–2007 should be higher given no perceived changes in overall life circumstances for the better or worse since 2005 and 2006, respectively, than they are given perceived changes. As before, we separated the cases in which perceived changes were for the better from those in which perceived changes were for the worse in order to see if on average the two sorts of changes had different impacts compared to no-change cases. The only correlations counted were those significant at least at P < 0.05.

Beginning in the lower right-hand corner of the last column of Table 6, one finds that on average there is just barely support for our assumption. Mean scores are higher in the no-change groups than in the change groups for 52.2% (133/255) of the cases compared. Contrary to results found in Table 5, the percentage of cases favourable to our assumption is higher when perceived changes are for the better (65.8%) than when they are for the worse (32.0%). For respondents in our samples, on average given no perceived changes in life circumstances, the year-by-year test–retest correlations for our variables tend to be higher than the test–retest correlations given perceived changes for the better but not given perceived changes for the worse. This is true for four of the five clusters as well as for the total set of measures. The overall life assessment index cluster is the exception. For this cluster, on average 76.5% of cases are favourable to our assumption when perceived changes are for the worse, and 72.2% of cases are favourable when perceived changes are for the better.

Examining the subtotal average percentage figures for the five clusters in the total sample, one finds that (like Table 5) the cluster of three overall life assessment indexes has the highest value (74.3%) and the cluster of ten single-item domain satisfaction variables has the lowest value (46.1%). In fact, the cluster of ten single-item domain satisfaction variables has a lower average value than the 3-group subtotal including all three clusters pertaining to some sort of overall life assessment (57.0%) as well as to each of the three clusters in that group; i.e., the clusters of four single-item overall life assessment variables (52.8%), three indexes (74.3%) and seven single-item discrepancy measures (50.0%). This low status of the ten single-item domain satisfaction cluster compared to the other four clusters in the two kinds of change situations is the same as that in the total change column, with one exception. On average, in the changes-for-worse column, the seven single-item discrepancy measures had correlations favourable to our assumption for only 16.0% of cases. On average, the three domain satisfaction indexes scored marginally better (48.4%) than the ten single-item domain satisfaction measures.

The variety of percentage values among the 27 variables in the total sample column is again remarkable, ranging from 22.2 to 83.3%. For our most successful cluster, the success rates run from 83.3% for SWLS, to 72.7% for SWB and 66.7% for CLAS. For the cluster of single-item overall life assessment measures, the happiness variable again has the greatest success (62.5%), followed by general health (55.6%), satisfaction with the overall quality of life (50.0%), and life satisfaction (44.4%). As we remarked concerning results in Table 5, these results should provoke some concern about the wisdom of relying heavily on a single-item measure of life satisfaction. For the cluster of discrepancy measures, the range is from 27.3% for self-others to 77.8% for self-deserves. From the point of view of MDT, it is not particularly encouraging to see the self-wants variable with a success rate of 70.0% and encouraging to see the relatively higher success rate of the self-deserves variable. For the cluster of three domain satisfaction indexes, it is surprising to find the health satisfaction measure with a relatively low success rate of 44.4% and a lower success rate than the single-item general health variable (55.6%). It is also surprising and unclear why the job satisfaction (75.0%) and financial security satisfaction (72.7%) measures would be so much more successful than the other eight variables in the ten item cluster.

Tables 7 and 8 were constructed like Tables 5 and 6, by counting cases favourable or not to the basic assumption that our 27 measures are sensitive to changing life circumstances. The main difference between the earlier tables and these is that to create these tables we counted the net balance of reported positive and negative events in respondents lives instead of perceived changes for the better or worse in respondents’ overall life circumstances. Following the patterns of Tables 5 and 6, Table 7 concerns changes in variable mean scores and Table 8 concerns changes in variable test–retest correlations.

Table 7 contains results addressing the question of whether or not respondents who experienced a net balance of positive or negative life events since the last survey would have variable mean scores increasing or decreasing more than respondents who experienced an equal number of positive and negative events. Assuming that the chances of producing a variable mean score change would increase as the year-by-year net balance increased either positively from +1 to +2 to +3 or +4 or negatively from −1 to −2 or −3, we counted positive changes for three kinds of cases and negative changes for two kinds of cases. There were not enough cases with four negative events to get complete symmetry, and even for positive events there were often relatively few cases of four events.

Beginning in the lower right-hand corner of the last column of Table 7, one finds that on average there is just barely support for our basic assumption. Mean scores are appropriately higher or lower in the net-positive-or-negative-balance-of-events groups (‘net-change groups’ for short) than in the equal-balance-of-positive-and-negative-events groups (‘no-net-change groups’ for short) for 51.6% (416/806) of the cases compared. Contrary to results concerning perceived changes in Table 5, the percentage of cases favourable to our basic assumption is higher when there is a net balance of positive events (54.2%) than when there is a net balance of negative events (47.7%). The progressive increases in variable mean scores that we anticipated to follow net increases in the balance of positive events was somewhat achieved. Mean scores are higher given a net balance of +1 positive events in 56.2% of cases, higher given a net balance of +2 positive events in only 49.4% of cases and highest given a net balance of +3 or +4 positive events in 57.1% of cases. For negative events, anticipated results hardly followed. Mean scores are lower given a net balance of −1 negative event in only 45.1% of cases, and lower given a net balance of −2 or −3 negative events in 50.3% of cases.

Examining the subtotal average percentage figures for the five clusters in the total sample column, one again finds that the cluster of three overall life assessment indexes has the highest value (64.4%). The cluster of ten single-item domain satisfaction variables has the lowest value (46.3%). As expected the 3-group subtotal average including all three clusters pertaining to some sort of overall life assessment was higher (55.3%) than the average for the cluster of 13 domain satisfaction scores (47.7%), for the 10 single-item scores (46.3%) and the 3 domain satisfaction indexes (52.2%). On average, the seven single-item discrepancy measures performed relatively well at 56.3%.

Turning to the column summarizing results for cases with a net balance of positive events, we find nearly the same pattern as in the total sample results column. On average, the three overall life assessment indexes have the greatest success (66.7%), the four single-item overall life assessment measures have the least success (44.4%), the seven discrepancy measures come in second (61.6%), the three domain satisfaction indexes third (53.7%) and the ten single-item domain satisfaction measures fourth (49.4%). This pattern is altered in the column summarizing results for cases with a net balance of negative events. While the three overall life assessment indexes still perform best (61.1%), the ten single-item domain satisfaction measures perform worst (41.7%), with the four single-item overall life assessment measures and the three domain satisfaction indexes coming second (50.0%) and the seven discrepancy measures third (48.1%).

The anticipated progressive changes in mean scores for the positive and negative net-change versus no-net change groups were not achieved in most clusters. Considering the subtotal average scores for the positive net-change groups first, one finds that as one moves from the +1 column to the +2 column, for three clusters there is less rather than more success, i.e., for the three overall life assessment indexes, seven discrepancy measures and three domain satisfaction indexes. For the four single-item overall life assessment variables there is no change and for the ten single-item domain satisfaction measures there is the anticipated increase in scores. As one moves from the +2 column to the +3 or +4 column, one finds the anticipated progressive changes in four of the five clusters, with the cluster of four single-item overall life assessment variables out of line. Our assumption of progressive changes was supported for three clusters in the negative net-changes groups. As one moves from the −1 column to the −2 or −3 column, on average our assumption is supported for the cluster of three overall life assessment indexes, three domain satisfaction indexes and seven discrepancy measures, but not for the four single-item overall life assessment measures or the ten single-item domain satisfaction measures.

Returning to the total sample column, there is relatively less variety than we have seen before in the percentage values among the 27 variables, ranging from 30.0 to 76.7%. Most success for our assumption was achieved by CLAS, at 76.7%, and least success was achieved by the satisfaction with religion or spiritual fulfillment variable (30.0%).

Table 8 contains results addressing the question of whether or not variable test–retest correlations would be higher for respondents who experienced an equal number of positive and negative events since the last survey than for respondents who experienced a net balance of positive or negative life events. Again assuming that the chances of producing a variable test–retest correlation change would increase as the year-by-year net balance increased either positively from +1 to +2 to +3 or +4 or negatively from −1 to −2 or −3 versus the no-net change situation, we counted positive changes for three kinds of cases and negative changes for two kinds of cases. As before, one should view the results in this table with caution because of the small sample sizes for each variable.

Beginning in the lower right-hand corner of the last column of Table 8, one finds that on average there is very little support for our basic assumption. Test–retest correlations are higher in the no-net-change groups than in the net-change groups for only 24.1% (107/444) of the cases compared. The percentage of cases favourable to our basic assumption is practically the same whether there is a net balance of negative events (24.4%) or a net balance of positive events (23.9%). Considering only positive net-change groups versus no-net-change groups, changes in correlations are exactly opposite to expectations. Test–retest correlations are higher for the no-net-change groups versus groups with a net balance of +1 positive event in 35.8% of cases, higher for the no-net-change groups versus groups with a net balance of +2 positive events in only 13.4% of cases and higher for the no-net-change groups versus groups with a net balance of +3 or +4 positive events in 2.4% of cases. Considering only negative net-change groups versus no-net-change groups, results are equally disappointing. Test–retest correlations are higher for the no-net-change groups versus groups with a net balance of −1 negative events in only 26.2% of cases, and higher for the no-net-change groups versus groups with a net balance of −2 or −3 negative events in 19.6% of cases.

Examining the subtotal average percentage figures for the five clusters in the total sample column, surprisingly one finds that the cluster of ten single-item domain satisfaction variables has the highest success ratio (31.4%) and the cluster of seven discrepancy measures has the lowest success ratio (18.2%). The average success ratio for the four single-item overall life assessment variables (22.7%) is practically the same as that of the three overall life assessment indexes (22.4%). The three domain satisfaction indexes have the second lowest success rate (18.8%).

Turning to the column summarizing results for cases with a net balance of positive events, we find the same pattern as that in the total sample column. On average the ten single-item domain satisfaction variables have the greatest success (30.1%), the seven discrepancy measures have the least success (16.4%), the four single-item overall life assessment measures come in second (29.4%), the three overall life assessment indexes third (21.6%) and the three domain satisfaction indexes fourth (17.9%). The pattern in the column summarizing results for cases with a net balance of negative events is different from the pattern in the total sample and positive net balance columns. While the ten domain satisfaction measures still perform best (32.9%), the four single-item overall life assessment measures perform worst (15.6%), with the three overall life assessment indexes coming second (23.3%), the seven discrepancy measures third (20.4%) and the three domain satisfaction indexes fourth (20.0%).

The anticipated progressive changes for the positive and negative net-change versus no-net change groups were not achieved in most clusters. Considering the subtotal scores for the positive net-change groups first, one finds that as one moves from the +1 column to the +2 column and from that column to the +3 or +4 column there is less rather than more success for each cluster except that of the 3 domain satisfaction indexes. However, our assumption of progressive changes was supported for 3 clusters in the negative net-changes groups. As one moves from the −1 column to the −2 or −3 column, our assumption is supported for the cluster of four single-item overall life assessment measures, the cluster of three overall life assessment indexes and seven discrepancy measures, but not for the two clusters of domain satisfaction measures.

Returning to the total sample column, there is again considerable variety in the percentage values among the 27 variables, ranging from 6.3 to 55.6%. Most success for our assumption was achieved by the financial security satisfaction (55.6%) and housing satisfaction variables (50.0%), and least success was achieved by the satisfaction with one’s overall quality of life (6.3%).

Given the great variety of results in Tables 5, 6, 7, 8, it is worthwhile to provide a summary table. Table 9 was constructed by calculating the average percentage figures from the final column results of Tables 5 and 6 for perceived changes and Tables 7 and 8 for net balances of life events, then taking the average of these two mean scores to form the final column of Table 9. Our basic question has been this: On average, are the 27 frequently used quality-of-life measures sensitive to (1) respondents’ reported year-by-year perceived changes in their life circumstances for the better or worse and (2) calculated changes in the net balance of their reported positive and negative life events? Besides answering this most basic question for the 27 measures collectively, we have addressed the question from the point of view of average scores for five different clusters of measures in the total set and of each measure taken individually.

Beginning with the last figure in the lower right-hand column of Table 9, one finds that on average, the 27 measures were appropriately sensitive to year-by-year changes in respondents’ life circumstances in 49.7% of the cases examined. In other words, measuring year-by-year changes in respondents’ life circumstances by reports of their own perceptions and experienced life events, on average the values of the 27 variables changed in ways that were consistent with respondents’ reported changes in about half of the cases examined. Clearly, a success rate struggling to equal a random chance rate is not a spectacular achievement for the total set of 27 measures. However, if one examines the bottom line of the other two columns in the table, one finds a substantial difference between the results in the perceived changes and net balance of life events changes columns. The former has a success rate of 61.7% while the latter has a rate of only 37.3%.

Considering only the final column of results, one finds that on average, measures in the three overall life assessment clusters had a higher success rate (53.0%) than measures in the domain satisfaction clusters (46.1%). In fact, on average measures for each of the three former clusters was higher than that of the latter cluster. The three overall life assessment indexes had the highest average (62.5%), followed by the seven discrepancy measures (50.5%) and then the four single-item overall life assessment measures (50.4%). Within the three overall life assessment index cluster, SWLS had the highest success rate (68.2%). Within the discrepancy cluster, self-progress was highest (59.8%) and for the single-item overall life assessment cluster it was happiness (57.7%). For the ten single-item domain satisfaction cluster, satisfaction with one’s financial security (59.2%) and self-esteem (56.9%) were highest. For the three domain satisfaction index cluster, satisfaction with one’s health was highest (50.3%).

Considering the column of worst results (column two), one finds again that on average, measures in the three overall life assessment clusters had a higher success rate (37.8%) than measures in the domain satisfaction clusters (36.8%). The three overall life assessment indexes again had the highest average (43.4%), followed by the ten single-item domain satisfaction measures (37.3%) and then the seven discrepancy measures (37.1%). The three domain satisfaction indexes came in fourth (35.4%) and the four single-item overall life assessment measures average came in last (34.7%). Within the three overall life assessment index cluster, CLAS had the highest success rate (52.0%). Within the discrepancy cluster, self-best was highest (49.2%) and for the four single-item overall life assessment cluster, happiness was tied with life satisfaction (38.4%). For the ten single-item domain satisfaction cluster, satisfaction with one’s financial security (52.8%) had the best score, followed by housing satisfaction (51.7%). The latter figure was the only one in the first two columns of the table in which a value of a variable in the second column (51.7%) was greater than its corresponding value in the first column (42.8%).

Considering finally the column of best results (column one), one finds again that on average, measures in the three overall life assessment clusters had a higher success rate (68.3%) than measures in the domain satisfaction clusters (54.6%). The three overall life assessment indexes again had the highest average (81.6%), followed by the four single-item overall life assessment measures (66.2%). Then, the seven discrepancy measures came in third (63.8%), the ten domain satisfaction measures came fourth (55.0%) and finally, the three domain satisfaction indexes (53.5%). Within the three overall life assessment index cluster, SWLS had the highest success rate (91.7%). Within the four single-item overall life assessment cluster, happiness had the highest rate (77.1%). For the discrepancy cluster, self-progress was highest (81.3%), for the cluster of ten domain satisfaction measures, job satisfaction was highest (70.9%), followed by self-esteem satisfaction (69.3%). Within the three domain satisfaction indexes cluster, health satisfaction was highest (63.9%).

6 Conclusion

Recall that Atkinson’s most important conclusions directly challenged the hypotheses that satisfaction measures “are poor social indicators because they were so conditioned by expectations and restricted awareness as to be insensitive to changing circumstances” and that “expectations and aspirations adjust very quickly to new situations and that satisfaction and other measures revert to their original levels immediately”. Predictions following these hypotheses were contradicted by his results “in that significant numbers of respondents perceive changes in their lives and those changes were reflected, for better or worse, in their satisfaction levels”. The changes he tracked “took place over a two year period” and he thought they showed that “while adaptation probably does occur, it is not instantaneous and will be detected by an indicator series which utilizes fairly frequent measurements”. Using two different measures of changing life circumstances, he found that “The hypothesis that the No Change group would have more stable QOL scores than those reporting change is supported when the perceptual indicator of change is used as the independent variable but not when the event measures are involved”.

We have some similar and some dissimilar results. Since about 46% of our respondents reported changes in their life circumstances using each of the two different measures of change, it is fair to say that we also found “significant numbers of respondents perceive changes in their lives”. Considering nearly three times as many relevant variables as Atkinson (i.e., 27 compared to 10) and one year change periods, we also found that they behaved more often than not (61.7%) as expected using “the perceptual indicator of change” but less often than not (37.3%) as expected using “event measures”.

We did not find as he did that single-item domain satisfaction measures performed better than overall life assessment measures. As reported in Table 9, using both kinds of measures of change, on average the ten single-item domain satisfaction measures had a 46.6% success rate, compared to 53.0% for the 14 overall life assessment measures. Using “the perceptual indicator of change”, on average the ten single-item domain satisfaction measures had a 55.0% success rate, compared to 68.3% for the 14 life assessment measures. It should also be noticed that within the cluster of all domain satisfaction measures, on average the three indexes did not perform better than the ten single-item measures, i.e., 53.5% compared to 55.0%, respectively. Generally speaking, then, considering the relative performance of overall life assessment measures versus domain satisfaction measures, our results are consistent with those summarized by Diener, Lucas, Schimmack and Helliwell (2009, pp. 72–73). They found “no systematic differences” among such measures and reported that “Overall, the content of well-being measures has surprisingly little influence on the reliability of well-being measures”.

Regarding single-item overall life assessment measures, Atkinson’s worst performer was the Gallup 3-step happiness measure. He suspected that “the item’s problems stem from the use of a three-point response scale”. As indicated in Table 9, our 7-step happiness measure had the highest success rate of the four single-item overall life assessment measures using both measures of change (57.7%) and perceived changes measure (77.1%). The two-year test–retest correlation for his 3-step happiness measure was r = 0.39 (p. 120, Table II), while our 7-step happiness measure scored r = 0.64 (Table 3). His 11-step life satisfaction scale had a two-year test–retest correlation of r = 0.41, while our 7-step scale scored r = 0.65. Concerning response categories, Diener, Lucas, Shcimmack and Helliwell (2009, p. 74) claimed that “the evidence suggests that, at least in the range from 2 to 11 response categories, the more response categories produce higher reliability…We strongly urge researchers to use scales with more response options and to phase out measures with three or four response categories that are still in use”.

Atkinson’s General QOL Index had a two-year test–retest correlation of r = 0.53, while the average of our three overall life assessment indexes was r = 0.76. In fact, each of our three indexes displayed greater stability than Atkinson’s General QOL Index, with SWB having the highest two-year test–retest correlation of r = 0.77, followed by SWLS and CLAS at r = 0.75 each. Most importantly, considering the general stability of our three indexes regardless of changing life circumstances (Table 3) and the average superior sensitivity of these indexes to perceived changes in life circumstances (81.6%; Table 9), a fairly strong case has been made for preferring any of these indexes to any single-item overall life assessment measure. It is, therefore, worth emphasizing that any general survey of the perceived quality of life and especially the World Values Survey and the Gallup World Poll would be strengthened if they included any of these indexes. Since Michalos and Kahlke (2008) showed that our single-item measure of satisfaction with the overall quality of life was more sensitive to arts-related activities than any of the three overall life assessment indexes, we still believe that the wisest course to follow in any assessment of perceived quality of life is to use more than one measure for one’s dependent variable. One of the strongest messages we have had from over 40 years of quality of life research is that different measures are more or less sensitive to different features of people’s lives (Michalos 2005), and the best way to avoid oversimplification in our assessments is to use more rather than fewer measures. In Michalos (2008) additional arguments were presented against the use of any single life satisfaction measure as a criterion variable for overall quality of life studies. Of course, if one must choose between having a single life satisfaction (index or single-item) measure and none at all in such studies, one should opt for such a measure. However, the use of any such single measure as a criterion variable for assigning importance weights to aspects or domains of life should be resisted or undertaken with great caution.

Considering the average stability of our seven discrepancy measures regardless of changing life circumstances (r = 0.59 and r = 0.57 for one- and two-year test–retest correlations, respectively; Table 3) and their average sensitivity to perceived changes in life circumstances (63.8%; Table 9), these measures should be used with caution. Since they are essential components of MDT, we are aware of the significance of this finding for that theory. While the success rates of measures of self/wants (80.9%), self/deserves (80.6%) and self/progress (81.3%) are fairly acceptable, those of the other 4, especially self/others (42.8%), are problematic. The reliability and usefulness of these four variables could prove to be a fatal blow to the theory as currently crafted. Results of our causal analyses in the next paper will help us decide what role, if any, these variables should continue to have in MDT.

Finally, in one of our email exchanges with Lucas, he suggested that “when looking at stability coefficients, it might be worth comparing people with more events (regardless of valence) to those with fewer events”. Table 10 summarizes the results of our exploration of this suggestion. Beginning with the last figure in the lower right-hand column of Table 10, one finds that on average for the stability coefficients of the 27 variables under consideration, the hypotheses that correlations are higher given no events versus one, two and three or more events are supported in only 37.6% of the cases examined. Reviewing the sub-total average figures for the five main clusters, one finds that the ten single-item domain satisfaction measures performed most consistently with the hypotheses (46.3%), but still more often than not failed to support them. The three overall life assessment indexes performed no better than average (38.0%). Comparing the bottom lines of the other three columns, one would expect to find increasing support for the basic hypothesis, i.e., more events produces smaller stability coefficients. In fact, we find decreasing support, from 39.1% (0 events vs 1) to 38.8% (0 vs 2) to 34.3% (0 vs 3 or more). Examining all the subtotals in the three columns, there are no cases of a steady increase across all three columns in support of the basic hypotheses. There are, however, two cases in which there are increases consistent with the hypotheses, namely, for the four single-items and the three indexes of overall life assessment clusters as one moves from 0 vs 1 to 0 vs 2 events. All things considered, we find little support for the hypothesis that, regardless of valence, respondent experiences of more versus fewer salient life events produces lower stability coefficients.