Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

To understand the dynamics of mental health, it is essential to develop measures for the frequency and the patterning of mental processes in everyday-life situations. The Experience-Sampling Method (ESM) is an attempt to provide a valid instrument to describe variations in self-reports of mental processes. It can be used to obtain empirical data on the following types of variables: (a) frequency and patterning of daily activity, social interaction, and changes in location; (b) frequency, intensity, and patterning of psychological states, i.e., emotional, cognitive, and conative dimensions of experience; (c) frequency and patterning of thoughts, including quality and intensity of thought disturbance. The article reviews practical and methodological issues of the ESM and presents evidence for its short-and long-term reliability when used as an instrument for assessing the variables outlined above. It also presents evidence for validity by showing correlation between ESM measures on the one hand and physiological measures, one-time psychological tests, and behavioral indices on the other. A number of studies with normal and clinical populations that have used the ESM are reviewed to demonstrate the range of issues to which the technique can be usefully applied.

Sampling of Experience

In recent years a growing number of investigators have sought information on the daily events and experiences that make up people’s lives. Pervin (1985) identified the “increasing use of beeper technology” as a research methodology in which signaling devices carried by respondents are used to elicit self-report data at randomized points in time. One of the earliest lines of investigation using pagers to stimulate self-reports began at the University of Chicago in 1975, under the name of “Experience Sampling Method” (Csikszentmihalyi et al. 1977). The general purpose of this methodology is to study the subjective experience of persons interacting in natural environments, as advocated by Lewin (1936) and Murray (1938), in a way that ensures ecological validity (Brunswick 1952). The need for this kind of approach arises from dissatisfaction with a large body of research demonstrating the inability of people to provide accurate retrospective information on their daily behavior and experience (Bernard et al. 1984; Mischel 1968; Yarmey 1979). Its goal is similar to the one Fiske (1971, p. 179) set out for psychology as a whole: “to measure … the ways a person usually behaves, the regularities in perceptions, feelings and actions.”

The objective of the research described in this paper is to sample experience systematically, hence the name Experience-Sampling Method (ESM). The present article describes ESM and reports on its reliability and validity, using findings from a number of studies. (Readers may also wish to refer to two earlier and more restricted reviews of the methodology i.e., Hormuth 1986; Larson and Csikszentmihalyi 1983.)

The development of experience sampling responds to a number of currents within psychology and the social sciences. Two methodological traditions are the most direct ancestors of the present approach; first, research in the allocation of time to everyday activities (time budgets); and second, research measuring psychological reactions to everyday activities and experiences, Time budget studies have typically assessed time investment in different activities by different categories of persons (Altschuller 1923; Bevans 1913; Robinson 1977; Sorokin and Berger 1939; Szalai 1972; Thorndike 1937; Zuzanek 1980). Another approach has been that of the Kansas School of Ecological Psychology, which observationally investigated behavioral settings and focused on time use in the socialization of children (Barker 1968; Barker and Wright 1955; Barker et al. 1961). These studies were later extended cross-culturally (Johnson 1973; Munroe and Munroe 1971a, b; Rogoff 1978).

A second tradition of research focused on the impact of everyday life situations on psychological states, such as “psychopathology and coping” (Gurin et al. 1960); and “well-being” (Bradburn 1969). These studies provided data on global psychological states in representative populations. Researchers in the field of social psychology, on the other hand, performed experimental studies in which respondents were asked to imagine themselves in various life situations and then to report their psychological reactions (Endler and Magnusson 1976; Magnusson and Endler 1977).

Although these methods have enriched our understanding of individual lives, they have several short-comings: Imagery evoked in laboratory studies is not necessarily typical of experience encountered in reallife situations. In quality-of-life studies, only global assessments of extremely complex phenomena are presented. The data are gathered in retrospect, outside of the context of the situation, thus permitting distortions and rationalizations to become important. Time budget studies have been obtained from observer data or from diaries that do not provide direct access to the subjects’ internal states. Nor in these studies is it clear what the link is between behavior and psychological states or between time use and experience, The ESM, which assesses subjects in real time and context, attempts to overcome some of these shortcomings.

Conceptually, the ESM “exposes” regularities in the stream of consciousness, such as states of heightened happiness or self-awareness, extreme concentration experienced at work, and symptoms of illness. The research aim is to relate these regularities to characteristics of the person (e.g., age, aptitude, physiological arousal, medical diagnosis), of the situation (e.g., the challenges of a job, the content of a TV show), or of the interaction between person and situation (e.g., the dynamics of a conversation with a friend, the circumstances that lead to a specific event). The objective is to identify and analyze how patterns in people’s subjective experience relate to the wider conditions of their lives. The purpose of using this method is to be as “objective” about subjective phenomena as possible without compromising the essential personal meaning of the experience.

Methods

Instruments

Signaling device. To obtain representative self-reports of experiential states, the ESM relies on an electronic instrument that emits stimulus signals according to a random schedule. Different sound or vibration signaling devices have been used by different research groups: pagers such as those used to page doctors in the hospital (e.g., Csikszentmihalyi and Graef 1980; Csikszentmihalyi and Figurski 1982; Larson and Csikszentmihalyi 1983), programmed pocket calculators (e.g., Massimini, this issue, p. 545), programmed wrist watch terminals (e.g., deVries 1983; Brandstatter 1983), and other devices that signal at random intervals, such as those used in thought-sampling research (Hurlburt 1979; Klinger et al. 1980) and occupational studies (Divilbliss and Self 1978; Spencer 1971). Some studies have had the subjects themselves set watches according to predetermined timetables (Brandstatter 1983; Diener et al. 1984) but moat program the signal devices for the respondents. The signal devices can also be programmed simultaneously and this provides special opportunities for analysis of the interdependence of experience in studies of couples, families, or friendship groups. Pawlik and Buse (1982) have pioneered data-recording units that, in addition to signaling, are also able to record coded responses directly on tape. At the University of Heidelberg, Hormuth (1985, 1986) built a device that can be programmed from a portable Epson computer to signal 128 times over a period of 8 days. Different devices have different advantages and disadvantages. The choice also depends on the groups of subjects under investigation (e.g., the vibration option is more practical for research with adolescents, who spend part of their day at school; the sound of wrist watch terminals can be too soft to page the elderly, etc.). The function of the signaling device is always to cue respondents to report their activities, thoughts, and inner states at unexpected random times in natural settings. Any technique providing this function is adequate.

Experience-Sampling Form (ESF)

At the signal, the respondent writes down information about his or her momentary situation and psychological state on the ESF self-report questionnaire (see Appendix). This record becomes the basic datum of the ESM. The ESF is typically designed so that it will take no more than 2 min to complete. Respondents usually carry a full packet of the forms in a booklet.

Items contained in the form vary depending on the investigator’s goals. In the authors’ research, the objective has been to seek comprehensive coverage of the respondent’s external and internal situation at the time of the signal. Hence, the ESF includes open questions about location, social context, primary and secondary activity, content of thought, time at which the ESF is filled out, and a number of Likert scales measuring several dimensions of the respondent’s perceived situation including affect (happy, cheerful, sociable, and friendly), activation (alert, active, strong, excited), cognitive efficiency (concentration, ease of concentration, self-consciousness, clear mood), and motivation (wish to do the activity, control, feeling involved). The sample ESF used in a study of adolescents is reproduced in the Appendix.

Variants of this form have been used with other samples. Some researchers have focused exclusively on thought content (Hurlburt 1979, 1980; Klinger et al. 1980) or on emotional experience (Diener et al. 1984). Other investigators have developed specialized item sets on self-image and self-awareness (Franzoi and Brewer 1984; Savin-Williams and Demo 1983), adjustments to changes in residence (Hormuth 1986), intervening daily events (Greene 1985), binge eating (Johnson and Larson 1982), alcohol and drug consumption (Filstead et al. 1985), thought disorders (deVries, 1983, 1984; deVries et al. 1986) and the special problems of physically disabled children and adolescents,Footnote 1 among others.

Procedures

Scheduling of signals. In the majority of ESM studies respondents received seven to 10 signals per day for 7 consecutive days. Usually the scheduling of signals is a variant of what Cochran (1953) describes as “systematic sampling,” a procedure that typically obtains more precise estimates of population characteristics than a purely random sample. Within the 15–18 target hours each day, one signaling time is selected for every block of 90–120 min, using a table of random numbers, with the provision that no signals should occur within 15 min of each other. The length of the reporting period and the timing of reports again depend on the scope and aims of the investigation. Filstead and colleagues (1985) asked alcoholics coming out of treatment to respond to a schedule of four signals per day for 3 months. In a study of the menstrual cycle, LeFevre et al. (1985) signaled married couples three times a day for 1 month. At the other extreme, studies of thought content have had signals with a mean delay of only 30 min for 3 days (Hurlburt 1979). A concentration of signals during crucial points of the day or in different situations has also been used (deVries 1983). However, the more that researchers have demanded of respondents, the fewer people have been willing to take part in the research. Generally, when frequency has increased, shortening the duration of the total sampling period gives augmented compliance.

Before beginning experience sampling, respondents receive instructions on the use of the signaling device and are instructed on how to fill out the ESF. They are told that they should fill out the form as soon as possible after each signal; typical situations in which this might be difficult (driving a car, playing sports) are discussed. Subjects are asked to keep the beeper turned on and to respond to all signals received, unless they go to bed or they “really need privacy,” The general purposes of the research are described, and the necessity for the respondents to report their life as it actually is—with all its joys and problems—is stressed.

Respondents should be given a chance to fill out a practice ESF. Subsequently, each subject is provided with a bound booklet of about 40–60 ESFs and a telephone number where someone can be reached in case any question or complication arises. When the week is completed, each subject is debriefed and the ESF booklets are collected. Additional interview or questionnaire data may be obtained before or after the test week.

Coding

Most of the data consists of self-scoring rating scales. The open-ended items may be coded in different ways, depending on the goals of the study. In some studies (Buse and Pawlik 1984; Hormuth 1984), activity and thought categories are provided on each ESF for the respondent to check, thus eliminating the need for the researchers to code open-ended responses. In the Chicago research each variable was coded in fine detail, although codes in most analyses are aggregated in larger categories.

Activity codes. For the adult sample the answers to the item “What was the main thing you were doing?” were initially coded in one of 154 activity categories (e.g., “operating a typewriter,” “playing with a child,” “planning a meal”). In most analyses, however, activities were combined in 16 larger groups (e.g., “working at work,” “other at work,” “transportation”). Sometimes only three global activity areas are contrasted; work, maintenance, and leisure. Agreement between two coders at the level of the 154 categories was 88 %; at the level of the 16 variable groupings it was 96 %.

Of course, subjects may vary in how they report an activity. The same activity can be described by one person as “I was typing at my desk,” by another as “I was helping out my boss,” and by still another as “I was waiting for 5 o’clock to come so I could leave the office.” Action identification theorists (Harré and Secord 1972; Wegner and Vallacher 1977) find important differences associated with different ways of segmenting and labeling behavior. In our studies activity reports were collapsed into functional categories (e.g., work or leisure) without concern for structural characteristics such as the “level of identity” of the response, although such coding could also be attempted.

Thoughts. We used the same codes as for activities to code the content of thought, with only a few additions. The categories were then aggregated into fewer functionally equivalent groups (e.g., thoughts about work, family, or self). Other researchers have developed schemes for coding thoughts that are specifically designed to study the stream of consciousness and psychopathology (deVries et al. 1986; Hurlburt, 1979; Klinger et al. 1980).

Data Structure

Data are typically stored in two major computer files: a “beeper” and a “person” file. The first contains the data from each separate ESF in its entirety. The second contains percentages, means, standard deviations, and other aggregate scores for different variables, compiled by respondent, as well as information from interviews and questionnaires. The person file may also contain various aggregate scores for each individual, such as the percentage of time in various activities or mean self-rating in different social contexts or locations. For each new analysis the relevant data are aggregated and then transferred from the beeper to the person file. The addition of a “day level” file has been added to the beeper and person files by deVries et al. (1986) to study psychopathology. The day level file contains an overall assessment of the daily routine and so-called “Zeitgebers” (time waking up, going to sleep, meals).

Because the ESM obtains random samples of daily experience, the data base in these files is a representative record that can be accessed over and over again, to test any number of hypotheses formulated at the time of collection or 20 years later. One can think of the data base as a permanent laboratory in which an almost unlimited number of relationships may be tested (e.g., Graef et al. 1983). To the extent that new records are continuously being added and the number of observations in each cell increases, ever more refined questions can be asked of the data.

Compliance

Volunteer rates. The ESM can be used with a variety of populations, provided that they can write and that a viable research alliance can be established. To date the youngest respondents have been 10 years old, and the oldest 85. ESM research has been carried out on people with schizophrenia (Delespaul and deVries, this issue, p. 537; deVries et al. 1986), anxiety disorders (Dijkman and deVries (1987), this issue, p. 550), multiple personality disorders (Loewenstein et al. 1987), bulimia (Johnson and Larson 1982), alcoholism (Filstead et al. 1985) and paraplegia.Footnote 2 In our studies we have been able to include adults who spoke little English and had only a few years of grammar school education, but the rate of volunteering from unskilled blue collar workers was extremely low (12 % of the target sample), and of those who volunteered only half completed 30 ESFs or more. It was clear that for them the task was unusual and difficult to handle. At the other extreme, among clerical workers and technicians, 75 % of the eligible population volunteered and completed the study. Among randomly selected fifth and eighth graders in a current study the rate was 91 %. Among high school students and adults, females have been more wilting participants than males.Footnote 3

Response frequency. Respondents varied in their rate of compliance with the method. Blue-collar workers have responded to 73 % of the signals on the average, clerical and managerial workers to 85 and 92 %, respectively, for an overall average of 80 %. High school students had a median response rate of 70 %. Pawlik and Buse (1982) and deVries et al. (1986) reported an 86 % completion rate, and Hormuth’s study (1985) had a median completion rate of 82 %.

Missing signals occur for a variety of reasons. From the debriefing interviews these appear to be due primarily to technical problems such as beeper malfunction or reception difficulties. The second most frequent source of attrition was forgetting the pager or the ESFs at home. A third source was related to the nature of the activity at the time the signal was transmitted, such as being in church, in the swimming pool, or in bed.

The frequency of delay in response to the pager is typically quite small. In our study of 75 adolescents (Csikszentmihalyi and Larson, 1984), 64 % responded immediately when they were signaled and 87 % responded within 10 min. In Hormuth’s (1985) study of 101 adult Germans (N = 5145 observations), 50 % of the signals were responded to immediately, 80 % within 5 min of the signal, and 90 % within 18 min. Similar results were obtained in other studies. Delays are usually due to being engaged in activities that cannot be interrupted, for example, taking a test in school, driving a car, or talking to a customer. In most studies, ESFs filled out more than 20 min after the signal were discarded.

Experimental effects. Analyses of debriefing responses suggest that the intrusiveness of the method is not felt to be excessive. Among U.S. adults, 32 % said that the beeper was getting disruptive or annoying by the end of the week; 22 % of the German adults complained that it disrupted daily routine (Hormuth 1985). Ninety percent of the Americans and 80 % of the Germans felt that the reports captured their week well. When asked whether they would participate again in such a study, 75 % of Hormuth’s subjects answered yes.

Reliability of ESM Measures

Sampling Accuracy

Because the ESM obtains a systematic random sample of daily life, it provides a measure of how people spend their time during a typical week. This measure is both an important measure in its own right, indicating the frequency of different activities, and a means for determining sampling accuracy of the ESM reports. A comparison with diary records from time budget studies (Robinson 1977; Szalai 1972) shows that the frequency of activities measured with the ESM correlates well with the rank of time budget activity frequencies (r = 0.93). Although diaries and the ESM provide very similar measures of activity frequency, a few discrepancies between the two are worth mentioning because they seem to be exceptions that prove the accuracy of the ESM. Respondents report to be ‘idling’ over 5 % of the time, while this category does not even appear in the time budgets. Idling was coded when the respondents reported staring out of the window or standing about without doing or thinking anything. Apparently this type of behavior is drastically underrepresented in retrospective reports. Overall, however, the two measures produce almost identical values of time allocation for different daily activities. The duration and sequence of activities, however, are more clearly calculated with diary approaches.

Stability of Activity Estimates

How stable are these measures over time? To answer this question, the week has been divided into two halves, and the frequency of activities for the group was computed within each period. By and large, activities are reported with the same frequency in the two halves of the reporting period. For a sample of 107 adults the difference was not significant (x 2[15] = 8.1), in spite of the enormous number. For a sample of 75 students, on the other hand, the differences were significant (x 2[13] = 33.4; p < 0.01). The primary difference was the greater percentage of time at work during the second half of the week, and it is attributable to the fact that more people, especially in the adolescent sample, started the experiment toward the end of the week (Thursday and Friday), so that the first half included more weekend self-reports.

Stability of Psychological States

Another issue is whether self-reports of affect, activation, motivation, and cognitive efficiency are stable over the testing period, or whether the pattern of responses changes over time as a result of the measurement procedure. To answer this question the mean scores for each individual from the first half of the week were compared with those from the second half.

None of the averaged individual mean response variables showed a significant change from the week’s beginning to its end. It should be added that for both adolescents (N = 75) and adults (N = 107) the most extreme change was on the variable free vs. constrained: adolescents (difference = 0.26, p = 0.002) and adults (difference = 0.16, p = 0.006) felt more constrained in the second half of the week, probably in reaction to the method itself.

The variance in responses around an individual’s mean diminished from the first to the second half of the week. For adults, the decrease in variance was always significant, whereas for adolescents the variance in affect and in strength was not.

Does this mean that in the course of the reporting period people become stereotyped in their responses and fail to differentiate between situations? To answer that question, we compared the first and the second halves of the week to determine whether the amount of variance accounted for by activities diminished with time, by calculating the variance attributable to the person’s own response pattern and that to his or her activities (see Table 3.1). To save space, only one variable for each of the four dimensions is presented here. Personal effects were in general more powerful, accounting for between one fifth and one third of the variance, whereas activity effects explained only about 5 %. In the second half of the week all the personal effects increased, indicating that with time individual responses become more predictable. However, activity effects did not show a comparable decline over time for either adolescents or adults. Thus the reduction in variance does not imply a lessened sensitivity to environmental effects, but a more precise self-anchoring on the response scales.

Table 1 Proportion of variance in psychological stales accounted for by persons and activities in the first and second half of the week of ESM self-reporting

Individual Consistency Over the Week

Contrary to a one-time measure, the ESM is not based on the assumption that people are going to be entirely consistent in their responses. Person A might be happier than person B on Monday, but on Thursday B could be happier than A, depending on intervening experiences. Because the technique was devised to measure the effects of life situations on psychological states, perfect reliability would in fact defeat its purpose. In general, however, it was expected that relative differences between respondents would tend to persist over time. To check on individual response consistency, each subject’s mean and standard deviation in the first half of the week were correlated with the means and standard deviations in the second half.

All correlations were significant, for the means as well as for the standard deviations, indicating that levels of both response and variability are fairly stable individual characteristics. The median correlation coefficient on the eight variables was 0.60 for the adolescents and 0.74 for the adults, suggesting that better anchored psychological states develop with age.

One investigatorFootnote 4 added five items intended to measure self-esteem to each ESF and collected 2287 observations from 49 mothers of small children. The Cronbach alpha for the set of items was 0.94, and coefficient of the correlation between the mean self-esteem scores in the first half of the week and the second half was 0.86 (p = 0.0001).

Pawlik and Buse (1982), studying 135 high school students in Hamburg, also correlated the frequency of responses between the 3651 protocols in the first half of the week and the 3729 protocols in the second half. Their subjects reported only the presence or absence of various subjective states, instead of using Likert scales. The correlation coefficients were 0.57 for locations, 0.76 for moods, and 0.80 for motives, quite similar to the ones reported above despite the difference in methods.

Individual Consistency Over Two Years

Test-retest data are available for 28 adolescents (Freeman et al. 1980) who took part in ESM for a week first in their freshman or sophomore years in high school, and again, 2 years later, in their junior or senior years. The stability in their responses ranged from r = 0.45 (p = 0.05) for the variables active, to r = 0.77 (p = 0.001) for happy (Table 3.2).

Table 2 Changes in immediate experience from time 2° table is based on the average of the approximately 31 ESM self-reports for each of 27 adolescents at two points in time. From Freemen, Larson, and Csikszentmihalyi, 1966. Reprinted with permission

Internal Consistency

Because individual ESM items are administered many times, it is less important for most analyses to have multiple items measuring a single construct as in a traditional paper and pencil measure. Indeed, it would be sketching respondents’ patience to ask them to fill out a 20-item instrument 50 times in 1 week. For data reduction, however, researchers have factor-analyzed the ESM mood items and constructed scales by summing small numbers of items. Alpha levels for scales of affect (0.57) and arousal (0.48) are acceptable for measures computed from only four items.

Validity of ESM Measures

The reliability data on the convergence of diary and ESM measures also provide information on the validity of the ESM for measuring time usage. Here we will focus on its use for assessing internal states.

In general, the data suggest (a) that ESM reports of psychological states covary in expected ways with the values for physical conditions and with situational factors such as activity, location, and social context; (b) that measures of individual differences based on the ESM correlate with independent measures of similar constructs; and (c) that the ESM differentiates between groups expected to be different, e.g., patient and nonpatient groups or gifted and average mathematics students.

Situational Validity

Eight subjects wearing heart rate and activity monitors as well as ESM pagers were asked to supplement each self-report with a rating on a 10-point scale on “How physically active have you been in the past 3 min?”. Footnote 5 This self-reported item predicted heart rate as well as readings from the ankle and wrist activity monitors: the correlation of self-ratings with heart rate was r = 0.41 (p < 0.0001), and with monitor readings, r = 0.36 (p < 0.0001). Substantial individual differences in this relationship, ranging from r = 0.61 to r = 0.16, were significantly correlated with other personality variables that suggested important individual differences in how physically aware the subjects were.

In addition, the physical activity self-ratings differentiated very highly between four body positions. When respondents were lying down, the mean z-score for self-reported physical activity was −1.47, when sitting, it was 0.34, when standing, it was 0.43, and when walking, it was 1.03 (analysis of variance, F[3,268] = 41.7, p < 0.0001). Heart rate also varied in relation to body position (F[3,268] = 6.95, p < 0.0001).

Another example of an expected relationship between activity and experience is the association between what people do and their level of motivation. When activity categories are arranged on an obligatory-discretionary axis, the productive ones (work) are rated as obligatory 80 % of the time. Maintenance activities are obligatory less often, from cleaning house (54 %) to shopping (37 %), and leisure is rarely seen as obligatory, from socializing (15 %) to watching TV (3 %).

In the study of working women with small children mentioned above, it was found that the self-esteem of mothers was much higher when they were working or involved in leisure than when they were taking care of the house or of their children (analysis of variance, F[1,46] = 13.77, p < 0.0005). Additional expected relationships between activity and experience can be found over a wide range from states related to drug and alcohol use or binge eating (Johnson and Larson 1982; Larson et al. 1984) to emotions when alone on Friday or Saturday night (Larson et al. 1982), or to stress experienced in anxiety-provoking situations (Dijkman and deVries 1987, p, 550; Margraf et al. 1987). this issue, p. 558).

Individual Characteristics and Variation in Experience

In addition to using ESM data to assess regularities in how people experience different daily situations, they may also be useful when the person is the unit of analysis. For example, a number of researchers have found correlations between participants’ responses on the ESM and their scores on other psychometric instruments.

Giannino et al. (1979) entered ESM scores of a workers’ sample into a regression analysis with 27 predictor items. The variable that accounted for the largest proportion of variance in the affect dimension (13 %, p < 0.0001) was the alienation-from-self subscale of Maddi’s Alienation Test (Maddi et al. 1979). In other words, the best predictor of positive affect on the ESM was the absence of alienation from the self. Moreover, workers who scored high on a work satisfaction test scored much higher on the item “involved” when they were alone and actually working on their jobs than did subjects who scored low on work satisfaction. Satisfied workers had higher scores on concentration, skills, alertness, and motivation (in each case, p < 0.0001).

The strength of the need for intimacy was measured by McAdams and Constantian (1983) using projective techniques and comparing them with ESM measures. People with a high need for intimacy reported more thoughts about people and relationships (r = 0.52, p < 0.001), more conversation with others (r = 0.40, p < 0.01), higher affect when with others (p < 0.001), and a lower rate of wishing to be alone when with others (r = –0.32, p < 0.05).

Hamilton et al. (1984) developed a questionnaire scale for measuring the amount of enjoyment subjects report in their daily lives. The amount of intrinsic enjoyment was related to several ESM variables such as motivation, wish to be doing the activity (p < 0.001), concentration, ease of concentration, control, and activity/potency (p < 0.05 in each case).

One-time assessments on the Rosenberg Self-Esteem scale (RSES) were compared with the average of a repeated self-esteem scale (four items from the RSES included in the ESF) and with the average of five ESM items related to self-esteem. The one-time RSES correlated with the repeated RSES items r = 0.62 (N = 49, p < 0.0001) and with the ESM self-esteem items r = 0.42 (p < 0.002). This latter correlation varied considerably depending on the social context: When alone, subjects’ responses on the ESM self-esteem items correlated with the one-time RSES at only r = .26 (NS), when only children were present at r = 0.36 (p < 0.05), and when adults were also present at r = 0.50 (p < 0.001). In other words, self-esteem as measured by a one-time traditional test corresponds to the self-esteem people report when they are in public.

Clinically, Loewenstein et al. (1987) reported the use of ESM with a woman with a multiple personality disorder. They found that the alternates displayed quantitative differences on the affect and motivational scales comparable to those observed between separate individuals,

Differences in Experience Between Groups

The ESM has also differentiated well between the responses of groups with distinctive behavioral patterns and groups with psychopathology. For example, deVries and associates (deVries 1983; deVries et al. 1986) coded the thoughts reported on the ESFs of Dutch schizophrenic and nonschizophrenic mental patients; the schizophrenics generally suffered more severe thought disorders, whereas the other patients suffered mainly from affective disorders. He found that nonschizophrenic patients evidenced congruence between thoughts and actions 75 % of the time, whereas the schizophrenics did the same only half of the time (t = 2.82, p < 0.005). The thoughts reported by the schizophrenics were also coded as disordered much more often than were those of the control group (t = 9.13, p < 0.0005). On the other hand, the schizophrenics reported a more positive average affect (t = 1.78, p < 0.05).

In their investigation of women with eating disorders, Johnson and Larson (1982) found that bulimic women were involved in food-related behavior or thought on the average of 38 % of their waking time, as opposed to the 14 % average for a comparison group of women. They also found that the overall level of positive affect was lower for bulimics than for normal women (happy, t = 4.66, p < 0.001; cheerful, t = 4.14, p < 0.001; sociable, t = 4.12, p < 0.001).

Overview

Since its introduction 10 years ago, the ESM has proved to be a useful tool for psychological research. Its main contribution has been to make the variations of daily experience, long outside the domain of objectivity, available for analysis, replication, and falsifiability, thus opening up a whole range of phenomena to systematic observation.

The most heuristic usefulness of the ESM lies in its description of the patterns of an individual’s daily experience. Because the method yields repeated measurements of a person’s activities, feelings, thoughts, motivations, and medical symptoms over time, questions such as the following can be answered: How much of the person’s variation in happiness (or any other state) is related to what the person does; to the company he or she keeps; to the time of day; to intervening events? By the same token, the ESM can reveal subjective effects of major life changes that might otherwise be hidden from consciousness by distortion or inaccurate recollection. The comparison of responses before and after a job or family change, a clinical intervention, or other changes in life situations can reveal what impact such transitions have on a person’s daily life.

Adding up patterns within a person, it becomes possible to use ESM to evaluate the common experience of situations. For instance, is solitude, housework, or marijuana smoking experienced similarly by different individuals? It follows that the ESM can be used to compare the subjective experience of different events. Do men and women differ in their daily emotions? What states, feelings, attitudes differentiate talented achievers from talented nonachievers? How can we compare the experience of physically handicapped children with that of other children? Likewise, ESM data can reveal whether changes in life situations elicit consistent changes in experiences of people in general.

A final use of ESM is to study the dynamics of emotions and other subjective states. The study of consciousness has lagged behind other fields of psychology. We know little about the structure of emotions and less about how other dimensions of our psychological state (e.g., concentration, involvement, motivation) ebb and flow in daily experience. ESM data allow examination of the magnitude, duration, and sequences of states, as well as an investigation of correlations between the occurrences of different experiences. For example, one can examine whether concentration is typically associated with positive affect, how long it lasts, and what factors are related to its ending.

The major limitation of the ESM is the obvious one: its dependence on respondents’ self-reports. This limitation becomes a concern in situations in which it is conceivable that a large segment of one’s sample provided inaccurate or distorted data. For example, if an employer used the method to study his employees’ productivity, the accuracy of self-reports’ related to working would be suspect, as would the ESM results in an investigation of private, sensitive, or illegal activities.

When self-reports deal with the immediate, however, they have been found to be a very useful source of data (Ericsson and Simon 1980; Mischel 1981). In this paper we have presented ample evidence indicating that they typically provide a plausible representation of reality.