Introduction

Use of real-time or momentary assessment of mood, symptoms, and behavior has increased in recent years in medical settings for clinical and research purposes. This development stems from concerns about recall bias of typical paper-and-pencil recall instruments (Gorin & Stone, 2001), an interest in studying within-person relationships (Affleck et al., 1999; Zautra, Smith, Affleck, & Tennen, 2001), and technological advances creating electronic diaries to collect ratings (Shiffman, 2000). Concurrent with this development have been concerns about reactivity of frequent measurements—often 3—12 times per day (Cruise, Broderick, Porter, Kaell, & Stone, 1996). It is not only important for the welfare of the patient to determine if frequent reporting about negative symptoms induces negative mood, but it is also important from a psychometric point of view, since there is evidence that negative mood results in increased retrieval of negative valence information and might bias symptom reports (Clark & Teasdale, 1982). Several studies have demonstrated that the phenomenon being assessed does not appear to be altered by frequent measurement (Peters et al., 2000; Stone et al., 2003; von Baeyer, 1994). However, there are only a few studies in the literature that report data to address the concern of the effect of frequent measurement of symptoms on mood. This may be particularly important in studies involving many days of assessment in a sample that is prone to depressed mood.

Three studies have data that address this issue. Barge-Schaapveld and colleagues assessed 21 patients in a depression medication trial 10 times per day for 6 days at both pre-treatment and at post-treatment phases (Barge-Schaapveld, Nicolson, van der Hoop, & De Vries, 1995). At each assessment, patients rated mental state, mood, and activity on a paper diary. Momentary negative affect decreased in treatment responders (medical team determination), and remained constant in non-responders, indicating no evidence of increased negative mood from momentary assessment. In a second study, reactivity to momentary assessment of pain and positive and negative mood was examined in a 7-day study of 17 rheumatology patients involving 7 random assessments per day (Cruise et al., 1996). Analyses indicated no systematic effect on mood with on-going momentary assessment. A third study conducted electronic diary assessment of pain, pain interference and limitations, and mood in 71 patients with chronic temporomandibular disorder 3 times daily for 2 weeks (Aaron, Turner, Mancl, Brister, & Sawchuk, 2005). On a post-assessment debriefing questionnaire, many patients indicated that they believed that using the electronic diary affected their pain, coping, mood, and activities. In all cases, the majority rated the effect as positive. However, objective analysis of these ratings over the 14 days indicated no significant effect either positive or negative.

While these three studies are all consistent with the conclusion that momentary assessment of symptoms does not increase depressed mood, they have several limitations. The lengths of the three study protocols were 6, 7, and 14 days; it is possible that longer periods of assessment could evoke a mood change not observed in shorter protocols. Second, only one of the studies (Barge-Schaapveld et al., 1995) described the level of pre-assessment depression in their sample. It is possible that level of depressed mood moderates the effect of momentary assessment on subsequent mood such that more depressed individuals might be more vulnerable to the effects of frequent rating of negative symptoms. Third, although each of these studies examined mood ratings, none incorporated a more comprehensive assessment of depressed mood.

This current study describes the results of an analysis of the effect of 30 days of momentary assessment of pain and fatigue on depressed mood among rheumatology patients with a range of pre-assessment levels of depression. It is a secondary analysis of data from a study examining the correspondence between momentary assessment and several recall periods of different lengths.

Method

Participants

Patients were recruited in two community rheumatology clinic waiting rooms to participate in the study. Telephone screening was conducted on 279 patients and 86 (31%) people were ineligible due to visual or hearing difficulties, inability to hold a pen, atypical sleep-wake schedule, no rheumatoid-related chronic illness diagnosis, or previous participation in a momentary assessment study in our laboratory. Of the 193 eligible patients, 76 (39%) declined participation, and 117 (61%) participated. The most common rheumatoid-related chronic diseases among participants were osteoarthritis (49%), rheumatoid arthritis (29%) and lupus (17%). Diagnoses were confirmed by the participants’ physicians.

A majority of participants were female (86%) and married (64%) with a mean age of 56 years (range 28–88). Most were high school graduates (97%), and a majority of participants had completed some college (71%). The protocol was approved by the Stony Brook University Institutional Review Board. All patients completed informed consent and received $100 compensation for their participation.

Measures

Depression was assessed by the Beck Depression Inventory (BDI-II) (Beck, Steer, & Brown, 1996). It consists of 21 items that assess the cognitive, affective and somatic symptoms of depression in the past two weeks. The BDI-II has good internal consistency and test-retest reliability across various populations (Beck, Steer, & Garbin, 1988), and has been described as an efficient screening measure in pain clinic samples (Turner & Romano, 1984). Clinical cutoff scores for level of depression that are suggested in the BDI-II manual were applied: 0–13 = minimal, 14–19 = mild, 20–28 = moderate, 29–63 = severe (Beck et al., 1996).

Momentary ratings of pain and fatigue were collected for 29–34 days on a hand-held computer (Palm Zire 31). The electronic diary (ED) utilized a software program provided by Invivodata Inc. (Pittsburg, PA) which featured auditory tones to signal the participant to complete a set of ratings. The ED was programmed to generate seven randomly scheduled signals across the participant’s waking hours determined by when the participant put the ED to sleep at night and a specified wakeup time for the next morning. An end-of-day set of ratings was also completed before bed. Several user-friendly features, such as delay and suspend, were incorporated into the electronic diary to decrease the intrusiveness of the assessment protocol. Each ED assessment began with the participant responding to questions about her location, activity, and whether alone or with others. These were followed by visual analogue scale items from several established pain and fatigue instruments.

Procedure

Participants completed the first BDI-II during the initial visit to the laboratory when they were trained in the use of the ED. The standardized training was done individually or in small groups of 2–3 patients and involved instruction and practice in making ratings and using the other features of the ED. Participants took the ED home, were telephoned within 48 h to trouble-shoot any difficulties or answer questions, and completed assessments for at least 28 days. In addition, patients completed 6 recall pain and fatigue questionnaires via interactive voice recording during the 30-day protocol (not reported in this study). Participants returned to the laboratory at the end of the protocol and completed a second BDI-II and a debriefing questionnaire and interview.

Results

Nine percent (11 of 117) of patients did not complete the protocol, feeling it required too much time or was too burdensome, and 1 patient was excluded due to completing less than 50% of the assessments on the ED. Thus, the analyses were conducted on 105 patients (90%) whose overall compliance with ED assessments was 95%. On average, patients completed 5.6 momentary assessments per day plus the end-of-day rating. Over the course of the protocol, each patient completed ratings of pain and fatigue approximately 158 times.

Table 1 displays the distribution of patients by depression severity classification for both pre- and post-protocol assessments. Approximately a third of the patients had scores indicating none to mild depression, and 20% scored in the moderate to severe range at both time points. On average depression scores decreased from pre- to post-protocol. The mean BDI-II score prior to engaging in the momentary assessment protocol was 13.1 (10.8), and it was 11.4 (10.1) at the end of the protocol which was significantly less (t(104) = 3.60, p < .001).

Table 1 Percent of patients at each depression level prior to and following 30 days of momentary assessment (N = 105)

To determine if change in depression scores varied by initial level of depression, change (pre–post) in BDI-II scores was computed for each patient and analyzed by depression classification. These change scores are displayed in Table 2. Analysis of variance indicated a significant main effect of level of depression (F(3, 104) = 7.58, p < .001), and Tukey HSD post hoc tests found that the minimally depressed patients’ BDI-II scores were reduced the least at the end of the protocol relative to the mild and severely depressed patients (= .005 and .001, respectively). A more detailed examination of changes in depression indicated that 10% of patients (10/105) moved to a higher depression classification at the end of the protocol, and 20% (21/105) moved to a lower depression classification. The 10 who shifted into a more severe classification level of depression had a mean increase on the BDI-II of 6.0 points, and the 21 who shifted to a lesser level had a mean decrease of 7.3 points.

Table 2 Mean (SD) change in BDI-II scores from pre- to post-protocol by depression classification at pre

Subjective information about patients’ reactions to the protocol was also examined. Two questions on the debriefing questionnaire and interview were most relevant. “How willing would you be to participate in a study like this again?” received similar responses regardless of whether the patients improved, worsened, or stayed in the same depression classification (χ2(2, = 105) = 3.36, > .05). Twelve percent of those whose depression level did not change rated their willingness as “not at all” or “slightly willing,” and 29% of those who improved and 20% of those who worsened gave these ratings. In response to the question, “How much did rating your symptoms help you better understand your pain and/or fatigue?” there were no significant differences by level of depression (χ2(2, = 105) = 0.17, > .05); 50–57% of patients endorsed “moderately” or “extremely helpful.” Finally, when prompted during the interview to list things not liked about the protocol, only 6 patients (6%) said they did not like being forced to think about their symptoms. None of these patients had an increase in depressed mood across the study.

Discussion

Repeated focusing on the negative symptoms of pain and fatigue during a momentary assessment protocol showed no evidence of systematically increasing depressed mood. This protocol tested this hypothesis in a long and demanding protocol (approximately 30 days) with approximately 6 completed assessments per day in a sample of chronic illness patients—a third of whom had some degree of depression. On average, depression scores were lower at the end of the protocol compared with the scores prior to starting the momentary assessment. Although this study did not have the benefit of a control group to monitor changes in depression in the absence of momentary assessments, in a chronic illness sample not initiating new treatment, it is unlikely that depression scores would substantially change. Consequently, these data provide reasonable support for a conclusion of no iatrogenic effect of intensive momentary assessment of negative symptoms on depressed mood.

Given that the study had a 10% drop-out rate, the possibility that very depressed patients found the task unpleasant and dropped out had to be ruled out. The mean BDI-II score measured before the momentary protocol for the 11 study drop-outs was 9.45 (7.2) indicating a low level of depressed mood. Therefore, this eliminates the concern that drop out was driven by very depressed patients electing out of the study due to unpleasantness of the protocol.

In contrast to the concern that instigated this study of an iatrogenic effect of repeated monitoring, the results suggest that there may be a positive effect of frequent assessment of symptoms. Patients with mild to severe levels of depressed mood prior to the study had BDI-II scores that were 3–5 points lower at the end of a month of symptom monitoring. It is possible that by frequently monitoring pain and fatigue, patients noticed the inherent variability in their symptom levels including periods with lower symptom intensities. Similar to Aaron and colleagues (Aaron et al., 2005), we solicited observations from participating patients about their experience in the protocol. Consistent with Aaron and our own previous studies, there were a range of reactions. Anecdotal comments of participants suggested that momentary assessment of pain often results in recognition of the amount of time with no or little pain. Participants also observe causal relationships (real or imagined) between activities, time of day, or emotions and pain levels, thus conferring some sense of increased understanding or control over pain. These experiences may reduce depressed mood. Furthermore, patients may also have experienced a psychological benefit by expressing or communicating the experience of their illness. There is a body of literature on emotional expression as a therapeutic intervention in chronic illness that lends some support to this hypothesis (Smyth, True, & Souto, 2001).

Nevertheless, some patients also made negative observations about the burden of carrying the ED with them and having to make frequent ratings that are sometimes inconvenient. Although the statistical analysis of our data and that of prior studies (Aaron et al., 2005; Barge-Schaapveld et al., 1995; Cruise et al., 1996) indicates no ill effect of momentary assessment on mood and perhaps even a potential positive effect, the personal observations and beliefs of a small minority of patients should not be dismissed. Thus, it would seem prudent to include a statement in the informed consent for these protocols that most patients do not experience any negative effect on their mood, but a small number may find that frequently reporting on their disease symptoms may be unpleasant.

Caution should be exercised in generalizing these findings to all negative symptoms that could be the focus of assessment. For our medical patients, the reporting of pain and fatigue may not have been very different from their typical periodic reflections about their symptom state. Thus, it did not induce a detectable increase in depressed mood. However, one could imagine that the outcome might be different for symptoms or behaviors that are associated with shame (e.g., expression of hostility or bowel incontinence) or with fear or dread (e.g., symptoms of cancer reoccurrence, post-traumatic stress flashbacks). Consequently, we would advise clinicians and researchers to consider the emotional response of patients to reporting the phenomenon of interest when considering use of momentary assessment. If there is an indication from clinical interviews that patients are upset or embarrassed by reporting it, then caution may be warranted. However, we believe that for the vast majority of outcomes of interest to clinicians and researchers, there will not be an iatrogenic effect of momentary assessment. Furthermore, in the one study that examined the effect of increasing numbers of momentary assessments per day, there was no evidence that a higher sampling density leads to an increase in reactivity (Stone et al., 2003).

The strengths of this study were several. The sample provided an opportunity to examine the potential effects of repeated assessment of negative symptoms in a patient population that may well be generalizable to other chronic conditions. Second, since this sample was typical of patients with chronic illness for whom depression rates are elevated, it was possible to examine changes in depressed mood in a sample that is vulnerable to depression. Thus, this sample would be likely to show the effect if it was operative. Third, the sampling density of the momentary assessment is typical of, and thus generalizable to, studies utilizing this assessment methodology. Fourth, the length of the protocol is long enough to generalize to the assessment period of the vast majority of clinical trials. There are also limitations. As depressed mood was not measured on a momentary basis, it is possible there could have been an increase in negative mood due to the assessments that was not detected by the Beck Depression Inventory-II.