Introduction

Though physicians in every specialty must relay adverse medical information to patients, it is particularly common in the oncology setting where news regarding a life-threatening diagnosis, treatment failure, and recurrence is frequently given. Discussing bad news is a primary communication task that practitioners must engage in when discussing a cancer diagnosis, cancer progression, treatment failure, or cancers for which there are no effective anti-cancer treatments. Thus, it is a task that must be faced thousands of times during the course of an oncology clinician’s career, and it has been the most frequently studied communication task [3].

The need to convey unfavorable news is frequent when interacting with patients with advanced cancers, particularly for those with metastatic cancer of unknown primary (CUP) that is defined by the presence of metastatic cancer in the absence of a documented primary site [1, 2]. In addition to presenting a diagnostic and therapeutic challenge, CUP is a potential source of stress for both patients and physicians since the management of a patient with cancer typically begins with a specific diagnosis that provides the foundation for treatment decisions. In the absence of an identified primary, CUP patients often feel that the diagnosis is incomplete and that their evaluation may be inadequate.

Since a diagnosis of cancer is associated with a number of potentially unfavorable events including debilitating and/or disfiguring treatment, pain, loss of function, and death, these discussions may be particularly stressful and difficult for both patients and healthcare providers [6, 17]. Physicians may experience anticipatory stress or anxiety when preparing to give the news [6]. The amount of stress experienced may be associated with factors such as the type of news and the physician’s perceptions about his or her ability to effectively convey the news. Ptacek and Eberhardt [17] describe a dynamic model of the stress associated with bad news which suggests that physicians may be more stressed prior to and while giving the news, whereas the height of patient stress may occur after the news has been given. Giving bad news has also been shown to be associated with increases in blood pressure and heart rate [5]. In addition, delivering bad news was shown to increase natural killer cell function in medical students in a simulated physician–patient scenario of breaking bad news [7].

When receiving bad news, the impression of physician empathy may be of particular importance to the patient [15, 19]. Perceived physician empathy has been shown to positively impact patient satisfaction and compliance [13].

Empathy has been defined as “a cognitive attribute that involves an ability to understand the patient’s inner experiences and perspective and a capability to communicate this understanding,” [11] and may be communicated through both nonverbal and verbal avenues. Verbal expressions of empathy have received most research attention.

Nonverbal communication is less frequently addressed in the empathy literature, and yet it is critical to understanding and conveying emotion [21]. Emotions such as sadness and fear can be identified with high accuracy based only on prosodic cues of pitch, loudness, and speaking rate [4, 8, 12, 21].

This study was designed to explore two hypotheses, one addressing speech production, and the other, perception. The speech production goal was to determine if differences existed in the speaking rate and pitch of healthcare providers when bad news topics versus neutral topics were discussed. It was hypothesized that the delivery of bad news would be characterized by a slower rate and lower pitch than neutral utterances. The perception goal was to assess the ability of listeners to perceive differences between bad news and neutral comments based on prosody alone. It was hypothesized that listeners would perceive differences in the absence of speech content.

Methods

Speech samples collection

The speech samples used for this study were obtained from healthcare providers (medical oncologists, oncology fellows, physician assistants) who were seeing patients with CUP. The outpatient consultation sessions between providers and patients with CUP were audio recorded as part of a larger study of uncertainty, communication, and psychosocial adjustment in patients with CUP.

Speech samples classification as neutral or bad news

The transcripts of the provider–patient interactions were independently reviewed by a medical oncologist, a psychiatrist, and a clinical psychologist, and a consensus was reached on the identification of the various portions of the interviews as neutral or bad news. Bad news included the disclosure of unfavorable developments in the patient’s illness such as the confirmation of a diagnosis of cancer, cancer recurrence, treatment failure to control the growth/spread of the cancer, unavailability of further anticancer therapies, or need to transition to palliative care only.

Examples of provider utterances that were considered neutral were greetings, communication of information related to the scheduling of test or treatment administration, expected time frame for obtaining test results, and other statements with logistic non-emotional content such as “maybe Dr. C has the chart,” “I have not had a chance to talk to Dr. C yet,” or “they are waiting for us”.

Thirty-three available transcripts of the medical visits were reviewed to identify segments of interactions in which bad news was delivered. Of these, 16 did not contain utterances meeting the criteria for classification as bad news and therefore were not evaluable for the purpose of this study. We identified clear bad news interactions in 17 transcripts, representing 12 different healthcare providers. We reviewed transcripts of the medical visits to identify segments of interactions in which bad news was delivered. For each transcript, we also identified segments of the visit that were neutral in content (e.g., introductory comments).

Preparation of speech samples for acoustic analysis

Questions were eliminated from the analysis because the upward inflection would impact the measure of pitch. At least 30 seconds of both bad news and neutral comments were identified for analysis. Because the neutral comments were often brief, they were accumulated across the interactions until at least 30 seconds were obtained. The mean number of seconds for the neutral comments was 33.2 (SD = 5.7), while the mean number of seconds for the bad news comments was 40 (SD = 12).

Acoustic analysis of speech samples

Utterances were segmented after identification as characterizing bad news or neutral comments. The time for each utterance was measured in seconds using an acoustic analysis program (computerized speech laboratory, CSL, KayPentax). The time for each utterance was totaled and divided by 60 to obtain a time in minutes. The total number of words spoken in each utterance was counted and totaled. To obtain a measure of speaking rate, the total number of words was divided by the time in minutes. To obtain the measures of pitch, the same segments were extracted and analyzed using the Multi-Dimensional Voice Profile, a component of the CSL. Mean fundamental frequency in Hertz (the objective measure of pitch) was obtained for each utterance and averaged across the bad news utterances and the neutral ones.

To determine if a statistical difference existed between the speaking rate and pitch associated with bad news and neutral utterances, paired t tests were performed across all healthcare providers. If a provider was represented more than once, his or her data were averaged before statistical analysis was completed.

Preparation of speech samples for perceptual analysis

To determine if listeners could perceive differences in neutral and bad news conditions, the same utterances on which acoustic analyses were performed were subjected to low pass filtering at 30 Hz using Adobe Audition 1.5 software. This filtering strategy maintained pitch contours and speaking rate, but eliminated acoustic energy associated with consonants, and is referred to as content-filtered speech [10]. Thus, the samples were unintelligible, but with unchanged intonation. Loudness was normalized across samples.

Listeners

Listeners were 27 students in a voice disorders class. They were offered extra credit for completion of the listening task. Three students declined to participate. All students had a bachelor’s degree and were in their second year of a Master’s Program in Communication Sciences and Disorders. All students were female, ranging in age from 23–47 years (mean = 26.9, SD = 5.5).

Procedures

The speech samples were completely randomized across healthcare providers and condition. Students listened to the samples on their home computers. They were instructed to listen to each sample once and asked to rate it immediately after listening. They rated the speeches on three features: caring, sympathetic, and competent. The ratings ranged from 1 = not at all, to 7 = extremely.

Results

Acoustic analysis

Table 1 displays the individual and mean speaking rate and pitch data for the voice samples of each healthcare provider. It can be seen that for speaking rate, all but one provider reduced rate in the bad news condition, albeit some to a much greater degree than others. The majority also reduced pitch in the bad news condition. It should be noted that caregivers were both male and female, thus accounting for the wide variation in average pitches. Only one provider increased both speaking rate and pitch when delivering bad news. Results of the statistical analysis revealed significant differences between neutral and bad news conditions for both pitch (t = 3.17, df = 11, p = 0.009) and speaking rate (t = 2.88, df = 11, p = 0.015). In both cases, the bad news conditions were significantly lower than the neutral conditions.

Table 1 Individual demographics and individual and mean speaking rate (in words per minute) and pitch (fundamental frequency in Hertz) for bad news and neutral utterances

Perceptual analysis

To determine if listeners perceived differences between the neutral and bad news conditions, a paired t test was performed across all healthcare providers. Mean data and statistical results are in Table 2. There was a significant difference between the neutral and bad news conditions for the characteristics of caring and competent, while sympathetic approached significance.

Table 2 Mean ratings for physician characteristics for neutral and bad news conditions

Because statistics averaged across providers mask individual differences, additional analyses were performed. Individual paired t tests were performed to assess differences in perceptual ratings for each physician’s neutral and bad news conditions. There were no significant differences for half of the providers, specifically 1, 2, 4, 5, 6, and 9. Physician 6 is of interest. She is a female who increased both speaking rate and pitch in the bad news condition, but was not perceived by listeners to have demonstrated a significant difference between them.

Physician 3 is a female who was perceived to be significantly more caring and sympathetic in the bad news condition than in the neutral condition. Her speaking rate and pitch were virtually unchanged between the two conditions. The other physicians for whom a significant difference between conditions was perceived demonstrated more predictable changes, with a reduction in either or both speaking rate and pitch in the bad news condition. Physician 12, also a female, received the most distinctive ratings between the two conditions for the characteristics of caring and sympathetic, seemingly based solely on decreasing her rate by half. It should be noted that her rate in the neutral condition is somewhat faster than the typical of around 220 words per minute, but she drastically reduced rate in the bad news condition. Also of note is that her ratings of competence remained very high regardless of speaking rate.

Discussion

When comparing the delivery of bad news versus neutral comments, healthcare providers in this study significantly decreased speaking rate and pitch. Listeners perceived a significant difference in the providers’ nonverbal communication when performing the two different tasks, typically rating the reduced rate and pitch as more caring and sympathetic. These findings are supported by work assessing the effect of reduced pitch and speaking rates on relaxation. A decrease in therapists’ loudness, pitch, and speech rates were found to reduce EMG and were rated as more relaxation-inducing, compared to unmodified therapists’ voices (that were not associated with EMG changes) by subjects with high anxiety undergoing progressive relaxation training [14]. A reduction in rate and pitch has also been described in association with an expression of sadness [18], although the association is not always strong [9].

While clearly the content of the news to be communicated (neutral versus bad news) had an influence on speech production, the determinants of the observed speech changes in the health providers when giving bad news are not known. They were not instructed to do so for the purpose of this study and we do not know whether they consciously effected those changes.

Some of the voice changes, specifically lower pitch, that have been observed by others under experimental conditions of induced stress [20] are similar to those that we observed; it is conceivable, given the nature of the information to be discussed, that some of the health providers experienced a stress reaction that affected their nonverbal characteristics. No monitoring of provider’s stress parameters was done for this study.

Other nonverbal characteristics that may contribute to a listener’s impression of empathy need to be considered: provider 3, who demonstrated virtually no differences in rate and pitch, was perceived as more caring and sympathetic in the content-free bad news portion of his interview. It is possible that vocal quality contributed to this perception; however, informal listening to provider 3 revealed a normal voice quality that would be unlikely to contribute to the rating. Other prosodic features such as the timing and duration of pauses, intonation patterns such as the extent of pitch changes, as well as vowel duration, resulting in word lengthening, may also contribute to the perception of the provider as caring. In addition to the advantage for the patient of being provided with a supplemental path to an empathic connection, a reduced speaking rate and/or word lengthening may give the listener more time to prepare to process adverse information.

Studies of physician–patient communication in oncology have traditionally focused on the analysis of the verbal content of the interactions. Similarly, the main emphasis of training curricula is on the recognition and use of empathic opportunities (mostly defined on the basis of verbal content) [16] to introduce specific verbal statements aimed at conveying alignment and empathic understanding on the part of the health provider.

Among the advantage of this type of analysis is that transcripts of the content of such verbal exchanges lend themselves to statistically quantifiable and reproducible assessments using widely available and established instruments. Clearly, however, other aspects of verbal communication besides verbal content (e.g., tone of voice, speaking rate, and loudness) carry relevant information and profoundly affect the listener. Since available studies of physician–patient communication have not been carried out using a parallel analysis of verbal content and of the nonverbal aspect of the communication, the relative contributions of verbal content and nonverbal voice characteristics to the outcome of the interactions are unknown.

It is reasonable to expect that usually (but not necessarily always) an empathic statement such as would be used in disclosing adverse medical news would be uttered in a soft, “caring” tone of voice, with a slow rate of speech. Exclusively nonverbal analysis does not reflect the complete communicative effect, however. Separate assessments of the verbal and nonverbal components of the communication would provide a more detailed understanding and possibly more complete recommendations for the conduct of these interviews. For example, it may not always necessary to explicitly make comments such as “This must be very difficult for you to hear.” It may be equally effective to convey empathy through a nonverbal expression of support and understanding.

There are several limitations of this study. First, only graduate students in communication disorders, as opposed to patients or a less sophisticated sample of the general population, assessed the healthcare providers’ voices. Second, we only analyzed speaking rate and pitch, to the exclusion of verbal content. Third, we did not obtain any objective measures of stress in the providers, and therefore we are unable to begin to address the question of whether stress induced changes in their speech production, or if it was a volitional, behavioral modification. Finally, since the number of voice samples analyzed was small and derived from encounters with a very selected population of cancer patients with an uncommon diagnosis, whether the results of this study are generally applicable to oncology patient/health provider encounters remains unknown.

Future research should focus on simultaneous assessment of verbal content, multiparameter analysis of speech, and observation of other associated nonverbal behaviors (such as leaning towards or away from the patient, presence or absence of eye contact, appropriate touching versus lack thereof). We would anticipate that information thus acquired would contribute to a more thorough understanding of the complex processes involved in the expression and perception of empathy, and eventually to the enhancement of communication curricula development and of providers’ communication effectiveness.

Conclusions

This study demonstrates a change in rate and pitch of providers’ speech when delivering bad news compared to neutral news. These speech changes were perceived by listeners as significantly more caring under experimental conditions of preserved rate and pitch but with an unintelligible verbal content. The expression of empathy has been determined to be strongly associated with important favorable outcomes. It would be important to understand to what extent speech prosody influences those results independently of verbal content.