For more than fifteen years, Germer (2009), Gilbert (2009), and Neff (2003b) have captured compassionate elements of Buddhist philosophy and articulated them in a way conducive to use, observation, and study using Western psychological principles. Empirical work to date has revealed encouraging results pointing to the apparent benefits of self-directed compassion in several domains (see Barnard & Curry, 2011, for a review). It consistently has been related to lower levels of psychopathology (MacBeth & Gumley, 2012), positive affect (Leary, Tate, Adams, Allen, & Hancock, 2007), higher emotional intelligence, coping, and stability (Neff, 2003b), more health-promoting and pro-social behaviours (see Neff, 2012 for a review), and feelings of well-being and life-satisfaction (Neely, Shallert, Mohammed, Roberts, & Chen, 2009; Neff, 2003a).

Self-directed compassion involves relating to the self in a compassionate manner, in the same way that one would relate to others compassionately. In North American society, the idea of relating kindly to oneself is fairly novel; however, Buddhist philosophy has long recognized the importance of compassion directed inwards (Buddhaghosa, 1975; Neff, 2003a). At their core, definitions of compassion share a focus on loving-kindness and the desire to help alleviate another’s suffering (Germer, 2009; Gilbert, 2009; Neff, 2003b). Some researchers emphasize mindful awareness as a component of, or a necessary requirement for, self-compassion (Gilbert, 2009; Neff, 2003b; see also Germer, 2009); Neff’s definition also includes recognition of suffering as a shared human experience. Care has been taken to distinguish self-directed compassion from self-pity and passive self-indulgence (Germer & Neff, 2013; Neff, 2003a, b). Likewise, although self-directed compassion and self-esteem measures typically are correlated, self-esteem has ties to performance-based self-evaluation and social comparison, rather than nonjudgmental acceptance of oneself within the context of the human condition (Neff, 2003a, 2003b; Neff & Vonk, 2009). Furthermore, self-compassion has been reliably, inversely related to shame and self-criticism (Barnard & Curry, 2011; Costa, Marôco, Pinto-Gouvei, Ferreira, & Castilho, 2016; López et al., 2015; Neff, 2003a).

There is debate in the literature regarding the conceptualization of the inverse of self-compassion. The widely used Self-Compassion Scale (Neff, 2003b) includes three negative subscales: self-judgment (the inverse of self-kindness), isolation (the inverse of common humanity), and over-identification (the inverse of mindfulness). Higher scores on these subscales are thought to indicate lower levels of self-compassion (Neff, 2003b). This perspective informed the coding in the current study, and therefore uncompassionate thoughts were operationalized as thoughts reflecting self-judgment, isolation, and over-identification. However, we acknowledge that the theory and research behind the concept of self-compassion (and uncompassion) is evolving (e.g., see Muris, Otgaar, & Pfattheicher, 2019, or Pfattheicher, Geiger, Hartung, Weiss, & Schindler, 2017, for alternative perspectives).

The recent proliferation of research on self-directed compassion (Neff, 2015; Neff & Dahm, 2014) has been facilitated, but also limited, by widespread reliance on the Self-Compassion Scale as a single self-report measure (SCS; Neff, 2003a). Its use in disparate populations has generated substantial data with which to evaluate its reliability and factorial validity (Neff et al., 2019), and researchers’ adoption of a common instrument facilitates comparison across studies; however, broadening our understanding of self-compassion as a construct requires examining it in divergent, complementary ways (Falconer, King, & Brewin, 2015; MacBeth & Gumley, 2012; Neff, 2003a; Neff, 2015; Williams, Dalgleish, Karl, & Kuyken, 2014). Reliance on a questionnaire-based measure of self-compassion arguably has influenced our conceptualization of the construct, such that we think of it more as a trait-like characteristic rather than a process that occurs within individuals in real time (MacBeth & Gumley, 2012; Neff, 2015).

Although there are limits to our capacity to observe self-directed compassion as an internal process, some qualitative researchers have interviewed people about their self-compassionate experiences and thoughts: either in general (Campion & Glover, 2017), or in a specific domain, such as self-perception of their body (Berry, Kowalski, Ferguson, & Mchugh, 2010; Sutherland et al., 2014b; Woekel & Ebbeck, 2013). Others have interviewed participants about how they experienced self-compassion and mindfulness-based interventions (Köhle et al., 2017; L’Estrange, Timulak, Kinsella, & D’Alton, 2016). This research adds to our understanding of what self-compassionate thoughts and ideas might look like through a retrospective lens.

A pioneering study demonstrated the potential to examine spontaneously arising speech for self-directed compassion. Observing that comments posted by internet users recovering from self-injury contained self-compassionate content, Sutherland, Dawczyk, et al. (2014a) differentiated themes and related them to Neff’s (2003a) proposed components of self-compassion. They considered self-kindness to be reflected in comments that conveyed self-understanding and warmth toward oneself, acknowledged strength and recovery, described self-caring behaviour, and portrayed themselves as more than simply someone who self-injures. In contrast, an understanding of common humanity (shared human experience) was shown in comments about receiving or feeling compassion from or for others, normalizing self-injury, opposing judgment, and being seen and understood by others. Statements considered to exemplify mindfulness conveyed a balanced perspective, acceptance of experiences, distress as manageable, and hope. These themes were found in content that likely was written after personal reflection and thoughtful deliberation by people who developed the strength to overcome a personal struggle.

As yet unknown, however, are the frequency and content of compassionate self-directed thoughts as they arise in everyday life. Relating compassionately to oneself is thought to be particularly beneficial when responding to difficult and stress-inducing experiences, as it buffers individuals from some of the negative emotions associated with such experiences (Blackie & Kocovski, 2019; Leary et al., 2007; Terry, Leary, & Mehta, 2013). We developed a vignette-based measure to inquire about immediate thoughts arising in response to difficult situations that participants might be expected to encounter in their everyday lives. Our goal was to code these self-directed thoughts to determine how often they were compassionate, to observe what compassionate self-talk looked like within our sample, and to situate compassionate (and un-compassionate) thoughts within the full, broad range of responses that the hypothetical scenarios engendered. Use of a variety of vignettes allowed for investigation of self-directed compassion at a situation-specific, rather than generalized and trait-like, level.

Our research strategy entailed developing a broad range of content codes using a relatively diverse sample of emerging adults from Amazon Mechanical Turk (MTurk; i.e., the ‘development’ sample), and then applying the coding system to a second sample of university students (i.e., the ‘application’ sample). Whereas diversity in the development sample was intended to elicit a wide range of potential responses to the vignettes, the application sample was limited to university students to specify the population to which findings would apply. The mental health crisis among students at North American universities necessitates continued study of self-directed compassion within this population, given its strong association with mental health (Macbeth & Gumley, 2012). According to the American College Health Association (2017), 52.7% of university students reported feeling hopeless within a one-year period, 39.1% felt so depressed it was difficult to function, and 61.9% felt overwhelming anxiety. Although young people have reported less self-directed compassion than more mature populations (Neff & Vonk, 2009), self-compassion has been associated with beneficial outcomes among university students (Arslan, 2016; Breines et al., 2015; Iskender, 2009; Neff & McGehee, 2010; Sharma & Davidson, 2015; Sirois, 2015; Yamaguchi, Kim, & Akutsu, 2014).

We sought to describe the range and relative frequency of self-directed compassionate and uncompassionate thoughts that may be generated by university students in difficult moments. Furthermore, we used exploratory factor analysis to describe how content categories tended to co-occur, and related these category groupings to constructs of shame, self-criticism, self-esteem, and self-compassion, to get a better idea of their function (guilt was included for purposes of discriminant validity, since it has not been associated with self-directed compassion; Fisher & Exline, 2010). This systematic exploration was intended to complement and inform questionnaire-based research.

Method

Participants

Development Sample

The sample used for initial development of content codes was recruited through Amazon Mechanical Turk (MTurk). MTurk samples have proven to be relatively demographically diverse, and obtained data are at least as reliable as from more traditional samples (Buhrmester, Kwang, & Gosling, 2011). Although 178 participants initially opened the survey, 75 were excluded for completing less than 50% of items, leaving a final N of 103. Included participants were 24 years old on average (range from 18 to 35 years; SD = 2.94). Sixty-three (61.2%) participants identified as White/European, 10 (9.7%), as Black/African/Caribbean, 6 (5.8%) as Southeast Asian, 14 (13.6%) as South Asian, 7 (6.8%) as Latin American, and 3 (2.9%) as “other.” Regarding gender, 41 (39.8%) participants identified as female, 59 (57.2%) male, and 3 (2.9%) non-binary, transgender, or undisclosed.

Application Sample

A total of 586 undergraduate students participated in exchange for credit toward an introductory psychology course. Of the total sample, data from 107 participants ultimately were excluded from factor analyses because they provided less than 12 codeable responses (i.e., less than one response per vignette) along with one outlier (response frequency > 3 SD above the mean). The mean age of remaining participants (N = 478) was 18.88 years (range 16–36 years; SD = 1.26). The majority (369; 77.2%) participants identified as White/European, 29 (6.1%) as Southeast Asian, 29 (6.1%) as South Asian, 15 (3.1%) as Black/African/Caribbean, 6 (1.3%) as Arab, 6 (1.3%) as Latin American, 5 (1%) as West Asian, and 1 (0.2%) as Aboriginal/First Nations/Métis, and 18 (3.8%) as “other.” Regarding gender, 342 (71.5%) participants identified as female, 133 (27.8%) as male (28.5%), and 3 (0.6%) as agender or genderless.

Materials

Vignettes

Twenty-four illustrated scenarios were created to evoke compassionate (or uncompassionate) responses (see Figs. 1 and 2 for examples).Footnote 1 Each scenario portrayed a difficult event that emerging adults typically might experience. The vignettes were accompanied by illustrations to increase their engagement with the scenarios, in order to facilitate access to relevant automatic thoughts. We attempted to cover a broad range of difficult experiences (e.g., at school, during leisure activities, and in social situations; involving romantic partners, friends, teammates, etc.). Items could be grouped into three broader content domains: achievement (7 items), social rejection (10 items), and social transgressions (4 items; the content of other remaining items contained elements of more than one domain). For the development sample, responses to all 24 vignettes were qualitatively coded and organized. For the application sample, we identified six vignettes in each of two domains, social rejection and failure, selecting a variety of exemplars within each context.

Fig. 1
figure 1

a “Please list three (3) automatic thoughts that you may have when failing a test” (female version depicted)

Fig. 2
figure 2

“Please list three (3) automatic thoughts that you may have when getting picked last for teams at a soccer game” (male version depicted)

Vignettes were presented individually and in random order; no time limit was imposed. Participants were provided with images that matched their specified gender (including an androgynous character for participants who indicated their gender as neither female nor male). Participants were given the following instructions: “For the following scenarios, we ask that you please tell us what your automatic thoughts would be in the given situations. We are looking for phrases or statements that describe your thoughts, not individual words like ‘good’ or ‘ok’ or a single emotion word.” Participants were asked for three thoughts per vignette, and a codeable thought could range from a two-word phrase to one multiple phrase sentence, with an average of 16.6 words per participant per vignette (with a mean of approximately six words per response).

The Self-Compassion Scale (SCS; Neff, 2003a). The SCS is a 26-item scale that evaluates the three components of self-compassion through six subscales: self-kindness, self-judgment, common humanity, mindfulness, isolation, and over-identification. Using a 5-point Likert scale, participants rate the frequency with which they respond to difficult circumstances according to these domains. Higher scores indicate greater self-compassion. The SCS has been demonstrated to have a high degree of validity and reliability (e.g., Castilho, Pinto, & Duarte, 2015; Neff, 2003a). In the application sample of the current study, internal consistency was as follows: Cronbach’s α = .72 for Mindfulness, .73 for Over-Identification, .84 for Self-Kindness, .81 for Self-Judgment, .78 for Common Humanity, and .77 for Isolation.

Test of Self-Conscious Affect for Adolescents (TOSCA-A; Tangney, Wagner, Gavlas, & Gramzow, 1991). This widely used scale includes ten negative and five positive vignettes of difficult situations. For each vignette, participants rate the likelihood they would respond four different ways on a 5-point Likert scale. Validity was supported by expected relations between the subscales and empathy (associated with shame-free guilt), and anger (associated with shame; Tangney et al., 1991). Subscales reflecting shame and guilt (each 15 items) were used in the current study. In the application sample of the current study, internal consistency was as follows: Cronbach’s α = .82 for Shame, and .84 for Guilt.

Level of Self-Criticism (LOSC) scale (Thompson & Zuroff, 2004). This 22-item self-report scale measures two forms of negative self-evaluation: Comparative Self-Criticism (SCS) and Internalized Self-Criticism (ISC) using a 7-point Likert scale. It has demonstrated good convergent validity (Thompson & Zuroff, 2004). In the application sample of the current study, internal consistency was Cronbach’s α =.78 for CSC and .88 for ISC.

The Rosenberg Self-Esteem Scale (SES; Rosenberg, 1965)

This widely used 10-item self-report scale, uses a 4-point Likert scale to assess positive and negative feelings about the self in order to measure global self-worth. In the application sample of the current study, Cronbach’s α = .91.

Procedures

Development Sample

Participants were recruited through MTurk. After reading a brief description of the study, those who proceeded provided informed consent and were linked to the study housed on Qualtrics online survey software. In a single session (approximately one hour), participants completed a brief demographics questionnaire, responded to the vignettes, and completed several questionnaires (the latter as part of an unpublished dissertation study; Redden, 2019). They completed vignettes before questionnaires to avoid influencing vignette responses. Participants were compensated $0.75 USD for completing the study.

Qualitative Analytic Approach

A team composed of a faculty advisor, a graduate student, and three fourth-year undergraduate research interns was assembled to identify content codes. It was imperative to the process to include undergraduate team members because, as emerging adults, they provided perspective on the function of vignette responses for their cohort. For several months, the research team met weekly for approximately three hours to discuss responses and look for patterns. Taking a complete coding approach (see Braun & Clarke, 2013), we sorted all responses into categories, grouping together those that appeared to serve a similar function. Responses were not grouped based simply on wording; rather, we considered the context surrounding each response and inferred its most likely function. We assigned these initial categories temporary placeholder names (e.g., “guilt,” “questioning,” “self-care,” etc.). These categories evolved as additional data were added. For example, categories were broadened to encompass similar but non-redundant responses, or split when multiple, fairly coherent sub-themes were identified within a category. Additional categories were generated when we noticed a recurring theme in the data that did not fit the pre-existing categories. Considerable attention also was given to capturing the intended meaning of subtle nuances in language within this population. To ensure the correct interpretation of age-relevant colloquialisms, the research team would occasionally consult external sources such as online resources and young adults unrelated to the research project.

For approximately a quarter of the dataset, responses were categorized by the team; we then progressed to coding independently. Responses not clearly fitting into a category were discussed during weekly team meetings to consider multiple perspectives on how best to categorize them. A proportion of thoughts were deemed uncodeable (as described below). Each categorized thought was entered into a shared online document, which was reviewed frequently to ensure consistency.

In the next developmental phase, we considered how categories related to self-compassion theory. We re-named the response categories to capture the coding team’s interpretations of implicit meanings in the data. For example, a response such as “Someday this will be a funny story” would have originally been placed in a “positive thinking” category. Subsequently, we used our interpretive lens to name this category “self-encouragement” under the superordinate category, “self-kindness.” Thus, category labels reflected theory rather than terms used by participants.

Application Sample

Participants were recruited through the university’s undergraduate participant pool. Informed consent procedures and Qualtrics survey composition were essentially identical to the development sample.

Applying the Coding System

Four new undergraduate coders, trained by the original team, received approximately nine hours of training on theory, the specific content codes, and practice coding responses together as a group. They referred to the exhaustive coding manual and passed a test with at least 70% agreement before proceeding. For three months, challenging items were discussed during weekly three-hour team meetings (including the faculty advisor, graduate student, and original long-term undergraduate coders). They were encouraged to code individually only those items that they could code with certainty. All challenging responses were recorded for future reference (however this list was unavailable during reliability coding). Codes were recorded using NVivo software.

Results

Development Sample: Qualitative Analysis

Our iterative coding process ultimately resulted in 29 content categories (see Table 7 in Appendix 1 for detailed descriptions and examples of each category). There were three observed forms of self-kindness: (a) making self-encouraging statements, (b) expressing liking or fondness for oneself, and (c) proposing self-care activities. Three other categories appeared to characterize mindful acceptance: (a) accepting personal limitations, often in the form of acknowledging a personal shortcoming without over-identifying with it, (b) accepting experience, such as labelling the unpleasant experience while tolerating or welcoming it, and (c) accepting personal responsibility, by recognizing one’s role in creating the situation. Common humanity was observed as (a) taking another’s perspective in difficult situation, (b) wishing others well, and (c) normalizing or generalizing to human experience.

Three conceptual opposites of self-compassion, proposed by Neff (2003a), are self-judgment, over-identification, and isolation. Two categories appeared to serve the function of self-judgment as they portrayed unnecessary punitive or derogatory thoughts towards oneself without a constructive focus. These were (a) unequivocal, broadly self-critical statements or (b) annoyance or critical feelings toward oneself. Seven categories conceptually appeared to exemplify over-identification: (a) grasping, or wanting things to be different than they were; (b) choosing to avoid unpleasant experiences or feelings through behaviours such as drinking or sleeping; (c) avoidant devaluing, denigrating or dismissing something or someone; (d) self-protective externalizing, assigning responsibility for a problem to someone or something else; (e) directed hostility, verbal aggression toward something or someone; (f) catastrophizing, irrationally inferring terrible consequences, and (g) a catch-all category, termed “other over-identification,” that included a variety of thoughts that functionally would hinder mental flexibility or curiosity (e.g., “I hate this,” or “I can’t do this at all.”). Isolation was exemplified in three distinct ways: (a) a general sense of isolation and being alone in one’s experience, (b) focusing on oneself as a (solitary or unique) victim, and (c) narcissism, assuming superiority over others.Footnote 2

In addition, some inverse categories did not readily fit into a discrete theorized facet of self-compassion. The category “Fear of Others’ Reactions,” which encompasses fears, worry, or anxiety associated with the anticipation of others’ reactions, appeared to involve both over-identification and isolation. Similarly, the category “Pressure to Achieve” appeared to reflect both over-identification and self-judgment. A third category was developed that reflected elements of Isolation and Self-Judgment, which was Internalization of Others’ Judgment.

Several response categories did not appear to be overtly compassionate or uncompassionate. We grouped reasoning, information seeking, and problem-solving together in an overarching category called “Reasoning,” as all three were logical, pragmatic responses. Responses assigned to the reasoning category differed depending on the type of vignette: in response to rejection, they usually involved social reasoning. We made no assumptions about association of these categories with self-compassion. An additional category, venting, captured brief displays of intense emotion that functioned to release one’s initial frustration; these displays were not directed at a specific target.

A final and frequent category, “acknowledging experience,” involved statements about emotions participants were feeling. Although these responses could indicate mindful awareness of emotional state, we lacked sufficient evidence to discern whether people were mindfully aware of, versus caught up with, or carried away by, their perceived emotions. Thus, this category likely was heterogeneous with respect to whether it served a compassionate function.

Application Sample: Replication and Quantitative Analysis

In total, 95.85% (14,438) of participant responses could be assigned a category. The remainder were deemed uncodeable for the following reasons: misinterpretations of the vignette, ambiguous wording that could justify two categories contradictory in nature, thoughts intended to be humorous (since humour can serve different functions), and single-word responses with insufficient context to interpret the function. Table 1 provides frequencies as a function of vignette type (expressed as a proportion of total responses for response categories). The type of adverse experience (failure versus social rejection) elicited proportionately different responses. The three most common codes in response to failure vignettes were problem-solving, grasping, and pressure to achieve, whereas the most common codes for social rejection were information seeking, acknowledging experience, and (social) reasoning. Across vignette types, only a small proportion of responses appeared to serve the function of self-kindness, mindful awareness, or common humanity (14.3% for failure, 7.3% for rejection). Self-encouragement was the most frequently observed compassionate category, followed by acceptance of personal limitations. Most of the remaining compassionate categories, including all with a common humanity theme, were rare (less than 1%); in comparison, particularly uncompassionate categories tended to occur more often (e.g., for failure and rejection, respectively: self-judgment 4 and 3.7%, isolation 1.8 and 6.6%).

Table 1 Intraclass correlations and frequencies (expressed as a proportion of total responses) for response categories

Inter-rater reliability for each response category was calculated on 20% of vignettes completed by the application sample using two-way mixed effects intra-class correlations (see Table 1). Reliability was calculated separately for failure and social rejection vignettes, since vignette type was associated with subtle content-related differences in some response categories that impacted coding. Four reliability coders each were randomly assigned 5% of the sample. Inter-rater reliability was calculated for categories observed with sufficient frequency to support their estimation (i.e., allowing for sufficient response variability to detect reliability; Goodwin & Leech, 2006; Shoukri, Asyali, & Donner, 2004): our liberal cut-off was 1 % of total codeable responses, which eliminated nine categories in each of the failure and social rejection domains. According to frequently cited guidelines (Cicchetti, 1994), 38 of the 54 intra-class correlations for specific categories were high enough to be considered excellent (> .75), 11 were good (between .60 and .74), three were fair (between .40 and .59), and two were poor (<.39). Those with poor and fair reliability were observed fairly infrequently, thus lower variability likely limited the strength of reliability estimates (Goodwin & Leech, 2006).

Exploratory Factor Analyses

Participants’ frequency of responses in each category were subjected to factor analysis. All analyses were conducted using the most recent version of R (R Core Team, 2019). Any response category with frequency lower than 1 %, or an intra-class correlation coefficient less than .69, was excluded from analysis. In addition, participants who provided fewer than 12 responses (and one outlier with responses > 3SD above the mean) were excluded from analysis.

A number of principal components analyses were conducted to guide the authors in their delineation of common types of responding. In the case of responses to the failure vignettes, it was determined that a six-component solution, with some degree of cross-loading on similar components, was the best solution. In the case of the rejection vignettes, it was concluded that a five-component solution, again with some degree of cross-loading on similar components, was the best solution. In both cases, the solutions were obtained under varimax rotation, followed the Kaiser criterion (Kaiser, 1958), and were distinguishable using parallel analysis (see Fabrigar & Wegener, 2012). Solutions accounted for 47 and 45% of the total variance for failure and rejection, respectively; each of the factors accounted for 7 to 10% of the variance after rotation. Please see Tables 2 and 3 for the solutions. As can be seen from the tables, there are pronounced similarities between the two solutions, though there are some differences that likely are attributable to the nature of the vignettes. Factors were assigned labels that were relatively theory-neutral to avoid over-interpretation.

Table 2 Factor loadings for Failure categories
Table 3 Factor loadings for Social Rejection categories

Three factors were identified across both failure and social rejection contexts. (1) A factor termed “Strong Negative Responses” included self-judgment and seeing the self as a victim; catastrophizing and general overidentification (e.g., “I hate this,” “I can’t do this”; for failure); and self-isolation (for social rejection). This collection of responses appeared unequivocally unsupportive of the self, and self-regulation. (2) A factor termed “Positive Responses” included encouragement and problem-solving, self-care (for failure), and reasoning (for social rejection). Thus, it included some responses seen as self-directed compassion, and some solution-oriented responses. In response to failure, acknowledgment of the experience loaded inversely on this factor, whereas in response to rejection, a mild form of self-judgment loaded inversely. (3) An “Externalization” factor included blaming others, overt hostility toward others (for social rejection), and devaluing people or activities. This factor appeared to capture an uncompassionate focus on others.

Other factors were identified in only one context. In response to failure, three such factors were identified: “Situation-Specific Shame” (F4) consisted of avoidance, fear of others’ reaction, and feelings of isolation; and “Non-Acceptance” (F5) involved wishing for, and asking questions about, how to achieve a different outcome (accepting personal limitations loaded inversely). A final factor was termed “Tension Release” (F6): venting (e.g., swearing), a response that would serve to release pressure, loaded highly, whereas increased pressure to succeed loaded inversely. The two additional social rejection factors were “Self Protection” (R4), which included recognition of one’s emotional response to the rejection, coupled with problem-solving and plans to avoid the experience (seeking information loaded inversely), and “Accepting Social Limitations” (R5), which included accepting personal limitations and negative implications about the self (pressure to achieve loaded inversely).

Correlations of Response Types Across Vignette Types

Component scores were generated for the six components for failure and the five components for rejection. Table 4 shows the correlations across vignette types for the response component scores. Confidence intervals (95%) are also given. Given that a varimax (i.e., orthogonal) rotation was applied, the correlations amongst the components for failure and amongst the components for rejection were essentially zero. As can be seen in the table, CI’s for these correlations ranged from negligible to medium in magnitude. In particular, the CI for the correlation between Strong Negative responses in failure and rejection contexts corresponded to a medium effect size, as did the CI for the correlation involving Positive responses. However, Positive and Negative responses were more modestly, inversely related across contexts (CI’s consistent with a negligible to small association). The CI for the correlation for Externalization across contexts ranged from small to medium in size. A few other correlations with CI’s ranging from small to medium effects were theoretically consistent. For example, Externalization (to failure) was inversely associated with Self Protection, and Externalization (to rejection) was directly associated with Tension Release. Notably, Situation-Specific Shame (to failure) was not meaningfully associated with Strong Negative Response (to rejection).

Table 4 Correlations between factor scores for Failure and Social Rejection contexts

Correlations between Response Types and Validation Measures

Table 5 provides intercorrelations among questionnaire measures measuring related constructs, and Table 6 provides correlations between these questionnaires and response component scores.Footnote 3Footnote 4 As seen in the table, Positive Responses to rejection scenarios were moderately associated with shame, self-criticism, self-esteem, and self-compassion in theoretically expected directions, with CI’s for correlations ranging from small to medium effects; however, these correlations were somewhat attenuated for Positive Responses to failure vignettes (range from negligible to moderate effects). Strong Negative Responses to failure and rejection were moderately associated with self-reported shame, self-criticism, self-esteem, and self-compassion, with CI’s for correlations generally ranging from small to medium effects. In contrast, Situation-Specific Shame (to failure) was less strongly associated with these measures, with CI’s for correlations ranging from negligible to moderate effect sizes. Externalizing was fairly negligibly associated with these measures. However, unlike most other factors, Externalizing was inversely associated with guilt, particularly in the rejection context (CI consistent with a small to medium effect).

Table 5 Means, standard deviations, and correlations amongst theoretically related variables with confidence intervals
Table 6 Correlations between response component scores and questionnaires measuring related constructs

Discussion

What Did Self-Directed Compassion Sound like?

In this study we reviewed university students’ immediate responses to a variety of difficult hypothetical situations, with the goal of identifying thoughts that appeared compassionate and uncompassionate toward the self. We also sought to situate these thoughts within the full, broad range of responses that such scenarios engendered. Since our inner thought processes are verbally mediated, any theory centred on relating to the self is supported by a foundational understanding of what people are actually saying to themselves. Thus, the current descriptive research provides a complementary approach to questionnaire-based assessment methods that rely on summative judgments about the way people relate to themselves.

Our observations converged with self-report measures in some respects, while paradoxically diverging in ways that may call into question the meaning of some questionnaire-based self-compassion scores. We identified several themes that fit conceptually within a broad theoretical construct of self-directed compassion, such as self-encouragement and acceptance of personal limitations. These themes tended to group together in factor analysis and to relate to self-reported shame, self-criticism, and self-compassion in the expected directions, adding credence to their presumed compassionate function. However, the relative infrequency with which we observed compassionate responses was itself a surprising and central finding: about 14 and 7 % of responses to failure and rejection scenarios, respectively. Furthermore, even these responses varied in the degree of compassion that might be inferred: for example, the motivation for phrases such as “Practice makes perfect” could range from gentle acceptance of one’s imperfections to a less compassionate, generic self-direction. Some expected types of self-compassionate thoughts were virtually absent: for example, thoughts that appeared to reference the common human condition comprised less than 1 % of codable responses, leading us to wonder whether the concept of common humanity was understood or acknowledged within this population (Table 7).

Although this paucity may be surprising, it fits with the general observation that within conventional society the concept of the “inner critic” is salient and readily understood, yet we do not recognize the parallel concept of an “inner friend” or “inner support person.” Questionnaires may provide an equal number of items assessing both self-compassionate and uncompassionate tendencies, but if some respondents have less experiential understanding of self-compassion, it could limit their ability to assess accurately their own levels of self-directed compassion (see also Davidson & Kaszniak, 2015). If this were true, then in addition to traditional self-report questionnaires, assessment of self-reported compassion could be enhanced by assessing the depth of one’s experiential understanding of what a “5” on a 5-point Likert scale item assessing self-directed compassion might look like.

It is possible that other response categories, particularly social reasoning and problem-solving, served a compassionate role to some degree as they tended to co-occur alongside self-directed compassionate thoughts. In response to social rejection vignettes, reasoning generally involved imagining the social other’s perspective, thus demonstrating a motivation to understand others that is foundational to other-directed compassion. In general, a pragmatic, solution-oriented focus may be experienced as comforting or encouraging, thus serving an implicit self-care function. Consistent with Gilbert’s (2009) conceptualization of self-directed compassion, it may reflect taking action to mitigate one’s own suffering. Associations have been reported between self-compassion and action-oriented coping (e.g., Neff, Rude, & Kirkpatrick, 2007).

‘Acknowledging experience’ was a heterogeneous category that awaits further study. It gives us pause, that in response to failure vignettes, acknowledgment of experience loaded inversely on the factor that included problem-solving, self-encouragement, and self-care. For social rejection vignettes, acknowledging experience loaded onto a factor that included avoiding experience and not seeking further information. Perhaps in some cases these responses may have served to minimize or invalidate experienced vulnerability or discomfort, an approach antithetical to self-directed compassion.

What Did a Lack of Self-Compassion Sound like?

Potentially uncompassionate responses were more frequent and varied. Although originally grouped according to Neff’s theorized domains of self-judgment, isolation and over-identification, factor analysis suggested alternate groupings for the response categories based on their co-occurrence. Across both scenario types, two factors were identified that characterized strong, uncompassionate responses to self and others, respectively. Loading highly on the Strong Negative factors were extreme response categories that the coding team considered to be acutely uncompassionate: critical self-judgment and the two forms of isolation that reify a narrative that one is alienated from, and victimized by, others. Catastrophizing responses were by definition extreme, and would particularly undermine self-regulation. Taken together, these categories accounted for six and 19% of responses to failure and social rejection scenarios, respectively. The Strong Negative factors were most strongly associated with self-report measures of shame, self-criticism, low self-esteem, and low self-compassion.

Externalizing factors generally indicated a clear lack of compassion toward others in both failure and rejection contexts. They were negligibly associated with shame, self-criticism, self-esteem and self-compassion, but were uniquely, inversely associated with guilt; this is consistent with research relating externalization to a lack of guilt (Muris et al., 2016; van Tijen, Stegge, Terwogt, & van Panhuis, 2004). Indeed, a central function of externalization is to deflect and minimize personal accountability, and its accompanying disavowal of personal vulnerability is a barrier to self-directed, as well as other-directed, compassion.

Other factors were more situation-specific. Of particular note, Situational Shame in response to failure involved avoidance, fear of others’ reaction, and perceived isolation. For example, responses to receiving a failing grade on a test may include: worrying how their parents would react, wanting to hide it from them, and believing they had let them down. Such responses characterize the experience of shame in response to a specific event, which has been distinguished from more trait-like shame-proneness involving global negative self-judgments (Goss & Allan, 1994; Tangney, 1996) such as those included in the Strong Negative factor. Situational Shame and Strong Negative factor scores were negligibly associated, suggesting minimal conceptual or functional similarity. Furthermore, Situational Shame was only modestly related to self-report measures of shame, self-criticism, low self-esteem, and low self-compassion. Considering the expressed fear of others’ reactions, we speculate that responses conveying elements of situational shame might be influenced by perceived pressure from others regarding the individual’s performance, at either a societal or relationship-specific level.

Other factors were less easily evaluated as generally compassionate or uncompassionate. Non-acceptance of failure is considered uncompassionate from a philosophical and theoretical perspective, in that compassion is said to entail acceptance of experiences (Neff, 2003a). However, from a functional perspective, it could be argued that some degree of situational non-acceptance of failure (e.g., wishing the outcome were different; seeking information about what went wrong and how to do better) would be pragmatic, depending on one’s goals. Future research could investigate whether such non-acceptance is associated with conscientiousness, competitiveness, agency, and/or achievement. In a similar vein, acceptance of social rejection and its potential negative implications might be seen as compassionate due to the theoretical connection between acceptance and self-compassion, but it is also possible that these responses might influenced by low perceived social agency. Ambiguities such as these suggest that the degree to which some of these thoughts might be uncompassionate may depend on their frequency and severity. They also argue for the importance of considering responses in their broader context when appraising whether they serve a compassionate or uncompassionate function. For instance, although rigid avoidance of distressing emotions typically is seen as problematic, temporary avoidance in order to get through a discrete difficult experience has been considered adaptive (Herman-Stabl, Stemmler, & Petersen, 1995; Lazarus & Folkman, 1984) and could be a compassionate response, depending on one’s immediate needs and internal resources. Similarly, exerting a certain degree of pressure to achieve might be consistent with one’s goals, and depending on the broader context, might differentiate self-directed compassion from passive self-indulgence.

Strengths, Limitations, and Future Directions

Taken together, a variety of commonly experienced failure and rejection scenarios generated a wide array of codable responses. Use of a diverse development sample, and a large application sample, allowed us to achieve good interrater reliability on most frequently observed categories and increased confidence in the generalizability of our findings to similar university samples. Although some categories occurred infrequently, future research may find that these categories are more relevant to other populations. For instance, narcissism (self-protective isolation) was an extremely rare category in our sample but might be more frequently observed in clinical or forensic populations. Moreover, infrequently observed forms of self-directed compassion could indicate specific targets for intervention.

Use of self-directed compassion as a theoretical perspective to interpret responses necessarily restricted our findings; other researchers may choose to identify and group response categories differently depending on their theoretical lens. We also were limited by the brevity of individual responses. Although we were able to make our best guess regarding a number of salient functional distinctions, we cannot be certain of these functions (Russell, 1982). Categories are based on the coding team’s best guess about statement function, following careful group discussion of difficult-to-code statements, and erring on the side of caution by considering responses uncodeable when conflicting interpretations were possible. Future research addressing the function of these responses could validate or elucidate the functional distinctions we inferred.

Of theoretical interest was the differentiation of self-directed compassion and self-esteem within the “liking self” category. Despite the intriguing conceptual distinctions between these constructs (Neff, 2003b), they share a great deal of overlap since both are a form of positive self-regard. Clear references to achievement-based self-judgment and social comparison were excluded from the category, due to their theoretical association with self-esteem; however, we cannot be certain that some responses coded as compassionately “liking self” could also have implicitly involved these processes.

Particularly ambiguous were responses categorized as acknowledging experience, one of the most common types of automatic thoughts. Thoughts assigned to this category typically involved labeling one’s emotion (e.g., “I feel angry,”) or simply restating the situation (e.g., “They don’t want to go out with me.”) The first step in a mindful response is to acknowledge moments of suffering in just this way. To the extent that feelings and situations were acknowledged with acceptance and openness, such acknowledgment would exemplify mindfulness. However, if such statements reflected resistance, or were a precursor to over-identification, then they would not be considered mindful. The statements themselves did not provide enough information to discriminate between these possibilities. Thus, we did not include this frequently used category within the mindful awareness construct, and expect that it is quite heterogeneous: it awaits future research.

A key limitation of the current study is that immediate thoughts were self-reported. Internal self-talk is difficult, if not impossible, to “observe” without reliance on self-report, which likely functions as a filter. Although asking for spontaneous thoughts to discrete scenarios arguably provided more immediate, genuine responses than the summative judgments collected on questionnaires, participants may not report their first three automatic thoughts for a variety of reasons (e.g., low insight, avoidance, social desirability). Requiring participants to type, rather than verbally report, their responses may have additionally obscured their immediate thoughts, although in the current era emerging adults arguably may find it easier to communicate openly via text than interpersonally (Hall, Feister, & Tikkanen, 2018). Nonetheless, future research should consider collecting spoken responses from participants, such as in the form of a structured interview, to explore whether the nature of spoken participant responses differs from the typed responses collected in the present study. Furthermore, although a strength of the current study, a focus on immediate thoughts also is a limitation since, at a deeper level, self-directed compassion extends beyond verbal language. It is unlikely that a heartfelt openness toward oneself can be fully captured through verbal communication. Considered one of the “Four Immeasurables” in Buddhist philosophy (Nhat Hahn, 1991), deeply authentic compassion may ultimately defy our attempts to measure it. Observation of mindful awareness was particularly hindered by a reliance on words: the act of simply being present is often not accompanied by words.

Despite unavoidable limitations and complexities, observing self-directed compassion as a relational process unfolding in real time provides a necessary foundation to the field. Future research is required to clarify and extend these findings in populations that differ by age, culture, and sub-culture. Since the meaning of participants’ responses is presumably highly culturally dependent, such research will require qualitative analysis to ascertain the breadth of responses and their meaning, with groups of coders who share the cultural perspective. Vignettes also may require modification to increase their relevance to the population under study. Understanding the observable forms it might take across these different cultural perspectives may ultimately enrich our understanding of self-directed compassion and how to foster its development.