Introduction

To date, the concepts of posttraumatic shame have received little empirical investigation in the field of traumatology, with a few notable exceptions (Beck et al. 2011; Matos and Pinto-Gouveia 2010; Resick et al. 2008; Harman and Lee 2010; Leskela et al. 2002; Platt and Freyd 2012; Semb et al. 2011; Street and Arias 2001). In particular, definitions of posttraumatic stress disorder (PTSD) emphasize fear, but newer theories and increasing evidence in the recent years suggest that non-fear emotions, such as shame, may be important features of PTSD (Holmes et al. 2005; Grunert et al. 2003; Grunert et al. 2007; Smucker et al. 2003; Beck et al. 2011; Resick et al. 2008; Harman and Lee 2010; Semb et al. 2011). Recent theoretical contributions have specified the role of shame in maintaining symptoms of PTSD (Harman and Lee 2010; Ehlers and Clark 2000). Both trauma-related shame and guilt cognitions elicit aversive emotions and can function as reminders of trauma memories (Ehlers and Clark 2000). The persistent self-criticism associated with shame represent a threat to the self and can contribute to the maintenance of PTSD symptoms by reinforcing the sense of ongoing threat (Ehlers and Clark 2000). Shame appraisals lead to shame-charged intrusions, great distress, with avoidance and thought suppression (Harman and Lee 2010; Lee et al. 2001; Lee 2005), and consequently strengthen the relationship between shame and intrusions. In addition, the inclusion of shame and guilt in the criteria for PTSD in the newly released DSM–V have profound consequences for the conceptualization, assessment and treatment of PTSD.

Patients in clinical settings are sometimes reluctant to disclose feelings of shame out of fear from being exposed and rejected (Macdonald and Morley 2001). Unfortunately, failure to identify shame in patients can be disruptive to treatment, (e.g., lead to treatment stagnation, prolonged treatment, or reduced treatment outcome) (Arntz et al. 2007; Brewin et al. 1996; Resick and Schnicke 1993). Some studies indicate that shame and guilt are both associated with reduced outcome of prolonged exposure, in which imagery exposure (IE) is a central component. These studies also indicate that the treatment model imagery rescripting (IR) is effective and relevant for PTSD patients with complex emotion profiles (Arntz et al. 2007; Grunert et al. 2003; Grunert et al. 2007). Cognitive processing therapy (CPT; Resick and Schnicke 1992) is a type of cognitive behavioral therapy that also has been shown to be effective in the treatment of a variety of emotional reactions in PTSD. Thus, a shame inventory could be useful in tailoring existing treatments to effectively target shame for patients with more complex emotional profiles. Consequently, an instrument assessing shame in the context of trauma is needed for clinical evaluation as well research purposes. Accordingly, the goal of the present study is to investigate the psychometric properties of a newly created assessment of shame in the context of trauma.

Shame Versus Guilt

Frequently, shame and guilt are used as equivalent terms, but several writers have distinguished them with reference to different emotional experiences (Lindsay-Hartz 1984; Tangney and Dearing 2003; Tangney et al. 1996). Most shame theorists agree that shame involves negative evaluations of the entire self (Kubany and Watson 2003). Theoretically, shame involves painful self-scrutiny and self-condemnation, along with feelings of worthlessness and powerlessness, and coupled with a behavioral tendency to hide, disappear, or withdraw out of fear from condemnation and rejection of significant others (Lindsay-Hartz 1984; Tangney 1991; Greenberg and Paivio 1997; Lewis 1995; Nathanson 1987; Stone 1992; Tangney 1991; Tomkins 1987). It is further postulated that shame is associated with negative biases when judging others’ evaluation of the self, and involves a desire to conceal one’s own perceived deficiencies from the evaluation of others (Lewis 1971; Tangney et al. 1996; Wicker et al. 1983). Accordingly, the cognitive (self-condemnation), affective and behavioral (hiding and withdrawing behavior) components of shame constitute the definition of shame and were included in the measurement design of the Trauma-related Shame Inventory (TRSI) to be described below. Consequently, shame is characterized as a more painful emotion in which the entire self, and not just the behavior, is negatively evaluated (Lewis 1995; Tangney and Dearing 2003). In contrast, guilt involves negative evaluations of specific actions or behaviors and motivates individuals to attempt reparative actions. (Kubany and Manke 1995; Lewis 1995; Nathanson 1987; Tangney 1991; Tangney et al. 1996). However, researchers have generally failed in prior attempts to distinguish guilt and shame and to provide precise operational definitions of shame (Kubany and Watson 2003). The gap between the clinical interest in shame and guilt, and the empirical focus on these self-evaluated emotions in treatment studies has been due largely to difficulties in the operationalization and distinction of these complex constructs.

Gilbert (1997) distinguished between external- and internal shame, in which two reference norms, personal and social, play different roles in the self-evaluation process. External shame is related to one’s preoccupation about how others will appraise and evaluate the self. In external shame the concern is how one is seen and evaluated by others (e.g., fear of being scorned, devaluated or ridiculed by other persons). By contrast, internal shame relates to one’s preoccupation with self-devaluations in which one evaluates oneself as flawed, weak, inadequate and inherently disgusting, which is the very definition of shame provided by Lewis (1971), Lewis (1995) and Tangney and Dearing (2003) among others. According to Gilbert (1997; Gilbert and Andrews 1998), internal shame occurs when an individual personalizes the traumatic experience and views it as confirming evidence of personal failure. Even though external- and internal shame are assumed to be highly correlated, Gilbert views them as two separate subcategories of shame. Whether external shame can occur in the absence of negative self-evaluation, or whether external shame is a result of one’s negative evaluations of the self, which constitute internalized shame, remains an empirical question. In accordance with Gilbert’s theory (Gilbert 1997; Gilbert and Andrews 1998), external shame, is also assumed to be part of the shame-ridden individual and related to the affective and behavioral components of shame. Accordingly, the affective component of shame and the behavioral tendency to hide and withdraw are assumed to be important aspects of both internal- and external shame and were included in the measurement design of the TRSI. The categories of internal- and external shame along with the subcategories of self-condemnation, affective- and behavioral components of shame are therefore included as essential features of the measurement design of the TRSI.

Condemnation of the self involves a judgmental and critical stance towards the self (Gilbert and Miles 2000). Self-Judgment has been linked in prior research both empirically (Neff 2003; Gilbert and Miles 2000) and theoretically (Gilbert and Miles 2000) to internal shame and depression. According to Orth et al. (2006), negative emotions like shame in contrast to guilt elicits rumination which then leads to depression. Prior research indicates that self-criticism is part of the shame construct (Gilbert and Miles 2000), and a study by Gilbert and Procter (2006) found that shame was associated with low self-compassion, high self-criticism and low self-esteem.

Assessments Used in Research on Trauma-Related Shame

Although, shame has captured the attention of clinical psychologists for decades, systematic empirical research is scarce (Beck et al. 2011; Blum 2008). The gap between the clinical interest in shame and the empirical focus on this emotion in relation to trauma has been due largely to difficulties in the measurement of the construct (Blum 2008). Independent conceptual analyses of the applied indicators for shame, as suggested in the methodological literature (Cronbach et al. 1972; Benson and Hagtvet 1996; Messick 1995; Nunnally and Bernstein 1994; Kane 2001), are scarce or absent in assessment of shame. Part of the problem in the assessment of shame is defining the domain of indicators for shame as distinguished from guilt and self-concept. The general approach taken in measuring shame is to include a number of different but related indicators of the construct in one scale (Andrews et al. 2002). The items have no direct reference in the item formulations to the concrete aspects of the construct under investigation (e.g., cognitions, emotional expressions or behavioral indicators of shame) or to specific contexts (like shame reactions in relation to trauma). Unfortunately, researchers with contradictory views on shame tend to design measurement instruments that reflect their own particular approaches (Andrews et al. 2002; Blum 2008). As such, the measures of shame vary substantially in structure and format with different conceptual distinctions to other emotions and self-concepts. Cook (1989), for example, defined internalized shame as an “enduring chronic shame that has become internalized as part of its identity and which can be most succinctly characterized as a deep sense of inferiority, inadequacy or deficiency” (p. 9). He suggests that internalized shame and self-esteem are two sides of the same coin (Cook 1991). The inclusion of indicators of self-esteem in the measurement domain of “The Inventory of Internalized Shame” (ISS; Cook 1989, 1993) are unfortunate because it rules out the possibility of exploring the relationship between shame and self-esteem, as well as their independent relationship to other psychological concepts (Tangney and Dearing 2003). An assessment of shame that includes items tapping self-esteem or guilt will result in confounding different constructs with potentially different effects on outcome in applied research. It is not uncommon to find shame and guilt to be differently related to psychological functioning and symptoms with even opposite direction of the effects (Leskela et al. 2002; Street and Arias 2001). Using assessments that confound shame and guilt run the risk of obtaining negligible correlations to the variable of interest because these effects have basically cancelled each other out. Thus, one may erroneously conclude that these emotions are not relevant to the variable of interest (Tangney and Dearing 2003). The consistent reports of high correlations between shame and scales of guilt and self-esteem across a number of different questionnaires (Cook 1988, 1991) may be an indication that shame often occurs together with guilt and low self-esteem, or it may reflect methodological problems.

Several issues persist in considering scales used in present research on trauma-related shame. The most widely used scale in research on shame is the Test of Self-Conscious Affect (TOSCA; Tangney et al. 1989), which represents 15 hypothetical scenarios in real-life situations that may elicit shame and guilt reactions. There are several problems using scales with hypothetical situations for shame, especially in clinical populations. The hypothetical situations represent situations encountered in every-day life and are not relevant to capture intense shame reactions in specific domains such as in the aftermath of traumatic incidents (Beck et al. 2011). Consequently, the scales lack ecological validity due to nonexistence of possibilities to report actual shame reactions among PTSD patients. In addition, the hypothetical situations do not include clinical aspects of shame related to condemnation of ones emotional- and behavioral reactions or coping ability, which are hypothesized to be important for trauma related shame (Lee et al. 2001).

Another approach in assessing shame is by asking general questions of shame reactions as in The Trauma Appraisal Questionnaire (TAQ; DePrince et al. 2010) and The Experience of Shame Scale (EES; Andrews et al. 2002). Andrews et al. (2002) constructed EES to assess four areas of characterological shame, and this scale is used in recent clinical research on trauma-related shame (Harman and Lee 2010; Resick et al. 2008). By asking general questions about shame over personal characteristics, habits, and manner with other people with no reference to concrete examples of cognitions, emotional expressions or behavioral indicators of shame in a situational context, the respondents are more susceptible to their own idiosyncratic understanding of this abstract and complex concept. This might result in conceptual ambiguity and diversity in responses. (e.g., one of the items in the TAQ is formulated “It’s as if my insides are dirty”). In addition, no instruments available tap important aspects of shame such as fear of negative consequences of disclosure like rejection and condemnation from significant others, which are central components in the definition of shame (Lewis 1995).

Related to the typical procedures of creating shame scales referred to above, there is reason to believe that not all features of shame in a clinical context are represented in the items of previous instruments. Despite the methodological problems regarding the available assessments of shame, studies of shame in relation to trauma relies on the measure The Inventory of Internalized Shame (Beck et al. 2011; Wong and Cook 1992), TOSCA (Leskela et al. 2002; Street and Arias 2001) and The Experience of Shame Scale (Resick et al. 2008; Matos and Pinto-Gouveia 2010; Harman and Lee 2010). Despite the need for developing assessment of trauma-related shame focusing on global negative assessment of one’s self in a situational context relevant for traumatized individuals (Beck et al. 2011), there is no such measure available.

In the currently managed health care environment, there is growing pressure for clinicians and researchers to use abbreviated instruments that measure complex constructs more quickly than the original full scale versions. A number of authors have argued for the use of short-form measure in clinical practice for screening purposes in order to reduce behavioral-observation time and costs (Donders 1997; Santor and Coyne 1997). A similar argument could be made for the use of a short version of the TRSI consisting of 24 items in clinical research and practice.

Aims of the Study

In the present study, the framework of generalizability theory (G-theory) will be applied to assess psychometric properties of scores reflecting trauma-related shame in patients with PTSD. G-theory provides a framework for most accurately estimating the reliability of scores based on the simultaneous analysis of multiple sources of error variances and combinations of these sources of variances (Brennan 2001a, b). G-theory also provides the opportunity to investigate alternative measurement designs in order to optimize sufficient number of items that provide acceptable generalizability and dependability of scores. G-theory represents a new approach in the measurement of shame, and is considered to be a relevant statistical framework to investigate psychometric properties of scores when applying a measurement design with multiple sources of variance (Brennan 2010). The purpose of this study was to assess 1) the internal structure of a newly-developed self report measure of trauma-related shame by means of G-theory when taking into account all its identifiable sources of variation and covariation in the present measurement design, 2) alternative measurement designs in order to find the lowest number of items that provide acceptable generalizability and dependability of scores without narrowing the measurement domain of trauma-related shame, 3) the differential construct validity between shame and guilt, and 4) the relationship between TRSI and a) depression, and b) internal shame measured by the subscale “self-judgment” in the Self-Compassion Scale (SCS; Neff 2003). Consistent with the theoretical considerations presented earlier, we expect the subcategories of self-condemnation, behavioral- and affective components to be highly correlated. In light of the theoretical considerations and empirical results presented earlier, we also anticipate that high scores on the TRSI would be related to high scores on both self-judgment and depression. By taking into account the sharp conceptual distinction between shame and guilt in the creation of the TRSI, we expect that scores of the TRSI represent shame in relation to trauma without confounding shame and guilt.

Method

Participants

The sample consisted of 68 patients diagnosed with PTSD from all parts of Norway seeking treatment for this disorder at a psychiatric hospital in Norway. Seventy-one patients were found eligible for treatment at the assessment stay and admitted to treatment from December 2008 to November 2010. At admission, all these 71 patients were found to meet research criteria, but 3 of them declined research participation. A standardized clinical interview was conducted by two psychologists using Posttraumatic Symptom Scale Interview (PSS-I; Foa et al. 1993). The ratings of the psychologists conducting the standardized clinical interviews were checked for inter-rater agreement. Inter-rater agreement for the continuous items constituting PSS-I was evaluated by means of ICC (3, 1) (Shrout and Fleiss 1979) with a value of .91 in the present study. The inclusion criteria were liberal and similar to clinical practice criteria. The inclusion criteria were: (a) satisfying DSM-IV criteria for chronic PTSD1 with symptom duration more than 6 months, (b) PTSD symptoms identified as the primary problem in need of treatment, (c) age 20 to 65 years, and (d) accepting withdrawal of all psychotropic medication (which is standard procedure on the Anxiety Unit of the psychiatric hospital). The exclusion criteria were: (a) current suicidal risk, (b) current psychosis, (c) severe dissociative symptoms, or (d) current involvement in an abusive relationship, which is similar to clinical practice criterion for prolonged exposure and Imagery Rescripting and Reprocessing Therapy (IRRT). Using listwise deletion for missing responses resulted in a sample of 50 patients completing all items of TRSI on both measurement occasions. The patients excluded from the analysis did not complete any of the TRSI items at the first measurement occasion. The missing data involved in the present study are defined as missing completely at random. The Little’s test of missing completely at random (MCAR) based on the 24 items on both measurement occasions indicated that data were missing completely at random, Littles MCAR χ 2 (6) = 5,434 p = .490. No data were available for patients with missing data at the first measurement occasion to estimate imputed values for missing responses. The sample of 50 patients consists of 44 % men and 56 % women with mean age 45.51 (SD = 9.629) with mean value of 1,201 (SD = .867) and .790 (SD = .799) on internal- and external referenced shame on a four point Likert scale, respectively. The 50 patients had experienced a variety of index traumas: 20 (40.8 %) patients had experienced physical assault by a familiar person, 24 experienced physical assault by a stranger, 23 accidents, 5 natural disasters, and 14 war-related traumas. 8 and 10 of the patients were subjected to sexual assault from a stranger and from a familiar person, respectively. The mean length of time since the index trauma was 17.5 years (SD = 13.3 years).

The Development of the TRSI

This inventory was constructed for exploring the concept of trauma-related shame. This construct was operationally defined as a negative evaluation of the self in the context of trauma with a painful affective experience, and a behavioral tendency to hide and withdraw from others to conceal one’s own perceived deficiencies. Twenty-eight items that best approximated the variety of trauma-related shame components were made after several reconsiderations of item phrases in English. Theories of emotion in general (Power and Dalgleish 2008) and shame in particular (Gilbert 1997; Gilbert and Andrews 1998) have led us to sample negative appraisals of the self, which includes the perception of the self as defective (internalized shame), and the perception of others’ negative evaluation of the self (externalized shame). Items were written to represent negative appraisals of the self, affective experience of shame or action tendencies expressed in withdrawing behavior. 12 positive items forming a subscale called personal growth representing positive appraisals of the self were written but excluded from the set due to very low correlations with the remaining items. Contrary to expectation, inter-item correlations indicated that the positively formulated items were tapping independent constructs rather than a unidimensional bipolar variable. The items were constructed in collaboration with two experts in the field to ensure the inclusion of all relevant aspects of trauma-related shame in the measurement domain. In the construction of the TRSI, an English version was made before the translated version. Two experts on shame, one expert on shame theory in general and one expert on shame in relation to trauma, participated in the process of creating the English items and evaluated the relevance and appropriateness of the items. Expert opinions were based on the English version of the scale. The English version was then translated into Norwegian by the fourth author of the present study. The Norwegian version was then translated back into English by a bilingual person with English as his native language without prior knowledge of the original English items. The final English translation of the TRSI was found to be in accordance with the original English version of TRSI. The patients completed the translated Norwegian version in their native language. An independent conceptual analysis of the translated version of the scale resulted in the exclusions of four items due to conceptual ambiguousness of the translated version of the item formulations.

The patients were asked to rate on a 4-point Likert scale the presence of specific symptoms of trauma-related shame experienced during the past 7 days (0 = not at all correct about me; 1 = sometimes correct about me; 2 = mostly correct about me; 3 = completely correct about me). The participants were assured that their responses to the inventory would be strictly confidential and used for research purposes only.

Procedure

The Trauma Related Shame Inventory (TRSI), the Trauma Related Guilt Inventory (TRGI) and the Self-compassion scale (SCS; Neff 2003) were administered to patients with PTSD during an initial 3-day pre-treatment assessment and during the first week of their 10-week inpatient program on the Anxiety Inpatient Unit. The patients were recruited during an initial 3-day pre-treatment assessment designed to evaluate eligibility for treatment and to determine their suitability for the study. Those who met inclusion criteria were fully informed about the study, gave written consent to participate in the study, and were introduced to the inventory as part of a measurement battery examining the role of emotions in treatment of PTSD. The patients completed the TRSI, the TRGI and the SCS, and were then put on a waiting list for the 10-week inpatient program at the Anxiety Unit. The patients completed the same inventories at the start of their treatment approximately 10 weeks later.

Assessments

Self-Compassion Scale (SCS; Neff 2003 ) contains 26 items that assess the degree of self-compassion that one is capable of during times of emotional distress. Participants respond to various items about “how I typically act toward myself in difficult times” on a 5-point scale. The inventory consists of 6 subscales tapping self-kindness, self-judgment, common humanity, isolation, mindfulness, and over-identification. Cronbach’s alpha for the subscale called Self-Judgment used in the present study was .85. The psychometric properties of the SCS has previously been investigated in a community sample by Neff (2003) examining both the internal structure and relationship between SCS and the Beck Depression Inventory (BDI; Beck et al. 1961), State-Trait Anxiety Inventory (Spielberger et al. 1970), and Self-Criticism subscale of the Depressive Experiences Questionnaire (DEQ; Blatt et al. 1976).

Beck Depression Inventory (BDI-II; Beck et al. 1996a) is a 21 item self-report scale for assessing degree of cognitive, affective, motivational, and physiological symptoms of depression during the past seven days. Items are scored on a four-point scale of symptom severity from 0 (no depression) to 3 (maximum depression). Various studies have investigated the psychometric properties of the BDI, e.g., a study by Beck et al. (1996b) reported a Cronbachs alpha value of .91.

Trauma-related Guilt Inventory (TRGI; Kubany et al. 1996) is a 32 items inventory, which consists of five subscales to assess different components of trauma-related guilt. The subscales are called global guilt subscale, the distress subscale, hindsight-bias/responsibility subscale, the wrongdoing subscale, and the lack of justification subscale. The latter three components are included in the guilt cognition subscale consisting of 22 items. Items are scored on a five points scale ranging from 1 (never/not at all true) to 5 (always/extremely true). Prior studies have demonstrated internal consistency reliability of .86 and moderate correlations with PTSD and depression symptoms in a trauma sample (Kubany et al. 1996). For the present study the guilt cognition subscale was chosen and examined in order to separate trauma-related guilt from trauma-related shame. In this study, Cronbachs alpha was .90 for the Guilt Cognition Scale.

The Measurement Design

In this study, trauma-related shame is represented by four facets, that is four different sources of score variance, within the measurement design. One facet is called Referent (r) and includes two different evaluative situational conditions: (1) Self-referent shame (internal–referent shame), and (2) Other-referent shame (external-referent shame). The second facet is labeled Aspect (a). The aspect facet represents different subcategories of shame consisting of self-condemnation as the cognitive component of shame in addition to the affective-behavioral component of shame. Affective and behavioral indicators of shame were composed of an affective-behavioral component due to the lack of behavioral items available to estimate a separate aspect category consisting of the behavioral component. Because measurements were collected at two measurement points, the third facet called occasions (o) was included in the design of the study as a random facet. The fourth facet is items (i) which is nested within (:) combinations of referents and aspects, and crossed with (x) occasions. The present multi-facet measurement design p x o (i:ra) implies that all patients completed the same instruments on both measurement occasions, in which different items exist in the aspect categories of self-condemnation and affective-behavioral component of shame, while the items have equivalent formulation in internal- and external referenced shame. The descriptive conditions for each facet are presented in Table 1.

Table 1 Measurement design of trauma related shame inventory: four facet p x o (i:ra) design

Conceptual Analysis of Item Indicators

According to Cronbach et al. (1972) and Kane (1982, 2001), a domain definition that defines the construct is required to claim generalizability of the score. A conceptual analysis was therefore carried out by the first author before conducting the statistical analysis in order to evaluate the correspondence between each item content and its corresponding shame construct. Three main criteria were critical for accepting the item as representative indicators of the shame construct. The items should, in accordance with the theoretical definition of emotion by Power and Dalgleish (2008), represent either: a) a component of the negative affective experience of shame b) appraisals in the form of negative evaluations of the self, or c) action tendencies expressed in withdrawing behavior. Four items representing the self-condemnation in internal- and external shame were excluded on the basis of the conceptual analysis identifying the conceptual ambiguousness of the Norwegian terms for ‘destroyed’ and ‘marked for life’ in the item formulations. Thus, the conceptual analysis resulted in 24 items of trauma-related shame, consisting of six items representing condemnation and affective-behavioral aspects of shame with equivalent formulations for internal- and external shame (see Table 1). The four different categories were labeled: Internal Condemnation, Internal Affective-behavioral, External Condemnation, External Affective-behavioral. The two first categories constitute internal referenced shame, while the two latter constitute external referenced shame.

Generalizability Theory Applied to the Present TRSI Design

One of the major advantages of generalizability theory (G-theory) is that multiple sources of error variance are estimated simultaneously in a single analysis when investigating psychometric properties of scores (Shavelson and Webb 1991). When doing research in real-world clinical settings one can conceive of a complexity of sources reflecting measurement error, such as inconsistencies in scores originating from e.g., measurement occasions, raters, items, among others. These sources of variation are represented as facets of observation. Accordingly, G-theory allows the test constructor to appropriately estimate the generalizability (reliability) of the scores in a multi-facet measurement design. In contrast, Cronbach’s alpha coefficient will likely provide biased estimation of psychometric properties of scores obtained in such measurement designs (Cronbach and Shavelson 2004).

A distinct characteristic of G-theory is the distinction made between reliability involving absolute decisions, which is relevant if clinical decisions are based on individual’s score, and relative decisions involving stability in relative standing or rankings of persons (Shavelson et al. 1989; Brennan 2003; Feldt and Brennan 1989). This distinction is important and needed in clinical practice because most clinical decisions concern the standing of a given patient with regards to criteria used for determining clinical intervention (absolute decisions). These two types of relevant coefficients can be estimated to represent different definitions of measurement error. The coefficient of generalizability, 2, is relevant when the researcher is concerned with decisions involving relative standing or ranking of individuals. The multiple errors or threats to generalization of relative decisions are compressed in the variance term, σ2 δ, labeled relative error and includes all the variance components involving interactions between persons and facets of observations in Table 2. The relative G-coefficient (Ep 2) is defined as the ratio between universe score (or true score) variance, σ2 p, and observed variance consisting of the sum of relative error variance, σ2 δ, and the universe score variance σ2p; σ2p/(σ2δ + σ2p). The G–coefficient is then interpreted as the amount of observed variance that is accounted for the by universe score variance.

Table 2 Estimated G-study Variance Components for Trauma Related Shame Inventory Based on Random Model: p x o (i:ra) Design (N = 50)

The other definition of measurement error is involved when estimating the other type of generalizability coefficient called index of dependability, Φ, (Brennan 1983; Shavelson and Webb 1991). This coefficient is relevant when making absolute decisions based on the absolute level of an individual’s score (e.g., a patient score in relation to a given clinical cut-off criteria). The multiple errors involved in the present study in absolute decisions include all variance components in Table 2 except the person component, σ2 p. The multiple error components for absolute error are compressed in the term, σ2 Δ. The index of dependability is defined by the ratio σ 2p/ (σ 2 Δ  + σ 2p).

In addition, generalizability theory provides information for optimizing multi-facet measurement design to minimize the influence of measurement error by alternating sample sizes of facets of observation (Brennan 2010). This is especially relevant for the second research question in suggesting a measurement design for short form scales with the aim of optimizing the number of items without narrowing the construct domain.

In the third research question we want to investigate the differentiation between shame and guilt measured by the TRSI and the guilt cognition scale. In the present data context, this research question can be approached by estimating the generalizability of the differences scores (Brennan et al. 1995) representing the distinction between guilt and shame. The reliability of difference scores among guilt and shame can be estimated by means of univariate generalizability study (Eikeland 1973). A generalizability coefficient representing the reliability of the distinction between shame and guilt was estimated by a formula corresponding to the formula for estimating a relative generalizability coefficient outlined above.

In the present pi:c design in which the interaction between person (p) by items (i) is nested within construct categories (c) of shame and guilt, respectively, the “person by construct” (pc) interaction term indicates to what extent a person’s rank order of universe scores differs across the two construct categories of shame and guilt. The estimated pc-component represents the universe score variance for the difference score. The estimate of the relative error variance, σ2 δ, which is now the σ2 pi:c/ni component where ni is the number of items applied in the estimation. Thus the generalizability coefficient for the difference score between the two constructs guilt and shame, 2 diff, is σ 2pc/ (σ 2 d + σ 2pc). It should noticed that in the applied p x i:c design, the number of items, ni, within the TRSI and guilt cognition scale is 24 and 22, respectively. Due to the present unbalanced design, the estimation of the variance components was conducted by means of the software program urGENOVA (Brennan 2001a, b). ni = 22 was applied when estimating the generalizability of the difference score. The generalizability coefficient for the difference score between TRSI and guilt cognition indicates to what extent the universe score variance of the pc - interaction or the universe difference score variance accounts for variance in the observed difference scores. Alternatively, high reliability of the difference scores between shame and guilt would indicate that the mean correlation among items within the constructs is higher than the mean correlation among items across the constructs. This pattern of correlations indicates support for differential construct validity (Eikeland 1973).

G-theory makes also a distinction between a G-study and a D-study. G-study estimations of the variance components are presented in Tables 2 and 3. These variance components display the relative importance of sources of variation in a typical observation or in an average observation in the design. A D-study, on the other hand, estimates generalizability coefficients for a composite of scores where error components are reduced by increasing the sample sizes of the facets of observations. For further information about G-theory and analysis the reader is referred to Shavelson and Webb (1991), and Brennan (1992a, b, 2001a, b).

Table 3 Estimated G-study variance and covariance components for trauma related shame inventory based on p• x o• x iº design1 (N = 50)

The generalizability studies were performed by GENOVA, a program developed for univariate generalizability analyses with balanced design (Crick & Brennan 1983). Multivariate generalizability analysis, performed by the program software mGENOVA (Brennan 1999) was also applied to explore the internal structure of the test design as will be elaborated below. The application of multivariate generalizability analysis to examine the covariance composition of the measurement design involving four categories from the combinations of two fixed facets (internal and external shame) has been suggested by Brennan (1994 p.189).

Results

Univariate Generalizability Analysis

The internal structure of the trauma-related shame was first examined by assessing the variance component structure estimated by univariate generalizability study. Based on the p x o (i:ra) design, 19 G-study variance components were estimated with GENOVA. The variance components from the univariate generalizability analysis are presented in Table 2.

In the present multi-facet study design, the person component (p), representing the universe scores (which is equivalent to true scores), accounted for a major part of the total variance and reflected that variance in scores are systematically related to individual differences in shame. Most of the remaining variance components representing sources of variance related to items, occasions, aspects and referents, and interactions among these sources of variance contributed negligibly to the score variance with some exceptions. Two major sources of variation in test scores were the “persons by items nested within referents by aspects” (pi:ra) and the “persons by occasions by items nested within referents by aspect” component (poi:ra) accounted for about 11 % and 24 % of total variance, respectively. The first variance component (pi:ra) indicated how much persons rank order differs across items nested within the aspect by referent categories. The latter variance component indicated instability in the persons rank order across items within the aspect by referent categories on different measurement occasions. Each of the variance components for the referents facet (r) and the interaction component “persons by occasions” (po) accounted for about 8 % of the total variance. The “person by occasion” (po) interaction term indicates to what extent a person’s rank order differs across the measurement occasions. The variance components associated with occasions (o) and interaction between occasions and persons (po) were small in magnitude indicating stability of the measured scores. The referents facet (r) indicates little variations in mean scores for the distinction between internal- and external referenced shame. Likewise, the aspects facet (a) also indicates little variations in mean scores of condemnation and affective-behavioral components of shame. All the variance components appeared to be rather stable estimates based on the estimated standard errors (see Table 2).

Of specific interest was the small variance component “person by referents” (pr) compared to the large person component (p). The relative size of the person- and the person by referent-components indicates relatively strong stability in persons rank order across internal and external referenced shame. Likewise, the small variance component of “person by aspects” (pa) relative to the large person component (p) indicates relatively strong stability in persons rank order across the different aspects of trauma-related shame consisting of condemnation of the self and affective-behavioral components of shame. The small variance components of “persons by referents” (pr), “persons by aspects” (pa) and “persons by referents by apects” (pra) might indicate a relative strong correlation between referents and aspects, respectively. These results motivate the use of multivariate generalizability analysis to further explore the internal structure in terms of the correspondence among the four fixed categories representing internal condemnation, internal affective-behavioral, external condemnation, and external affective-behavioral components of trauma related shame. Thus, the univariate component structure suggested relative strong correlations between the two referents categories (internal and external referenced shame), between the two aspects categories (condemnation and affective-behavioral components of shame), and also between all four categories from the combination of the referents and aspects categories.

Multivariate Generalizability Analysis

The univariate analysis does not explicitly inform about the relationship between the referent and aspect categories. To further examine the internal structure of the TRSI and in terms of the homogeneity of the scores, we utilized a multivariate generalizability study to estimate the covariance components structure of the four categories from the combinations of two fixed facets in the present. This design has been suggested by Brennan (1994 p.189). The present multivariate design p• x o• x iº implies that all persons completed all items on both measurement occasions. The multivariate generalizability analysis is reported in Table 3.

The covariance components structure clearly suggested that the measured construct of trauma-related shame may be homogeneous in nature rather than consisting of distinct categories of internal and external-referenced shame. In the present multivariate study design, the largest components are the person components (p), representing the universe scores, which accounts for approximately 48 % and 51 % of the total variance for personal condemnation and the affective-behavioral component of internal shame, respectively, and about 27 % and 43 % of the total variation for perceived condemnation and affective-behavioral component of external-referenced shame, respectively. The relative contribution to the composite universe score from the Condemnation and Affective-behavioral components within the internal- and external-referenced shame were 25.36 %, 27.64 %, 23.61 % and 23.39 %, respectively (see Table 3). While the high correlations among the four universe scores support the existence of a general component, the relative contributions suggests that the four variables contributed about the same amount of variance to the composite shame score. The covariance components structure clearly suggested that the measured construct of trauma-related shame may be homogeneous in nature rather than consisting of distinct categories of internal and external-referenced shame. In sum, both the relative size of the person component in the univariate analysis and the homogeneous composition of the covariance matrix for persons estimated by the multivariate analysis, suggested that all the four categories share a strong common underlying component.

Decision Studies

By systematically examining the various sources of measurement error, G-theory provides information for optimizing the design by reducing measurement error. Decision studies (D-studies) provide generalizability coefficients tailored to the intended use of the measurement. The variance and covariance components derived from the multivariate analysis in Table 3 were used as input for estimation of generalizability coefficients in D-studies. At the D – study level measurement conditions with different numbers of items (6, 3, 2 and 1) nested within each of the four aspect categories, respectively, were specified. As presented in Table 4, both the G-coefficients associated with relative decisions and absolute decisions are high even for four items. The results from the D-study analyses displayed little variation in estimated G-coefficients and indexes of dependability for different numbers of indicators. This would imply that four items are sufficient for generalizability coefficient and index of dependability of .77 (see Table 4).

Table 4 Estimated D-study Statistics for Total Scores Derived from the Multivariate G-study in Table 2 with Total Number of Items in Each D-Study Equal to 24, 12, 8 and 4, respectively

Reliability of Difference Scores Between Shame and Guilt

The third research question, involving differential construct validity between shame and guilt, was evaluated by estimating the reliability of the difference scores based on the variance components derived from the univariate generalizability study (Brennan 2001a, b) with the pi:c design. The pc and the relative error component were estimated to be .17 and .82/22 = .039. Inserted in the relevant formula presented above, the estimated G-coefficient for the difference score was .813 based on 22-item version of the guilt cognition scale and the TRSI. Thus the distinction between guilt cognition and shame is highly generalizable. The high generalizability coefficient of .812 based on 22-item version of the guilt cognition scale and the TRSI in Equation 1 shows that the distinction between guilt and shame is highly reliable. The consistency of scores across items within the constructs is relatively stronger than the consistency of scores across items across the two constructs, indicating support for the two separate constructs of shame and guilt.

External Validation for Construct Validity Interpretation

A linear regression model was first utilized to examine the proposed relationship between TRSI and depression when correcting for the alternative interpretation represented by guilt. TRSI total scores were significantly related to the Self-Judgment scores (r = .52, p < .001). The correlations between TRSI and guilt cognitions and depression were .58 (p < .001) and .49 (p < .001), respectively. Two separate multiple regression analyses were then executed to control for guilt cognitions. Significant unique relationships were obtained between TRSI and a) Self-Judgment and b) depression indicated by partial standardized regression coefficient of .56 (p < .05) and .40 (p < .05), respectively. This finding indicates that only shame had an unique effect on Self-Judgment and depression, while guilt was not uniquely related to Self-Judgment and depression.

Discussion

In this article, univariate and multivariate generalizability analysis were applied to examine the first research question regarding the psychometric properties and the internal structure of the Trauma-related Shame Inventory (TRSI) by the estimation of different sources of score variance. Both the univariate and the multivariate generalizability study displayed a large person component relative to the remaining sources of variance, including the interaction components involving persons, occasions and items. The present results from the first research question indicated that the TRSI is constructed to be sensitive enough to capture considerable variability between persons in trauma-related shame. The results indicated that the score variance to a large extent reflected individual differences in shame with little influence from other remaining sources of variance representing items, occasions, aspects and referents categories. By including the facet of items, aspects, and occasions in the present measurement design, we estimated a generalizability coefficient of the scores by taking into consideration measurement errors related to variability in scores from one occasion to another, diversity in scores at the level of items, aspects and referent categories (internal and external referenced shame) along with different combinations of sources of measurement errors. Both the composite G-coefficient and index of dependability for the measurement design p• x o• x iº consisting of 24 items were high, .87 and .87 respectively, suggesting that the TRSI yielded reliable scores and provided an acceptable index of shame in traumatized individuals with little influence of measurement errors. The high index of dependability also suggests that the TRSI is precise enough to be used for screening purposes. In addition, a multivariate generalizability analysis revealed strong positive correlations among the components of shame to include condemnation and affective-behavioral components for internal- and external referenced shame.

In accordance with shame theory (Gilbert and Andrews 1998; Gilbert 2000), internal and external-referenced shame should be correlated because they are part of the same construct of shame. However, shame theory does not specify how strongly internal and external-referenced shame should be correlated. Correlations in the range of .82 to .90 between aspect categories across internal- and external-referenced shame suggest that a general component of trauma-related shame can be estimated. Despite the meaningfulness of the theoretical distinction between the two different evaluative perspectives in trauma-related shame, our data indicate a fusion of internal and external shame as measured by the TRSI. Lewis (1995) refers to the fusion of external and internal shame in his conceptualization of “the exposed self”. The perception of being judged and shamed by others has been understood by some researchers as a form of externalized projections of ones own self-condemnation (Wilson et al. 2006); that is, experiencing the psychological pain from condemnation of oneself may result in fear linked to an expectation of devaluation from others. As such, when experiencing shame related to traumatic incidents, an individual may erroneously conclude on the basis of a critical self-evaluation that the outside world will turn against him or her (Wilson et al. 2006). As a consequence the shame-ridden traumatized individual may hide perceived flaws and display avoidance behavior out of fear from condemnation and rejection of other people (Lindsay-Hartz 1984; Tangney 1991; Greenberg and Paivio 1997; Lewis 1995; Nathanson 1987; Stone 1992; Tangney 1991; Tomkins 1987).

The very high correlation between the two aspect conditions in both internal- and external referenced shame suggests that responses to indicators of self-condemnation do not differ from responses to affective and behavioral indicators of trauma-related shame. This finding may illustrate that condemnation is a core feature of trauma-related shame in which condemnation of the self is strongly related to affective and behavioral components of shame. Broader theoretical views emphasize condemnation of the self in the very definition of shame (Lewis 1995; Tangney et al. 1996). Tangney defined shame as negative condemnation of the whole self (Tangney et al. 1996). Both Lewis (1995) and Gilbert’s (1997, Gilbert and Andrews 1998) theorizing of shame emphasizes the negative self-evaluative aspect. A major contribution to the field was advanced by Lewis (1971) contending that self-awareness is a prerequisite for the individual to experience shame. In this regard, the source of shame is one’s thoughts about oneself, which involves being self-absorbed in one’s own perceived personal defects. However, some researchers do not view shame as a self-conscious emotion, but as a primary physiological response to the threat of isolation and thus ignore the role of self-evaluation in shame (Martens 2005). This is unfortunate because it limits the possibility of explaining why individuals have different shame cognitions to the same traumatic events. The results of our study support the notion advanced by most theorists in the field that condemnation of the self is a core component of shame.

The second research question relates to the design of a short-form measurement that provides optimal reliability. Applying G-theory allows the test constructor to estimate generalizability in a multi-facet measurement design. In addition, generalizability coefficients tailored to the specific use of the measurement are estimated in different measurement designs with different number of items within each of the four components of shame. Within the framework of G-theory, the present study started with a larger number of items designed to accurately estimate the variance components. While 24 items were used to estimate the G-study variance components, the results from the D-studies suggest that eight items (that is, two items for each of the four fixed categories) are sufficient to provide composite generalizability and index of dependability coefficients above .80 for trauma related shame scores, while four items are sufficient for composite generalizability coefficient and index of dependability of .77 (see Table 4). As noted by Smith et al. (2000), the precision of the instrument and validity of the inferences drawn from the scores of an instrument must be preserved, especially when a more lengthy assessment is deemed essential by the original test developers. However, reducing a full-scale measurement to a small set of items could result in narrowing the measurement domain. As such, the possible loss of validity for saving time is not preferable. Because assessment scores are frequently used to make critical decisions with regard to clinical interventions, the use of short-form measures for clinical use poses heavy demands of measurement precision and validity. Decisions concerning whether patients are in need of clinical intervention targeting shame or whether the patient has improved as a result of clinical intervention, suggest the need for research to investigate whether a short-form version of an instrument is precise enough for this particular use.

The third research question relates to the divergent validity of shame and guilt. In the process of creating the TRSI, a sharp conceptual distinction between shame and guilt has enabled us to create a measurement that separates these two self-evaluative emotions. While shame and guilt seems to be related to each other and often occur together, they are nevertheless treated as two distinct emotions in theory. The relationship between shame and guilt is meaningful, in which guilt may elicit shame from attribution processes if the individual see one’s behavior as confirming evidence of personal failure. The high generalizability coefficient of the difference scores among guilt and shame of .821 in the present study indicates that shame and guilt are separable and two distinct emotions, in which mean correlation among items within the TRSI is higher than mean correlations among items across the TRSI and the TRGI. This particular generalizability estimation can be considered an aspect of construct validity labeled differential construct validity by Eikeland (1973).

The fourth research question relates to the relationship between the TRSI scale, BDI, and the subscale of Self-Judgment. According to recent thinking in validity theory, later developments have emphasized some general principles inherent in the construct validity model (Kane 2001). A central point is assessment of proposed interpretations of test scores, as well as considering possible competing interpretations. The general question is whether the relationships found between TRSI, depression and self-judgment are due to the influence of a third set of variables (e.g., guilt). If scores on the TRSI are related to the Self-Judgment and depression independent of guilt cognitions, then the results will further provide supporting evidence towards construct validity. The TRSI scale passed a stringent test, with the demonstration that both the relationships between the TRSI and depression and internal shame measured by the subscale Self-Judgment was independent of the influence of guilt cognitions. Both the homogeneous composition of the TRSI scores and the high regression coefficient values between the TRSI and Self-Judgment and depression while controlling for guilt cognitions provide supporting evidence towards construct validity.

Clinical Implications

The need for an assessment of trauma-related shame is essential given the revised DSM-V criteria for PTSD to include shame and guilt. However, no measure available assesses shame in the context of trauma, and gold standard assessments of PTSD do not include shame and guilt. Furthermore, treatment studies of PTSD have focused on other emotions (e.g., fear, anger, guilt) or depression, and trauma-related shame is typically investigated only as subordinate outcome in PTSD treatment trials. In particular, the recent focus on treatment of moral injury in war veterans suggests the role of trauma-related shame in PTSD with implications for treatment (Smith et al. 2013; Steenkamp et al. 2013). The TRSI might then serve as a useful screening and evaluation measure as well as a measure of change in PTSD treatment in general, and in particular in the treatment of veterans with PTSD.

Our results indicate that shame in the context of trauma is a uniform phenomenon. Accordingly, shame can be treated by one treatment method, and there may be no need to develop specific methods for treating internal versus external shame.

Limitations

Results of the present study indicate a fusion of internal and external shame as measured by the TRSI. However, one cannot rule out the possibility that the operationalization of internal- and external shame in the TRSI items may be too similar, and thereby obscure distinct features of internal and external shame. Our strategy of using identical formulation of indicators differing with reference to the internal- and external shame could lead to more homogenous responses in which relevant aspects of this distinction are not made explicit. On the other hand, the identical formulation of indicators with reference to internal and external shame enabled us to estimate the covariance components structure of the four aspect conditions in the present multivariate measurement design. The two referenced norms, internal and external- referenced shame, represent nevertheless an important issue that deserves more attention in the future.

The results displayed overall high correlations between the four universe scores. However, a correlation value of 1.044 in the multivariate generalizability study in Table 3 is slightly out of range. Given the small sample of 50 persons this value may be caused by sampling error rather than misspecification of the model. The small sample size of 50 participants in the present study has the potential to result in unstable estimates. However, all the variance components appeared to be rather stable estimates based on the estimated standard errors in a univariate G-study (see Table 2).

Future Research

While the results from this study are encouraging, replication studies are needed to investigate whether similar results emerge from other independent samples, preferably with the use of confirmatory factor analysis on a larger sample size. A possible extension of this inquiry would be to measure the TRSI components by shorter scales than the one used in this domain study. Selection and reduction of item indicators to eight or four items may be better assessed by the improved methodology of confirmatory factor analysis or item response theory as supplements to G-theory.

Conclusion

While the availability of scales for identifying shame in traumatized patients appears to be crucial, there is to date no measure of assessing shame within the context of trauma. The present study provides support for using the 24-item version of the TRSI in both clinical practice and research. Assessing trauma-related shame in clinical research can enhance our understanding of negative emotions following trauma. This instrument is also suitable for investigating clinically relevant theory-derived hypotheses regarding trauma-related shame as a maintaining factor in treatment studies of PTSD. A high index of dependability indicating low influence of absolute measurement errors suggests the utility of the TRSI for screening purpose. The dependability of the scores allows both researchers and clinicians to use the TRSI as a screening instrument, as well as an assessment of shame with high degree of confidence. This increased confidence has the potential to identify shame in traumatized individuals that would result in early identification and choice of intervention tailored to the patient symptom profile.