According to the cognitive theory of personality disorders, each personality disorder is characterized by a specific set of dysfunctional beliefs (Beck, Freeman, Davis, & Associates, 2003). Cognitive therapy for personality disorders emphasizes the identification and modification of these dysfunctional beliefs (Beck et al., 2003). The Personality Belief Questionnaire (PBQ; Beck & Beck, 1991) was developed as a clinical and research instrument to assess dysfunctional beliefs associated with individual personality disorders on Axis II of the Diagnostic and Statistical Manual for the Mental Disorders (American Psychiatric Association, 1994). The PBQ contains 126 items (9 scales, 14 items per scale). A shorter and more refined version of the PBQ is desirable for clinical and research purposes. In this article we report the development of and preliminary psychometric findings for the PBQ Short Form (PBQ-SF).

The items for the original PBQ were derived from clinical investigations and theoretical considerations and were first published as a list of schemas for the various personality disorders (Beck, Freeman, & associates, 1990). Cognitive conceptualizations of each disorder were derived that linked the behavioral markers for a disorder with corresponding dysfunctional assumptions and beliefs. For instance, the behavioral manifestations of dependent personality disorder include submissiveness and excessive reliance on the approval and support of a strong ally. Underlying these behavioral patterns are beliefs such as “I’m helpless and can’t cope as other people can.” By way of contrast, behavioral correlates of narcissistic personality disorder include arrogant and haughty behaviors and demands for special treatment. These behaviors, in turn, correspond to underlying beliefs such as: “Because I am special, others should put my wants above theirs.”

Although each personality disorder is associated with pervasive and persistent impairments in functioning, the specific form and level of impairment will vary across disorders. According to the DSM-IV (American Psychiatric Association, 1994), patients with personality disorders in Cluster A (paranoid, schizoid, or schizotypal) often appear odd or eccentric; those with personality disorders in Cluster B (antisocial, borderline, histrionic, or narcissistic) often appear dramatic, emotional, or erratic; and those with personality disorders in Cluster C (avoidant, dependent, or obsessive-compulsive) often appear anxious or fearful. Such descriptive differences between personality disorders may be reflected in different patterns of beliefs as well as different clinical symptoms. For instance, patients with internalizing disorders such as avoidant or dependent PD may be relatively more prone to depression and anxiety than patients with externalizing disorders, such as antisocial or narcissistic PD (Krueger, 1999). Although patients with externalizing disorders certainly experience depression and anxiety, their self-serving biases and behaviors also reflect a degree of fearlessness and blamelessness (Lilienfeld, 1994).

Trull, Goodwin, Schopp, Hillenbrand, and Schuster (1993) examined the PBQ in a college student sample. They found that the internal consistency of the PBQ scales was good (Cronbach alpha’s ranged from .77 to .93). They found modest correlations between the PBQ and both the Personality Disorder Questionnaire-Revised (Hyler, Skodol, Oldham, Kellman, & Doidge, 1992) and the MMPI-PD (Morey, Waugh, & Blashfield, 1985). Using a German version of the PBQ, Fydrich, Schmitz, Hennch, and Bodem (1996) found that the reliability of the scales in a sample of 282 psychiatric patients was good (alpha’s ranged from .78 to .91). They also found that the PBQ scales correlated moderately with SCID-II trait scores (median correlation = .32). In the largest clinical study to date on the PBQ, each of five PBQ scales (avoidant, dependent, obsessive-compulsive, narcissistic, and paranoid) were found to be specifically associated with the corresponding SCID-diagnosed personality disorders (Beck et al., 2001).

The development of the PBQ-SF proceeded in two stages. In the first stage we used archival data of psychiatric outpatients who completed the PBQ between 1995 and 2001. We identified the seven highest loading items from each of the 14-item PBQ scales and used these items to calculate experimental short form scales. We then tested the reliability and criterion validity of these experimental scales using the archival data set. In the second stage we used the items from the experimental scales to construct the PBQ-SF. We administered the PBQ-SF to a new sample of psychiatric outpatients and evaluated the reliability and construct validity of the PBQ-SF scales in this independent sample. The two stages of scale development and testing are reported herein as two separate studies.

Study 1: experimental short form scales

Method

Participants

Nine hundred and twenty adult psychiatric outpatients were administered the PBQ between 1995 and 2001 at either the Center for Cognitive Therapy at the University of Pennsylvania or the Beck Institute for Cognitive Therapy and Research, both in the Greater Philadelphia area, Pennsylvania, USA. The mean age of this sample was 36.4 years (SD = 11.1; range 18–76) and there were 515 (55%) women and 405 men. In this sample there were sufficient numbers of patients with personality disorders to examine the criterion validity of five PBQ scales: avoidant, dependent, obsessive-compulsive, narcissistic, and paranoid. There were 79 patients with a primary Axis II diagnosis of avoidant personality disorder, 26 with a primary Axis II diagnosis of dependent PD, 58 with a primary Axis II diagnosis of obsessive-compulsive PD, 26 with narcissistic PD, and 27 with paranoid PD.

Procedures

All patients went through a comprehensive intake evaluation, which included the Structured Clinical Interview for the DSM-III-R Personality Disorders (SCID-II; Spitzer, Williams, Gibbon, & First, 1992), or for those whose diagnostic evaluation occurred after January 1996, according to the Structured Clinical Interview for the DSM-IV Personality Disorders (First, Spitzer, Gibbon, Williams, & Benjamin, 1995). All assessors were postdoctoral clinicians who received at least 2 weeks of training on the SCID-II prior to conducting diagnostic evaluations. Training was overseen by the research coordinator and consisted initially of reading the SCID manual, direct instruction in the protocol, and in-session observation of experienced raters on at least two occasions followed by comparison and discussion of ratings for these sessions. An experienced rater then observed the trainee conduct at least one interview (more if necessary) and provided instruction as needed. Following training, complex cases were routinely discussed and diagnostic questions resolved in weekly group meetings presided over by the research coordinator. Patients whose Axis II diagnosis remained ambiguous after this discussion were classified as “diagnosis deferred” and their data were excluded from this study. Patients completed the PBQ as part of a packet of questionnaires completed during the intake process. Further details on subjects and data collection procedures for this sample are reported in Beck et al. (2001).

Statistical analyses

Using the sample of all patients who completed the PBQ, corrected item–total correlations were computed for each of the 9 PBQ scales. The seven items from each scale with the highest item–total correlations were summed to create the experimental short form scales. A multivariate analysis of variance (MANOVA) was conducted followed by univariate ANOVAs. A preliminary MANOVA was conducted which included sex as a factor. The main effect for sex was not significant, F(5, 357) = 1.09, and neither was the interaction effect for sex by diagnostic group, F(25, 1327.70) = .93. Hence, the data from both sexes was combined for subsequent analyses. Predicted between-group differences were analyzed with independent t-tests. Predicted within-group differences were analyzed with paired t-tests. Given that predictions were strongly related to cognitive theory all t-tests were one-tailed. Alpha was set at .01 to correct for inflation of Type I error rate.

Results

The internal consistency reliabilities (Cronbach’s alpha) of the experimental short form scales were as follows: Avoidant (.84), Dependent (.89), Passive-Aggressive (.86), Obsessive-Compulsive (.90), Antisocial (.80), Narcissistic (.83), Histrionic (.89), Schizoid (.79), and Paranoid (.91).

The results of a MANOVA testing the six groups (avoidant, dependent, obsessive-compulsive, narcissistic, paranoid, and no personality disorder) on the five relevant experimental scales indicated a significant overall effect, Wilks’ lambda = .54, F(25, 1357) = 9.94, P < .0001. Univariate ANOVAs were significant for each of the five scales tested: Avoidant, F(5, 369) = 27.62, P < .0001; Dependent, F(5, 369) = 17.40, P < .0001; Obsessive-Compulsive, F(5, 369) = 6.95, P < .0001; Narcissistic, F(5, 369) = 4.49, P = .001; and Paranoid, F(5, 369) = 8.44, P < .0001.

Table 1 shows results of separate between-group comparisons on each of the experimental scales. The mean of the criterion group for each scale is displayed in bold font. The differences between the criterion-group mean and the means of other personality disorder groups on the same scale are listed in the third column (“Mean diff.”). Results of t-tests comparing these mean differences and corresponding effect sizes are listed on the right. Thus, on the experimental avoidant scale, the criterion group (patients with avoidant PD) had a mean of 15.05 and this mean was significantly higher than the mean of patients with obsessive-compulsive PD (8.65), narcissistic PD (7.18), and no PD (7.05). The corresponding effect sizes were 1.16, 1.44, and 1.49, respectively. These effect sizes are each large by Cohen’s (1988) standard (>.80). Patients with avoidant PD differed from paranoid PD patients at the P < .05 level, however this is not statistically significant after Bonferroni correction. It is worth noting that the effect size for this difference (.60) suggests that a larger sample of paranoid patients would have yielded a statistically significant difference. Avoidant PD patients did not differ from dependent PD patients on the experimental avoidant scale. Within-group analyses (paired t-tests) showed that avoidant PD patients scored significantly higher on the experimental avoidant scale than on any of the other experimental scales (all P’s < .0001).

Table 1 Results of between-group analyses of the experimental scales

The experimental dependent scale yielded consistently supportive findings. The criterion group scored significantly higher than each of the comparison groups and the effect sizes were all large. A series of paired contrasts showed that dependent PD patients scored significantly higher on the dependent scale than on any of the other experimental scales (all P’s < .01).

Between-group results for the experimental obsessive-compulsive scale were mixed. While patients with obsessive-compulsive PD scored higher than did patients with paranoid PD or no PD, they did not differ from patients with avoidant or dependent PD on this scale, and they differed only marginally from patients with narcissistic PD. Effect sizes were in the small-to-moderate range. Within-group analyses for this scale were supportive. Paired t-test results indicated that patients with obsessive-compulsive PD scored significantly higher on the experimental obsessive-compulsive scale than on any of the other experimental scales (all P’s < .05).

As expected, patients with narcissistic PD scored significantly higher on the experimental narcissistic scale when compared with each of the other patient groups. Moreover, the size of these effects was generally large (ranging from .77 to 1.44). The results of paired t-tests showed that patients with narcissistic PD scored significantly higher on this scale than they did on any of the other experimental scales (all P’s < .01).

Finally, the experimental paranoid scale performed according to expectations for the most part. Patients with paranoid PD scored much higher on this scale than did other groups and the mean differences were significant in most cases. The only difference that did not reach significance was with narcissistic PD patients. Here again it is worth noting that there was a moderate effect size (d = .60) but limited statistical power (df = 18) made it difficult to tell if this represents a reliable difference. Paired t-test results indicated that paranoid patients scored significantly higher on the experimental paranoid scale than they did on any other of the experimental scales (all P’s < .05).

Figure 1 displays profiles of experimental scale scores for each of the five personality disorder groups. The scales were transformed to z scores using the original sample (N = 920) prior to profile graphing.

Fig. 1
figure 1

Profiles for five personality disorders using the 7-item experimental PBQ scales. AVO, Avoidant; DEP, Dependent; PAS, Passive-aggressive; OBS, Obsessive-compulsive; ANT, Antisocial; NAR, Narcissistic; HIS, Histrionic; SCH, Schizoid; and PAR, Paranoid

Discussion

In Study 1 all of the experimental scales showed good internal consistency. Criterion validity was examined by comparing experimental scale scores across relevant SCID-derived diagnoses. Findings were largely supportive. Twenty-one (84%) of the 25 between-group tests showed that patients with the criterion personality disorder scored higher on the respective experimental scale than did patients with an alternative personality disorder or no personality disorder. Also, results from within-group analyses showed that the five personality disorder groups we examined scored higher on their corresponding experimental scale than on the alternative experimental scales. This adds further support for the criterion validity of five of the experimental scales.

Although the experimental obsessive-compulsive scale yielded supportive within-group findings (patients with obsessive-compulsive disorder scored higher on this scale than on the alternative scales), the between-group findings were mixed. The only groups that the scale reliably discriminated from obsessive-compulsive PD were paranoid PD and no PD. These findings are inconsistent with findings obtained using the full PBQ obsessive-compulsive scale (Beck et al., 2001), where patients with obsessive-compulsive personality disorder scored significantly higher than each of the other personality disorder groups except those with narcissistic personality disorder.

Study 2: construction and evaluation of the PBQ-SF

Our major goal in Study 2 was to investigate how the experimental scales from Study 1 performed psychometrically when administered as a new measure—the PBQ-SF. We were particularly interested in examining the internal consistency, test–retest reliability, and construct validity of the PBQ-SF in psychiatric patients.

A portion of the patients who participated in Study 2 completed a packet of questionnaires for a separate study on the role of daily stress and coping. Some of the questionnaires from this concurrent study assessed constructs of interest for validating the PBQ-SF. In particular, measures of depression and anxiety were considered to be pertinent to PBQ-SF scales associated with personality disorders in the anxious-fearful cluster. A measure of self-esteem was included since this construct should relate differently to PBQ-SF scales assessing self-promoting beliefs (e.g., narcissistic or antisocial) versus self-devaluing beliefs (e.g., avoidant or dependent). The Dysfunctional Attitudes Scale (Weissman & Beck, 1978) was included since it contains attitudes associated with depression and may show associations with the PBQ-SF similar to those of self-esteem. A measure of social support was included to examine the construct validity of PBQ-SF scales emphasizing problems in attachment (e.g., dependent, schizoid, and paranoid). Neuroticism was included as a proxy for general psychic distress and maladjustment. Lastly, a general measure of psychosocial functioning was included since each of the PBQ-SF scales measures beliefs that are presumed to impair functioning.

In constructing the PBQ-SF, the items from the experimental scales were randomly distributed and the original scaling and instruction set from the PBQ were retained. That is, patients were asked to indicate how much they believe each statement most of the time by circling numbers using the following scale: 0 = Not at all, 1 = Slightly, 2 = Moderately, 3 = Very much, and 4 = Totally.

Method

Participants

Subjects entered treatment at the Beck Institute for Cognitive Therapy and Research during 2003–2004. They included 160 patients, 93 (58%) females and 67 males, whose average age was 39.8 (SD = 14.2). Forty percent of these patients were married, 30% were single, 18% were divorced or separated, and 2% were widowed. The racial make-up of the sample was 86% Caucasian, 5% African–American, 3% Asian, 3% Hispanic, and 3% other. Employment status broke down as follows: 56% employed, 10% students, 4% homemakers, 18% unemployed, 8% disabled, and 4% retired. The educational level of the sample was higher than typical, with 40% having an advanced degree, 30% having a college degree, 20% having some college, and 10% having a high school diploma. The distribution of primary Axis I diagnoses in this sample was 53% affective disorders, 28% anxiety disorders, 10% adjustment disorders, and 9% other disorders. Thirty-one percent of the sample had an Axis II diagnosis. These included 9 patients with avoidant PD, 7 with obsessive-compulsive PD, 7 with borderline PD, and 26 with PD not otherwise specified.

Measures

Depression

Patients’ level of depression was measured with the Beck Depression Inventory-II (BDI-II; Beck, Steer, & Brown, 1996). The BDI-II is a 21-item self-report questionnaire that measures the level of patients’ depressive symptoms. A number of studies document the reliability and validity of the BDI-II (Beck et al., 1996; Dozois, Dobson, & Ahnberg, 1998).

Anxiety

Patients’ level of anxiety was assessed using the Beck Anxiety Inventory (BAI; Beck & Steer, 1990). The BAI is a 21-item self-report measure that assesses symptoms of anxiety. The scale has been found to have strong psychometric properties (Steer & Beck, 1997).

Dysfunctional attitudes

Depressive attitudes were assessed using the Dysfunctional Attitudes Scale (DAS; Weissman & Beck, 1978). The DAS is a self-report questionnaire containing dysfunctional attitudes frequently found in psychiatric patients and associated with vulnerability to depression. The 40-item short form of the DAS (Form A) was used in the current study. This instrument has been found to correlate strongly with the full DAS (r = .84) and is predictive of a diagnosis of major depressive disorder (Nelson, Stern, & Cicchetti, 1992). Cronbach’s alpha for the present sample was .95.

Neuroticism

Neuroticism was assessed with the 12-item Neuroticism Scale of the NEO Five Factor Inventory (NEO-FFI; Costa & McCrae, 1992). This scale contains items that tap anxiety, hostility, depression, self-consciousness, impulsiveness, and vulnerability. The NEO-FFI is a commonly used personality measure with well-established reliability and validity (Costa & McCrae, 1992). For the current sample Cronbach’s alpha was .76.

Self-esteem

The Rosenberg Self-Esteem Scale (RSES; Rosenberg, 1965) was used to assess patients’ level of self-esteem. This widely used measure consists of 10 self-report items on a 4-point Likert scale, which are summed to produce a global self-esteem score. Adequate reliability and validity of the RSES have been demonstrated in previous studies (Corwyn, 2000; Rosenberg, 1965). Cronbach’s alpha in this sample was .88.

Social support

Perceived social support was assessed using the Social Provisions Scale (SPS; Russell & Cutrona, 1984). This 24-item self-report questionnaire asks respondents to rate the degree to which their social relationships are currently supplying various social provisions. Provisions in six areas are assessed: guidance, tangible assistance, caring, similarity of interests and concerns, reassurance of worth, and opportunity to provide nurturance. Previous studies have shown evidence of high internal consistency for the total scale score, with Cronbach’s alpha ranging from .85 to .92 across a variety of populations (Russell & Cutrona, 1984). The scale has shown predicted associations with measures of loneliness, depression, and life satisfaction (Russell & Cutrona, 1984). Cronbach’s alpha for the total score in the present sample was .91.

Psychosocial functioning

Psychosocial functioning was assessed using the Progress Evaluation Scales (PES; Ihilevich & Gleser, 1979, 1982). The PES consist of seven Guttman type scales, each measuring a particular area of functioning: family interaction, occupation, getting along with others, feelings and mood, use of free time, problems, and attitudes toward self. Each scale has five levels that represent a continuum of adjustment, from the most pathological to the healthiest level observed in the community. Two versions of the instrument were used in this study: a patient self-report version (PES-P) and a version completed by the clinician conducting the intake evaluation (PES-C). The PES scales have been found to correlate with standardized rating scales and show satisfactory convergent and divergent validity (Ihilevich & Gleser, 1982). Intraclass correlations between clinician ratings and patient ratings in the current sample ranged from .73 to .95. For the purposes of this study we averaged the patient and clinician ratings for each scale and summed these values to produce a PES total score for each patient. Intraclass correlations for the averaged scale ratings ranged from .83 to .96, and Cronbach’s alpha for the total score was .75.

Procedure

All patients were assessed and diagnosed at intake by experienced doctoral-level clinicians using the Structured Clinical Interview for DSM (SCID) for Axis I and Axis II disorders. Patients completed the BDI-II, BAI, and PES-P prior to their intake appointment. They completed the PBQ-SF, DAS, Neuroticism Scale, RSES, and SPS at home during the week before their first therapy session.

Results

Table 2 displays the means, standard deviations, internal consistency reliabilities, and intercorrelations of the nine PBQ-SF scales. Cronbach’s alpha coefficient for the total PBQ-SF score was .97. The mean total score was 81.28 (SD = 42.70). As can be seen in Table 1, the alpha coefficients for the individual PBQ-SF scales were good and ranged from .81 for the avoidant and narcissistic scales to .92 for the paranoid scale. The high intercorrelations among the PBQ-SF scales (median r = .57) indicate that the scales share a nontrivial amount of variance (squared median r = .32).

Table 2 Means, SD, internal consistency reliabilities, and intercorrelations of the PBQ-SF scales

Thirty-six patients completed the PBQ-SF at intake and again 4 weeks later. Test–retest correlations for the PBQ-SF scales during this interval were: Avoidant .67, Dependent .80, Passive-Aggressive .80, Obsessive-Compulsive .82, Antisocial .57, Narcissistic .74, Histrionic .78, Schizoid .74, and Paranoid .72.

Table 3 displays correlations of the nine PBQ-SF scales with other clinical variables. All nine scales showed significant positive correlations with depression as measured by the BDI-II. Seven out of nine scales correlated significantly and positively with anxiety as measured by the BAI: the antisocial and narcissistic scales being the exceptions. The BDI-II and BAI were most strongly correlated with the dependent scale and showed the lowest correlations with the narcissistic scale. The DAS was highly and positively correlated with the avoidant and obsessive-compulsive scales and also correlated moderately with the dependent, passive-aggressive, histrionic, and paranoid scales. Neuroticism correlated significantly and positively with virtually all of the PBQ-SF scales (the correlation with the histrionic scale approached significance, P = .07). Self-esteem (RSES) and social support (SPS) showed the lowest correlations, all in the negative direction. Self-esteem was significantly lower in patients who scored high on the avoidant, obsessive-compulsive, and paranoid scales. Social support was significantly lower among patients who scored high on the antisocial and paranoid scales. Psychosocial functioning as measured by the PES showed significant correlations with each of the PBQ scales. However, the correlations with the antisocial and narcissistic scales were in the positive direction, suggesting relatively higher functioning among patients who endorse antisocial and narcissistic beliefs. Lastly, the PBQ-SF total score correlated significantly in the expected direction with all of the other clinical variables.

Table 3 Correlations of PBQ-SF scales with other clinical variables

The large number of significant correlations in Table 3 raises the possibility that the PBQ-SF scales are assessing a general distress factor. This possibility is consistent with two other findings: the correlations of all of the clinical variables with the PBQ-SF total score, and the intercorrelations among the PBQ-SF scales (see Table 2). This would suggest that an increase in general distress will elevate a patient’s overall PBQ-SF profile. In our next set of analyses we investigated how well the PBQ-SF scales assess individual differences in personality disorder beliefs when the effect of general distress is held constant.

To control for a general distress factor we calculated ipsatized PBQ-SF scale scores for each patient. Specifically, for each patient we subtracted the mean of their nine scale scores from each of their individual scale scores (see Greer & Dunlap, 1997 for a review of this method). For example, a patient whose mean score across all nine scales is 8.5 and whose raw score on the Avoidant scale is 12 would have an ipsatized Avoidant scale score of 3.5. Conceptually, ipsatized scores quantify how much a specific scale “stands out” in a profile regardless of the elevation of the profile as a whole. A profile of ipsatized scores retains the original proportions of the peaks and valleys shown in a profile of normative scores while controlling for the effect of general distress (as well as other general influences such as response sets).

Table 4 displays the correlations between ipsatized scores and other clinical variables. These correlations represent associations between the predominance of a particular PBQ-SF scale in a PBQ-SF profile and other clinical variables. For example, starting with leftmost column of correlations in Table 3, one can see that PBQ-SF profiles with a peak on the avoidant scale were associated with higher dysfunctional attitudes (DAS), lower self-esteem (RSES), poorer psychosocial functioning (PES), higher depression (BDI-II) and higher anxiety (BAI). Profiles with a peak on the dependent scale were associated with higher depression and anxiety symptoms and poorer psychosocial functioning. Profiles with a relative elevation on the obsessive-compulsive scale were associated with higher dysfunctional attitudes and lower self-esteem. The predominance of the narcissistic scale in a profile was associated with fewer dysfunctional attitudes, higher self-esteem and psychosocial functioning, and lower depression and anxiety. A similar pattern was apparent when the antisocial scale predominated in a profile. Profiles with a peak on paranoid scale were associated with lower scores on self-esteem, social support, and psychosocial functioning. The ipsatized scale scores for the passive-aggressive, histrionic, and schizoid scales did not correlate with the other clinical variables generally. The one exception was a negative correlation between the ipsatized schizoid score and the DAS.

Table 4 Correlations of ipsatized PBQ-SF scale scores with other clinical variables

Comparing Tables 3 and 4 one can quickly see some striking differences. Ipsatizing scores significantly reduced the number of significant correlations obtained, and the direction of association was reversed in many cases, particularly with regard to the antisocial and narcissistic scales.

Discussion

The first set of findings in Study 2 provided support for the reliability of the PBQ-SF. Estimates of internal consistency were in the good range and test–retest coefficients across a four-week interval in the adequate-to-good range. These findings are noteworthy given that each scale contains only seven items.

The main set of findings from Study 2 pertains to the construct validity of the PBQ-SF. Overall, we found that the PBQ-SF scales correlated significantly with an array of clinical variables. However, this finding may have been due in part to the influence of a general distress factor on PBQ-SF scores. The possible influence of a general distress factor may also account for the moderately high intercorrelations among the PBQ-SF scales. When the construct validity analyses were repeated with ipsatized scores (which control for general factors affecting all scales), a pattern of coefficients emerged that was more consistent with theoretical formulations of the disorders represented. For instance, PBQ-SF scales for avoidant, dependent, and obsessive-compulsive personality disorders (the “anxious-fearful” cluster) correlated positively with measures of anxiety and depression, or depression-proneness (i.e., the Dysfunctional Attitudes Scale, see Segal & Ingram, 1994; and the Rosenberg Self-Esteem Scale, see Roberts, Kassell, & Gotlib, 1995). PBQ-SF scales representing personality disorders that are characterized by externalization and self-aggrandizement (antisocial and narcissistic) showed significant correlations with the very same variables, but in the opposite direction.

Interestingly, while virtually all of the normative PBQ-SF scales showed significant positive correlations with neuroticism, none of the ipsatized scales did. This lends credence to the proposition that the overall elevation of a PBQ-SF profile is associated with a general distress factor, whereas variability of individual scales within a PBQ-SF profile is associated with disorder-specific factors.

General discussion

We set out to create an abbreviated version of the PBQ that would be practical for clinical and research purposes. In Study 1 we identified items that best represented each PBQ scale empirically and created experimental short form scales. Five of these scales generally discriminated patients with the corresponding personality disorders from each other and from patients with no personality disorder. The weakest set of findings was for the experimental obsessive-compulsive scale, which yielded mixed results. In Study 2, the PBQ-SF was tested in an independent sample and found to have good internal consistency and test–retest reliability. An investigation of the measures’ construct validity suggested that it captures disorder-specific symptoms as well as general psychological distress. The general distress factor appears to be responsible for overall PBQ-SF profile elevation, whereas disorder-specific symptoms are reflected in the spread of scale scores around the profile mean.

Some limitations of this research should be kept in mind when interpreting the findings. Although the SCID-II interviewers in Study 1 received thorough training and diagnostically complex cases were systematically reviewed, we did not obtain inter-rater reliability estimates for the Axis II diagnoses. Thus, the reliability of the independent variable in the criterion validity analyses is uncertain. This is not a major threat to the criterion validity analyses, however, since low reliability would only limit our ability to find true effects. A limitation of Study 2 was the relatively low sample sizes for the correlational analyses involving neuroticism, social support, dysfunctional attitudes, and self-esteem. Although the subsamples were large enough to detect moderate effects at a power of .80 (Cohen, 1988), these analyses would not detect small but true effects. Thus, neuroticism, social support, dysfunctional attitudes, and self-esteem may account for more of the variance in PBQ-SF profiles than we were able to ascertain in this study. It is also worth noting that the subsamples included proportionally less personality disordered patients compared with the whole sample (12.5% vs. 31%, respectively). It is unclear how this difference might have influenced our findings.

The findings from this investigation are considered preliminary. Future research is needed to investigate the factorial structure of the PBQ-SF as well as its sensitivity to treatment effects. Since experimental scales were used in Study 1 to evaluate criterion validity, it is premature to say with confidence that the corresponding PBQ-SF scales would discriminate between personality disorders in the same way. The criterion validity of the passive-aggressive, antisocial, histrionic, and schizoid PBQ-SF scales remains to be tested. Nevertheless, based on the findings from this investigation, the PBQ-SF appears to hold promise as a practical measure of personality disorder beliefs.