The construct of psychopathy has a long and esteemed history in both criminal forensic psychology research and clinical practice. The broad essentials of psychopathy include personality, affective, and behavioral qualities that breach explicit and implicit societal principles (Cleckley 1941; Hare 2003). The personality characteristics of psychopathy include charisma, domineering egocentricity, as well as the indifferent and deliberate exploitation of others. The affective characteristics of psychopaths include anomalously shallow and unpredictable levels of emotion; insincere commitments to personal goals, interpersonal relationships, and societal principles; and deficiencies in guilt, empathy, and remorse. The behavioral characteristics of psychopathy include erratic, negligent, and sensation-seeking activities that violate social and legal norms.

Decades of investigations have examined and refined the psychopathy construct, resulting in a vast database on its external correlates and internal structure (e.g., Forth and Burke 1998; Hare 2003; Hart and Hare 1989; Salekin et al. 2004). Researchers have extensively used psychopathy to predict future violent and nonviolent antisocial behavior (see Gendreau et al. 2002; Hemphill et al. 1998; Salekin et al. 1996; Walters 2003b for reviews). Over the past 20 years, questions about the relation between psychopathy and risk for future antisocial conduct have sparked a great deal of empirical research, theoretical controversies, and animated debates. The breadth and diversity of these studies make it difficult to discern the patterns that exist across the entire literature. The present article therefore uses meta-analysis to summarize research that relates psychopathy (i.e., Hare Psychopathy Checklists; PCLs) to antisocial behavior (i.e., recidivism and institutional maladjustment). We also test the moderating influence of contextual variables related to sample characteristics and assessment methodology on this relation.

We chose to focus our meta-analysis on the Hare PCLs (Hare 1980, 2003; Forth et al. 2003; Hart et al. 1995) because these measures of psychopathy have the longest empirical history, have good validity and reliability (Hare 2003; Hart et al. 1995; Forth and Burke 1998; Forth et al. 2003), and have been used in a large number of studies on antisocial conduct. The Hare PCLs have traditionally been conceptualized as comprising two factors. The first factor (F1) represents interpersonal and affective components of the syndrome, whereas the second factor (F2) represents socially deviant and behavioral aspects. In a recent analysis, Cooke and Michie (2001) suggested that psychopathy, as measured by the PCL-R, is best understood as a three-factor construct with the traditional F1 separated into discrete interpersonal and affective components and the traditional F2 modified such that it has a reduced emphasis on criminal conduct. Even more recently, Hare (2003) presented a two-factor, four-facet model for psychopathy which retains the original two factors, but then divides each into more specific facets. The original F1 is separated into the Interpersonal and Affective facets, and F2 is separated into the Lifestyle and Antisocial facets. Criminal conduct, which is removed in the three-factor model, is reincorporated into the Antisocial facet of the two-factor, four-facet model (Hare 2003). Our meta-analysis focused on the traditional two-factor model because our literature search uncovered only one study (Douglas et al. 2005b) using the three-factor model and two studies (Cooke et al. 2001b; Spain et al. 2004) using the four-factor model to predict antisocial behavior.

Contribution of the Current Study

Previous meta-analyses examining the relation between psychopathy and antisocial behavior have generated highly similar conclusions (see Table 1). PCL scores were moderately predictive of antisocial behavior across a range of different studies. Psychopathy was similarly predictive across different ages (adolescents vs. adults), study methodologies (prospective vs. retrospective), and different types of outcomes (institutional infractions vs. recidivism). F2 effect sizes were consistently larger than F1 effect sizes (Edens and Campbell 2007; Edens et al. 2007; Gendreau et al. 2002; Guy et al. 2005; Hemphill et al. 1998; Salekin et al. 1996; Walters 2003b), suggesting that the behavioral aspects of psychopathy are better predictors of antisocial behavior than are the interpersonal and affective components. Furthermore, some studies have found variables that significantly influence the relation between PCL scores and antisocial behavior. Studies using white samples have shown stronger effects than those with ethnically diverse samples (Edens et al. 2007), studies using males have shown stronger effects than those with females (Edens et al. 2007), studies with longer follow-up periods have shown stronger effects than those with shorter follow-up periods (Hemphill et al. 1998), and studies conducted outside of the United States have shown stronger effects than those conducted in the United States (Guy et al. 2005).

Table 1 Summary of effect sizes for PCL scores from previous meta-analyses

The current study seeks to add to this literature in three major ways. First, we wanted to summarize research relating psychopathy to antisocial conduct across a broad set of domains, allowing us to uncover the most robust influences on this relation. We therefore performed a wide-ranging and thorough search of the literature, identifying 95 studies with non-overlapping samples for inclusion in our analyses. Second, the current review uses rigorous methods to detect moderators of the relation between psychopathy and antisocial conduct (e.g., heterogeneity and weighted regression analyses), which have not been consistently used in prior meta-analyses. Finally, our large sample of studies allows for more powerful tests of moderating variables than prior meta-analyses.

Prior summaries have examined only restricted subdomains of this literature. By examining a wider selection of studies, we may detect variables that influence the relation between psychopathy and antisocial conduct that have been missed in more specific reviews. Researchers have frequently examined the influence of sample and methodological characteristics on the relation between psychopathy and antisocial conduct (Douglas et al. 2006). The issue of whether the predictive ability of psychopathy varies across diverse samples has been of theoretical interest (e.g., Cooke et al. 2001a), so we examined the moderating influences of race, gender, institutional setting, and country on study effect sizes. There has also been practical interest as to how differences in methodology influence study results, so we examined the moderating influences of the length of follow-up, the type of information used to score the PCL, and the independence of the PCL and transgression assessments.

Method

Compilation of Hare PCL Studies

The purpose of our literature search was to locate the population of studies that have investigated the relation between psychopathy and future antisocial conduct. We examined multiple computerized databases, including: PsycINFO (1970-August 2004), MedLine (1965-August 2004), Educational Resources Information Center (1966-August 2004), and Digital Dissertations (1970-August 2004). Our search procedure paired terms related to psychopathy with terms related to either recidivism or institutional infractions. We used “wild card” search terms (i.e., those ending with an *) to obtain articles using all variations of the stem of each term. Our search terms for psychopathy were psychopath*, PCL, PCL-R, PCL-SV, PCL-YV, Hare Checklist*, and our search terms for recidivism or institutional infractions were recidiv*, reoffen*, risk, infract*, agg*, violen*, nonviolen*, institut*, physical*, and verbal*. We examined the reference sections of previous reviews of the literature on psychopathy. Finally, we wrote the first authors of the articles identified by these methods to obtain additional work in the area of psychopathy and risk that may have been missed by our search. We chose not to include conference presentations and raw data sets for several reasons. Conference presentations and raw data sets are not included in computerized databases, making a thorough and representative search of them impossible. Furthermore, presentations and raw data sets are not subjected to the same level of peer review as are published studies, governmental reports, and dissertations, making them less methodologically rigorous. While excluding these data sets may inflate our estimate of the mean effect size, we consider the effect of publication bias (i.e., file drawer problem) through our computation of Rosenthal’s (1979) fail-safe N described below. Additionally, we did not expect that excluding these data sets will influence our moderator analyses, which is the primary contribution of this paper.

Inclusion Criteria

Several criteria determined whether a study was included in the current analysis. The primary criterion was that eligible studies must have related scores from a version of the Hare PCL to a quantitative measure of recidivism or institutional maladjustment. We considered recidivism to include any behavior that resulted in legal charges following release from an institution, and institutional maladjustment to include any violations of institutional rules while detained. Studies that only examined criminal histories were excluded because criminal history is directly used by the PCL as an indicator of psychopathy, so the inclusion of these studies would falsely inflate our estimate of the mean effect size. Eligible studies included published articles from refereed and non-refereed sources, dissertations, and governmental reports. Studies reported in languages other than English were excluded.

Coding of Moderators

Two of the authors (ARL and JD) independently coded all moderator variables to establish inter-rater reliability estimates. These estimates (Intraclass correlations for continuous moderators and Cohen’s kappa for categorical moderators) ranged from 0.81 to 0.99, with a median of 0.86. Discrepancies between codes were resolved through discussions among the first three authors.

Calculation of Effect Sizes

We used Hedges’ d as our effect size, which is the standardized mean difference between two groups (i.e., recidivists and non-recidivists, or institutional violators and non-violators) corrected for sample size bias. Positive effect sizes designate that higher PCL scores were associated with higher rates of recidivism and institutional infractions. To enhance calculation accuracy, two authors (ARL and JD) independently calculated all effect sizes with the assistance of a computer program designed specifically for meta-analytic calculations (DSTAT Version 1.10; Johnson 1993). Discrepancies were resolved through discussions between the two coders.

For many studies, effect size calculations were derived from reported means and standard deviations. Some studies did not report such statistics and d was computed through other statistical methods, such as from correlations, ANOVAs, t-tests, chi-square tests, or from p-values (see DeCoster 2004). One study reported non-significant results but did not provide a specific test statistic, so we set its effect size equal to zero. When studies reported more than one measure of recidivism or institutional infractions, an average of the effect sizes was calculated to ensure sample independence. If there was insufficient information to compute an effect size for a study, the authors were contacted and asked to provide means and standard deviations. All but one author provided the requested information. This study was necessarily excluded from our analyses.

Results

Descriptive Analyses

Study Characteristics

Ninety-five studies with non-overlapping samples were included in the current analysis, assessing a total of 15,826 participants (see Appendix A for a summary of overall effect sizes for each included study). An additional 24 studies were located that examined the relation between psychopathy and antisocial conduct but were excluded because their samples overlapped with other studies included in the analysis (a list of these studies is available from the authors). As noted in Table 2, studies included in this analysis were chiefly published articles with a median publication date of 2001. Studies used fairly large samples (Mean N > 150) that were predominantly adults. The mean age of participants was 29.7 years. The majority of the studies were from peer-reviewed journals. Studies were conducted in the United States, Canada, Sweden, the Netherlands, Spain, France, and the United Kingdom. We examined histograms of effect sizes prior to conducting our analyses and did not observe any outliers.

Table 2 Summary study characteristics

Mean Effect Sizes

A summary of the effect sizes compiled across all studies and outcome measures for PCL Total, F1, and F2 is presented in Table 3. Each effect size was weighted by the inverse of its variance. All weighted mean effect sizes were significantly greater than zero, indicating that higher PCL Total, F1, and F2 scores are associated with increased engagement in recidivism or institutional infractions. We computed the fail-safe N statistic to examine the possible effects of publication bias on the magnitude of these effect sizes (Rosenthal 1979). The resulting values indicated that there would need to be over 5,000 unreported studies with null findings to reduce the mean PCL Total, F1, or F2 effect sizes to a magnitude that they were no longer significantly different from zero. It is highly unlikely that this many unpublished and unreported studies remain “in the file drawer.” Furthermore, Rosenthal’s (1991) guidelines indicate that it is highly unlikely that the PCL Total, F1, and F2 mean effect sizes are nonsignificant.

Table 3 Summary effect size characteristics

The weighted mean d for PCL Total scores across 94 samples (Ben-Horin 2001, only reported factor scores) was 0.55. This can be descriptively classified as a medium effect size (Cohen 1992), indicating that the associations between psychopathy total scores and recidivism/institutional infractions are strong enough to be easily recognized. To enhance the understandability of this effect size, we can imagine comparing the theoretical distributions of psychopathy total scores for transgressors and non-transgressors. The size of this effect indicates that the mean psychopathy total score for the transgressors would be at the 71st percentile of the non-transgressor group (U3, from Cohen 1977).

The weighted mean d for PCL F1 averaged across 54 samples was 0.38. This can be descriptively classified as a small-to-medium effect size (Cohen 1992), indicating that although relations between F1 scores and recidivism/institutional infractions are notably smaller than the effect sizes for PCL Total scores, they are not trivial. If we would compare the theoretical distributions of F1 scores for the transgressors and non-transgressors, the size of this effect indicates that the mean F1 score for the transgressors would be at the 65th percentile of the non-transgressor group (U3, from Cohen 1977).

The weighted mean d for F2 across 53 samples was 0.60. This can be descriptively classified as a medium effect size (Cohen 1992), indicating that the relations between F2 scores and recidivism/institutional infractions are easily detectable. If we would compare the theoretical distributions of F2 scores for the transgressors and non-transgressors, the size of this effect indicates that the mean F2 score for the transgressors would be at the 72nd percentile of the non-transgressor group (U3, from Cohen 1977).

We used the Z statistic derived from Steiger (1980) to compare the relative abilities of F1 and F2 to explain future antisocial conduct. This test requires an estimate of the correlation between the two factor scores, which we took to be 0.50 based on Hare (2003) and Hemphill et al. (1998). The result of this test indicated that F2 effect sizes were significantly stronger in magnitude than F1 effect sizes (Z = −8.92, p < 0.001).

Considerations Related to Moderator Analyses

Relating Moderators to Effect Sizes

As displayed in Table 3, the values of Q w are all significant, indicating that the effect size distributions for PCL Total, F1, and F2 were statistically heterogeneous. Moderator analyses based on a fixed-effects model were examined to explain this variability in effect sizes. Effect sizes were weighted by the inverse of their variances in all moderator analyses. We examined plots of each moderator versus the effect size to detect outliers and nonlinear relations before conducting moderator analyses. One outlier was noted for the length of follow-up, which is discussed below in the presentation of that moderator. For each moderator, we present a test of its ability to explain effect sizes, as well as a test of whether there is a significant amount of variance remaining in effect sizes after removing variability associated with the moderator.

Reducing Dependence Among Effect Sizes

Many studies provided information allowing us to calculate effect sizes under several different levels of our moderator variables. However, including multiple effect sizes from the same study violates the assumption of independence made by fixed-effects models. To resolve this dilemma, we used a method originally suggested by Cooper (1989) to minimize dependence while still making use of within-study variability. Each study only contributed multiple effect sizes to a moderator analysis if separate effects could be calculated for different levels of that particular moderator. Otherwise a single overall effect size was used in the analysis. We therefore included dependent observations in our analyses only when they are specifically related to the moderator being analyzed. For example, a study that presented results broken down by both race (Caucasian and non-Caucasian) and gender (male and female) would contribute two effects to the moderator analysis of race (one for its Caucasian participants and one for its non-Caucasian participants, both averaging over gender), two effects to the moderator analysis of gender (one for its male participants and one for its female participants, averaging over race), and one effect to every other moderator analysis (averaging over both race and gender). Although all of our analyses are based on the same set of studies, using this method causes the number of effects and the total variability among the effect sizes to be different in each moderator analysis.

Collapsing Across Offense Types and Ages

Previous meta-analyses have typically limited their consideration to either specific offense types or to specific age groups. We therefore examined whether these factors had significant influences on effect sizes before choosing to analyze them together. We compared effects measuring recidivism to those measuring institutional infractions, as well as effects measuring violent to those measuring non-violent offenses. Given that these breakdowns were categorical, moderating effects were measured by Q b (see Table 4). There were no significant differences between the effect sizes found for institutional infractions and recidivism, nor were there significant differences between the effect sizes found for violent and non-violent offenses. A weighted regression analysis was conducted to predict effect sizes from the mean age of each sample (mean ages ranged from 14 to 46 years). Age was not a significant moderator, indicating that relations between psychopathy and future antisocial conduct were consistent across differences in the average age of the samples for PCL Total, F1, and F2 (see Table 5). We also found that studies using the youth version of the PCL (i.e., PCL:YV) were not significantly different from those using the adult versions (i.e., PCL-R, PCL: SV, PCL) of the PCL (Total Q b [1] = 0.017, p =  0.90), F1(Q b [1] = 0.41, p = 0.52), F2(Q b [1] = 1.75, p =  0.19). Since neither offense type nor age appeared to have significant influences on the relation between psychopathy and antisocial conduct, we collapsed across these factors for all subsequent analyses.

Table 4 Results examining the influence of offense type
Table 5 Moderator analyses for continuous variables

Moderator Analyses Related to Sample Generalizability

Country

We examined whether the country in which the study was conducted had a significant influence on effect sizes. The first category included studies conducted solely in the United States. The second category was comprised of studies conducted in Canada, and the third category consisted of studies conducted in European countries. This variable was a significant moderator of PCL Total and F2 effect sizes (see Table 6). Since Q b provides an omnibus test of differences across the three levels in this variable, we performed contrasts to determine the exact nature of these effects (Rosenthal and Rubin 1982). The results indicated that the mean effect sizes for studies conducted in Canada and Europe were significantly larger than those found in studies conducted in the United States for both PCL Total (Z = 5.73, p < 0.001) and F2 (Z = 3.24, p < 0.001). Country was not a significant moderator of F1 effect sizes.

Table 6 Moderator analyses for categorical variables

Race

Even though race is a categorical variable at the subject level, race is naturally continuous at the study level. When a study reported frequencies of participants’ races but did not provide results separately by race, we coded the proportion of Caucasians in the study. When a study reported results separated by race, we included separate effect sizes for Caucasians and non-Caucasians, coding the effect size for Caucasians as 100% Caucasian and the effect size for non-Caucasians as 0% Caucasian. Table 5 displays the estimate and standard error of the slope coefficients predicting effect size from the percentage of Caucasian participants in the sample. Results from this analysis suggest that effect sizes varied depending on the racial composition of the sample. The results for PCL Total and F2 indicate that samples with larger numbers of Caucasian participants had larger effect sizes. Relations between antisocial conduct and F1 were consistent across samples.

Gender

We coded gender continuously at the study level in the same way that we coded race, except that this moderator represents the proportion of male participants. Table 5 displays the weighted regression results predicting effect sizes from gender. Gender composition of the sample was a significant moderator of PCL Total and F1 effect sizes. The negative regression coefficients indicate that the PCL Total and F1 scores explained future antisocial conduct better in samples that included more female participants. F2 effect sizes appear to be equivalent despite the gender composition of the samples.

Institutional Setting

Two categories were coded to differentiate the institutional setting of the samples. The first category included studies conducted in forensic or civil hospitals, whereas the second category included participants from jail, detention, and prison settings. Results indicate that institutional setting was a significant moderator of F2 effect sizes. The mean F2 effect size was larger in samples of patients than in samples of detainees (see Table 6).

Moderator Analyses Related to Study Methodology

Length of Follow-up

Length of follow-up was coded as a continuous variable that indicated the average number of months between the rating of the PCL and the collection of outcome data. Examination of a scatter plot graphing effect sizes versus length of follow-up indicated one outlier with a follow-up period of 20 years (Weiler and Widom 1996). This follow-up period was substantially longer than the second longest follow-up of 12 years, so we excluded this study from our analyses of this moderator. Table 5 displays the weighted regression analysis predicting effect size from length of follow-up with this outlier removed. Length of follow-up was not a significant moderator of PCL Total and F1 effect sizes, indicating that they remained consistent across varying lengths of follow-up. Length of follow-up was a significant moderator of effect sizes for F2, such that these effect sizes were.

Information Used to Assess the PCL

For this moderator, we distinguished studies that generated PCL ratings using only file information from standard administrations. PCL Total and F2 effect sizes were significantly larger when the ratings were based only on file information (see Table 6), although this moderator did not influence F1 effect sizes.

Independence of Assessments

Researchers in the area of psychopathy frequently use the terms “prospective” or “retrospective” to describe the relation between the assessment of psychopathy and the collection of outcome measures. However, different researchers use these terms in different ways. We therefore developed an objective system to code the independence of assessments based on the details of each study’s method rather than relying on labels that authors provided. Four groups were initially created to examine how the independence of the PCL and outcome assessments influenced effect sizes. The first group, labeled “predictive,” included studies in which PCL ratings were made before the outcome data were collected. A second category, “mixed with independence,” included studies in which PCL ratings were made after the outcome data were collected, but the PCL ratings were based only on information collected prior to the period during which the outcomes were gathered. This second category required evidence that the data used to make PCL ratings and the data used to code outcomes were separate. The third group, labeled “mixed without independence,” consisted of studies similar in design to the “mixed with independence” group; however, these studies did not explicitly indicate that PCL raters had no knowledge of the outcome data while making PCL ratings. Studies in this group still based the ratings of the PCL and the outcomes on different sources of information. If it was not clear that the PCL and outcomes were based on different sources of information, or the authors explicitly stated that the PCL raters had knowledge of the outcomes, the study was coded “postdictive.” Studies in the first category have been previously labeled “prospective” in the literature, whereas those in the last three categories have previously been labeled “retrospective” in the literature. No significant differences were found between studies coded “mixed with independence” and “mixed without independence.” We therefore collapsed these groups into one category labeled “mixed” for our analyses.

We used Q b to estimate the amount of variability between these groups because of the categorical nature of this variable. Independence of assessments was a significant moderator of PCL Total and F1 effect sizes (see Table 6). Contrasts (Rosenthal and Rubin 1982) indicated that the mean PCL Total score in postdictive studies was significantly lower than the average of the other two groups (Z = −2.78, p < 0.01), while the mean F1 score for predictive studies was significantly higher than the average of the other two groups (Z = −3.02, p < 0.001).

Combined Analysis of Moderator Variables

Relations Among Moderator Variables

We used correlations, chi-square tests, and ANOVAs to determine if there were any significant relations among the moderators (i.e., country, race, gender, institutional setting, length of follow-up, information used to assess the PCL, and independence of assessments). The significant results from these analyses (using α =  0.002 based on a Bonferroni correction) are presented in Table 7. Footnote 1 We found that the country in which a study was conducted was significantly related to the proportion of minorities in the sample, such that those conducted in the United States had more minorities (51%) than those conducted in either Canada (23%) or European countries (25%). We also found that the information used to assess the PCL, the independence of assessments, and the institutional setting were all significantly related to each other. Predictive studies generally used both file and interview information and were primarily conducted in non-psychiatric settings.

Table 7 Significant relations between moderators

Independent Effects of Country and Race

Given that race and country were significantly related to each other, we performed a multiple regression analysis to determine the unique ability of race and country to explain variability in the effect sizes (Lipsey and Wilson 2000). Results from these analyses are conceptually the same as results from a standard multiple regression analysis, allowing us to determine if each moderator variable has a unique influence on effect sizes above and beyond the influence of the other moderator variable. Since studies conducted in Canada and Europe had generally the same proportions of non-Caucasian participants and similar mean effect sizes, we combined these studies into one level for our regression analysis. The country moderator variable therefore had two levels: studies conducted in the United States and studies conducted outside of the United States.

The results of this multiple regression analysis are presented in Table 8. We found that country was a unique predictor of PCL Total effect sizes, such that effect sizes were larger in studies conducted outside of the United States, but was not a unique predictor of F2 effect sizes. We also found that the proportion of Caucasian participants was a unique predictor of F2 effect sizes, such that samples with more Caucasian participants showed stronger effects, but was not a unique predictor of PCL Total effect sizes. The original moderator analyses indicated that both of these variables were able to explain PCL Total and F2 effect sizes. The fact that the original bivariate relations were only partially replicated in the multiple regression analysis indicates the presence of multicollinearity between Country and Race.

Table 8 Multiple regression analyses

Independent Effects of Institutional Setting, Information Used to Assess the PCL, and Independence of Assessments

We found that predictive studies generally used both file and interview information and were primarily conducted in non-psychiatric settings. We therefore performed a multiple regression analysis to determine the unique ability of institutional setting, the information used to assess the PCL, and the independence of assessments to explain variability in the effect sizes (Lipsey and Wilson 2000).

The results of this multiple regression analysis are presented in Table 8. The type of institutional setting had a unique influence on effect sizes beyond the influence of the information used to assess the PCL, and the independence of assessments. Mean PCL Total and F2 effect sizes were significantly larger in psychiatric samples than in detained samples. The information used to assess the PCL did not uniquely affect PCL Total, F1, or F2 effect sizes. The independence of assessments had a unique influence on both PCL Total and F2 effect sizes beyond the influence of institutional setting and the information used to assess the PCL.

The two significant effects of independence of assessments appear to contradict each other. The mean PCL Total effect size for predictive designs was significantly higher than the mean PCL Total effect size for postdictive designs; whereas the mean F1 effect size for predictive designs was significantly lower than the mean F1 effect size for postdictive designs. This inconsistency was due to the fact that the effect of independence of assessments in the studies that provided both PCL Total and factor scores was different from that found in studies that only provided PCL Total scores. For the studies that reported factor scores, the mean PCL Total effect size for predictive designs (d = 0.54) was lower than the mean PCL Total effect size for postdictive designs (d = 0.72). For studies that did not report factor scores, the mean PCL Total effect size for predictive designs (d = 0.64) was larger than the mean PCL Total effect size for postdictive designs (d = 0.24).

Discussion

The current meta-analysis examined a broad cross-section of studies investigating the relation between psychopathy and antisocial conduct. The overall weighted mean effect sizes were clearly within the range of those reported by prior meta-analyses. The impulsive and antisocial behavioral traits of psychopathy (i.e., F2) had a stronger relation with antisocial conduct than did the affective and interpersonal traits (i.e., F1), which is consistent with previous meta-analyses. We found that moderators often had different influences on F1 and F2 effect sizes, implying that there are important differences between the constructs represented by these two factors.

Psychopathy explained recidivism/infractions equally well across younger and older samples. However, it should be noted that our use of average ages might obscure important differences that occur close to ends of the distribution (e.g., young adolescents and older adults) since these individuals constitute small proportions of the samples. Even though the current meta-analytic results indicate that the PCLs are moderately and consistently predictive of negative outcomes across ages, more research is needed on the applicability and viability of the concept in youth (see Farrington 2005; Rutter 2005).

Several other sample characteristics influenced the explanatory power of psychopathy. The magnitude of PCL Total and F1 effect sizes were stronger in samples containing higher proportions of females. F2 effect sizes were larger for patients than for detainees, even when controlling for aspects of study methodology. PCL Total and F2 effect sizes were larger in studies with more Caucasian participants and in studies conducted outside of the United States. The effects of race and country were highly collinear, making it difficult to determine whether either or both of these variables had unique influences on the psychopathy and antisocial conduct relation.

File only studies were primarily postdictive and conducted in psychiatric settings. After controlling for independence of assessments and institutional setting, the type of information used to assess the PCL did not substantially influence effect sizes. However, the independence of assessments uniquely affected the predictive ability of PCL Total and F1 scores, even after controlling for the influence of other moderators. We are hesitant to make strong conclusions from these results since the effect of this moderator in studies reporting both total and factor scores differed from the effect in those reporting only the total scores. While it is unclear how the independence of assessments affects the relation between psychopathy and antisocial conduct, there is evidence that this variable does influence effect sizes. Future research should continue to use predictive studies to study the true effects of psychopathy. Postdictive studies are vulnerable to criterion contamination, which potentially adds bias to study results. It does not matter methodologically whether postdictive studies lead to larger or smaller effect sizes than predictive studies, since we know that the predictive studies will lead to more accurate conclusions.

The antisocial and impulsive behavioral aspect of psychopathy (i.e., F2) had better predictive ability at longer follow-up periods. To explain this result, it is useful to identify three categories of individuals: (A) non-transgressors, (B) unsuccessful transgressors (i.e., those apprehended), and (C) successful transgressors (i.e., those not apprehended). Psychopathy is theoretically related to whether an individual engages in antisocial conduct (categories B and C), but these outcomes are only measured by examining those who have been caught (category B alone). Over longer follow-up periods, the number of successful transgressors will dwindle because each additional transgression provides another opportunity for the transgressor to be caught. This means that what we want to measure (all transgressors—categories B and C) is more similar to what we actually measure (apprehended transgressors—category B) when follow-up periods are longer. We would therefore expect the predictive ability of psychopathy to increase with longer follow-up periods.

Our results replicated some findings from prior meta-analyses while failing to replicate others. Consistent with Guy et al. (2005), we found that effect sizes were larger in studies conducted outside of the United States than those conducted in the United States. We replicated Hemphill et al.’s (1998) finding that effect sizes were larger at longer follow-up periods and Edens et al.’s (2007) finding that effect sizes were larger in Caucasian samples. Our finding that effect sizes were consistent across offense types and age groups replicated Edens and Campbell (2007) and Walters (2003b). However, our finding that effect sizes were stronger for samples with larger numbers of females specifically contradicts Edens et al.’s (2007) observation that psychopathy was more predictive for males. Our meta-analysis revealed that country, gender, race, institutional setting, information used to assess the PCL, independence of assessments, and length of follow-up significantly influenced effect sizes, conflicting with null findings by Edens and Campbell (2007), Edens et al. (2007), Guy et al. (2005), and Walters (2003b). Although it is useful to provide in-depth analyses of particular populations or particular offense categories, our results show that it can also be beneficial to take a broader look at the psychopathy literature. By examining the data in this way, we were able to provide a more encompassing picture of the explanatory power of the PCLs. We also extended previous meta-analyses by examining the unique predictive abilities of moderator variables that were significantly related to each other, providing a more detailed picture of the relation between psychopathy and antisocial conduct.

We would like to note a few limitations of the current meta-analysis that future research may address. Findings from the current study apply only to studies using PCL measures, and are therefore less generalizable to the broader psychopathy literature base that includes self- or parent-report measures (e.g., Antisocial Process Screening Device). Furthermore, our results only provide empirical evidence on institutionalized and imprisoned samples. How well these findings extend to community samples will need to be examined in future meta-analyses of studies investigating non-institutionalized samples. The findings in the current study are also limited to the two-factor model of psychopathy. Recent research indicates that newer factor models may be more appropriate. Relations between PCL scores and antisocial conduct may not be as strong as those found in the current meta-analysis if researchers use the three-factor model, which removes antisocial conduct as a factor. Meta-analyses of newer factor models of psychopathy may also uncover different moderator results than those found in the current meta-analysis.

Implications and Future Considerations

Using psychopathy as a clinical measure of the likelihood of institutional misconduct and post-release outcomes is moderately supported by the empirical evidence to date. However, researchers, clinicians, and decision-makers in this area need to take care that information about psychopathy is used appropriately. Predicting recidivism or institutional maladjustment differs from many clinical predictions in the obvious implications for both the individual (e.g., abridgement of personal freedoms) and society (e.g., community safety). PCL scores are sometimes used to justify harsher sentences, transfers of youth to adult court, longer parole ineligibility periods, and capital sentencing (Cunningham and Reidy 2002; Zinger and Forth 1998). Some state sex offender commitment statutes even require that the PCL be given during psychological evaluations (Edens 2001). Given the seriousness of these psycho-legal determinations, we must recommend that clinicians and legal decision makers consider risk and protective factors beyond psychopathy when attempting to predict future behaviors.

We found that several important individual characteristics influence how well the Hare PCLs predict antisocial behaviors. Our results suggest that predictions of antisocial conduct based on the Hare PCLs should be interpreted more cautiously for members of minority ethnic groups, males, and prisoners than for Caucasians, females, and psychiatric patients. Furthermore, our work suggests that predictions of antisocial conduct will be less reliable for shorter follow-up periods than for longer follow-up periods.

Researchers and clinicians should also be cautious when interpreting the limited predictive ability of F1 scores. High scores on F1 indicate interpersonal charm, exploitative manipulation, and self-advancing deceitfulness, which are likely associated with duping the system and escaping documentation of antisocial conduct. It is possible that some individuals scoring high on F1 engage in comparable amounts of antisocial behaviors, yet are interpersonally skilled, cunning, and manipulative enough to escape documentation. Future studies could examine this hypothesis by using outcome criteria such as dismissed charges, staff observations, and institutional notes.

As a final note we would like to offer researchers our suggestions regarding what information should be reported so that readers can accurately evaluate their studies. We found surprising variability in the reporting of study methodology across the articles reviewed for the current analysis. Consistent with the recommendations of Hemphill et al. (1998), we propose that researchers provide details regarding institutional, sample, and demographic characteristics; release and follow-up variables; details about the antisocial outcomes being examined; and relations between PCL scores (including factor scores) and antisocial behavior. We also recommend that researchers provide details about rater characteristics (e.g., race, gender, professional background, training on PCL measures) and report both PCL Total and factor scores that are individualized by gender and ethnicity. Documentation of this information is necessary to judge the methodological quality of the study and provides invaluable data for future meta-analyses.

Psychopathy has received a considerable amount of empirical attention and is well into the later stages of construct and test validation. Future studies can enhance our understanding of the relation between psychopathy and antisocial outcomes by: (a) examining outcomes over long-term follow-up periods (greater than 2 years), (b) using predictive methods to gather outcome data, (c) scoring the Hare PCL using both interview and file information, (d) examining diverse samples including women, ethnic minorities, and individuals with mental illnesses, and (e) investigating outcomes directly measuring antisocial behaviors rather than relying on technical charges or documented offenses.

The clinical and empirical popularity of psychopathy is evident from the number as well as the diversity of studies included in this meta-analysis. Researchers have carefully investigated the basic relations between psychopathy and antisocial conduct, but the complexities of this relation are less understood. Future research exploring how situational factors and individual characteristics influence the relation between psychopathy and antisocial conduct will greatly enhance the psychological theories on and clinical uses of psychopathy.