Introduction

Depression and posttraumatic stress disorder (PTSD) are among the two most common disorders found in post-deployment veterans (i.e., those returning from the wars in Iraq and Afghanistan). For example, one study found the rates of probable depression at 13.7% [1], and estimated prevalence rates of PTSD have ranged from 13 to 29% [2, 3]. Nosologically, depression encompasses a number of disorders in the Diagnostic and Statistical Manual of Mental Disorders (DSM-5) [4], including major depressive disorder, dysthymic disorder, depression due to a general medical condition or substance use, and other specified/unspecified depressive disorders. PTSD falls under the new category of trauma- and stress-related disorders, though it was previously classified as an anxiety disorder. The diagnosis includes 20 possible symptoms arising secondary to a traumatic event. The focus of diagnosis according to DSM is reliability, and on the surface differentiating PTSD from depressive disorders would seem straightforward. However, both include nonspecific symptoms, such as impaired concentration and problems with sleep. Accurate differential diagnosis is important given that treatment for the disorders can be distinct from both psychopharmacology and psychotherapy perspectives (e.g., cognitive behavioral therapy for depression versus prolonged exposure for PTSD).

Diagnosis and assessment

The gold standard for rendering a psychiatric diagnosis remains a clinical interview and/or a validated structured interview. However, a variety of self-report measures are available that focus on individual disorders. For example, the Beck Depression Inventory-II (BDI-II) [5] is a 21-item self-report measure of symptom severity associated with depression occurring over the prior two-week period. Scores range from 0 to 63 with higher scores indicating more distress.

The manual provides cut scores for use in classification of depression severity from minimal to severe. The BDI-II is commonly used clinically to track symptom change over time and in research to classify participants into groups based on severity of depression or the presence/absence of depression using an a priori cut score. The PTSD Checklist-Military version (PCL-M) [6] is a 17-item self-report measure of distress associated with DSM-IV PTSD symptoms related to a military-based event. Respondents are instructed to rate how much they “have been bothered by” symptoms secondary to “a stressful military experience” over the prior month or week. Scores range from 17 to 85, with higher scores indicating more distress. A cut score of 50 has been suggested for clinical and research use as a threshold for possible PTSD [7, 8]. Use of this cut score to predict PTSD groups based on Clinician-Administered PTSD Scale (CAPS) in a sample of veteran primary care patients resulted in an AUC for the PCL (version unspecified) of 0.88 (SE = 0.02) [9].

Self-report measures are widely used as they are brief, easy to administer and score, and typically provide interpretive guidelines that require minimal training to follow. They may be used to identify the severity of one’s symptoms. Self-report measures can also be used to track symptoms over time, for example, when assessing effectiveness of treatment (e.g., prolonged exposure for PTSD). Finally, self-report measures can be used as screeners for the disorder of interest, and in research may also be used to identify groups that are likely to have a certain disorder. For example, the PTSD Checklist (PCL) [7] cut score of >50 has been used in the literature to identify possible PTSD [7, 8].

Interpretation of self-report measures

A key concept to appropriately interpreting self-report measures is understanding the underlying construct being measured. In general, self-report measures assess symptom frequency, severity, or level of distress and could be conceptualized as capturing symptom burden. Self-report measures rely on an individual’s understanding of symptoms, an understanding that may not be consistent with the interpretation of a trained practitioner. As a result, error variance is inherent in the use of self-report measures. Similarly, most self-report measures are designed to maximize sensitivity with little regard for specificity. Thus, individuals with depression may produce high scores on a particular self-report measure of depression symptoms; however, participants with other psychiatric conditions may produce high scores as well.

One important factor contributing to poor specificity is the construct of general distress, or the distress generated by a symptom or set of symptoms. General distress is not specific to a disorder but rather reflects the overall psychological discomfort (i.e., ego dystonic distress) created by the presence of any psychopathology. Due to its lack of specificity, this general distress complicates the use of self-report inventories for differential diagnosis. A prominent example of this phenomena occurred in the Minnesota Multiphasic Personality Inventory-2 (MMPI-2). The clinical scales were empirically derived: sets of items were used to create scales that predicted known groups, regardless of item content. This allowed an individual with a high level of general distress to elevate multiple scales due to the endorsement of more general symptoms rather than syndrome-specific symptoms. In response, the restructuring of the MMPI-2 (the MMPI-2-RF) [10] involved identifying the general distress factor (labeled demoralization), removing the items that loaded onto that construct, and restructuring the clinical scales to increase specificity.

Although self-report measures purport to focus on unique syndromes, they may be host to the same problem. For example, Arbisi and colleagues [11••] found that the BDI-II and PCL-M performed equally well in identifying PTSD diagnosis based on the CAPS in a sample of veterans. The authors suggested that this overlap might be due to the measurement of a more generalized, nonspecific distress construct. They also noted that the BDI-II and PCL-M were highly correlated (r = .77). This finding has been replicated in other studies as well. For example, a review of the military, civilian, and specific versions of the PCL reported correlation coefficients between these measures and the BDI-II ranging from .43 in the military version to .76 in the civilian version [12•]. These correlations were larger than the correlations between the PCL and the CAPS, another measure of PTSD. In the military sample, the PCL and CAPS were correlated at .30 at baseline and .62 at 9 months post-treatment; in the civilian sample, the PCL and CAPS were correlated at .63. The updated PCL-5 based on DSM-5 criteria has continued to demonstrate a strong association with the BDI-II (r = .64) [13•].

PCL-M and BDI-II: a research example

Presented data were drawn from the study of post-deployment mental health (PDMH) [14] and a subsequent neurocognitive study, each conducted by the Mid-Atlantic Mental Illness Research, Education, and Clinical Center (MA-MIRECC). Both studies were reviewed and approved by the local Institutional Review Board. Written and verbal informed consents were obtained from all participants. Welfare and privacy of human subjects were protected and maintained. The PDMH is open to any veteran serving in the military since September 11, 2001 and consists of a research rather than a clinical sample. Participants complete self-report questionnaires assessing health, medication and substance use, sleep, head injuries, and psychological functioning and provide blood and serum samples. Psychological diagnoses are assessed using the Structured Clinical Interview for DSM-IV Disorders (SCID-I) [15]. Participants meeting additional criteria including no combat prior to 1985, no moderate or severe head injuries, no pre-military PTSD, and no present psychosis or substance use disorders were invited to complete the neurocognitive study. Participants completed a fixed neuropsychological battery and additional measures of psychopathology. We examined the association between both the PCL-M and BDI-II and current diagnoses of depressive disorders (Dep) and PTSD. The depressive disorders group included participants who met criteria for major depressive disorder, dysthymic disorder, or depression not otherwise specified.

The overall sample of 250 OEF/OIF veterans was primarily male (n = 222, 88.8%) and White (n = 180, 72.3%) or Black (n = 70, 28.1%). Detailed diagnosis information is presented in Table 1. Within this sample, 94 participants met criteria for PTSD and 60 met criteria for a depressive disorder, 43 of whom met criteria for both PTSD and a depressive disorder. Two subsamples were identified, one that excluded participants with a PTSD diagnosis (no PTSD, n = 156) and one that excluded participants with a depression diagnosis (no Dep, n = 190). Characteristics of the overall sample as well as each subsample are presented in Table 2. Within each sample (overall, no PTSD, and no Dep), point-biserial correlations were calculated between the dichotomous diagnosis variables (Dep and PTSD) and the continuous self-report measure scores (PCL-M and BDI-II). All correlations were significant and are provided in Table 3.

Table 1 Diagnoses of overall sample
Table 2 Characteristics of the overall sample and subsamples by diagnosis
Table 3 Point-biserial correlations between diagnosis and outcome measure by group

In the overall sample, the BDI-II and PCL-M were highly correlated with one another (r = .86, p < .001), suggesting significant redundancy despite each measure’s claim to assess a nosologically unique disorder. In the overall sample (which included comorbid PTSD and Dep diagnoses), the presence of PTSD was highly correlated with both PCL-M and BDI-II scores. The presence of Dep was similarly highly correlated with scores on both measures. Because PTSD and depression are highly comorbid in veterans (17.2% of our sample had both), we examined these correlations in subsamples. In the no-Dep sample, PTSD remained highly correlated with both the PCL-M and BDI-II. In the no-PTSD sample, depression diagnosis was moderately correlated with both the PCL-M and BDI-II. In the subsample analyses, the correlation between the BDI-II and PTSD diagnosis was larger than the correlation between the BDI-II and depression diagnosis. It would seem, then, that an individual who scores high on the BDI-II may be just as likely to have PTSD as depression.

Treatment

Treatment and research recommendations for assessment of psychopathology

Self-report inventories of psychological symptomatology are appealing to practitioners and researchers due to their brevity, simple administration and scoring, and utility in tracking change in symptoms over time. Prior research has utilized self-report inventories to create diagnostic categories and subsequently apply findings clinically to diagnostic groups. Clinically, practitioners often use cut scores on a measure to screen and/or support a specific diagnosis of PTSD or depression. This becomes problematic when the inventory has poor specificity to the related diagnosis. Recent research has raised questions regarding the specificity of the BDI-II, an instrument commonly used to measure symptoms related to depression (e.g., Arbisi et al. [11••]). Recent studies, including the example provided in this commentary, suggest that the BDI-II may instead be a measure sensitive to general distress, similar to the MMPI-2-RF demoralization scale. Alternatively, the high correlations among the BDI-II, PCL, PTSD diagnosis, and depression diagnosis may reflect symptom overlap between the conditions. However, such high correlations suggest the relationship goes beyond the limited symptom overlap. For example, re-experiencing, arousal, and avoidance symptoms are not expected from depression. Future studies using DSM-5 criteria might further explore these associations. The following recommendations are offered:

Clinical contexts

  • Clinically, the BDI-II continues to be a useful measure of overall distress; however, clinicians are cautioned against assuming high scores result from a depressive disorder specifically and are encouraged to consider other psychiatric conditions as potential contributors.

  • The BDI-II or PCL can be used to follow patient improvement secondary to medication or therapy trials.

  • The BDI-II should not be used to confirm a diagnosis of depression. It can be supplemented with a structured interview tool specifically querying DSM criteria, or a clinical interview to confirm all criteria are met. Similarly, best practice should include the use of an interview tool to confirm a diagnosis of PTSD rather than relying on a PCL-M score.

  • The high correlation between a diagnosis of PTSD and scores on the BDI-II suggests that the measure can be clinically useful in measuring distress in PTSD as well as depressive disorders.

Research contexts

  • Researchers are encouraged to refrain from using the BDI-II or PCL to create probable diagnostic groups based on established or new cut scores. Instead, future studies should use a diagnostic tool such as the SCID, CAPS, or another interview tool to categorize participants into diagnostic categories.

  • Research findings based on studies grouping participants using only brief self-report inventories may include a high number of false positive group assignments. To avoid confusion, researchers using these paradigms are encouraged to use clear wording indicating that results reflect degree of psychological distress; researchers should avoid giving the impression that results are based on diagnoses unless another method for diagnosis was used.

  • These inventories can be useful in research focusing on degree of distress and symptom improvement following therapeutic interventions.

Conclusions

In conclusion, the strengths of the BDI-II and PCL-M lie in their utility as brief measures of symptom distress and their ability to track change in symptoms over time. However, due to their lack of specificity, they should not be used without additional interview tools (either standardized tools or clinical interviews) for diagnostic purposes in research or clinical work. This is not to dissuade the use of either the BDI-II or PCL-M; rather, the measures might better be interpreted as indicators of symptom burden.