Introduction

Anxiety and mood disorders (particularly unipolar major depression) are among the most prevalent psychiatric disorders worldwide, imposing a significant burden upon the general population. According to the World Health Organization 2017 an estimated 322 million people were suffering from depression, or a staggering 4.4% of the global population. Similar trends were observed for anxiety disorders, which were affecting an estimated 264 million people, or 3.6% of the global population, in 2017 (World Health Organization 2017). In South Korea (hereafter Korea), anxiety and depression have become prominent a public-health and social issue due to the rapid increase in suicides (Oh et al. 2013). The 12-month prevalence of adults with depression was highest among individuals age 20 to 29 (3%), with slightly greater prevalence among men (3.1%) than among women (2.9%). Moreover, the highest 12-month prevalence of generalized anxiety disorder also occurred in the 20–29 age group (2.4%), but it was more common in women (2.8%) than in men (1.9%) (Hong et al. 2016). In fact, Korea ranked first among the 34 Organization for Economic Cooperation and Development (OECD) countries in suicide mortality rate as of 2017, at 24.3 per 100,000 people (Korean Statistical Information Service (KOSIS) 2017a). Among deaths among individuals age from 20 to 29 years, according to the OECD report, 45.5% were attributable to suicide (Korean Statistical Information Service (KOSIS) 2017b).

University can be a stressful life transition with increased exposure to stressors. Many Korean university students experience frustration due to the pressure of military service, competition for good grades and the failure to find work. As a result, anxiety and depression is common among Korean university students, and may in extreme cases lead to suicide (Jo et al. 2011). Accurate measurement of psychological symptoms is, thus, vital to enhance suicide prevention among Korean university students (Park and Seo 2010). Early screening for anxiety and depression in primary care and university settings requires a measurement method that is rapid and easy to apply and has confirmed psychometric properties.

Various measures and models have been developed to assist in the diagnosis of anxious and depressive symptoms. One challenge in the study of the development of these conditions is that anxiety and depression are defined as putatively separate or distinct phenomena at the conceptual level. Nevertheless, previous attempts to quantify these constructs using questionnaires and clinical ratings have often yielded high levels of correlation between self-report measures of anxiety and depression (e.g., Stark and Laurent 2001) and high rates of comorbidity between anxiety and depressive disorders (e.g., Brady and Kendall 1992). Such findings have led psychologists to consider whether anxiety and depression represent different manifestations of the same underlying pathogenesis and to what extent they should be viewed as distinct disorders (Barlow et al. 1996).

At the syndrome level, the high rates of comorbidity between anxiety and depression have spawned theoretical disputes regarding the distinguishability of these constructs (Brown 1996). In this regard, Clark and Watson (1991) concluded, on the basis of a literature review, that although anxiety and depression share a substantial component of general affective distress and other common symptoms, certain features still distinguish the two constructs. Specifically, Clark and Watson (1991) proposed a tripartite model of anxiety and depression to account for the observed overlap between anxiety and depressive symptoms and the diagnostic comorbidity between these affective states. This tripartite model postulates that anxiety and depression can be conceptualized in terms of three dimensions: (a) general distress or negative affect, which occurs in both anxiety and depression; (b) physiological hyperarousal, which is specific to anxiety; and (c) an absence or a low level of positive affect, which is specific to depression.

In another research endeavor, Lovibond and Lovibond (1995) developed a single measure to assess the full range of core symptoms of anxiety and depression while achieving maximum discrimination between the two constructs. During the original factor-analysis testing of the scale, a third factor corresponding to tension, irritability, and agitation emerged and was labeled as stress. Accordingly, Lovibond and Lovibond called their resulting questionnaire the Depression, Anxiety, and Stress Scale (DASS-42), a self-report instrument comprising three subscales: (a) depression, measuring a lack of motivation, low self-esteem, dysphoria, and hopelessness; (b) anxiety, assessing autonomic arousal, physiological hyperarousal, and the subjective feeling of fear; and (c) stress, evaluating irritability, impatience, tension, and persistent arousal. The DASS is commonly used to assess the unique and unrelated aspects of anxiety and depression, along with the third construct of stress, among clinical and non-clinical populations. It has full (42-item) and short (21-item) versions. The shorter version has the practical advantages of faster administration, resulting in improved clinical utility and ease of scoring. To create it, Lovibond and Lovibond selected seven representative items from each scale of the original DASS-42 that had the highest factor loadings.

The psychometric properties of the DASS-21 have been evaluated with various groups, including the general population (Henry and Crawford 2005; Sinclair et al. 2012), elderly individuals (Gloster et al. 2008), multicultural populations (Bottesi et al. 2015; Vasconcelos-Raposo et al. 2013), psychiatric patients (Clara et al. 2001), and adolescents (Moore et al. 2017; Tully et al. 2009). In general, these studies demonstrated high internal consistency of the subscales of the DASS-21, although the Anxiety subscale (range = .74 to .82) has yielded somewhat less favorable results than the other two (Depression range, .82 to .92; Stress range, .81 to .90). Correlations between the three DASS-21 dimensions (with r values ranging from .44 to .75) have been medium to large (Clara et al. 2001; Gloster et al. 2008; Henry and Crawford 2005; Sinclair et al. 2012; Vasconcelos-Raposo et al. 2013). The correlation between the DASS-21 and two other instruments, the Beck Depression Inventory (BDI) and Beck Anxiety Inventory (BAI), has also been examined. The DASS-21 depression subscale has been highly correlated with the BDI (r = .76) but not with the BAI (r = .51); conversely, the anxiety subscale has shown high correlations with the BAI (r = .74) but not the BDI (r = .47) (Gloster et al. 2008). Moreover, Bottesi et al. (2015) found a similar pattern of correlations in both community and clinical samples. The stress subscale has been shown to correlate both with similar measures and with anxiety measures, indicating a broader symptom pattern and the possibility of overlapping features between anxiety and stress.

As for the factor structure of the DASS-21, several studies have conducted exploratory factor analysis (EFA) and/or confirmatory factor analysis (CFA) to determine which factor structure best fits the data. Analyses of the underlying factor structure of the DASS-21 have been somewhat inconsistent. For example, Tran et al. (2013) one-factor solution (with all depression, anxiety, and stress items loading on a single latent factor) emerged from an EFA of data from a non-clinical Vietnamese sample. On the other hand, CFA, which assesses data fit to a priori theoretical assumptions, has indicated alternative factor solutions. For instance, two empirical studies have suggested that a two-factor structure that distinguishes depression from the anxiety and stress dimensions provided the best fit to the data in an adolescent sample and a patient sample, respectively (Apóstolo et al. 2006; Duffy et al. 2005). However, Lovibond and Lovibond (1995) found that a three-factor solution represented the optimal fit of all the structures they tested in their original study of the DASS. Evaluations of the original English-language DASS-21 factor structure in different populations have generally confirmed Lovibond and Lovibond’s hypothesized structure (Brown et al. 1997; Clara et al. 2001; Crawford and Henry 2003; Gloster et al. 2008; Sinclair et al. 2012). Moreover, a consistent three-factor structure for the DASS-21 has also emerged from studies validating foreign versions of the questionnaire. For example, testing of a Chinese-language version both supported the instrument’s internal consistency and confirmed the original three-factor model, as demonstrated by CFA (Lu et al. 2018). Some investigations into the three-factor structure of the DASS-21, however, made modifications to their models by allowing correlated errors between items of the same subscales (Crawford and Henry 2003) and/or cross-loading items (Clara et al. 2001), which in turn provided a better fit to the data.

Although several studies have supported the original three-factor structure, a study by Henry and Crawford (2005) suggested that a bifactor structure consisting of the three depression, anxiety, and stress dimensions plus a general distress factor achieved a better fit to the data than all competing models analyzed. Thus, their findings sustained the hypothesis that the three factors represent individual constructs while also acknowledging the existence of a more general factor that shares a large amount of common variance with all three of them. Other subsequent CFAs have found similar support for a bifactor model of the DASS-21 (Bottesi et al. 2015; Vasconcelos-Raposo et al. 2013).

Another important question surrounding the DASS-21 is whether the instrument has universal applicability across both genders. Gomez et al. (2014) and Lu et al. (2018) found the three-factor model to be invariant across gender in Australian and Chinese adult samples.

Despite the wide utility of the DASS-21, its measurement invariance across gender has remained largely unexplored. This is an important issue for a widely used measure such as the DASS-21, because, in the absence of evidence in this regard, it is not justifiable to compare the DASS-21 scores across gender. In particular, no information is available on the DASS-21’s measurement invariance across gender among Korean university students.

The DASS-21 has been translated into 42 languages (Crawford et al. 2011) and has been empirically validated in diverse cultures. To date, only one study has examined the psychometric properties of the DASS-21 in a Korean adult sample (Lee et al. 2019). Although Lee et al. (2019) confirmed the psychometric properties of the Korean version of the DASS-21, at least three limitations currently prevent its use in Korean university and research contexts. First, the participants in the study by Lee et al. included both a community sample recruited from community health centers and patients diagnosed with a depression disorder by psychiatric physicians. The university student sample might differ qualitatively from clinical samples of anxious or depressive patients. Second, the age of the sample (19 to 79), though typical of questionnaire validation studies, was much broader than that of university students. Third, Lee et al. did not test for measurement invariance between gender groups in their study. Due to these limitations, the previous study does not provide sufficient validation of the applicability of the DASS-21 with Korean university students. Given the high prevalence of depression and anxiety in university students and its associated negative outcomes as well as the need for accurate instruments that efficiently predict and identify risk or distress in students, the DASS-21 may be a particularly useful instrument for this population. Indeed, research on the DASS-21 has suggested that it is a viable tool for initial screening of internalizing symptoms associated with depression, anxiety, and stress in adolescents and university students (Lu et al. 2018; Moore et al. 2017). Therefore, examining the factor structure of the DASS-21 is important to guide its use as a screening instrument with Korean university students. Its usefulness in prevention and early intervention efforts make it an attractive option for the university context. In addition, using valid measures of internalizing symptomatology is essential for university counseling or health services, so as to gain an accurate understanding of the levels of internalizing distress affecting students who seek help (Moore et al. 2017).

In this study, we evaluated the psychometric properties of the DASS-21 with a sample of Korean university students. Specifically, CFA was performed to investigate and compare the fit of previously suggested models. Measures of invariance were used to determine whether the DASS-21 operated in an equivalent manner across gender. Finally, correlation analyses were conducted between the DASS-21 and three other psychiatric instruments—the Perceived Stress Scale (PSS-10), the Generalized Anxiety Disorder Scale (GAD-7), and the Patient Health Questionnaire (PHQ-9)—to assess the convergent validity of the DASS-21.

Methods

Participants

Data were collected from 582 undergraduate students (234 male and 348 female) enrolled at a private university in Dae Jeon, Korea. The participants’ courses of study included architecture, arts, education, design, and social welfare. Their ages ranged from 18 to 30 years with a mean age of 20.3 years (SD = 2.00). The mean age was 19.8 for female students (SD = 1.51) and 21.0 for male students (SD = 2.39).

Procedure

Ethical approval for this study was obtained from the Institutional Review Board (1041549–190,709-SB-76). After that, we made arrangements with academic instructors to have their students complete the questionnaires during scheduled class time. Prior to participating in the survey, students were informed of the study’s purpose and that participation was voluntary, and they were assured of the confidentiality of all information provided. They then completed the paper questionnaire. One of the study authors was present at each administration to provide instructions and collect consent forms from the students. The students needed 10 to 15 min to complete the questionnaire. They received no compensation for their participation.

Measures

Depression, Anxiety, and Stress Scales-21

The Korean version of the DASS-21, as translated and validated by Lee et al. (2019), was used to assess the negative emotional states of depression, anxiety, and stress. This 21-item measure comprises three subscales of seven items each. Item examples include “I felt down-hearted and blue” (depression), “I felt I was close to panic” (anxiety), and “I found it difficult to relax” (stress). Responses are given on a 4-point Likert scale ranging from 0 (“does not apply to me at all”) to 3 (“applies to me very much or most of the time”). Higher scores indicate greater incidence of negative experience over the past week. Total scores of the DASS-21 are doubled to correspond to scores on the 42-item DASS for the purpose of interpreting the severity of each emotional state. Lovibond and Lovibond (1995) posited severity ratings from “normal” to “extremely severe,” based on percentile scores, with the following cutoff points: normal, 0–77; mild, 78–86; moderate, 87–94; severe, 95–97; extremely severe, 98–100.

Patient Health Questionnaire-9

The Patient Health Questionnaire-9 (PHQ-9; Kroenke et al. 2001) is a 9-item, self-report instrument that assesses the severity of depression. Respondents rate the frequency with which they had experienced a variety of depressive symptoms within the past 2 weeks. Responses are on a 4-point scale ranging from 0 (“not at all”) to 3 (“nearly every day”). Total scores can thus range from 0 to 27, with higher scores reflecting more severe depressive symptoms. The present study used the Korean version of the PHQ-9 (available at the Patient Health Questionnaire website), which has demonstrated acceptable psychometric properties (internal consistency, test-retest reliability, known-groups validity, and convergent validity) among Koreans. The PHQ-9 had an internal reliability of α = .83 in the current study.

Generalized Anxiety Disorder-7

The Generalized Anxiety Disorder-7 (GAD-7; Spitzer et al. 2006) is a 7-item, self-report inventory that measures the symptoms of worry and anxiety. Each item is scored on a 4-point scale ranging from 0 (“not at all”) to 3 (“nearly every day”), with a resulting total score between 0 and 21. The GAD-7 has exhibited adequate psychometric properties (Kroenke et al. 2010), and its Korean version is available at the Patient Health Questionnaire website. The GAD-7 had an internal reliability of α = .91 in the current study.

Perceived Stress Scale-10

The Perceived Stress Scale-10 (PSS-10; Cohen et al. 1983) is a 10-item, self-reported questionnaire that measures subjective perceptions of and emotional response to stress. Each item is scored on a 5-point Likert scale, with responses ranging from 0 (“never”) to 4 (“very often”); the total score can thus range from 0 to 40, and higher scores indicate greater perceived stress. The PSS-10 has shown adequate psychometric properties in the Korean context. (Lee and Jeong 2019). The PSS-10 had an internal reliability of α = .82 in the current study.

Statistical Analysis

All statistical analyses in this study were conducted using IBM SPSS and AMOS v20 (Arbuckle 2011). Before conducting the data analyses, we reviewed all responses to search for missing values. The amount of missing data was minimal, representing less than 1% of the total number of cases in the dataset. Missing values were replaced using the SPSS expectation-maximization algorithm.

Four competing models of the latent factor structure of the DASS-21, based on the relevant theories and prior empirical research, were assessed using CFA, applying a maximum-likelihood procedure. Model 1 was a one-factor solution, with the 21 items of the DASS-21 loading onto a single latent factor. Model 2 was a correlated two-factor model, with depression items loading on one factor and anxiety and stress items loading on another. Model 3 was a correlated three-factor model in which the three latent variables were represented by depression, anxiety, and stress. The final model (Model 4) tested was a bifactor model including a general distress factor, onto which all items were allowed to load, and orthogonal specific factors of depression, anxiety, and stress.

To evaluate the fit of the tested models, the following fit indices were examined by calculating the chi square (χ2) and its subsequent ratio with the number of degrees of freedom (χ2/df); comparative fit index (CFI); goodness-of-fit index (GFI); root mean square error of approximation (RMSEA) and its 90% confidence interval (90% CI); and standardized root mean square residual (SRMR). The chi-square test divided by the number of degrees of freedom should have a value less than 5 (Marsh et al. 2004). Usually, indices greater than .90 for CFI and GFI and lower than .08 for RMSEA and SRMR are interpreted as indicating acceptable fit (Fan and Sivo 2007; Hu and Bentler 1996; Marsh et al. 2004). Chi-square difference tests were used to determine whether the models differed significantly from one another. In addition, the best-fitting factor solution was assessed for measurement invariance across gender using multi-group CFA. Invariance was tested for configural (equal model structures), metric (equal factor loadings), and scalar (equal intercepts) invariance (Van de Schoot et al. 2012). The more constrained model was selected if the following four criteria suggested by Cheung and Rensvold (2002) and Chen (2007) were met: (a) the χ2 difference value (χ2) was not statistically significant (p > .05); (b) the difference in CFI (CFI) was lower than .01; (c) the difference in RMSEA (RMSEA) was lower than .015; and (d) the difference in SRMR (SRMR) was lower than .03.

There is no agreed-upon standard as to the absolute minimum sample size required for CFA, but various recommendations have been presented. For example, Comrey and Lee (1992) claimed that a sample size of 500 is very good for a factor analysis. Thus, the sample size of 582 in the current study should be considered adequate, particularly in view of the relative simplicity of the investigated models.

To examine convergent validity of the DASS-21, patterns of correlation between the DASS-21 subscales and the PHQ-9, GAD-7, and PSS-10 were assessed using Pearson’s r. After we examined goodness of fit with the DASS-21 for alternative measurement models, the most appropriate model was selected and descriptive statistics were computed. Internal consistency was then conducted using Cronbach‘s alpha.

Results

Factor Structure

Table 1 summarizes the goodness-of-fit indices for the alternative CFA models. First, a one-factor model was tested to examine whether the DASS-21 should be best understood as a general measure of negative affectivity, rather than as measuring differentiated states associated with depression, anxiety, and stress. As shown in Table 1, the one-factor for the DASS-21 scale demonstrated adequate to poor fit to the data (χ2 = 1017.3, df = 189; χ2/df = 5.38; CFI = .872; GFI = .88; RMSEA = .087 (90% CI = .082–.092); SRMR = .049). The two-factor model, which distinguishes depression from the anxiety and stress dimensions, fit the data marginally better than the one-factor model, as evidenced by a decrease in the χ2 value and improved results for CFI, GFI, RMSEA, and SRMR; however, the fit indices did not meet the accepted criteria (χ2 = 951.3, df = 188; χ2/df = 5.06; CFI = .889; GFI = .89; RMSEA = .084 (90% CI = .078–.089); SRMR = .047). In contrast to the first two models, the original three-factor model of the DASS-21 showed adequate fit to the data (χ2 = 921.8, df = 186; χ2/df = 4.96; CFI = .904; GFI = .89; RMSEA = .083 (90% CI = .077–.088); SRMR = .046). Correlations between factors in the three-factor oblique model were strong: anxiety-depression r = .69, anxiety-stress r = .74, and depression-stress r = .69. However, the bifactor model resulted in the best solution (χ2 = 713.2, df = 171; χ2/df = 4.17; CFI = .934; GFI = .92; RMSEA = .074 (90% CI = .068–.080); SRMR = .042), with significant improvements over the three-factor model. The χ2/df value for the bifactor model was smaller than those for the one-, two-, and three-factor models—another indication of a good fit. Furthermore, the χ2 difference tests revealed that the bifactor model provided a significantly better fit to the data than the one-factor model (χ2 (18) = 304.1, p < .001), the two-factor model (χ2 (17) = 238.1, p < .001), and the three-factor model (χ2 (15) = 208.6, p < .001). The conceptual model evaluated in the CFA can be seen in Fig. 1.

Table 1 Goodness-of-fit indices of models for the DASS-21 (N = 582)
Fig. 1
figure 1

The conceptual bifactor model of the DASS-21

Table 2 presents the standardized factor loadings of the bifactor model. Strong loadings of each item on the general distress factor were observed. When compared with item loadings on each specific factor, the loadings for the general distress factor were stronger than those of the specific factor, with loading values ranging from .49 to .79 for the general distress factor and from .11 to .67 for the specific factor represented. Most notably, the item loadings onto the specific factors of depression, anxiety, and stress were weak, with three to six items on each specific factor having acceptable loadings (i.e., > .30). Hence, all loadings associated with the general factor were significant at p < .05 and had a satisfactory size, whereas three loadings associated with the specific group factor were not significant.

Table 2 Standardized factor loadings for bifactor model of the DASS-21

Multi-Group Analysis

Factorial invariance tests were performed by fitting the bifactor solution to the data for male and female students (Table 3). The bifactor model was used as a baseline model to test a series of restrictive models, beginning with configural invariance, metric invariance, and scalar invariance. The results supported configural invariance for the bifactor model, indicating adequate model fit (χ2 = 803.4, df = 211; χ2/df = 3.81; CFI = .932; RMSEA = .073 (90% CI = .068–.081); SRMR = .041). Furthermore, a test for metric invariance also found good fit (χ2 = 824.1, df = 235; χ2/df = 3.51; CFI = .931; RMSEA = .075 (90% CI = .070–.083); SRMR = .044). The χ2 difference and CFI, RMSEA, SRMR values (∆χ2 (24) = 20.7, p < .01; CFI = −.001, RMSEA = .002, SRMR = .003) indicated that the invariance of factor loadings resulted in a nonsignificant change in model fit as compared with the configural model, thereby supporting the full metric measurement invariance across genders. Finally, the scalar invariance model showed acceptable fit to the data (χ2 = 849.5, df = 257; χ2/df = 3.31; CFI = .927; RMSEA = .080 (90% CI = .084–.094); SRMR = .053). Chi-square difference tests indicated no significant change in model fit between metric and scalar models across gender (∆χ2 (22) = 25.4, p < .01). There was no meaningful decrement in fit from the metric to the scalar model with equal thresholds for both genders (CFI = −.004, RMSEA = .005, SRMR = .009). Overall, these findings suggest that the DASS-21 is factorially invariant between male and female students in relation to the bifactor solution.

Table 3 Goodness-of-fit indices for tests of measurement invariance across genders for the DASS-21

Descriptive Statistics

Table 4 reports the means and standard deviations for the three scales and the total scale of the DASS-21 for males and females and the total sample, as well as Cronbach’s alpha coefficients and the correlations among the three subscales for the sample. The mean DASS-21 total score for the sample was 12.43 (SD = 11.97). No gender difference in DASS-21 total scores was found, t(580) = 1.40, p = .16, indicating that both males (M = 12.22, SD = 10.97) and females (M = 12.38, SD = 11.35) reported similar levels of general distress. In addition, the mean scores for depression, anxiety, and stress were computed by gender. Male and female students did not differ significantly on the mean scores of depression (M = 3.86, SD = 4.90 for males vs. M = 4.42, SD = 4.34 for females, t(580) = 1.44, p = .15), anxiety (M = 2.96, SD = 3.87, vs. M = 3.19, SD = 3.50, t(580) = .72, p = .47), and stress (M = 5.40, SD = 4.38 vs. M = 4.77, SD = 5.15, t(580) = 1.59, p = .11).

Table 4 Means, Standard Deviations, Cronbach’s αs, and inter-correlations between the DASS-21 subscales and total score, and doubled DASS-21 scores

To examine the reliability of the scales and the total score, Cronbach’s alpha estimates were computed. The DASS-21 had an internal reliability of .90 for depression, .84 for anxiety, .88 for stress, and .95 for the total DASS-21 score across the whole sample. Intercorrelations between the subscales and total DASS-21 scores, ranged from a low of .77 to a high of .85.

Convergent Validity

The convergent validity of the DASS-21 was determined by Pearson correlations with other measures of similar constructs used in the study (see Table 5). The depression score of the DASS-21 was correlated more strongly with the PHQ-9 (r = .69) than with the GAD-7 (r = .49) and PSS-10 (r = .32). Likewise, the anxiety score on the DASS-21 had a slightly stronger correlation with the GAD-7 (r = .73) than with the PHQ-9 (r = .49) and PSS-10 (r = .45). Lastly, the stress score of the DASS-21 was correlated more strongly with the GAD-7 (r = .62) and PSS-10 (r = .53) than with the PHQ-9 (r = .34). These correlations were medium to large in size, suggesting an adequate specificity of the three DASS-21 subscales. Overall, such results indicated adequate convergent and validity of the DASS-21.

Table 5 Correlations between the DASS-21 subscales and the PHQ-9, GAD-7 and PSS-10

Discussion

The purpose of this study was to evaluate the Korean version of the DASS-21 with regard to its factor structure, internal consistency, and convergent validity. Four competing models were specified and tested. Consistent with several other studies (Clara et al. 2001; Crawford and Henry 2003; Henry and Crawford 2005), the one-and two-factor models provided inadequate fits to the data. This result should not be surprising, since these models have rarely been supported and since the DASS was originally developed to assess the multiple dimensions of depression, anxiety, and stress symptoms.

The fit indices of the original three-factor model were adequate, but the bifactor model with three specific factors yielded slightly better fit indices overall with the present sample of university students. The results are in line with the findings of recent studies that support the bifactor structure of the DASS-21 (Bottesi et al. 2015; Vasconcelos-Raposo et al. 2013). Notably, our results provide strong evidence of a common general distress factor (Clark and Watson 1991; Lovibond and Lovibond 1995), although at a fundamental level, the scales of depression, anxiety, and stress represent legitimate and consistent dimensions of each specific emotional syndrome (Vasconcelos-Raposo et al. 2013). In other words, although the three DASS-21 scales index a substantial common general distress factor, they also measure phenomena specific to each scale. Therefore, both the shared and the unique features of depression, anxiety, and stress are important in constructing a comprehensive explanation of the high co-occurrence of these disorders in clinical practice (Bottesi et al. 2015). In line with findings by Henry and Crawford (2005), our support for bifactor model indicates that the use of the total score could be as appropriate as measuring the general psychological distress. The loadings of items on the general distress factor were higher than those for the specific factors of depression, anxiety, and stress; this finding may have contributed to the reduction of the correlation between the depression and anxiety factors. Tully et al. (2009) found that the correlation between these two psychological dimensions was significantly reduced when a general distress factor (i.e., negative affectivity) was included in the bifactor structure.

From factorial invariance testing, we found evidence for configural, metric, and scalar invariance of the DASS-21 across gender when modeled as a bifactor structure. These findings suggest that the items were interpreted in a similar manner by both male and female students. Although there are differences in the prevalence of depression and anxiety symptoms between men and women, as noted in the introduction, it may not affect the use of the DASS-21 in Korea. Moreover, Gomez et al. (2014) and Lu et al. (2018) found that the DASS-21 is invariant across gender in Australian adult and Chinese university student samples. Our findings offer further evidence regarding the validity of the DASS-21 as a reliable and equivalent measure for male and female university students in non-clinical settings.

With respect to psychometric properties, the Korean version of the DASS-21 demonstrated excellent reliability. The internal consistency between all three scales and total scores was higher than .80; this result is similar to those of previous studies using the Korean version and versions in other languages (Lee et al. 2019; Vasconcelos-Raposo et al. 2013). These findings suggest that the DASS-21 exhibits good internal consistency across different languages. However, these reported values tend to be lower than the internal consistency estimates for the full version of the DASS-42, as one would expect since Cronbach’s alpha is strongly affected by the number of items. Abell et al. (2009) suggested that α should be at least .85 if an instrument is to be used to draw inferences concerning an individual. Therefore, our results suggest that the three scales measured by the DASS-21 can be used in either separate or combined forms to contribute to the broader clinical assessment of such syndromes.

Inter-correlations between scales were high, consistent with the values observed in previous studies. The total score of the DASS-21 displayed strong correlations with all three subscales of depression, anxiety, and stress. Consistent with the results obtained from the CFA, this result may imply that the total score can be used as a measure of general distress.

Findings concerning the correlations of the three subscales with those of other psychiatric instruments that measure similar constructs provided support for the validity of the DASS-21 subscales and were satisfactory overall. The DASS-21 depression and anxiety subscales showed specific associations with the corresponding measures of these disorders, thereby supporting the usefulness of these constructs. In contrast, the DASS-21 stress subscale had a broader symptom pattern, correlated with both the GAD-7 and the PSS-10. This finding suggests that the stress construct may capture symptoms of agitation, tension, and irritability that are observed in both anxiety and stress (Clark and Watson 1991). Only a few studies of the DASS-21 included a specific measure of stress, and they reported similar findings (e.g., Bottesi et al. 2015). Hence, this result is not surprising, and it seems to confirm the overlapping features of anxiety and stress.

It is important to consider the practical implications of the CFA results for use of the DASS-21 in both research and university practice. Our results suggest that the general distress factor is associated with most of the variation in DASS-21 scores. Therefore, our findings support the use of the DASS-21 as a screening tool to identify general distress based on one’s total score. This approach should have important advantages (e.g., controlling for general psychopathology and making the screening process simple and convenient) in both research and practice. As Henry and Crawford (2005) suggested, the three subscales could also be administered separately with the caveat that each component shares a large portion of variance with general distress.

Several limitations must be considered when interpreting these results. First, the sample was recruited from an undergraduate population and was relatively homogeneous, limiting the generalizability of the results to samples of varying ages and backgrounds. Replication studies with more heterogeneous samples (i.e., clinical and non-clinical community populations) is needed. However, obtaining research samples that are both sufficiently large and diverse would be extremely difficult. Second, only self-reported data were included, and such data can incorporate social desirability bias or shared method variance. Future studies would benefit from including additional measures when assessing psychiatric diagnoses, such as clinical interviews and physiological assessment. Next, examination of the loadings of items onto the specific factors of depression, anxiety, and stress indicated that one item (number 5) from depression, two items (2 and 9) from anxiety, and two items (1 and 11) from stress did not show strong specificity to their relevant factor. Items 2, 5, 9, and 11 also showed low factor loadings in previous studies (Bottesi et al. 2015; Clara et al. 2001; Shea et al. 2009) but no evidence regarding the low specificity of item 1 to its relevant factor has been previously reported. Eliminating items with weak or non-significant loadings is a complex issue, as it entails reducing the number of items on an established questionnaire. Doing so can make the instrument shorter and more precise, but it may also prevent comparisons between results obtained with the newly altered scale and those obtained by administering the original version (Bottesi et al. 2015). In spite of the potential weaknesses that may arise from retaining all the items, the original DASS-21 is widely used in many countries, so maintaining the same version is considered best for the purpose of making comparisons. Moreover, removing weak or non-significant items would not be the best solution in the present situation because (1) the bifactor model achieved a significantly better fit than the competing models; (2) the bifactor model appeared to be the most appropriate one in representing the observed data, since all loadings on the general factor were significant at p < .05 and had an adequate size; and (3) estimates of internal consistency were very good for all the DASS-21 scale scores in our student sample, and no indication calling for the removal of any item emerged.

In conclusion, despite the above-mentioned limitations, the present study provides support for the factor structure and convergent validity of the Korean version of the DASS-21 with a university student sample. Internal consistencies were also satisfied, indicating that the DASS-21 can be applied in research and practice. Although the optimal model for the DASS-21 is still a subject for further research, the results from our CFA provide further support for a bifactor model that includes a common general distress factor and three orthogonal factors of depression, anxiety, and stress. Additionally, the current study has provided the first indications of factorial invariance of a bifactor solution across gender in n our non-clinical sample of university students in Korea. The current finding strengthens the evidence that the DASS-21 may be used both to differentiate between psychological problems (i.e., depression, anxiety, and stress), and as a measure of overall psychological distress.