Introduction

South Africa has recently seen a spike in suicide rates among university students (van Zyl et al. 2017). Higher stress levels have been implicated, among other factors; in this, rising figures of suicidality in South African university students (Bantjes et al. 2016). Global stress researchers and practitioners have three versions of the Perceived Stress Scale – PSS 14, 10 and 4 ─ (Cohen and Williamson 1988) at their disposal. However, it is only the PSS-10 that has had widespread use, specifically due to its conciseness and superior psychometric properties as compared to the other two versions. Cohen developed the PSS based on Lazarus’s transactional model of perceived stress as the interaction between the appraised stressor and perceived coping (Lazarus and Folkman 1984). The instrument’s popularity is such that it is available in more than 25 languages other than English (Lee 2012). Its currency notwithstanding, the factorial validity of this version is still a source of contestation among researchers. Specifically, the debate is focused around whether the PSS-10 is a bi-factorial or two-factor instrument, and why the two-dimensions explain a low percentage of the total variance in EFA (Lee 2012; Michaelides et al. 2016). A number of factors, including (a) the use of different analytic methods, (b) sample sizes, (c) sample heterogeneity, and (d) participants characteristics (clinical vs. nonclinical) have been implicated for the contradictory PSS factor structure findings (see Lee and Jeong 2019).

While the measure was originally developed as a unidimensional measure of perceived stress (Cohen and Williamson 1988), characterised by negative and positive items, subsequent research has suggested that such a differentiation of item statement directionality subsumes or represents distinct factors, with negative-worded items examining perceived self-efficacy, and positively-worded items representing perceived distress (Barbosa-Leiker et al. 2013). The two-factor solution further received overwhelming support from studies that compared it with the one-dimensional model (Baik et al. 2019; Dao-Tran et al. 2017; Hong et al. 2016; Kaya et al. 2017; Khalili et al. 2017; Manzar et al. 2019; Mondo et al. 2019; Nielsen et al. 2016), despite some results indicating that it too might after all not be very stable because the solution did not consistently account for >50% of the total variance in the 10 items in all studies.

Moreover, research that investigated the relations of the two factors with external constructs (e.g., anxiety and depression) as a way of seeking discriminating evidence between them, only found a minor distinction between the two factors (i.e., the difference in magnitude of the correlations) – partly supporting Cohen and Williamson’s (1988) recommendation that practitioners should use the composite score when scoring the PSS. Alternatively, Dougherty et al. (2017), using Rasch analysis, found evidence for the unidimensional conceptualisation, while Bradbury (2013) reported superior fit for a correlated three-factor solution (measuring Distress, Perceived self-efficacy and Emotional reactivity). However, the correlated three-factor model has only attracted limited attention from the research community (e.g., Denovan et al. 2017).

More recent work on the dimensionality of the PSS-10 has resulted in an even more differentiated measurement model. A bifactor model has been suggested as a superior alternative to the one and two-factor models (Denovan et al. 2017; Michaelides et al. 2016; Perera et al. 2017). This takes advantage of advances in latent modeling (Reis et al. 2010), wherein bifactor models are proposed as an alternative in instances (such as with the PSS) where both the one-factor and two-factor solutions are considered inadequate. The bifactor solution considers the PSS-10 to be adequately unidimensional to permit the use of global scores, while equally sustaining the importance of domain-specific factors of the measure (Reise et al. 2007). The PSS-10 bifactor solution comprises a general perceived stress factor, with two orthogonal factors (i.e., Distress and Perceived self-efficacy). This conception is not unlike Lazarus’s model, which was always about general level of and specific manifestation of stress, thus making the bifactor model a probable theoretical model of perceived stress. Clarity concerning the factor structure of the PSS-10 is important, as this also has implications for the identification of morbidity rates for psychological stress.

In addition, although Cohen et al. (1983) suggested that because the PSS measures general stress and therefore without content that is particular to any population, studies have found mixed results with regard to group-related difference in PSS scores (Gitchel et al. 2011). This has led to calls in the literature for the PSS to be validated with diverse populations and across different cultures (e.g., Lee 2012).

The PSS-10 has similar traction among South African practitioners and researchers (Magalhaes Das Neves et al. 2014), yet its psychometric properties have rarely been investigated in this country. The limited validation of health outcome measures across diverse groups is implicated in a number of measurement problems (i.e., inaccuracy and test bias) when these measures are used within diverse populations (Ramırez et al. 2005). As such, to qualify for use with diverse populations, psychological measures should at least demonstrate that they do not measure different constructs for different groups. Relatedly, the present study tests the validity of the PSS-10 in a diverse group of South African university students. University students are a common cohort for psychometric investigations of the PSS-10 because they experience significant levels of stress (e.g., financial, academic, and personal stress) (Lee 2012). Stress is a widespread mental health problem, rampant not only in the general population, but among university students too. In fact, university students have consistently recorded a higher prevalence of stress than the general population (van Zyl et al. 2017).

A prevalence of moderate to extremely severe stress has been recorded for South African university students (van Zyl et al. 2017). This high level of psychological stress has recently been implicated, among other factors, in the rising figures of suicidality and depression in university students (Bantjes et al. 2016; Pereira-Morales et al. 2019; Zajenkowska et al. 2019; Zhang et al. 2015). For instance, South African university students are reported to have a higher prevalence rate of suicidal ideation compared to the general population and university students from other countries (e.g., United States and Turkey) (Bantjes et al. 2016). As such, psychometrically valid measures are important for university counselling services, in that, such measures assist in the identification/screening of those at risk for mental illness and the mitigation of health consequences. The PSS-10 is preferable to student-tailored stress measures (such as the College Student Recent Life Experiences Scale, College Student’s Stressful Event Checklist, The Perception of Academic Stress Scale, Stress and Adversity Inventory [STRAIN]), due to its briefness, ease of use, and wide-reaching nature – making it suitable for application in variety of settings (Denovan et al. 2017).

Pursuant to the literature on the unresolved factor structure of the PSS-10, cited above, and the limited psychometric data on the measure in South Africa, the object of the current study was to evaluate the psychometric properties (i.e., dimensionality, measurement invariance [MI] across gender (male and female) and race (Black African and White), and reliability) of the PSS-10 for use with South African university students. The examination of the measurement invariance the PSS-10 is important because the PSS (although not designed to examine psychological symptomatology) is used in determining the prevalence rates and differences in stress levels across different populations (e.g., gender and race groups). Such studies assume that the PSS is measurement invariant (i.e., assesses equivalent/similar constructs in exactly the same manner across the concerned groups). Conclusions drawn on between-group differences using a measure that has not been demonstrated to be invariant across groups, maybe misleading. Mean scores have to have minimal bias when comparing groups.

Method

Participants and Procedure

A sample comprising 862 students (Mage = 21.70 yrs., SD = 13.51; 72% = Female; 67% = Black African) was recruited from both the University of Limpopo (46%) and the University of Pretoria (54%). Participants were recruited from undergraduate classes across the faculties of humanities, natural sciences, and management sciences. The Ethics Committees of both the University of Pretoria and the University of Limpopo approved the study, and the participants consented to voluntary participation. The research complies with the Declaration of Helsinki. The survey was self-administered (paper and pencil) outside of official lecture hours and no incentive was provided for participation.

Measures

Perceived Stress Scale (PSS–10)

The PSS-10 (Cohen and Williamson 1988) measures stressful life situations and circumstances (see Table 1 for item descriptions). Respondents are asked to respond on a 5-point Likert scale (never = 0; almost never = 1; sometimes = 2; fairly often = 3; very often = 4). Items 4, 5, 7 and 8 were reverse scored to obtain the total scale score. The measure had a good internal consistency in the present study (α = .79).

Table 1 Item description abbreviations

Data Analysis

The sample characteristics were analysed with the SPSS; Omega and Greatest Lower Bound with R software and Rasch models with WINSTEPS (Linacre 2009). A series of five competing CFA models of the PSS-10 were estimated with maximum likelihood (ML) estimation using AMOS 25.0. The following latent factor structures found in the literature were estimated: (1) unidimensional PSS model: with all 10 items loading on a single factor; (2) a bi-dimensional PSS model: with two correlated latent factors (i.e., Distress: items 1, 2, 3, 6, 9, 10 and perceived self-efficacy: items 4, 5, 7, 8); (3) bi-factor model: all items loaded on a general factor and two additional, specific factors (factors were orthogonal and independent from the general factor); (4) a bi-factor model, with the perceived self-efficacy factor; and (5) a bi-factor model, with the distress factor. The following goodness-of-fit indices were used to evaluate the specified models: the Chi-square statistic to df ratio (x2/df), the comparative fit index (CFI), the Tucker-Lewis index (TLI), Akaike’s information criterion (AIC), Bayesian information criterion (BIC) and the root mean square error of approximation (RMSEA, with associated 90% confidence interval).

Models were considered to fit the data well when x2/df < 1.5, TLI and CFI ≥ .95 (adequate at .92 to .94), and RMSEA <.06. Competing models were compared for parsimony using the AIC and BIC (small values on both indices indicate superior fit of the model to the data, with larger values of more than 10 points suggesting poor fit; Burnham and Anderson 2004). Based on the best-fitting model, measurement invariance was then examined using multigroup confirmatory factor analysis (MCFA) (i.e., means and covariance structures [MACS]) to test if the model was invariant across male and female, and black African and white participants (ΔCFI & ΔRMSEA ≤0.02). Three models were examined for invariance analyses: configural invariance (similar factor structure across gender and race groups [no constraints imposed]); metric invariance (equal constraints on factor loadings between groups); and scalar invariance (equal constraints on factor loading and item intercepts between groups).

Partial Credit Model within Winsteps software, was employed to analyse the polytomous data. Within Rasch analysis, two rating scale models (RSM) were adopted, namely, coping and distress. The information-weighted fit (infit) and outlier-sensitive fit statistic (outfit) mean square (MnSq) were applied to examine item fit (i.e., item infit or outfit MnSq outside of 0.5–1.5 = misfit). Average and step measures of the descriptors were used to check if disordering threshold exists in the PSS. Additionally, infit MnSq, outfit MnSq and monotonically increased difficulties between 0.5 and 1.5 suggested no disordering (Tennant and Conaghan 2007). Lastly, DIF across the gender and race groups were also estimated. Both the statistical test and DIF contrast were used to determine if male students interpreted any of the PSS items differently from their female counterparts (i.e., DIF contrast >0.5 represents a substantial DIF) (Scott et al. 2009). The same was for race groups. Reliability was considered acceptable at >.70 for all indices (i.e., Cronbach’s Alpha, separation reliability, Omega, greatest lower bound) (Dunn et al. 2014).

Results

Descriptive Statistics

Although, females (M = 21.85) in the sample reported more experiences of stress than males (M = 20.36) (t = −3.89, p = .00, d = .29), there were no statistically significant racial differences with regard to stress experiences (black Africans: M = 21.54; whites: M = 21.23; t = 0.823, p = .41).

Confirmatory Factor Analysis

CFA was applied to alternative PSS-10 factor structures found in the literature and compared them against each other. Table 2 shows the fit indices of all of the tested models. Model 1 (one-factor) demonstrated poor fit to the data. Model 2 (two-factor) displayed a better fit compared to Model 1, while the bi-factor model (Model 3) demonstrated a better fit to the data than both model 1 and 2 (using fit indices only). Both the bi-factor model with the perceived self-efficacy factor (Model 4) and bi-factor model with the distress factor (Model 5) also displayed adequate fit, although they were inferior to Model 3. However, the AIC, BIC and the factor loadings confirm that model 2 had a better fit to the data and is more parsimonious than competing models.

Table 2 Model fit estimates of the confirmatory factor analyses

Also, while all the estimated factor loadings for Model 3 (the bi-factor model) had expected signs and were statistically significant (see Table 3), not all items loaded substantially on their relevant factors (i.e., items 4, 6 and 7). The same was the case for Model 1 (i.e., items 4 and 7), Model 4 (i.e., items 4 and 7) and Model 5 (i.e., items 1, 2, 3, 6, and 9). Taken together and despite better model fit for the bi-factor model (using fit indices only), these results provide evidence towards Model 2 (Bi-dimensional model). The two factors have statistically significant associations between them (r = 0.42, p value <0.001).

Table 3 Standardized loadings on the latent factors for all the models

Measurement Invariance (Gender and Racial Groups)

Baseline Models

The PSS-10 two-factor model had a good fit to the data for both male (χ2[34] = 64.97; CFI = 0.94; SRMR = 0.00; RMSEA = 0.06, with 90% CI = 0.04 to 0.08) and female (χ2[34] = 128.36; CFI = 0.95; SRMR = 0.06; RMSEA = 0.06, with 90% CI = 0.05 to 0.08) participants. It also had an acceptable fit to the data for both black African (χ2[34] = 97.60; CFI = 0.95; SRMR = 0.07; RMSEA = 0.06, with 90% CI = 0.04 to 0.07) and white (χ2[34] = 134.21; CFI = 0.91; SRMR = 0.00; RMSEA = 0.10, with 90% CI = 0.08 to 0.11) participants.

Factorial Invariance Test

Table 4 presents the related tests for multigroup models for both gender and race groups (each with more restricted parameterisation).

Table 4 Goodness-of-fit statistics for tests of invariance of the PSS-10 bi-dimensional model for males and females, and black Africans and Whites

The results show that all the nested models represented a good fit to the data, with the resulting ΔCFI and ΔRMSEA values of ≤0.02. These suggest that the PSS-10 provides an assessment of stress that is equivalent across male and female, and black African and white university students.

Rasch Analysis

Item Threshold Ordering

All measure indices showed that the difficulties of responses (i.e., Likert scale) were monotonically increased, while the infit and outfit mean squares suggested no disordering (see Table 5).

Table 5 Response disordering test

Item Fit

Perceived Self-Efficacy Factor

Items fit the latent construct well (infit and outfit: between 0.5 and 1.5) (see Table 6).

Table 6 Rasch difficulties and fit statistics for each item

Distress Factor

Items fit the latent construct well (infit and outfit: between 0.5 and 1.5) (see Table 6).

Dimensionality

Perceived Self-Efficacy Factor

The 6-item subscale PCA of residuals accounted for 58.2% of the variance. The unexplained variance of the eigenvalue for the first contrast was 1.4, suggesting that the Negative subscale was unidimensional.

Distress Factor

The 4-item subscale PCA of residuals accounted for 49.4% of the variance. The unexplained variance of the eigenvalue for the first contrast was 1.5, suggesting that the positive subscale was unidimensional.

Differential Item Functioning Across Gender and Race

The DIF analysis suggested that there were no items reflecting DIF across the two gender and race groups (i.e., DIF contrasts <0.5) (see Tables 7 and 8).

Table 7 Differential item functioning across gender
Table 8 Differential item functioning across race groups

Reliability Analyses

The reliability of the PSS total scale and the two domain specific factors (i.e., Distress and Perceived self-efficacy) was acceptable (α = .79, .82, .69; ω = 0.84, 0.87, 0.75 and Greatest Lower Bound = 0.86, 0.84, 0.74, respectively).

Discussion

The present study evaluated the factorial validity of the PSS-10, by comparing five competing measurement models of the scale and offered reliability evidence of the measure among a sample of university students. Because the use of different statistical techniques is one of the issues implicated for the contradictory factor structure findings of the PSS, the present study used two different analytic methods (both CFA and Rasch analysis) to circumvent the limitations of the extant literature. Students reported relatively higher levels of perceived stress (M = 21.3, SD = 5.2) than the norm (i.e., general population of the same age group) (M = 14.2, SD = 6.2; Cohen 1994). Overall, the results suggest that the PSS-10 is best represented by a bi-dimensional structure in South African university students. Specifically, a two-factor solution, consisting of two domain specific factors (Distress and Perceived self-efficacy) was found to be superior to the other four alternative models. The two-factor model is consistent with that of previous research with older adults, clinical population and university students (Baik et al. 2019; Dao-Tran et al. 2017; Hong et al. 2016; Kaya et al. 2017; Khalili et al. 2017; Manzar et al. 2019; Mondo et al. 2019; Nielsen et al. 2016), and Lazarus and Folkman (1984)‘s two-sided theory of appraised stress. These two factors were named “Perceived Distress” and “Perceived Coping”, “Global negative feelings of general distress” and “Perceived ability to cope effectively”, “Stress” and “Counter-stress”, “Perceived Helplessness” and “Perceived Self-efficacy”, and “Negative Stress” and “Positive Stress” in previous research (Reis et al. 2010).

The reported two-factor structure is also implied by the low intercorrelations between the two subscales of the PSS-10. This pattern of relations suggest that the subscales only have 16% overlap, thus indicating that they tap into separate aspects/domains of perceived stress. In keeping with the theory, the subscales measure related but separate aspects of perceived psychological stress. The results seem to contradict Cohen and Williamson (1988) suggestion that when measuring stress, the distinction between the two factors was irrelevant and that a composite score should instead be used. Instead, the results support calls in the literature (Taylor 2015) for practitioners to report subscores when using the PSS.

Findings of factorial invariance suggest that the PSS-10 measures stress equivalently across gender and race groups, and that, findings of gender and race mean differences on perceived stress can be interpreted meaningfully. The present study confirms findings of other measurement invariance studies of the two-factor solution (Lavoie and Douglas 2012; Smith et al. 2014). The internal consistency reliability of all the factors (i.e., Distress and Perceived self-efficacy) was satisfactory.

Conclusion

In summary, this study provides psychometric evidence (i.e., factorial validity and reliability) of the PSS-10 for use with South African university students. The results support the use of the subscale scores in the reporting of perceived stress. The instrument’s briefness and findings of its psychometric soundness offer support for PSS as a potential resource for University student Counselling services in South Africa, specifically as an early detection tool for those at high risk for mental illness.

Limitations

Some aspects of the present study (e.g., overwhelmingly female sample) that limit the generalisability of this study’s findings are worth highlighting. Comprehensive validation of a measure such as the PSS extend beyond structural validation and internal consistency reliability. Future studies should explore additional psychometric evidence such as convergent, divergent and predictive validity of the instrument. The extension of this research with other populations (clinical and community sample) would increase the generality of the findings.