Background

Globally, mental illness is prevalent. The 2017 Global Burden of Disease (GBD) study estimated that 970 million people—approximately 13% of the global population—were living with a mental disorder [1, 2]. Approximately 264 million people, or 4% of the global population, were estimated to be living with a depressive disorder [2]. Although there are cost-efficient and effective treatment options to address depression [3, 4], large gaps exist in care-seeking and receipt of effective treatment. One major barrier affecting care-seeking among patients exhibiting depressive symptoms is stigma, the deeply entrenched sociocultural norms that comprise attitudes which reduce people to being tainted or otherwise discounted based on some attribute [5, 6]. Since what is considered to be a discreditable attribute—and the degree to which it discredits—may vary by cultural context, tools measuring stigma may need to be tailored to and validated in specific populations.

In European and North American countries, where the bulk of mental health stigma research has been based, those who report greater mental health stigmatization are less likely to seek care and are more likely to experience discrimination and demoralization [7,8,9,10]. Beyond seeking treatment, stigma also negatively affects adherence and response to treatment [11, 12]. Previous studies have also demonstrated that the experiences and consequences of stigma are modified by other social attributes of patients, including age, gender, education, and socioeconomic status, demonstrating that the social context of mental health stigma is very important [13,14,15,16].

In many African countries, there has yet to be extensive research on the role of mental health stigma in mental health treatment. One study conducted in South Africa found that stigma and misinformation regarding mental illness are prevalent and negatively associated with help-seeking behavior [17]. A cross-sectional study of attitudes around mental illness in Blantyre, Malawi observed that the degree and type of stigmatizing beliefs that participants endorsed varied by age but not by education, socioeconomic status, or gender [18]. While that study produced valuable information, it did not use an instrument validated in the study population, as there have yet to be such tools validated in Malawi to measure mental health stigma.

Accordingly, this study aims to evaluate the structural validity and reliability of a short quantitative instrument to measure depression-related stigma in patients engaged in noncommunicable disease (NCD) care and exhibiting depressive symptoms in Malawi. We further aim to characterize the prevalence of depression-related stigma among this cadre of patients and assess construct validity, or the degree to which this measure of stigma is associated with other measures of other constructs as we would expect [19].

Methods

Study design

The Sub-Saharan Africa Regional Partnership for Mental Health Capacity Building (SHARP) began depression screening in 10 NCD clinics in Malawi in May 2019, and recruitment to the cohort is ongoing. The overall objective of the SHARP scale-up study is to integrate depression treatment with diabetes and hypertension treatment at NCD clinics and compare outcomes between basic and enhanced clinic-level implementation strategies. Patients were recruited into the SHARP cohort by consecutive depression screening as they presented to the NCD clinics for their standard care. Eligible participants were 18–65 years of age, had elevated depressive symptoms denoted by a score ≥ 5 on the Patient Health Questionnaire (PHQ-9) [20,21,22,23], and had a new or current diagnosis of diabetes or hypertension. SHARP research assistants asked participants a series of questions, in Chichewa or Chitumbuka, to complete the baseline questionnaire. Patients included in this cross-sectional analysis completed their baseline interviews between May 2019 and March 2021.

Outcome assessment: depression-related stigma

The outcome of interest, patients’ levels of depression-related stigma, was measured using a brief 10-item instrument that was adapted from the Stigma in Global Context—Mental Health Study (SGC-MHS) [24, 25]. The SHARP study team used an iterative approach to tailor the instrument to the social and linguistic context of the study: team members involved in the study’s protocol development worked together to identify key prompts from the original SGC-MHS instrument to adopt and translate to Chichewa and Chitumbuka in a way that maintained the meaning of the prompt in English while being understandable in the target languages and cultures. Although the original SGC-MHS stigma instrument was 75 items, the SHARP team generated a modified 10-item instrument to meet the SHARP study’s data collection needs while focusing on facets of stigma that the team anticipated would be salient in the Malawian context based on prior research experience around depression in Malawi [26,27,28]. Ultimately, questions in the final instrument were related to negative feelings and attitudes toward individuals with depression (negative affect), the role of disclosure particularly on the family (disclosure carryover), and social isolation as a result of engaging in treatment (treatment carryover) [29, 30]. Disclosure carryover prompts were centered around the family due to the importance of family in this population and the indication from previous research that stigma spills over onto the family [18, 31]. Study team members who translated the stigma instrument were fluent in the respective target language.

The stigma instrument first introduced a vignette of a woman named Thandi who was experiencing depressive symptoms, without naming it as depression. Participants were asked to rate their level of agreement with statements about whether Thandi’s situation was shameful or embarrassing, e.g., in four prompts on a 5-point Likert scale from strongly disagree to strongly agree. A second vignette then explained that Thandi had been diagnosed with depression and presented five more prompts, almost parallel to the first four prompts, on the same 5-point Likert scale. Each prompt was written such that agreement endorsed depression-related stigma. Strong agreement with any given prompt was equivalent to 4 points, while strong disagreement was 0 points.

Along with the nine Likert scale items in the stigma tool, a tenth item asked participants to identify, from a list of options, what they believed to be the most likely cause of Thandi’s behavior (prior to the second vignette). Participants had seven prespecified responses to choose from, but they were also able to provide another response not listed or respond that they did not know. Responses to this item were not quantified but were instead examined graphically to better understand the way that participants interpreted Thandi’s depressive symptoms. If not belonging to the seven prespecified etiologies, patient responses were included in the “other” category and not further described.

Based on the exploratory factor analysis (EFA) results, stigma subscales were produced as the average of the items belonging to each factor identified. The overall stigma score was then equivalent to a weighted average of the subscale measures, based on the number of questions each subscale provided to the overall scale. To describe study sample characteristics succinctly, participants with an overall stigma score greater than 2 were identified as having high stigma. Using the Likert scale described previously (ranging 0–4, with 2 corresponding to “Neutral”), participants with an overall stigma score greater than 2 on average expressed agreement with the statements articulating stigmatizing beliefs, whereas those with a score of 2 or less were on average neutral or expressed disagreement with the statements.

Covariate assessment

Patient baseline interviews included instruments that measured depression (PHQ-9) [20], stressful life events (Life Events Scale) [32, 33], social support (Multidimensional Scale of Perceived Social Support) [34, 35], and adaptive coping behaviors (Brief COPE) [36]. Adaptive coping behaviors included positive reframing, acceptance, humor, religion, and using emotional support [36]. Baseline demographic information was also included, such as sex (as a proxy for gender), age, education level, and indicators of socioeconomic status. A weighted factor score was used as an indicator of socioeconomic status, generated from responses to radio, refrigerator, television, mobile phone, and car ownership, along with responses to whether the household had electricity, how often they worried about money, and how often they went to bed hungry. All scales were coded such that a larger number equated to a greater level or amount of that construct (e.g., greater wealth than the average participant or greater social support). Because the wealth score was generated as a weighted factor score, the mean will be expected to be equal to 0 and the standard deviation will be equal to 1 in this sample.

Statistical methods

To begin understanding the reliability and validity of the stigma measure, the distribution of characteristics of the SHARP baseline cohort was described in a table stratified by high versus low stigma. A Chi-square test was used to compare participants’ ability to correctly identify Thandi’s depressive symptoms across the high versus low stigma categories. Inter-item reliability was evaluated using Cronbach’s alpha. We then assessed three types of validity: structural, convergent, and divergent. To assess structural validity, an exploratory factor analysis (EFA) was conducted using the principal-factors method with a Promax rotation. Scree plots were used to identify the proper number of factors to retain in the analysis prior to rotation [37]. Using an EFA allowed us to investigate the dimensionality of the stigma tool and identify problematic items [38]. To assess convergent validity in this patient population, OLS regression models were run that included expected correlates of depression-related stigma (age, sex, education, depressive symptoms, social support, stressful life events, coping behaviors, and wealth). To identify divergent validity, a separate OLS regression model was run, regressing stigma scores onto height (measured in centimeters), while adjusting for sex. The hypothesis was that height and stigma would be unassociated, while stigma levels may be related to patients’ age, depression severity, and other indicators of social status. Stigma scores, depressive symptoms, age, social support, stressful life events, coping behaviors, and wealth scores were all included in models as continuous variables, whereas sex was coded as a binary categorical variable with female as the reference level, and education level was a 5-level ordinal categorical variable with no formal schooling as the reference level. Given that few datapoints were missing (3% in convergent validity analysis; 9% in divergent validity analysis), complete case analyses were conducted. All continuous variables were assessed for best functional form using marginal graphical analyses and likelihood ratio tests, which demonstrated that all continuous variables in these regression models fit best as simple linear terms. The data were analyzed using Stata [39].

Results

Of the 1586 patients approached by research staff for eligibility assessment at the time of data extraction for this analysis, 1105 were eligible and 695 consented. Ultimately, 689 participants completed their baseline interviews, and 688 participants completed the stigma questionnaire during their baseline interviews. Among these participants, 338 (49%) were identified as having high depression-related stigma (Table 1). The sample included 556 (81%) female patients and 132 (19%) male patients, and 380 (55%) of the patients admitted to often worrying about not having enough money for basic necessities. The total sample had overall stigma scores that were normally distributed (mean: 2.1; SD: 0.8).

Table 1 Characteristics of SHARP patients by baseline depression stigma (N = 688)

The stigma instrument demonstrated acceptable inter-item reliability with a Cronbach’s alpha of 0.77 (Table 2). Scree plots produced from the EFA suggested that there were three factors within the stigma instrument, and the subsequent rotated factor loadings clustered around three domains of stigma (negative affect, disclosure carryover, and treatment carryover) [29]. The EFA results demonstrated high levels of agreement among the questions that belong to each domain, with the exception of item #4 (Table 2). Thus, item #4 was removed in a second factor analysis and was not included in the scoring of stigma scales and subscales in this analysis. The removal of item #4 did not dramatically change subscale alpha values or the factor loadings of other items in the stigma instrument (see Online Resource 1). Within each factor, the Cronbach’s alpha remained acceptable, demonstrating sufficient internal consistency within subscales of the stigma instrument (Table 2). Lastly, the correlation matrix in Table 3 demonstrates the distinct nature of each domain, with low correlation coefficients between subscales.

Table 2 Reliability and structural validity of the overall stigma scale and subscales
Table 3 Correlations between stigma subscales

In identifying Thandi’s condition in the initial vignette, prior to being told that Thandi had been diagnosed with depression, 36% of participants said she had depression, while 34% thought it was stress, and 7% believed it was the normal ups and downs of life. Other responses were poverty (6%), HIV (5%), asthma (1%), and schizophrenia (2%). Another 6% of participants said they did not know (see Online Resource 2). Among participants with high stigma (overall score > 2), 114 (34%) identified Thandi’s condition as depression as compared to 137 (39%) patients in the low (overall score ≤ 2) stigma group. In contrast, 128 (38%) patients in the high stigma group identified her condition as stress, as compared to 103 (29%) patients in the low stigma group (χ2 = 12.7, df = 7, p = 0.08).

Generally, patients in the high stigma and low stigma groups had similar characteristics (Table 1). Patients with high and low stigma scores shared comparable distributions of education levels. Both groups also had similar distributions across age, sex, and socioeconomic status. On average, the high stigma group had greater depressive and anxiety symptoms, while the low stigma group had greater social support and more adaptive behaviors (Table 1).

The results of the convergent validity analysis demonstrated varying patterns of association between the four stigma scales—the overall scale and the three subscales—and the set of covariates. For example, regression coefficients from Model 1—with the overall stigma score as the dependent variable—indicated that greater depressive symptoms were positively associated with the overall stigma score, while adaptive coping behaviors were negatively associated with the overall stigma score (Table 4). Model 2—describing disclosure carryover—found that depressive symptoms and social support were positively associated with an increased belief that Thandi’s family would be better off if her situation were kept secret. Model 2 also found that age, stressful life events, and adaptive coping behaviors were all negatively associated with the disclosure subscale (Table 4). Model 3 found that depressive symptoms were positively associated with negative affect toward Thandi; adaptive coping behaviors were negatively associated with negative affect (Table 4). Model 4 found that stressful life events were positively associated with concerns that Thandi would lose friends if she sought treatment and others found out (treatment carryover). Model 4 also found a negative association between adaptive coping behaviors and treatment carryover concerns (Table 4). In Models 5–8, which assessed divergent validity by regressing each scale onto height, adjusting for sex, the mean estimates demonstrated that a 10-cm change in height was not associated with any significant change in stigma scores, as mean estimates were centered around zero, with confidence intervals on either side of the null value of zero (Table 5; Models 5–8).

Table 4 Convergent validity: coefficients and 95% CIs from ordinary least squares regression models relating the overall stigma scale and each subscale to demographic and psychosocial factors (n = 671)a
Table 5 Divergent validity: coefficients and 95% CIs from ordinary least squares regression models relating the overall stigma scale and each subscale to height, adjusting for sex (n = 624)a

Discussion

This study describes depression-related stigma among patients with diabetes or hypertension and depressive symptoms in Malawi. The results of this study demonstrate that a large proportion of these patients (49%), on average, agreed with stigmatizing statements about individuals with depression and their anticipated experiences. Results from this study also support the reliability and validity of a short stigma questionnaire in this population. When this questionnaire was drafted, the study team used an iterative process to identify key prompts to include in the abbreviated tool. These prompts were grouped around three themes: negative feelings and attitudes toward individuals with depression (negative affect), the role of disclosure particularly on the family (disclosure carryover), and social isolation as a result of engaging in treatment (treatment carryover). The stigma tool was also designed to have parallel questions related to these themes before and after the second vignette revealed Thandi’s depression diagnosis with the goal of understanding the role of an official diagnosis on stigmatizing beliefs. Exploratory factor analysis demonstrated that the items grouped around the three themes without regard to whether the prompt was delivered before or after the second vignette, meaning that participants generally responded with similar levels of stigma in these three domains, regardless of whether they knew that Thandi was diagnosed with depression.

The exploratory factor analysis results also demonstrated that the fourth item of the instrument was weakly associated with two domains, likely due to poor wording of the prompt. The fourth item was thus excluded from the stigma scale and subscale scoring. The exploratory factor analysis also found that each item of the newly 8-item quantitative instrument mapped onto only one of the three domains, with sufficiently high factor loadings, and the groupings of items on each domain aligned with what would be expected based on the content and themes of the questions [29]. This is an example of the structural validity of the stigma instrument. Moreover, the overall stigma scale and each subscale had acceptable alpha values, suggesting appropriate levels of internal consistency in these measures.

The current study results also exemplified convergent validity of the stigma measure in relation to expected variables. When collapsed across all three domains, overall stigma scores were positively associated with depression severity and negatively associated with education and adaptive coping behaviors. Based on the research literature, these associations were expected [17, 40]. In contrast, age was not clearly associated with overall stigma score, nor was sex, wealth, recent stressful life events, or social support. While the lack of strong or statistically significant associations with these constructs was not anticipated, the absence of such associations does make sense in context. First, given that the majority of study participants were female, there were fewer males in the study sample to compare stigma scores across these groups, leading to imprecise estimates of the association between sex and the outcome variables. Second, given that stigma is a multidimensional measure, relationships between these constructs may vary in direction across dimensions of the measure, ultimately averaging out to a null relationship between the construct and overall stigma score. One example of associations in opposite directions is the relationship between social support and disclosure carryover versus treatment carryover. In the disclosure carryover domain, individuals with stronger levels of social support, particularly from family, may be concerned about the impact of Thandi’s disclosure on her family. Conversely, in the treatment carryover domain, which focused on the likelihood of Thandi losing friends due to receiving treatment, individuals with higher levels of social support, particularly from friends, may anticipate that the friends would not abandon them for seeking help for their condition. The supposed mechanism behind these associations relies on the probable scenario that patients internalized Thandi’s story and spoke based on their own current or anticipated experiences of depression-related stigma. Previous research supports that, when vignettes closely align with the experience of participants, their responses are related to their own lived experiences or their familiarity with others’ experiences in these scenarios [41, 42]. Overall, these contradicting associations across domains, paired with the correlation matrix in Table 3, provide support for utilizing the domain-specific stigma scores in future multivariable analyses rather than using an overall stigma score. Our study results also exemplify the divergent validity of the measure, in that there was no association between stigma scores and height (adjusted for sex), just as would be expected.

Another important finding from this study was that most participants did not identify Thandi’s condition as depression; rather, they used a variety of words to describe her state. While 37% of participants directly identified Thandi’s condition as depression, another 32% described her condition as stress, and 7% called it the “normal ups and downs of life.” These three concepts are not exclusive; stress and “ups and downs of life” may also be used to describe Thandi’s situation and represent other commonly used terms to describe Thandi’s condition within our patient population. A similar trend was observed in South Africa: in a survey where respondents were given vignettes with obvious presentations of depression and substance use disorder, they were more likely to attribute those symptoms to stress [17]. Further, in Zimbabwe, depression is commonly characterized as “thinking too much” [43]. These descriptions highlight the variation in language used around depressive symptoms amongst patients in Malawi, which is similar to other southern African countries. This variation in language used to describe depression underscores the necessity of incorporating patient screening for depression into standard care visits using validated tools in the patient population rather than assuming that patients who say they are just “stressed” are not struggling with depression. Moreover, this is an opportunity for patient education: patients who understand depression to be a medical condition that is amenable to treatment may be more inclined to seek or accept treatment as compared to those who believe such symptoms to be simply stress or an artifact of everyday life [28].

Limitations

There are several limitations to this study. First, our data were cross-sectional, and we are therefore unable to identify the temporal direction of the relationships between several of the constructs that were measured. Second, the SHARP cohort is a convenience sample, meaning that the sample of patients enrolled in the SHARP study may differ from patients in other settings. As a result, this study may be susceptible to sampling bias, particularly because patients were being enrolled in a program related to depression, which may be a sensitive topic for some individuals. The SHARP study was also limited in the number of questions in the stigma tool due to an already extensive patient questionnaire within the broader study. Like any tool, this shortened stigma survey reflects the understanding and perspective of those who developed it and likely does not capture all aspects of stigma in Malawi. While this stigma tool does capture three dimensions of stigma in this population, a longer tool in future studies could provide deeper insight into the role of stigma in depression care.

As in all psychometric instruments, the validity of this stigma tool is also highly dependent on participants’ interpretation of the prompts presented to them. In cross-cultural studies of validity, a further barrier is the translation of language and values from one context to another [44,45,46]. The stigma instrument was designed and discussed in English, then translated and edited by study team members fluent in the two target languages: Chichewa and Chitumbuka. The team members involved in the design of SHARP study materials chose to use a vignette format for this stigma instrument based on prior qualitative research that demonstrated that many people in Malawi may not have a clear understanding of what is meant when referring to “depression” [26]. The study team further elected to gender the vignette character as a woman for two primary reasons: First, traditional gender roles are very common in Malawi, and depression tends to impair function [47,48,49,50,51,52]; Second, the patient population was expected to include more women than men due to variation in healthcare-seeking behaviors by gender [53]. Thandi’s gender inevitably influenced participants’ interpretation of Thandi’s depressive symptoms, as her struggle to complete the functions of her assigned gender role were deliberately described in the vignette. This likely influenced participants’ responses to Thandi and the associated stigma prompts. However, we anticipate that such a bias in participant responses may be in the direction of less stigma, given that depressive symptoms tend to be more socially accepted in women as compared to men [54, 55]. Furthermore, the sex of the participant—which we used as a proxy for gender—did not have a significant relationship to participants’ stigmatizing responses in this study (Table 4; Models 1–4), and the regression analysis results in Table 4 demonstrate that there was much imprecision in the coefficient associated with sex. If the gender of the participant strongly determined the way in which a participant responded to this vignette character, we would have expected a more precise estimate of the coefficient associated with sex in the regression analysis. Therefore, we do not anticipate that the gender of the vignette character differentially biased these analyses of the reliability and validity of the stigma tool in our study sample. Nevertheless, in future iterations of this instrument, the study team will consider using a male vignette character to test the degree to which the vignette character’s gender and traditional roles may influence participants’ responses to stigma prompts.

While quantitative indicators are useful for understanding some aspects of the validity of psychometric instruments, some information is lost without qualitative data. Particularly in studies that adopt an instrument from one sociolinguistic context and then tailor that instrument to another, cognitive interviewing or other “thinking aloud” protocols are useful to understanding how participants are interpreting and responding to prompts [56, 57]. Therefore, the SHARP study team’s next step in assessing this instrument’s validity in this patient population is to analyze qualitative interviews among a subsample of SHARP patients and compare patients’ qualitative responses to quantitative responses.

Strengths

Key strengths of this study include its sample size and breadth of data describing NCD patients with depressive symptoms in Malawi. Moreover, this study is the first to apply quantitative measurement of stigma in this patient population, offering further insight into depression-related stigma among individuals experiencing depressive symptoms, rather than describing stigma only in the general public.

Conclusion

The current study evaluates the psychometric properties of an instrument to measure depression-related stigma among individuals living with depression in Malawi. Validated tools are necessary to describe stigmatizing beliefs among stigmatized individuals and thereby better understand the unique barriers that these individuals face when accessing care or seeking support. Because stigma is multidimensional, we may not expect a patient with mostly treatment carryover concerns to have equal treatment adherence to a patient whose main concern is negative affect. It is therefore useful to have separate subscales of stigma for various dimensions, as exemplified in this analysis, to understand the ways in which these dimensions may differentially influence patient outcomes. Ultimately, longitudinal analyses of patient stigma and patient experiences throughout the treatment continuum are needed to provide critical information that may enhance the expansion of lifesaving psychiatric treatment in Malawi—validating a tool for measuring such stigma is a key first step.