Mindfulness-based programs (MBPs) have been advanced as treatments for a variety of health conditions and have reached a wide audience (Li et al., 2017; Maglione et al., 2017; Rogers et al., 2017; Ruffault et al., 2017). However, the generalizability of the effectiveness and safety of a treatment can only be inferred based on findings across multiple major demographic groups. The nature of stressors, cultural values, and available resources may differ across communities (Watson-Singleton et al., 2019), with potential implications for treatment utilization and outcome disparities (Guerrero et al., 2017; Narain et al., 2019). The World Health Organization emphasizes that in order to create healthy environments for all, it is necessary to address structural racism, cis-heterosexism, classism, and ageism, which contribute to the inequalities in the social and material determinants of health that disproportionately disadvantage minority groups (Healthy People, 2020). Addressing these problems requires inclusive research that can enable data-driven initiatives (Bailey et al., 2017). However, exclusion of Black, Indigenous, and People of Color (BIPOC) and sexual and gender minority (SGM) communities makes it difficult to accurately assess and include their needs and resources (Proulx et al., 2018), or to estimate the treatment effects of existing MBPs among these populations (Beery & Zucker, 2011; Oakes, 1972; Wells, 1999).

An extensive literature exists delineating the importance of cultural and demographic differences in determining intervention response and utilization (Kleinman et al., 1978; Singer et al., 1992), as well as the implications of these differences for mindfulness (Kirmayer, 2015). In a notable example, Watson-Singleton et al. (2019) found that African American women preferred to work with instructors from similar sociodemographic backgrounds because of a shared identity, history, and values that contributed to greater confidence and safety in the treatment. Recent reviews have identified the uniqueness of stressors associated with historical marginalization, the importance of understanding needs specific to minoritized cultural groups, and the potential implementation obstacles posed by treatments that do not reflect participants’ cultural values (Castellanos et al., 2020; DeLuca et al., 2018).

Leading institutions, including the National Institutes of Health and Centers for Disease Control (Geller et al., 2011), have advanced guidelines to address persistent inequalities in race, ethnicity, and cultural backgrounds. These guidelines include appropriately describing race and ethnicity of participant samples. Reporting on these variables can allow researchers to ascertain the degree of equity in research participation, in benefits, and in outcomes for interventions that are administered to the general population. These variables include, but are not limited to, the following variables: race, sex, gender, socioeconomic status (SES), and age. These identities are not mutually exclusive, and specific combinations of these attributes may contribute to unique effects. Therefore, intersectionality (simultaneous status in more than one category) is also an important analytic target. Transparent reporting of these characteristics is necessary for an empirical evaluation of the benefits of MBPs for diverse populations.

The current study utilizes an existing dataset compiled to evaluate the impact of mindfulness interventions on self-regulation as a mechanism of health behavior change (Desbordes, 2019; Hoge et al., 2021). An analysis of the reporting and omission of demographic variables in these trials, including race, ethnicity, gender, sexual orientation, education (an index of SES; Shavers, 2007), age, language, and composite variables in which these categories intersect (e.g., race and sexual orientation), is presented. This analysis expands on a recent review by Waldron et al. (2018), who examined the demographic characteristics reported within 69 RCTs of US-based MBPs and found that among the subset of studies that reported these demographics, non-Latinx White, female, and economically advantaged individuals were overrepresented in the study samples compared with the US census data. The analysis also includes international RCTs that involve two of the most widely studied and disseminated MBPs: mindfulness-based stress reduction (MBSR:(Kabat-Zinn & Hanh, 2013)) and mindfulness-based cognitive therapy (MBCT: (Teper et al., 2013)), which involve outcomes related to self-regulation, the ability to monitor oneself to intentionally manage cognitive and emotional resources in order to accomplish goals (Burman et al., 2015). The study builds on the work by Waldron and colleagues through four related aims: (1) including international studies, which represent a substantial corpus of research; (2) describe the completeness and reporting of key demographic variables, as well as the omission of these variables across studies, and trends in the reporting of these variables over time; (3) provide descriptive statistics for the demographic composition of reported MBP study samples, and compare the composition of the US samples with the US census data to ascertain representativeness; (4) identify studies that conducted subgroup analyses that may reveal differential outcomes based on demographic factors, as well as intersectional description of demographic data; and (5) offer a set of recommendations for including diverse populations in MBP research based on our findings. Finally, this review advances the MBP research field by describing the variables omitted, the diversity of reported samples, and the presence and nature of sub-group effects in both the US and non-US-based samples during a time when MBP research became exponentially growing popular and influenced policy-making (Van Dam et al., 2018).

Methods

This review is part of a broader set of systematic reviews that investigate the application of MBPs for self-regulatory mechanisms (Desbordes, 2019; Hoge et al., 2021). All reviews followed guidelines provided by the Cochrane Handbook of Systematic Reviews (Higgins & Green, 2011), the Agency for Healthcare Research and Quality’s Methods Guide for Comparative Effectiveness Reviews (United States Agency for Healthcare Research & Quality, 2008), and the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) (Liberati et al., 2009).

The following electronic databases were searched for this study: PubMed, PsycINFO, CINAHL (Cumulative Index to Nursing and Allied Health Literature), and ERIC (Education Resources Information Center). Additionally, a grey literature search was conducted using Open Grey (http://www.opengrey.eu/) and New York Academy of Medicine Grey Literature Report (http://www.greylit.org/). A panel of researchers (see COI) devised 118 search terms of relevance (see Appendix). The search was completed on July 31, 2016.

Researcher Identity

Lack of diversity among researchers is a pressing problem (Ginther et al., 2011; Wu, 2020) which is exacerbated by low reporting of author demographics. This review was performed by a group of researchers (see Conflict of Interest (COI) statement) who are stakeholder members of multiple minoritized groups, including race/ethnicity, SES, sexual orientation, gender identity, English Second Language (ESL), age, education, and immigrant status.

Inclusion/Exclusion Criteria for Study Selection

RCT studies published in English, involving MBCT, MBSR, or variations thereof (parameters described below) that address any self-regulation-related outcomes, were included as part of a large systematic review examining behavior change and MBPs. Self-regulation outcomes included three domains: cognitive processing, emotion regulation, and self-related processing. The following inclusion criteria were selected in order to obtain a pool of MBP studies which, because of their research design and quality, would most likely impact public health and medical dissemination: (a) program arm size >  = 10, (b) adult participants (age 18 +), (c) both clinical and general populations, (d) randomized controlled trials with at least one control condition. MBSR or MBCT used in the studies must have followed the standardized format described in the programs’ respective manuals (Segal et al., 2012) or delivered in a condition-specific context (e.g., smoking cessation), in which case the content must have adhered to the parent MBP manual for at least 50% of the total therapy time: (a) delivered in person, (b) within a group setting, (c) for an 8-week period, (d) with weekly 2.5 h sessions, (e) including a minimum half-day silent retreat (total 24 h of in-person time), or condition-specific variations that included at least 15-h in-person time. The final decision on inclusion or exclusion status was made by the principal investigator (WB). Articles reporting on the same study were combined to avoid duplication.

Data Collection Process

Population descriptions and reporting features were identified and extracted from all eligible studies, using an online data repository system (Systematic Review Data Repository, SRDR; http://srdr.ahrq.gov). To ensure reliability, extraction instructions for each data field were created and reviewed by the review team, then added into the online repository to guide the coding team (initials blinded). The extraction was assessed for completion and quality by two additional team members (initials blinded). The overall inter-rater agreement ranged from 85.1 to 99.0% (kappa = 0.933). All data, including category ratings and the derived population statistics, were double coded. Inconsistencies were discussed in group meetings until 100% consensus was reached.

Demographic Category Extraction

Included studies were reviewed for their measurement and reporting of the following demographic variables: (a) race, (b) gender, (c) sexual orientation, (d) age, and (e) education, and (f) reporting of intersections of demographic variables. Additional variables used for organizing the analyses and results, but which do not represent target demographic variables in this review, included (g) country of origin, (h) language (including language-based exclusion criteria), and (i) inclusion of any subgroup analysis by demographic variables. In order to determine reporting of demographic characteristics, each category was dichotomously coded for the presence or absence of reporting for each demographic variable (0 = absent, 1 = present). For example, if a study reported participants’ sexual orientation, the study would receive a “1” for that category. Sums of scores from each category were used to obtain the percentage of studies that reported on the given category.

Race and Ethnicity

Studies were grouped by country of origin. Race and ethnicity data were examined in the USA, UK, and Canadian studies, the three countries that provided the most studies reporting on race and ethnicity. Because most of the studies that reported on race were US based, for US studies alone, we examined differences between the racial composition of combined study samples and the US census data. Eighteen of the twenty US studies that reported on race and/or ethnicity did not include Hispanic ethnicity as separate from Hispanic race (i.e., “Hispanic” was considered to be another racial category along with “White,” and “Black/African American” within the race variable). The conflation of Hispanic race/identity is complex, and even the inclusion of a separate “Hispanic” category does not address the nuanced identities that are masked by this superordinate category. We retained “Hispanic” as a distinct additional category in our analyses, in order to capture reporting practices that did and did not follow official US census categorization. For comparisons between these US-based studies and the US census data, we used the overall percentage of the White (Hispanic White and non-Hispanic White) population in the US census (76.6%) as the reference percentage for White race/ethnicity. Although this approach of counting Hispanic as distinct category may under-report the number of participants in the included studies who would be considered “White” by the US census categories (e.g., a Hispanic and White person would be considered White in the US census, but not in numerous study protocols), we selected this as a conservative approach to treating incomplete data while also remaining as close to the data as possible.

Gender

Any gender demographic data were extracted to determine the proportion of studies that reported the following gender identities: male, female, transgender, gender non-conforming/non-binary, or others. The reported demographics were used to tabulate proportions of participants by gender. Constructs labeled in primary studies as “sex” were coded as gender. We assume that the authors understood sex as a biological construct which, in the absence of explicit gender reporting, was likely mislabeled. Likely, authors intended to report on gender identity or sex assigned at birth (see the “Gender Versus Sex” section).

Sexual Orientation

Any data reporting on sexual orientation were extracted, including reporting of “heterosexual,” “straight,” “homosexual,” “gay,” “bisexual,” or other orientations (e.g., men who have sex with men, MSM).

Education

The reported education status of participants was collected, as well the type of variable reported (i.e., categorical/ordinal or interval). Weighted mean (years, weighted by study N) was calculated for studies that reported mean years in education. Due to the variability in the education systems outside of the USA, the International Standard Classification of Education (Classification IS, 1975) was used to determine equivalence in grade levels across the different non-US countries documented in the systematic review.

Age

Age was collected and coded as follows: (1) age-related inclusion criteria (i.e., upper and lower age limit in each study), (2) reported mean and variance of age of the population, and (3) range of age reported in each study. Average minimum and maximum inclusion criteria were established across the studies that reported age-related inclusion criteria. The total weighted mean age (weighted by study N) was calculated across all studies. We calculated the median and the range of all reported standard deviations in age.

Intersectionality

Reporting of intersectional identities was dichotomously coded (0 = absence, 1 = presence), such that any study that reported on or analyzed the combinations of multiple, intersecting demographic or diagnostic criteria, received a score of 1.

Language Criteria

Language criteria were dichotomously coded (0 = absent, 1 = present) based on whether there were inclusion or exclusion criteria that specified participants must be fluent in the language in which the study was conducted.

Composite Demographic Reporting for Use in Temporal Analyses

Demographic composition and reporting of demographic variables across time were also assessed by creating scatter plot graphs of select variables over publication year. Results were assessed through visual inspection of trend lines and trend line slope. To illustrate trends in demographic composition, percent female and age, and percent white were plotted using all studies; percent Black was plotted for the US studies only. To evaluate overall reporting of demographic variables over time, a composite variable for each study was created using the following seven coded variables: race, gender, sexual orientation, education, variable stratification (i.e., covariate analysis), and intersectionality. “1” indicated a perfect reporting score for each variable, with a maximum total of seven points per study for all seven demographic variables. Some variables were coded with partial points, based on partial adherence to reporting standards. Partial scores were allocated as follows: for race, the lowest possible score of “0” would be given if no race information was reported, “0.33” if white vs. other was reported, “0.66” if more than one race was reported, and “1” if both race and ethnicity were reported (because of the bias toward the US studies in this reporting category, only the US studies were included in analyses with this composite metric as an outcome). For gender, “0” was given if gender was not reported, “0.50” if a binary was reported (i.e., male/female), and “1” if non-binary gender was reported. Sexual orientation was coded as either reported “1” or not reported “0,” as was education. For age, a “1” was assigned if range (0.33), variability (0.33), and mean (0.33) were reported. Presence of analyses that examined differences based on demographic characteristics received a score of “1” for variable stratification. Composite scores were plotted across time by year of study publication for the US studies.

Data Analyses

For all studies, descriptive statistics of frequencies and measures of central tendency and variance were reported. As most of the studies were US-based, inference statistics were reported for the US studies only. The US studies reporting data that were included in the 2017 US census (gender, race/ethnicity) were compared to the census data by chi-square analysis in order to see if the proportion of persons with specific demographic characteristics in the study data differed from the proportion in the US population. Analyses were performed in Microsoft Excel 2016. As recommended by Rea and Parker (1992), Cramer’s V statistics were calculated as an index of effect size, interpreted as follows: [0.1–0.2] is a weak effect, [0.2–0.4] is a moderate effect, [0.4–0.6] is a relatively strong effect, and [0.6–0.8] is a strong effect.

For temporal analyses of demographic variable reporting, regression coefficients were generated for each demographic variable type across time (publication year) with scatter plots generated for visual inspection of trend lines. To illustrate trends in demographic composition, percent female and age, and percent white were plotted using all studies; percent Black and composite demographic scores over time were plotted for the US studies only.

Results

The final systematic review included 94 studies (N = 8512), published between the years 2000 and 2016 (PRISMA Flow Chart, Fig. 1). MBP distribution is as follows: 60.6% of publications used MBSR, 28.7% used MBCT, and 10.6% used a modified MBSR or MBCT. See online supplement for a complete study list.

Fig. 1
figure 1

PRISMA flow chart. MBSR = mindfulness-based stress reduction. MBCT = mindfulness-based cognitive therapy. MBI = mindfulness-based intervention

Country of Origin

More than half (k = 53, 56.3%) of the 94 studies were from North America (US: k = 45, 47.9%; Canada: k = 8, 8.5%), a third were from Europe (k = 31, 33%), seven studies (7.4%) from the UK, and the remaining ten studies were located in Asia (Iran: k = 6, 6.4%; Israel: k = 1, 1%; China: k = 2, 2.1%) and Australia (k = 1, 1%). The mean total sample size per study was 91 (range = [18, 341]), with a mean sample of 44 among MBP arms (range = [10, 168]).

Race and Ethnicity

Reporting Race/Ethnicity

Race/ethnicity data were reported in 42 of all 94 studies (44.7%). This includes 5 US studies that contained errata in race/ethnicity reporting such that total numbers reported among participant demographics were not consistent with total reported study sample sizes. Race was reported in 32 of 45 US-based studies (71.1%, Table 1). Specific racial groups (e.g., multiple racial categories) were reported in 25 (55.6%) of the studies conducted in the USA, and 20 (44.4%) of the US-based studies reported both race and ethnicity (17 of these reported Hispanic/Latino as a race or an ethnicity). Although 5 (71.4%) of the UK studies reported participant race, only one of these (14.3%) reported specific racial groups. Four (50%) of the studies from Canada reported on race, although only one of these (12%) reported specific ethnic groups. Among all studies from countries outside the USA, UK, and Canada, only one (3%) reported on race (Fig. 2). Only two studies published outside of the USA (one in Canada: (Garland et al., 2014); and one in the UK: (Armstrong & Rimes, 2016), combined N = 145) reported on multiple racial categories (combined n among the two studies: White/European: n = 130; 90%; Black: n = 1; 1%; Asian: n = 11; 8%; and Native/Aboriginal: n = 3; 2%). There was no reporting of Native Hawaiian or Pacific Islander, Hispanic or Latinx, Multiracial, or Other Race in studies conducted outside of the USA.

Table 1 Reported racial and ethnic category breakdown
Fig. 2
figure 2

Reporting of race across mindfulness-based studies. This figure shows how studies reported and categorized race. When studies reported multiple racial categories, it was reported as “Specific Racial Groups.” When studies only reported a binary between White and other, it was reported as “Whites vs Others.” Many of the studies did not present racial data and were categorized as “Not Reported”

Participant Race/Ethnicity

Participants identifying as White comprised 89% of the overall participant population (n = 4,030) for studies in which race was reported (Table 1; Fig. 3). Studies conducted in the USA that reported race contained 79% White participants, 6% Black/African American, 3% Asian, 2% Multiracial, and 11% Others (see Table 1). Compared to the US census, there was a significant difference in the inclusion of racial groups across US studies compared with the general population (U.S. Census Bureau, 2016) (χ2 (6) = 221.70, p < 0.001, v = 0.161). Whites were over-represented by 2%, and Blacks and Asians were underrepresented by 7% and 3% respectively. Only 12 participants among all MBP studies identified as Native American/Alaska Native, and 7 identified as Native Hawaiian/Pacific Islander. Hispanic/Latinx participants were often reported as a racial category rather than separate ethnicity. Ten studies counted Hispanic/Latinx either as race or separately. Three studies (Eisendrath et al., 2016; Kearney et al., 2016; Pbert et al., 2012) had miscounts of their race data, where not every participant had a race data point, or there were more race data points than the number of participants. Due to these reporting issues, it was not possible to compare Hispanic/Latinx demographics to the US census. As noted earlier, the number of White participants may be underestimated in the present demographic calculations, as participants who selected Hispanic were not classified as White in our calculations even though individuals with comparable characteristics might have been considered White in US Census reporting. No studies performed covariate or subgroup analyses according to race/ethnicity.

Fig. 3
figure 3

Racial distribution of participants in mindfulness-based studies. This figure shows the racial distribution of participants whose race was reported in Canada, the United Kingdom, and the USA. The US census data (2016) are included for comparison. The ethnicity Hispanic/Latinx is not included here

Gender

Studies that reported on gender (k = 88; 94%) only reported it in binary categories, (predominantly male/female, less commonly male vs. other or female vs. other). No MBP study reported transgender, gender-nonconforming, or other gender categories. There were more female than male participants in all studies (70% versus 30%, see Fig. 4). Among the US samples, the ratio of female to male participants (68% versus 32%) deviated from the general population (50.8% versus 49.2%) (U.S. Census Bureau, 2016) (χ2 (1) = 399.06, p < 0.001, v = 0.344) with a moderate effect size. Only studies in Iran (k = 6), had a higher percentage of male than female participants (55% versus 45%).

Fig. 4
figure 4

Gender distribution of female and male participants in mindfulness-based studies. Vertical line indicates population distribution of about 50/50%. No study included a third gender option

Eight studies either adjusted for gender (Cherkin et al., 2016; Polusny et al., 2015) or used gender as a covariate in their analyses (Benn et al., 2012; Bondolfi et al., 2010; Carson et al., 2004; Eisendrath et al., 2016; Skovbjerg et al., 2012; van Ravesteijn et al., 2013). One study (Benn et al., 2012) showed a differential effect of gender on outcomes after the MBP without reporting direction.

Sexual Orientation

Only two studies (k = 92; 98%) reported on the sexual orientation of participants. One US study (n = 88; (Carson et al., 2004)) recruited only heterosexual couples, and a Canadian study (n = 117; (Gayner et al., 2012)) recruited gay male participants. No studies reported on bisexual participants, gay women, or other populations identifying as sexual orientation minorities.

Educational Background

Fifty-six (60%) of the 94 studies reported education. Forty-five of these reported the highest degree obtained, including 23 US studies and 22 non-US studies. Forty-six of these studies reported education in years. Of those reporting in years, participants in the six US studies that reported on years of education had a weighted mean of 15.5 years of education (SD = 0.56–3.00; n = 611). Non-US studies had a weighted mean of 15.0 years of education (SD = 1.9–3.5; n = 855). Comparing these averages to national and international education statistics, MBP studies included participants with higher educational levels than the general population averages in the USA (13.8 years), the UK (13.2 years), and non-US or UK countries (combined UNESCO educational year average = 12.3 years (UNESCO, 2018)). No studies explicitly excluded or included participants based on education, and we did not search for implicit exclusion (e.g., exclusion of illiterate participants). Three studies included education level as a covariate in statistical analyses: level of education was controlled for in the Cherkin et al. (2016) study due to differences in randomization assignment, with no significant effects on outcomes reported. Skovbjerg et al. (2012) tested for education differences across groups to verify randomization, finding no differences in education level across groups. Benn et al. (2012) used education as covariate, finding that reductions in perceived stress varied by education level, although the direction was not specified.

Age

The weighted average age was 45.8 years old across the 92 (98%) studies that reported mean age (Table 2). Eighty nine percent of all studies reported SDs for age (k = 84); the median of standard deviations in age was 9.9 (range of SDs reported within studies: 2.6–17). Among studies that reported age ranges (k = 25, 27%), average minimum age was 27.2 and average maximum age was 67.9. Although 43 studies (45.7%) used minimum age as an inclusion criterion (average minimum age criterion = 19.9 years, SD = 8.93), fewer studies (k = 21, 22.3%) used maximum age as an inclusion criterion (mean maximum age = 67.3 years, SD = 4.81). One study recruited older adults, with a minimum age inclusion criterion of 75. Some age inclusion criteria were not explicitly stated but rather implied, as in studies that recruited Marines, physicians, educators, college students, and parents. We note that our findings concerning minimum age inclusion criteria are likely influenced by the search terms of the review, which featured only studies with adult participants. Five (5.3%) studies included age as a covariate (Bondolfi et al., 2010; Cherkin et al., 2016; Skovbjerg et al., 2012; van Ravesteijn et al., 2013). One study found that when age was included as a covariate, it was significantly associated with positive affect post treatment (Benn et al., 2012), but the direction of the association was not reported.

Table 2 Participant age across mindfulness-based interventions

Intersectionality

Five studies (5%) reported populations with multiple, intersecting identities: two reported on the intersection of race/gender ((King et al., 2016): White men (n = 33), Black men (n = 2); Hebert et al., 2012: White men (n = 21), Black men (n = 14); one reported an intersectional characteristic likely indicative of age/sex (postmenopausal women; n = 110; (Carmody et al., 2011), one reported on ethnicity/race/gender (non-Hispanic White women; n = 78; (Whitebird et al., 2013), and one reported on sexual orientation/gender (gay men; n = 117; (Gayner et al., 2012). No other studies reported on intersecting demographic data, nor analyzed the effects of their program on intersectional sub-populations.

Language Criteria

Thirty (32%) of the 94 studies specified fluency in the language in which the study was conducted as inclusion criteria.

Temporal Analysis

We inspected trends in the extent to which studies reported demographic characteristics in the USA, and then in the reported demographic composition of study samples overall, by means of regression. This retrospective analysis is intended to describe trends rather than to make generalizable inferences. Given the field’s aspiration toward generalizability, improved representation, and equity in research, an “ideal” finding would reveal trends toward greater inclusion; findings in the opposite direction would be cause for concern. Thus, regression coefficients and significance statistics are reported for descriptive purposes only.

Reporting of Demographic Characteristics over Time

Reporting of race and ethnicity improved only slightly over time (B = 0.01, SE = 0.02, t(43) = 0.60, p = 0.56), with the trend line consistent with an increase in 0.01 points in the race/ethnicity reporting score per year (Fig. 5A). Initial studies in the early 2000s reported only White/other or did not report race at all. The first study to attain a maximum reporting score for including specific races and ethnicity was published in 2010. Trends did not improve for reporting on education (B =  − 0.04, SE = 0.03, t(43) =  − 1.40, p = 0.17). This slope was quite small: since education was dichotomized (0 = no, 1 = yes), this corresponds with an overall shift from reporting education to not reporting it, all else being equal, over roughly 25 years (Fig. 5B). Age also saw a minute decrease in reporting, B =  − 0.02, SE = 0.01, t(43) =  − 1.12, p = 0.27 (Fig. 5C). These two variables contributed to the overall worsening over time in total reporting, computed as the sum of seven variables (age, race, education, gender, variable stratification, and intersectionality) B =  − 0.06, SE = 0.05, t(43) =  − 1.14, p = 0.26 (Fig. 5D). Trends in gender reporting corresponded with a slight decrease attributable to 2 studies that did not report gender, B =  − 0.01, SE = 0.01, t(43) =  − 1.72, p = 0.09. Too few studies reported variable stratification, sexual orientation, or intersectionality to evaluate trends over time separately for those variables.

Fig. 5
figure 5

Reporting of demographics in mindfulness-based studies over time in US studies. (A) Reporting of race and ethnicity. 0 = no race/ethnicity reported (1/3 = white/other reported; 2/3 = specific races reported; 1 = specific race and ethnicity reported). (B) Reporting of education (0 = not reported; 1 = reported); (C) reporting of age (1/3 = mean, 2/3 variance or range, 1 = variance and range); (D) overall reporting of all demographic variables over time, including age, race, education, gender, sexual orientation, variable stratification, and intersectionality

Demographic Composition of Studies over Time

Next, we inspected trends in participant demographics over time, in all included studies. Over time, MBIs have become slightly more inclusive and diversified with regard to gender, age, and race. Among studies that reported these characteristics, samples have shifted from approximately 80% female to approximately 60% female from year 2005 to 2015, at a rate of approximately 1% per year (B = 0.01, SE = 0.01, t(87) =  − 1.2, p = 0.234: Fig. 6A). A visual inspection of the distribution of age means over time suggests more variation in mean participant ages across studies (Fig. 6B). This was further examined by comparing studies published before the median year (2013) with studies conducted in 2013 or later via Levene’s test, which revealed different variances in the two groups (Levene’s statistic (1,89) = 5.39, p = 0.02). There is a trend toward more diversified race inclusion, as well, with, going from 99% White in the three initial studies reporting race that were conducted between 2000 and 2004, dropping to 77% White in 2016, consistent with a 1.6% decrease in proportion of White participants per year overall, B =  − 0.016, SE = 0.01, t(40) =  − 2.70, p = 0.010 (Fig. 6C). Only two US studies prior to 2011 reported the number of Black participants (1% and 4% Black, respectively; Fig. 6D). Percentage of Black participants increased from approximately 1% to 11% in the US studies from 2004 to 2016, B = 0.01, SE = 0.01, t(19) = 1.16, p = 0.26, corresponding with an increase of approximately, 0.6% per year, although only 21 studies reported percentage of Black participants in a way that could be used in these analyses.

Fig. 6
figure 6

Participant demographics in mindfulness based studies over time. (A) Percent of population reported as female in included studies, by publication year. (B) Mean age of study population, by publication year. (C) Percent of population reported as White, by publication year. (D) Percent of population reported as Black or African American in studies conducted in the USA, by publication year

Discussion

Under-Reporting and Omission of Demographic Categories and Variables

This systematic review examined the reporting of demographic characteristics in the two most widely studied MBPs (MBSR, MBCT, and close adaptations). We found the extent of omission and inadequacy in reporting to warrant substantial concern. Across 94 RCTs, we found substantial variability in reporting methods across demographic categories, especially in race/ethnicity and education. All studies did not report any non-binary gender categories. Most studies omitted reporting of multiple racial/ethnic and education categories, sexual orientation, or intersectional sub-populations.

One-third of examined studies from the USA reported English language fluency as an inclusion criterion, even though 65 million Americans (21.6%, U.S. Census Bureau, 2016) speak a language other than English at home.

Among studies that reported participants’ education, the majority of participants had approximately one and a half to two more years of education than the general public within the US studies (U.S. Census Bureau, 2016), and two to three more years of education than the general public within the non-US studies (UNESCO, 2018). The majority of studies predominantly enrolled those in middle age (mean participant age = 46), with relatively low age variation among studies. Only three of the reviewed studies focused on older adults. Meanwhile, there are strong indications that the proportions of the US and European populations over 65 will continue to increase in the coming decades (Economic & Financial Affairs, 2018; U.S. Census Bureau, 2018). Among MBPs that reported on race and ethnicity, White participants were over-represented compared with the general US population. Most of the non-US-based studies were conducted in parts of the world where there are high percentages of White individuals [(Canada (k = 8), 73% white), and the UK (k = 7) (87% white)], which may reflect disparities in access to research funding and opportunities between the global north and south (Bulhan, 2015). Trends over time in the US studies show improvements in reporting of multiple racial categories and ethnicity, although reporting of other demographic variables has not improved. Surprisingly, reporting on age (including range and variance) and education worsened between the years 2004 and 2016.

These discrepancies between the study samples of MBP trials and non-participant populations suggest that the field has been out of step with the demographic landscape and shifting needs of world populations. They also speak to potential gaps in the findings that such research can produce. Requiring English fluency to participate in research, or continuing to rely on samples that have more education than the general population, may exclude many individuals with sociocultural characteristics that can moderate the efficacy or implementation in MBPs (Chen & Cheung, 2021). Research also suggests that age may moderate MBP treatment outcomes (Gallegos et al., 2013; Raes et al., 2015), and more representative research can clarify the nature of age-related interaction effects.

Importantly, by excluding BIPOC from the benefits of participating in research, many studies’ failure to report on race and ethnicity contributes to and perpetuates structural racism in the field of integrative and complementary health, following Bailey et al.’s (2017) definition of the term as “the totality of ways in which societies foster racial discrimination through mutually reinforcing systems of housing, education, employment, earnings, benefits, credit, media, health care, and criminal justice.” Mindfulness practices are underutilized by racial/ethnic minorities in the USA (Macinko & Upchurch, 2019), a testament to the failure of the MBP field to actively engage these communities. Enhancing diversity and improving the reporting of race and ethnicity may help to counteract the structural racism of existing institutions, which have thus far enabled systematic exclusion of racial/ethnic minorities (Olano et al., 2015; Proulx et al., 2018).

Reporting on gender and sexual orientation reveals further challenges facing the MBP research field. Consistent with prior meta-analyses of MBPs (Dryden & Still, 2006; Waldron et al., 2018), 70% of all study participants were female. This approximates the demographics of current meditators in the USA, which comprise more women than other genders (Cramer et al., 2016). Fewer than one-tenth of the reviewed studies adjusted for gender in their statistical analyses, and only one study (Benn et al., 2012) showed a differential effect of gender on MBP outcomes (although direction of the effect was not reported). Some recent studies suggest differential effects of MBPs for men (vs. women, in binary comparisons), including lower improvement in characteristics like clarity and emotion regulation (Kang et al., 2018; Rojiani et al., 2017), higher anxiety (Johnson et al., 2016), and not improving in subjective well-being and stress after an MBP (de Vibe et al., 2013). Unfortunately, the reliability of such differences is not yet known; inclusion of more male participants in future research is important to better understand the potential moderating effects of sex or gender among MBPs.

Notably, no study reported any non-binary sexual or gender demographics, suggesting that intersex, non-binary, genderqueer, and gender non-conforming (GNC) participants may not have been included in the study populations, could not be identified due to omission of these categories as demographic measures, or that these participants were excluded from analyses that were reported in the published research. The only study among those we examined that reported sexual orientation specifically investigated the impact of MBSR for gay men (Gayner et al., 2012).

In the USA, approximately one in twenty-five adults and one in seven adolescents (Raifman et al., 2020) identify as lesbian, gay, bisexual, or transgender (LGBT), and one in 250 identifies as transgender or gender non-conforming (Herman et al., 2017). Proportionally, there may have potentially been up to 1200 LGBT and 16 transgender or GNC individuals who were either not included, not identified, not reported, or withdrawn from analyses. Given the limited access to health care (Cruz, 2014), health disparities (Bradford et al., 2013; Budge et al., 2013), discrimination in healthcare systems (Lambda Legal, 2014), yet high willingness to engage in behavioral treatments (Martell et al., 2004) among sexual orientation and gender minorities, these populations would especially benefit from equitable inclusion in MBP research.

Implications of Reporting Omissions and Lack of Diversity in MBP Research

As MBPs are disseminated more widely, with expanding applications in health care (Demarzo et al., 2015; Wielgosz et al., 2019), underrepresentation of minority populations in MBP research can perpetuate existing health inequities that these populations face, as well as the systemic racism, cis-heterosexism, ageism, and classism that contribute to them. Furthermore, under-representation of minoritized groups in research is itself a manifestation of systematic racism. Those at the intersection of multiple underrepresented and marginalized identities (Blosnich, 2018; Heard et al., 2020) are among the least likely to be identified by current reporting standards; only five of the reviewed studies mentioned any type of intersected population, and no study analyzed the effects of their program on individuals identifying with multiple minority identities. The combination of an extensive research literature on one hand, and of demographic homogeneity within this literature on the other, can perpetuate an insidious myth of presumed universality: MBPs may be assumed to have strong universal support, even though the research that lends this support predominantly reflects a narrow slice of the population. Meanwhile, the difficulty many researchers experience in recruiting typically under-represented samples into their studies may itself be the product of poor disseminability of MBPs for the populations that MBP research has ignored, ironically leaving researchers to continue recruiting the same homogeneous samples.

In light of these problems, the following section outlines five recommendations to improve representation and address disparities in MBP research.

Recommendations for the MBP Research Field

We believe that a system change is necessary, and see merit to adaptations within the delivery and content of MBPs themselves. Hence, we offered recommendations on multiple aspects, starting with the reporting of diversity, and continuing with outreach to diverse populations including a removal of systemic barriers of participation, making space for underrepresented communities to contribute their voice during the research process by collecting feedback and collaboration with communities during every step of research.

New Framework for More Robust and Transparent Reporting of Demographic Data, Including Attrition

This review reveals the extent to which MBP studies omit demographic variables in their reporting. Missing and low-quality reporting for race, ethnicity, education, gender, and/or sexual orientation is common in psychological research, where collection of demographic data has suffered from lack of consensus or standardized procedures (Krieger et al., 2003). A field-wide consensus on demographic data collection for MBPs is needed, potentially in the form of a generalized rubric. Such a consensus can help reveal the degree of representativeness in current research, which in turn can guide efforts to improve accessibility and relevance of MBPs for diverse populations.

NIH-funded studies are currently required to report race, gender (although binary only), and ethnicity (NIH, 2017). It may therefore be a relatively small step for studies to start reporting on a number of additional, potentially critical participant characteristics, including the intersections of those variables that are already being tracked. Additional inclusion of variables that impact a person’s access to and “uptake” of mindfulness-based interventions such as disability status, immigration/refugee status, differing abilities in cognition or physical function, body type, socioeconomic status, religion, and intersecting identities, is also important. For example, religion may play an important role in implementation and dissemination of MBPs to diverse populations because of perceived conflict with participants’ religious (or a-religious) commitments (Palitsky & Kaplan, 2019). Innovative frameworks like Sexual Configurations Theory (van Anders, 2015) can enable the study of diverse sexualities and gender identities via multiple, dimensional, and fluid facets (vs. discrete LGBTQIA + categories). However, these data must be collected within an environment of collaboration and safety. It could be counter-productive, for example, to collect sensitive data if participants fear the handling of personal information, especially pertaining to immigration status or religion.

These recommendations also extend to attrition reporting. Although attrition data and adherence rates, along with reasons for withdrawal according to CONSORT guidelines, are commonly reported, studies do not typically report the demographic data of participants who drop out. One of the included studies in this review (Arch et al., 2013) specifically addressed differential attrition between racial/ethnic minorities and non-Hispanic White participants. Participants who dropped out before the first class were significantly more likely to be racial/ethnic minorities than participants that began treatment. Study researchers, funders, and journals can encourage reporting of detailed demographic data of individuals who have dropped out or missed study visits, and encourage researchers to include an explanation for these dropouts, including addressing the role of systemic racism, study design-specific barriers, and other factors that may relate to attrition (Amico, 2009). Solicitation of personal experiences and beliefs with regard to racial identity, gender identity, or sexual orientation, generational experiences and/or social standing, may further inform whether attrition is related to such factors. As we discuss in Recommendation 5, disclosing researcher identities may also be an important component of rigorous demographic reporting in MBP research.

Dedicate Resources to Improve Outreach

An important possible explanation for the lack of representativeness among MBP study participants is a failure to outreach appropriately. Several groups have identified how outreach problems can hamper recruitment (e.g., Proulx et al., 2018; Spears et al., 2017; Watson-Singleton et al., 2019; Woods-Giscombé & Gaylord, 2014). Their work has also attributed underrepresentation to geographic and temporal barriers (location of study, insufficient time due to professional or personal caregiving responsibilities), perceived lack of interest from research teams, experiences of racism, discrimination, or lack of trust in research process or teams, lack of identification with, or representation among research teams or intervention delivery staff, and/or perceived incompatibility with the content of the intervention (Blum, 2014; Woods-Giscombé & Gaylord, 2014). All of these are potential targets for specialized outreach efforts and accommodations.

Outreach expertise is no less essential for conducting inclusive research than any other critical area of expertise, such as biostatistics. For this reason, we recommend that teams attempt to dedicate resources to outreach specialists who can occupy influential decision-making roles on research teams. We propose prioritizing the inclusion of outreach specialists in research budgets for an effective and comprehensive MBP research field. Researchers whose resources do not allow for hiring recruitment specialists are encouraged to nevertheless consider how, given their resources, they can benefit from such expertise through collaborative research or community partnerships.

It may be useful, for example, to identify a liaison with underrepresented communities, particularly at times of recruitment, informed consent, study visit follow-up, and if a study participant drops out or withdraws. Additionally, it may be useful to identify how data collection procedures can be established that allow participants to explain reasons for dropout, including reasons related to exclusion or discrimination or safety.

Implement a Feedback Structure Throughout the Study Process and Report Received Feedback to Funders and in Publications

Another way to begin to address the representation gap is to solicit feedback from study participants, especially from stakeholders in historically marginalized and under-represented populations. This would allow the field of MBP to catch up with, and enter into conversation with, already standard procedures in implementation science through “Hybrid 3” designed studies focusing on implementation (Landes et al., 2019), with multiple extant frameworks that may serve as guidelines. Although a full discussion of this literature exceeds the scope of this review, several implementation frameworks may be especially germane for the collection, disclosure, and incorporation of participant feedback to design better suited interventions for diverse populations: RE-AIM (Glasgow et al., 2020), M-PACE (Chen et al., 2013), and Adaptome (Chambers & Norton, 2016), as well as the methods of community-based participatory research (Wallerstein & Duran, 2006).

Collaborate with Communities During Every Step of Intervention Development

Beyond inclusion is collaboration. Increased collaboration with stakeholders from diverse populations can afford a better understanding of the relevant cultural norms and existing strengths, and pressing needs of these communities. It also introduces the opportunity to incorporate cultural values (e.g., familismo: (Davila et al., 2011)), and to anticipate community-specific risks (e.g., negative religious coping:(Pargament et al., 1998)). For some communities, collaborative research may indicate departures from current emphases on intellectual, individualistic, and cognitive approaches (Magee, 2016).

Importantly, stakeholder collaborators are vital for diversifying the “paths to mindfulness.” Community-based and participatory research can signal how to potentiate the strengths of existing communities and populations to adapt and strengthen the interventions for specific populations (e.g., Bazzano et al., 2015). It is also crucial not to assume that currently standard conceptions of MBPs are universally good for everyone (Britton, 2019). Collaborating with stakeholders from marginalized communities may involve prioritizing the tools for well-being that they already possess. Indigenous, native, religious, cultural, ancestral, or family-based traditions and values may contribute to resiliency in a way that the current MBP field leaves out (La Roche et al., 2011; Santoyo & Santoyo, 2019). Co-creation of new programs with underrepresented populations has the potential to harness strengths from multiple sources (Hartwell et al., 2018; Spears et al., 2017; Watson-Singleton et al., 2019). Such collaborative efforts may reveal that adapting MBP language and praxis to include idioms commonly used by the target population (e.g., mindful walking can be amended to mindful dancing; didactics based on a “banking” model of education may be amended to story-sharing models) (DasGupta et al., 2006) can improve engagement and reduce stigma (Kaiser et al., 2015).

Diversify MBP Research Investigators, Intervention Facilitators, and Settings

This review demonstrated that MBP research studies are largely located in Western, educated, industrialized, rich, and democratic (WEIRD) societies in the Global North, similar to much other behavioral and psychological research (Henrich et al., 2010; Rad et al., 2018). Diversification of the facilitators, investigators, and settings of MBP research may support diversification of MBP research participants, and can contribute to better-adapted methods and curricula (Proulx et al., 2018). Proulx et al. (2018) observed “We know that when one demographic is overrepresented in anything, the work in that field will typically be biased in ways that reflect the values and practices of that demographic” (p. 368). While intuitive approaches to diversification may be tempting, it is important to approach this process iteratively with feedback from participants. For example, while improving race/ethnicity concordance between participants and research staff may seem like an intuitive solution, a recent study that pooled data from longitudinal studies of respiratory illness found concordance was surprisingly associated with higher attrition rates (Mindlis et al., 2020). Funding institutions such as the NIH have devoted significant resources to minority mentorship programs and funding opportunities. Such resources must also be invested in diversifying the next generation of researchers, study coordinators, and data scientists in MBP research. One step forward could be to encourage and normalize the reporting of the demographic identities of the researchers conducting the research, including their roles within the study (i.e., decision making, data collection, MBT delivery) and the quality and amount of contact with participants, common for reflexive qualitative approaches (Engward & Davis, 2015).

Limitations of This Systematic Review

Identity of Researchers Was not Included

This study did not collect or report on the identities of the researchers included in this review. Aside from considering the demographics of the participants in MBP research, it is crucial to consider demographics of the MBP instructors and researchers, who also typically emerge from WEIRD (Henrich et al., 2010) and dominant culture populations, a trend that extends into disparities in research funding (Ginther et al., 2011). On one hand, researcher identities might interact with the participants’ demographics (Does et al., 2018) to produce unique and unmeasured effects. On the other hand, the particularities of researcher backgrounds may introduce biases and priorities that derive from their own worldviews and assumptive frameworks.

Underpowered Samples

The sample sizes of the reviewed studies were too small to calculate optimal hierarchical analyses incorporating the intersectionality of subgroups (Frost et al., 2008). This power issue can only be addressed by conducting meta-analyses of multiple studies. In order to facilitate analyses over multiple studies, the raw data of studies could be uploaded to databases, such as the Open Science Framework (osf.io). To avoid possible re-identification of the participants, decently large sample sizes would still be necessary.

Studies Were Limited to Adult Populations

The current review only included studies with adult samples (age > 18 years) and therefore provides no information about the representation and reporting of key variables in studies in youths. This is especially relevant as school-based mindfulness programs are becoming widespread, with efforts to make them required and state-sponsored (Loughton & Morden, 2015; Zenner et al., 2014).

Hispanic/Latinx Ethnicity Data Overlapped with Racial Data

It was not possible to determine the degree of representation of those who identify as Hispanic or Latinx. In analyzing the data on race, we attempted to separate out double counted Hispanic/Latinx persons. However, occasionally, it was not possible to determine exactly where the double counting occurred. Thus, there may be a small amount of error in the number of individuals tabulated in the Hispanic/Latinx category.

Non-English Language Studies Were not Included

This review was limited to studies that were reported in English. While we documented the number of studies that used language as a criterion for participation, we did not document the languages used to administer the mindfulness intervention in the studies. This may have impacted results by skewing included studies toward those conducted in countries with majority-White populations. Without justification, we assumed that all interventions might have been conducted in the study country’s national language, and all the US studies might have been conducted in English, because it was not differently reported in the articles and we did not follow-up with study authors.

Other Demographic Information Was not Reported on

The demographic variables reported in this review reflect what is commonly reported in the underlying studies, but the list of variables is not exhaustive. Other important variables include recent immigration/refugee status, differing abilities in cognition or physical function, body type, religion, and non-education metrics of SES, such as income or employment. Including diverse populations in the research process to improve the selection, definition, and recording of demographic variables might help to optimally reflect the population’s diversity.

Only a Subset of Mbps Are Represented in This Review

Including only studies of standardized 8-week MBCT and MBSR is both a strength and a limitation of this study. As MBSR and MBCT are among the most structured, standardized, and widely disseminated programs, this study was able to compare a large number of participants across a relatively controlled set of conditions. This is a limitation however, because these MBPs do not include the broad range of mindfulness-based interventions (e.g., apps, mindfulness groups, and mindfulness-based psychotherapy like acceptance and commitment therapy). They are also time-intensive, were not specifically created to serve underrepresented populations, and may include structure and content that are not reflective of the majority of contexts where MBPs are actually delivered. Thus, this review prioritized internal over external validity. Future reviews should analyze the demographics in a broader range of MBPs, including modified and specifically adjusted MBPs (Amaro et al., 2014; Dutton et al., 2013; Sobczak & West, 2013) to further define who currently uses MBPs, and for whom MBPs have been designed and evaluated (such as DeLuca et al., 2018).

Gender Versus Sex

Sex-related biological factors and gender identity may influence health and response to interventions in different ways, and hence, the terms sex and gender should be used appropriately rather than interchangeably (Clayton & Tannenbaum, 2016). In this review, it is possible that both sex assigned at birth and gender identity are conflated; because some included studies only reported sex, for these studies, sex was used as an index of gender, introducing a degree of error in estimation of gender characteristics.

Challenges of Diversifying MBP Research to Address Systemic Racism, Hetero-Cissexism, Ageism, and Classism

Results presented here and in others’ work underscore the need to diversify MBP research to develop interventions that best meet the needs of populations at the highest risk of health disparities. Several guidelines are presented here to facilitate this endeavor, but challenges remain. Solutions may require multi-pronged approaches with iterative processes, with collaboration and feedback between multiple levels of inquiry and experience (e.g., researcher, participant, funder, community, providers, instructors).

The five recommendations that we offer represent a first step toward rapprochement of this serious challenge in research. However, more than recommendations are required: since 2016, the field has continued to be slow to mobilize to address issues of equity and inclusion. Only cross-disciplinary engagement among MBP researchers on large scale can begin to change the structures and incentives of research, to ensure that when it comes to MBPs, we are mindful of who has a seat at the table.