Introduction

Household air pollution (HAP) is one of the world’s largest environmental health risk factors [1]. Nearly 3 billion people rely on biomass fuels like firewood, charcoal, animal dung, and crop residues for their daily cooking and heating needs [2]. Inefficient combustion from burning biomass fuels in traditional open fires leads to high levels of air pollution and environmental degradation. In turn, HAP exposure is responsible for an estimated 1.6 million premature deaths and 60 million disability-adjusted life years annually [1]. There is substantial epidemiological evidence for the adverse effects of HAP on health [3,4,5,6,7,8,9], but to date there have only been a few randomized controlled trials of cookstove interventions to improve health [10,11,12,13,14,15,16] and evaluating changes in personal air pollution exposure remains rare [17,18,19].

While estimates of health burdens from air pollution require data on average personal exposure (to fine particulate matter—PM2.5—principally) [20], exposure assessment remains a significant challenge in clean cooking intervention studies [21]. Clean cooking interventions must reduce long-term average personal air pollution exposure if they are to improve health. Therefore, contextualizing the results from clean cooking interventions is only possible through extensive personal air pollution exposure monitoring to characterize the effect of interventions on exposure—and thus the potential for improvements in health. Furthermore, personal air pollution exposure monitoring enables exposure–response analyses that are instrumental in establishing health risks [20].

We carried out the Ghana Randomized Air Pollution and Health Study (GRAPHS) (Trial Registration NCT01335490), a cluster-randomized intervention trial to test the effectiveness of a cleaner biomass stove or a clean cooking fuel to increase birth weight and reduce pneumonia incidence during the 1st year of life through reduced maternal and child air pollution exposure [22]. As elsewhere [10, 11, 23, 24], only households with an eligible pregnant mother were provided the intervention stoves, meaning most participants were surrounded by other family units still using traditional biomass fires for cooking.

GRAPHS makes several important contributions to the understanding of the potential for clean cooking interventions to improve health. GRAPHS was among the first randomized controlled trial to include a liquefied petroleum gas (LPG) intervention [16], though there are others ongoing [24,25,26,27]. In addition, GRAPHS researchers undertook extensive personal air pollution exposure measurements to enable assessments of the effectiveness of the interventions to reduce exposure and subsequent exposure–response analyses with health outcomes.

The present study describes the effects of clean cooking interventions on long-term average personal (maternal and child) air pollution exposure from a large cluster-randomized intervention in Ghana. We present exposure results from intention-to-treat analyses, as well as an exploration of the variety of factors that affect personal exposure. In doing so, we provide guidance for future interventions and programs that seek to reduce air pollution exposure through clean cooking fuels and cleaner biomass-burning stoves.

Methods

The GRAPHS protocol has been described elsewhere [22]. Briefly, 35 clusters of 38 communities were randomized into three study arms: control, cleaner biomass stove, and clean cooking fuel. Eligible women were (1) carrying a live intrauterine singleton fetus, (2) in their first or second trimester of pregnancy (gestational age ≤ 24 weeks as determined by ultrasound), (3) the primary cook in their household, and (4) nonsmokers. The protocol was approved by the Columbia University Medical Center and the Kintampo Health Research Centre Institutional Ethics Committee. All pregnant women provided written informed consent for their and their child’s participation. Participants were enrolled from August 2013 to January 2014 and data collection ended in March 2016.

The study included two intervention arms. In one arm, households received two BioLite HomeStoves in the cleaner biomass stove study arm (BioLite Inc., Brooklyn NY). The BioLite stove improves heat transfer efficiency (i.e., more energy to the pot per unit fuel combusted) through improved geometry and also increases combustion efficiency through thermoelectric powered fan circulating air through the combustion chamber [28, 29]. In the LPG intervention arm, households received one two-burner LPG cookstove and two 14.5 kg LPG cylinders. After the baseline exposure assessment, households received deliveries of one LPG cylinder refill and stove maintenance and repair as needed until they exited the study. Additional gas was available if households ran out prior to the next scheduled delivery. Stoves and associated hardware were repaired or replaced when needed in both intervention arms. Representative photographs of the stoves across the study arms are available in Fig. S1. Research staff visited each home weekly and checked on stove status. Households in the control arm also received weekly visits. These were framed as bed net check-up visits.

Study context

The study sample consisted of women and children from 38 communities in the Bono East Region of Ghana (formerly known as the Brong-Ahafo Region), including Kintampo North Municipality and South District of Ghana, West Africa. In a formative pilot study in the GRAPHS study population, biomass fuel use was recorded among 99% of the households [30]. A nationally representative survey shows that 91% of rural households and 73% of all households relied on biomass fuels (firewood and charcoal) for cooking in 2017 [31]. The region is primarily a tropical savanna climate. Uniquely, West Africa experiences a season called Harmattan characterized by episodes of dry and dusty northeasterly winds blowing from the Sahara Desert over West Africa (December–March). There is also pervasive crop and field burning during Harmattan in this region [32].

Exposure measurements

Rationale

Air pollution exposure assessment in GRAPHS was designed to optimize available technology and funds based on pilot experiences in Ghana [22]. Published pilot data indicated that area sampling (e.g., in the kitchen) was not predictive of personal exposures [30]. In line with our objective of identifying the effects of the interventions on personal exposure and to enable individual-level exposure–response analyses, we opted to monitor personal exposure. Furthermore, at the time of designing the study, the scientific literature indicated that personal CO exposure was a good predictor of personal PM2.5 exposure [33,34,35] and that 48 h of sampling was necessary to effectively estimate long-term exposure [36].

Mean PM2.5 exposure of the primary cook in the pilot was 129 μg/m [3] (95% confidence interval (CI): 100–157 μg/m3; median: 122 μg/m3)—an exposure somewhat lower than other similar studies [18, 37, 38]. At the time of developing the study, it was believed that there would be a greater chance of the cooking interventions yielding observable health benefits as compared hypothetical higher exposures at baseline because lower exposures are closer to the steepest part of the PM2.5 dose–response curves for relevant health outcomes (i.e., ~15–100 μg/m3) [39].

Given budgetary constraints, we opted for CO—which was cheaper to monitor than PM2.5—as the primary marker air pollution exposure. Still, given the importance of PM2.5 as an indicator of health risk, we obtained supplemental funding to monitor personal PM2.5 for the majority of participants at two time points after intervention, rather than at more time points for fewer participants. This approach was intended to enable the development of a CO to PM2.5 prediction model, thus retaining a large study sample in future PM2.5 exposure–response analyses. However, we note two limitations of this approach. First, while at the time of study development and during data collection the literature suggested that CO to PM2.5 prediction was a feasible and lower-cost alternative to direct PM2.5 measurements, since then the predictive power of CO to estimate personal PM2.5 exposure has come under question [40]. Second, as reported in section “Monitoring plan,” PM2.5 exposure monitoring did not occur in the baseline period which limits statistical analysis of the effect of cooking interventions.

Monitoring plan

The primary objective of air pollution exposure monitoring during GRAPHS was to attribute exposures to individuals to enable (forthcoming) exposure–response analyses. Figure 1 summarizes the exposure monitoring plan. Baseline exposure assessments occurred after enrollment and prior to stove intervention. Field teams then carried out three additional post-intervention exposure assessments over the remaining duration of the pregnancy (~9, 6, and 3 weeks prior to delivery). Mothers and newborns received exposure assessment 1, 4, and 12 months after delivery. A subset of women received co-located PM2.5 and CO monitoring. Personal exposure measurements were collected for 72-h periods and trained fieldworkers visited each participant every 24 h during each 3-day period to record information about activities during the previous day and to ensure monitor wearing compliance.

Fig. 1: Personal air pollution exposure monitoring plan for GRAPHS.
figure 1

Participants (pregnant women) received baseline carbon monoxide (CO) exposure monitoring at the time of enrollment in the study or shortly after (77% the same day; 86% within a week). Three weeks after intervention stove delivery (itself 1-2 weeks after enrollment), all participants received personal CO exposure monitoring and a subset of participants (65%) each received simultaneous personal CO and personal PM2.5 exposure monitoring. Sessions 3 and 4 were personal CO exposure monitoring only, spaced at three-week intervals prior to birth. One month after birth, both the mother and newborn received personal CO exposure monitoring. Three months later, all mothers received personal CO exposure monitoring and a subset (65%, partially overlapping with the first subset) received simultaneous personal CO and personal PM2.5 exposure monitoring. At this time, all newborns received personal CO exposure monitoring. Newborns did not receive personal PM2.5 exposure monitoring due to the size of the monitor. Eight months later, at child age 1 year, the mother and child received personal CO exposure monitoring. Session numbers (1-7) are associated with the relative timing of the planned monitoring sessions (i.e., baseline = 1, three weeks before birth = 4, four months after birth = 6).

Carbon monoxide monitoring

We used the Lascar EL-CO-USB Carbon Monoxide (CO) data logger (Erie, PA) as the primary personal exposure monitoring method. The devices were programmed to record CO concentrations every 10 s throughout the entirety of the target 72-h monitoring period. The device reports concentrations between 0 and 1000 parts per million (ppm) and has a manufacturer-reported precision of ±6%. In addition to factory calibration, calibrations were checked every 6 weeks using NIST traceable certified calibration gas in the KHRC laboratory. Based on these calibration checks, device- and time-specific correction factors (CF) were generated to adjust CO observations during data processing [41, 42].

The CO monitor was placed in a rainproof plastic housing and clipped to clothing near the breathing zone of the mother. For infants, monitoring equipment was clipped to swaddling clothes or the cloth that holds the baby on its mother’s back. Participants were instructed to keep the CO monitor on their person/near the baby throughout the day and to place it close to their head while sleeping (see Fig. S2 for representative photographs).

In a subset of samples (N = 132), we carried out co-deployments of the CO monitors where a participant would wear two monitors concurrently throughout a deployment period. Valid 48-h estimates between co-deployed devices were positively correlated (r = 0.62; p value < 0.001). We averaged values in analyses when a participant had two valid 48-h estimates.

Fine particulate matter monitoring

In one prenatal and one postnatal maternal monitoring session, the RTI MicroPEM V3.2 monitor (Research Triangle Park, NC) was deployed alongside the CO monitor. The MicroPEM includes a nephelometer for real-time monitoring, a Teflon filter for analysis of integrated concentrations, and an accelerometer for assessing wearing compliance of subjects. Teflon filters were pre- and post-weighed on a microbalance after equilibration in an environmentally controlled glovebox, with static charge dissipated with a Po-210 source and correcting data for buoyancy, following established protocols at Columbia University described further in Supporting Information. Filters were installed in and removed from the MicroPEM in a clean air hood at the KHRC laboratory. During the first and last 5-min periods of each deployment, a low back pressure HEPA filter was attached to the MicroPEM to aid in correction of the nephelometer baseline drift.

Identifying valid air pollution exposure estimates

The purpose of this study is to assess the effect of two clean cooking interventions on the personal exposure of women and children in Ghana. To best address this research question, we utilized a stringent data validation procedure and retained only the data in which we have the highest confidence. The study protocol dictated 72-h monitoring periods for both CO and PM2.5 deployments. However, only 47% of CO exposure sessions achieved 72 h of run time. Still, more than 90% of all CO exposure deployments achieved more than 48 h of run time. Therefore, we used mean 48-h CO exposure as the primary study outcome. Data after the 48-h mark were discarded to maintain comparability across samples due to the diurnal patterns observed in personal exposure to air pollution (e.g., low exposures during the night, very high exposures during cooking events) and to not arbitrarily capture a different number of short-term cooking events which largely drive the average CO exposure. We utilized the same procedure for PM2.5 exposure; 92% of PM2.5 exposure deployments achieved 48 h of run time. Full details on deployments meeting validation criteria are reported in the “Results” section.

Carbon monoxide exposure validation

CO exposure data was validated according to three independent criteria described here, in the Supporting Information, and at length elsewhere [41]: (1) deployment duration; (2) visual validity of the exposure time series; and (3) CF confidence.

  • Deployment duration: deployments lasting fewer than 48 h were removed from final data analysis.

  • Visual validity (valid, low, or invalid): with oversight from study leadership, two members of the research team plotted the time series exposure data and visually assessed the validity of the measurements according to three criteria and blinded to study arm, which were codified in a standard operating procedure [41]. First, patterns of “spikes” of increased exposure were assessed as valid—as opposed to plateaus of high exposure, increasing or decreasing CO values over the entire time series. Second, elevated baseline where majority of CO readings hover above 0 ppm was assessed as invalid. Third, long periods of baseline 0 ppm which were evaluated on a case-to-case basis (e.g., periods of flatline at 0 ppm while CO spikes still occur may not be problematic, but a sudden change from more responsive data to sudden flatline was deemed invalid). Only visually valid files were retained for this study.

  • CF confidence (high, low, or none): monitors were tested against a standardized 50 ppm CO tank every 6 weeks, from which we calculated CF (CF = measured value divided by the expected value). Confidence levels, developed after visual inspection of the data and to avoid large corrections, were assigned as follows: “high” if the CF is in the range 0.6 ≤ CF ≤ 1.2, “low” if CF is >1.2 or if 0.2 ≤ CF < 0.6, and “no” confidence if CF < 0.2. Only samples with a high CF confidence were retained for this study.

Fine particulate matter exposure validation

Fine particulate matter exposure assessed using the RTI MicroPEM underwent a multi-stage validation process to utilize the real-time and time-integrated data and estimate 48-h personal exposure. A full description of the exposure validation procedure is available in Supplementary Information Section 1.2. Briefly, the time series data were visually validated, checking if the data contained negative readings, improbable plateaus of high values, “stair-step” increases and decreases in baseline, or if the pre- and post-deployment HEPA period readings were outside of the expected range (±20 μg/m3). Only visually valid data were retained for this study.

Three corrections were done to each deployment to get final average 48-h PM2.5 concentrations. First, an initial baseline correction was applied where valid interpolated HEPA readings for each minute were subtracted from nephelometer readings. If the endline monitoring-period HEPA filter reading was missing, then the pre-HEPA reading was assumed to be valid for the entire deployment. Second, for deployments with valid gravimetric filter weights (no holes, tears, or lost filters), a gravimetric correction was carried out by multiplying each nephelometer reading by the ratio of the gravimetric PM2.5 concentration divided by the average nephelometer PM2.5 concentration for the total deployment time. For deployments without valid gravimetric samples, an average CF for the individual MicroPEM device was used. Nephelometer measurements were assigned an average CF using the most recent or bracketed (before and after deployment) paired valid gravimetric samples. Third, all nephelometer data points were corrected as described above prior to averaging the first 48 h of active data collection.

Statistical analysis

We carried out a difference-in-difference analysis to assess the effect of the cooking interventions on maternal air pollution exposure. We also present two additional analyses using data subsets to (1) demonstrate the importance of leveraging the full randomized design to assess the effectiveness of the cooking interventions and (2) to provide a comparison to other studies using cross-sectional or before and after designs.

We carried out three types of regression analyses with log maternal 48-h CO exposure as the primary outcome to assess the effect of interventions on exposure (see Table 1). Secondary outcomes included log child 48-h CO exposure and log maternal 48-h PM2.5 exposure. For all regression analyses, we utilized generalized estimating equations (GEEs) with robust standard errors using the “sandwich” variance estimator and an exchangeable correlation matrix to account for both multiple observations per participant and the village-level nature of the GRAPHS intervention, as implemented in other studies with repeated measurements among individuals nested within clusters [43, 44]. In GEEs, parameter estimates of interest are interpreted as “population-averaged,” because they are averaged across the clusters (i.e., villages and participants in those villages).

Table 1 Approaches to estimating the effect of intervention on exposurea.

The first equation assesses differences in exposure “across study arms” utilizing only post-intervention data. The parameter of interest in this model is the effect of study arm indicator variables, with the improved biomass and LPG arms being compared to the control arm. The second equation carries out a “before and after” comparison for all study arms. Here, the parameter of interest is the effect of the post-intervention study period indicator variable. This model effectively controls for subject characteristics but has limited ability to control for confounding by time-varying determinants of exposure. The third and final equation is a “difference-in-differences” approach that utilizes all study data and includes indicator variables for study arm and post-intervention study period. The main parameters of interest are the interaction variables between intervention groups and post-intervention study period. This is similar to carrying out the “across study arms” comparison but with the added adjustment for any potential differences between study arms.

When exponentiated the parameters of interest represent the fraction of exposure experienced by the control group that the group of interest experienced. We transformed these results into the final outcome of interest: percent reduction in personal exposure due to treatment status. The “difference-in-differences” model is our primary specification because it fully leverages the study design and data collection and best accounts for potential confounding. Nonetheless, we present the “across study arms” and “before and after” models because they are comparable to other common study designs [18] and demonstrate the importance of the randomized nature of our intervention.

Then, the fourth equation assessed the effectiveness of interventions disaggregated to each monitoring session. This analysis mirrored the “difference-in-differences” approach, but rather than treating the post-intervention period as a unit, we analyzed each session to assess the effectiveness of the intervention over time.

In an additional analysis, we examined the association between population density surrounding participants and personal exposure. We calculated the number of individuals living within a 50 m radius of each study household using local census data to estimate population density [45], and therefore potentially capture neighboring air pollution emissions. We considered measuring population density as the number of individuals living within 100 and 200 m radii, too, finding similar associations in analysis, limited changes in population density ranking. Therefore, we opted for the closest distance to ensure the plausibility of the association as a measure of contributions from neighboring cooking events.

As a check of robustness, we jointly applied the CO and PM2.5 validation procedures to sessions with co-deployed CO and PM2.5 monitors to obtain a smaller, “paired high-validity” maternal PM2.5 and CO exposure data set (N = 1 048). We observe consistency between our main results and those obtained in this paired exposure data set and only report these results in Supplementary Information Section 2.1.

All analyses were performed in R software version 3.6.0 [46]. GEEs were implemented using “geepack” [47]. Code that supports the analyses presented in this study will be made available upon publication.

Results

Table 2 reports descriptive statistics for the GRAPHS study participants with a valid CO exposure estimate. Participants were nonsmoking pregnant women, on average in their late 20s, with ~2 years of completed formal education on average. Households had on average between six and seven members. Most households had their primary cooking location fully outside, though many had multiple cooking locations, one of which was at least semi-enclosed (95%). Approximately half of households shared their primary cooking location with another household—though study households were the sole users of their intervention stoves. In addition, half of study households had a dedicated room in the house for cooking. Firewood was the dominant primary cooking fuel for households prior to randomization, though half used charcoal as a secondary fuel. Households in the LPG cluster had a slightly higher average number of persons living within 50 m of the household. The use of tobacco products was relatively rare; only one-fifth of households had a smoker (almost exclusively men). Observed household- and individual-level differences across study arms resulted from randomization taking place at the community level [22].

Table 2 Baseline descriptive statistics of GRAPHS population with a valid CO exposure estimate.

Exposure measurements and validation

The GRAPHS study team carried out 11,898 CO exposure deployments (8540 maternal and 3358 child) on 1405 mothers and 1083 children. More than 75% of mothers received six or seven sessions and more than 75% of children received all three of their intended sessions (Table S2). Nearly all mothers (97%) received baseline exposure monitoring and at least one post-intervention monitoring session. The percentage of mothers and children receiving exposure monitoring during each session is detailed in Table S3.

Figure S3 summarizes the air pollution exposure validation process. Overall, two-thirds of maternal CO exposure sessions resulted in a high-validity 48-h exposure estimate; the percentage was slightly smaller for child exposure monitoring sessions (57%) (Table S4). Estimates were removed according to validation criteria: 10% lasted <48 h (maternal median = 71.85 h, child median = 71.93 h); less than one-quarter were not visually valid (maternal: 16.3%, child: 23.9%); and some had an invalid calibration factor (maternal: 18.9%, child: 24.5%) (Table S5). Figure S4 shows representative images in each visual validity category. Approximately 70% of samples were valid in the prenatal period, but in the postnatal period the fraction of valid samples declined to around 60% (Table S6).

A final sample of 5655 valid 48-h maternal CO exposure estimates and 1903 valid 48-h child CO estimates was obtained after the validation criteria were applied and after removing a small number of sessions for having an improbable 48-h CO concentration of 0 ppm (maternal N = 4; child N = 4) and averaging valid co-deployments (maternal N = 92). Mothers (N = 16) and children (N = 1) with no valid exposure estimates were dropped from analyses.

The GRAPHS study team also carried out 1750 PM2.5 monitoring sessions for 980 women, conducted in conjunction with a subset of the CO monitoring sessions. A procedure similar to the CO validation procedure was conducted for the PM2.5 measurements (see section “Fine particulate matter exposure validation”). A small number were removed due to insufficient run time (N = 134), low visual validity (N = 184), and missing gravimetric sample validity (N = 29). In total, more than 80% of PM2.5 monitoring sessions resulted in high-validity 48-h exposure estimates (N = 1 389). Ten of these high-validity estimates were removed because they took place during the baseline period.

Summarizing maternal and child air pollution exposure

In the baseline period, 0.6% of 24-h maternal CO exposure estimates exceeded the World Health Organization (WHO) CO 24-h guideline of 6.11 ppm (equivalent to 7 mg/m3) [37]. Table 3 provides descriptive statistics for maternal CO, maternal PM2.5, and child CO exposure estimates. Baseline CO exposures did not differ significantly across study arms. CO exposure decreased in the post-intervention period for all study arms. The distributions of post-intervention 48-h maternal CO and PM2.5 and child CO exposures for each study arm are visualized in Fig. S5. In the post-intervention period, the percent of 24-h maternal CO exposure estimates in excess of the WHO 24-h guideline was 1.4% in the control study arm, 0.7% in the improved biomass study arm, and 0.6% in the LPG study arm.

Table 3 Descriptive statistics of valid maternal and child personal exposure monitoring deployments during GRAPHS.

Mean child CO exposure was lower than maternal CO exposure in all study arms. Among paired samples where child and maternal CO exposure was monitored during the same session, the two exposures were weakly correlated (Pearson’s r = 0.36) (Fig. S6). The median ratio between child and maternal CO exposure was 0.78 (interquartile range: 0.30–1.79), though observed ratios varied greatly across paired samples (Fig. S7).

Figure 2 shows a time series of 48-h maternal CO and PM2.5 exposure estimates from all study arms throughout the post-intervention period (November 2013 to February 2016). Here, two patterns emerge. First, CO exposure appears to decline throughout the study period. Second, PM2.5 exposure shows a marked seasonal pattern with periods of higher exposures during the Harmattan season. As a result of these observed patterns, we carried out several different analyses to assess the effect of cooking interventions on exposure.

Fig. 2: Post-intervention 48-h CO and PM2.5 exposure measurements during the GRAPHS study period show seasonality during Harmattan.
figure 2

Time series of post-intervention 48-h measurements of maternal CO (upper panel) and PM2.5 (lower panel) exposure from November 2013 to February 2016. Points display 48-h individual measurements from all study arms. Solid lines show a local weighted smoothing (LOESS) function with light gray areas showing the 95% confidence interval of the local mean. Time periods shaded gray depict Harmattan season (December–March) when episodes of dry dusty winds are typically more prevalent.

Estimating the effect of clean cooking interventions on personal air pollution exposure

Before conducting our “difference-in-differences” primary specification, we assessed the effect of cooking interventions on exposure using two approaches: (1) “across study arms” and (2) “before and after.”

Table 4 reports results from the “across study arm” approach (first equation). As compared to the control arm, both the LPG and improved biomass arms had reduced mean maternal CO exposure (LPG: 42% lower, 95% CI: 35–48% lower; improved biomass: 10% lower, 95% CI: 1–18% lower). An exploration of seasonal patterns found that exposure reductions in the intervention arms were greatest among the subsample of sessions obtained during non-Harmattan months (April–November, representing 59% of maternal samples). The difference in CO exposure between the LPG arm and the control arm was somewhat attenuated during Harmattan months (35% lower, 95% CI: 22–45%) and we observed no difference between the Improved study arm and the control arm during these months (3% lower, 95% CI: 17% lower to 12% higher). We found that child CO exposure was only reduced in the LPG arm as compared to the control arm (22% lower, 95% CI: 6–35% lower; improved biomass: 6% lower, 95% CI: 21% lower to 11% higher).

Table 4 Summary of personal exposure after intervention.

Two-thirds of post-intervention mean maternal 48-h PM2.5 exposure estimates exceeded the WHO Annual Interim-I guideline (35 μg/m3) [37] in the LPG study arm, with the fraction for the improved biomass and Control arms being higher (86% and 88%, respectively) (Table S7). In addition, more than 85% of mean maternal 24-h PM2.5 exposure estimates exceeded the WHO 24-h guideline of 25 μg/m3 and nearly all exposure estimates were above the 10 μg/m3 annual guideline. Mean maternal PM2.5 exposure was only reduced among the LPG arm as compared to the control arm (32% lower, 95% CI: 26–38% lower; improved biomass: 4% lower, 95% CI: 11% lower to 4% higher). PM2.5 exposure estimates were higher among Harmattan subsamples and during this season we observed no significant differences in exposure across the study arms. The reductions of the LPG intervention arm were larger during the non-Harmattan season than the reduction observed when including all monitoring sessions.

In comparison to the “across study arms” models, the “before and after” models described in the second equation incorporate data from the baseline period in addition to the post-intervention study period for each study arm. Exposure fell significantly in the post-intervention study period among all study arms as compared to the baseline (Table 4 and Fig. S8). Indeed, even the control group had an estimated 32% lower (95% CI: 24–39% lower) mean maternal CO exposure in the post-intervention period. This trend makes the “difference-in-differences” approach where we use all exposure estimates obtained during GRAPHS particularly important.

In the “difference-in-differences” analysis, then, we see that as compared to the change observed in the control arm in the post-intervention period, only the LPG arm experienced a significantly greater CO exposure reduction (47% lower, 95% CI: 36–56% lower) (Table 5). Using the same approach, but with a non-logarithmized outcome, we estimate that the absolute reduction in personal CO exposure attributable to the LPG intervention is 0.52 ppm (95% CI: 0.28–0.75 ppm lower). In contrast, the change in exposure after the intervention in the improved biomass study arm was not different from the control arm (8% lower, 95% CI: 21% lower to 8% higher).

Table 5 Estimates of the effect of the cooking interventions on maternal 48-h CO exposure using different model specifications, expressed as a percent change in mean exposure.

Effect of interventions on exposure over time

Maternal CO exposure fell throughout the study period for all study arms, including the control arm (Fig. S9). We conducted a session-specific difference-in-differences analysis to evaluate whether the effect of the intervention diminished over time. In the LPG arm, no attenuation of the intervention effect was seen during the prenatal period. In the postnatal period, the intervention effect was somewhat attenuated, but still significant as compared to control. Similar trends over time were observed in the improved biomass arm, although reductions in exposure were not significant as compared to the control arm (Fig. 3).

Fig. 3: CO exposure differences between the LPG study arm and improved biomass study arm compared to the control study arm throughout the study.
figure 3

Results from models described in the fourth equation—the “session-specific difference-in-differences” regression approach—to explore the potential interaction between the effects of the intervention over time by cluster. Point estimates are the percent change in CO exposure as compared to the control study arm baseline period with 95% confidence intervals. Models account for within-subject clustering over time and the cluster-randomized nature of the intervention using generalized estimating equations.

Assessment of population density and exposure

Given the focus on intervening during pregnancy, participants in the intervention study arms were in close proximity to households not enrolled in the study. Close proximity to nonintervention households using three-stone fires may have affected personal air pollution exposure in intervention study arms. Population density across the study groups varied somewhat (control mean (SD) persons within 50 m: 48.4 (29.6); improved biomass mean (SD): 48.6 (31.9); LPG mean (SD): 53.9 (35.0)). Households in the LPG study arm living with more than 50 persons within 50 m (approximately the median) had average 48-h CO exposure of 1.00 ppm (SD: 2.58), whereas those living with fewer than 50 persons within 50 m had average 48-h CO exposure of 0.72 ppm (SD: 0.88).

Discussion

In this study, we presented the results from the largest randomized clean cooking intervention trial to report air pollution exposure results to date. First, we described the validation procedures we employed to ensure high confidence in exposure estimates. Then, we described the overall results of CO and PM2.5 maternal and child exposure deployments, characterizing both the distribution of deployments over the study period and across study arms. We showed that the LPG stove significantly reduced personal CO exposure as compared to the control three-stone fire and that PM2.5 exposure was lower in the LPG arm as compared to the control in the post-intervention period. We also showed that there was no attenuation of the intervention effect during the prenatal period among LPG stove users, but that there was some evidence of effect attenuation after birth. We also demonstrated that a fan-assisted biomass stove did not lead to statistically significant reductions in CO or PM2.5 exposure as compared to the control.

This study makes several important contributions to the field. Although the validation procedures were stringent, GRAPHS nonetheless yielded more than 5600 48-h maternal CO exposure estimates, 1903 48-h child CO exposure estimates, and 1379 48-h maternal PM2.5 exposure estimates, one of the largest personal air pollution exposure monitoring efforts in the context of clean cooking interventions to date. Low within-subject correlation across all exposure measurements justified our repeated measurements approach (see Supplementary Information Section 2.1). GRAPHS marks one of the largest deployments to date of a clean cooking fuel intervention in a randomized controlled trial, and the first time the impact of LPG stoves on personal exposure to air pollution has been rigorously tested. While there are some clean cooking fuel intervention efforts ongoing [48], few prior studies have presented exposure results [18, 49]. This study also offers insights into air pollution exposure among pregnant women, a particularly sensitive group where exposure reductions can yield substantial public health benefits.

The main results from the present study show that the mothers in the LPG study arm experienced 47% lower mean 48-h CO exposure compared to the control arm using pre- and post-intervention data and 32% lower mean 48-h PM2.5 exposure using post-intervention data. We also show that a fan-assisted biomass stove did not reduce CO nor PM2.5 exposure in statistically significant ways. These results further support the findings from a recent meta-analysis that concludes that improved biomass-burning stoves have not reduced personal PM2.5 exposure below WHO air quality guidelines [18]. Stoves using clean fuels like gas, electricity, or alcohol have the potential to reduce air pollution exposure much more than “cleaner” biomass stoves in real-world use. Our study demonstrates that statistically significant exposure reductions are possible through an LPG stove intervention. Still, two-thirds of post-intervention mean maternal 48-h PM2.5 exposure estimates in the LPG study arm exceeded the WHO Annual Interim-I guideline of 35 μg/m3.

There have been multiple reasons proposed in the literature to explain the failures of improved and/or clean fuel stoves to achieve expected exposure reductions, notably: (1) insufficient emissions reductions over the long term due, potentially, to stove breakage and/or maintenance issues over time; (2) continued traditional biomass stove use in parallel to the intervention stove (termed, “stove stacking” or “fuel stacking” when referring to the use of multiple fuels) for a variety of different reasons [50,51,52,53]; and (3) high levels of ambient air pollution due to interventions in single households in communities where the majority of households continue to use traditional stoves.

The LPG arm experienced significantly lower exposure compared to the control arm throughout the entire study period (median time between first and final sessions: 357 days). Furthermore, we observed consistent LPG stove use during the entire study period and before and after birth (Figs. S10 and S11). Improved biomass stove use, however, declined over time and exposure in the improved biomass study arm was not different from the control arm throughout the majority of the study period.

Given the growing body of literature discussing the potential for clean cooking intervention to improve health, it is valuable to contextualize our results. First, we note that there are relatively few directly comparable studies—that is, randomized controlled trials with clean fuel interventions reporting personal CO exposure measurements. The most comparable study to our own to present results to date is the Randomized Exposure Study of Pollution Indoors and Respiratory Effects (RESPIRE)—a randomized controlled trail with an improved solid fuel stove with a chimney in Guatemala. Geometric mean maternal CO exposure declined by 61% (95% CI: 57–65% lower; baseline concentration 3.4 ppm) in RESPIRE, though throughout the study only 529 personal CO exposure estimates were collected [17]. A review of eight studies that examines pre- and post-improved solid fuel stoves with chimney intervention personal CO exposure estimated a weighted mean reduction of 52% (3.4 to 1.6 ppm) (totaling 778 estimates, most coming from RESPIRE) [18]. This same review only found three studies that included a clean fuel intervention—one for LPG in Sudan (N = 57 estimates) and two for ethanol in in Ethiopia and Madagascar (N = 85 estimates combined)—though neither utilized personal air pollution exposure monitoring and instead only had kitchen monitoring. These studies reported declines in kitchen CO concentrations between 76 and 82%, though pre-intervention concentrations were between 11 and 33 ppm. The currently underway Household Air Pollution Intervention Network trial—a large multisite randomized controlled efficacy trial providing unlimited LPG refills to 3200 households for 18 months [54]—will increase the available evidence on the potential for clean fuels to reduce personal air pollution exposure.

The observed estimates of personal air pollution exposure in this study are somewhat low in comparison to other similar studies. As noted above, a review of eight studies [18] with personal CO exposure estimate a weighted pre-intervention mean of 3.4 ppm and post-intervention at 1.6 ppm. However, these studies came from a range of geographic contexts—largely Central and South America—that may not be as relevant to Sub-Saharan Africa. In a cross-sectional study in Accra, Ghana—a large urban city—households only using LPG had mean PM2.5 exposure of 24 µg/m3, though households also reporting wood use or charcoal use had somewhat higher exposures (between 31 and 79 µg/m3) [55]. A study in rural Kenya estimated 48-h personal CO exposure to be between 0.8 and 1.3 ppm—concentrations comparable to those presented in this study [56]. A study in Rwanda reported mean 48-h personal PM2.5 exposure to be around 220 µg/m3 across intervention and control arms (no difference in exposure), though the interquartile range extended from about 95 to 300 µg/m3 [57].

In summary, personal air pollution exposure concentrations are highly variable within and across contexts and while exposure estimates in this study may be somewhat lower than in other studies, there is significant overlap in the distributions. In addition, the range of exposures observed in this study fall in ranges of the integrated exposure–response functions for PM2.5 and lower respiratory infections [39], for example, where even modest declines in exposure might yield meaningful reductions in relative risk.

Limitations

The results of this study should be considered in light of its limitations. The methods and protocols for this study were developed between 2010 and 2013, with data collection occurring between 2013 and 2016. Since then, there have been shifts in the air pollution exposure technology and the state-of-the-science knowledge on best practices, so we report extensively on the limitations of this study as advice for future similar studies.

First, as we have discussed previously in the “Methods,” due to resource constraints, CO was used as the primary exposure metric in GRAPHS. Chronic and short-term CO exposure is an important health risk factor associated with asthma, cardiovascular disease, and impaired neurological development and acute symptoms and mortality, respectively [58]. Furthermore, CO is a marker of incomplete combustion and is included in the WHO’s Air Quality Guidelines for Household Fuel Combustion alongside PM2.5 [37]. Still, PM2.5 is thought to be the best indicator of health risk from air pollution [21, 59, 60]. When designing the study, we planned to use CO as a proxy for PM2.5 exposure. Now, evidence is accumulating that CO may perform poorly as a proxy for PM2.5 exposure in HAP studies [40], but these findings were not available during the design phase of GRAPHS. Still, our findings show that across arm exposure reductions were of a similar magnitude for CO and PM2.5 samples. We report an additional limitation that PM2.5 exposure measurements did not take place at baseline, limiting our PM2.5-related analysis to cross-sectional post-intervention assessments.

Second, ambient air pollution was not measured during the GRAPHS study period due to limited resources. Our results showing the positive association between population density around a participant and air pollution exposure indicate the potential for neighbors’ air pollution to have affected participants’ air pollution exposure. The lack of ambient monitoring limits our ability to determine the relative contribution of a household’s own cooking practices from community-level ambient air pollution to personal exposure. However, it is rare for entire communities to transition from biomass-based cooking to the exclusive use of clean fuels, so the intention-to-treat analysis in this study offers useful real-world results of a clean fuel stove and fuel refill intervention. LPG stove uptake in rural communities in Ghana was uncommon during the study period [61], suggesting that it is unlikely that neighboring transitions from solid fuel use to clean fuel use would have changed ambient air pollution over the course of the study. Still were there to be such a transition, we do not expect that any shifts would occur differentially across study arms.

Third, we carried out only limited pretrial field measurements with the Lascar CO exposure monitor—though we did consult with other research teams experienced in its use. While we did not observe any evidence of issues with deployment, more extensive pretrial testing can be a valuable and important step for establishing good practices for data collection, cleaning, and analysis as well as establishing internal and external credibility of exposure estimates.

Fourth, although post hoc truncating exposure estimates to 48 h were intended to ensure having a similar number of cooking events per deployment, this truncation also induced a limitation of our PM2.5 estimates. We only used the first 48 h of the gravimetric-corrected light-scattering nephelometer data, even though the CF was based on the full deployment length (median = 72 h). As such, any significant variations in particle sources with different optical properties during the first 48 h as compared to the entire deployment may bias our estimates. However, there is limited likelihood that the truncated time period after 48 h captured different particle sources that would significantly impact our estimates because 48 h comprises a large proportion of the full deployment length. Furthermore, a strength of our use of the microPEM is that the device provides a gravimetric correction for every deployment, rather than a common approach of co-locating nephelometer-only sensors with gravimetric-only monitors in a small subset of deployments that is subject to bias [62].

Fifth, stove stacking may also have played a role in the levels of air pollution exposure observed in this study. This study focused on intention-to-treat analyses, categorizing households according to treatment irrespective of cooking fuel use patterns in the household. A limitation of GRAPHS is that there was not comprehensive stove use monitoring during all personal air pollution monitoring sessions or throughout the longer study period. Use of nonintervention stoves was reported during GRAPHS based on weekly household surveys, as reported in the Supplementary Information and published elsewhere [63]. However, given the lack of complete stove use monitoring we are unable to undertake a full analysis of the contributions of stove use to observed personal exposure. Future studies may benefit from comprehensive stove use monitoring paired with personal air pollution exposure to assess the degree to which the benefits of stove interventions are attenuated by fuel stacking with polluting fuels like biomass and kerosene. Additionally, stove use monitoring can enable the analysis of the contribution of cooking events to time-resolved personal air pollution exposure, potentially disentangling overall air pollution exposure from those directly affected by cooking interventions and thus whether reductions in exposure during cooking are the primary drivers of overall differences in exposure.

Sixth, our strategy using visual validation of the data lacked a formal evaluation of inter-rater reliability. Due to the highly-localized nature of air pollution exposure trends from day to day, visual validation remains a top way to detect deviations from the norm and—potentially—indications of sensor failure in addition to objective monitoring criteria and survey-based questions to the participant on exposure monitoring wearing behaviors during the monitoring period.

Conclusions

There is increasing demand for interventions to reduce HAP exposure and improve health in Sub-Saharan Africa and the rest of the world as researchers and policymakers learn more about the health and climate effects of biomass combustion. The particular interventions that will be best suited to achieve these goals remain a subject of debate. Ghana, along with other countries in the region, is establishing national clean cooking programs to scale-up clean cooking fuels—especially LPG—to reduce forest degradation while also improving livelihoods and population health [61]. In this study, we provide evidence from a controlled trial in a low-income setting demonstrating that an LPG stove intervention outperformed a fan-assisted biomass stove intervention in reducing air pollution exposure among a population of pregnant women vulnerable to the adverse health impacts of air pollution. Recent studies from around the world emphasize the importance of cost and access in determining the sustained use of clean fuels in the long term [64]. Future work should investigate how clean fuel stoves can be adopted sustainably in real life and over the long term to reduce air pollution exposure among vulnerable populations.