Introduction

The rising prevalence of obesity and associated health, economic and social burden is a global issue of concern [1, 2]. While several determinants and many complex interactions shape the causal pathways leading to overweight and obesity, it is widely agreed that the increased availability and promotion of relatively cheap, palatable and energy-dense foods is a significant driver of high obesity rates [3]. Changing food environments have been accompanied by highly sedentary work and leisure time, decreased active transport and increased availability of labour-saving devices [4], further exaggerating energy imbalances and weight gain at the population-level. There is global consensus that comprehensive population-wide action is required to address the systemic and environmental drivers of obesity [3], and numerous population-level strategies are routinely recommended by health organisations and public health experts [5, 6]. However, decision-makers (e.g. policy-makers and public health practitioners) are faced with the challenging task of prioritising interventions to implement within political and resource constraints.

There are increasing calls for evidence-based approaches to selecting health policies and interventions [7], including among decision-makers themselves, who have expressed the importance of evidence to inform obesity prevention decisions and the challenges in gaining political commitment in the absence of strong evidence [8, 9]. Despite agreement on the importance of evidence in public health decision-making, there is a lack of consensus on what constitutes viable and sufficient evidence, particularly in relation to complex issues such as obesity [10, 11]. Divergent perspectives on how best to interpret and translate obesity prevention evidence into action reflect the complex nature of obesity and the challenges in generating reliable evidence, particularly regarding the impact of population-level interventions and policies. Population-wide strategies are generally more difficult to evaluate and often pose a greater risk of political criticism and therefore are less often implemented and evaluated to provide evidence of real-world effect [12•]. Even when population-level obesity prevention measures are implemented, it is very difficult to isolate and demonstrate the effect of a single intervention on weight-related outcomes due to the complex aetiology of obesity involving multiple interrelated determinants at the individual-, community- and societal-level [13, 14] and the relatively long time delay between changes in behaviour and changes in weight-related outcomes [15]. Moreover, there are several deficiencies in the current methods used to measure food and physical activity environments and the impact of obesity prevention interventions, including large diversity of tools employed and a lack of standardisation in risk factors and intermediate outcomes measured [16, 17].

An evidence framework is one tool that can support explicit and transparent evidence utilisation by guiding a systematic assessment of a body of evidence to indicate the level of certainty that an estimated effect is true (also referred to as ‘strength of evidence’) [18••]. While evidence frameworks are frequently employed by public health organisations and academics to classify the strength of evidence for specific interventions and to inform health-related guidelines [19, 20], there is debate regarding the suitability of many existing frameworks for evaluating and prioritising intervention implementation for complex population issues [10, 21••]. This is because many frameworks adopt a narrow interpretation of evidence that privileges internal validity and randomised control trials (RCTs) above other evidence [10, 11]. This approach is problematic for assessing evidence related to obesity as RCTs are often not practical for many recommended obesity prevention actions (e.g. taxes on unhealthy foods and beverages in real-world contexts), and it is often challenging to ensure adherence to obesity prevention interventions, particularly over the sustained periods of time required to demonstrate effect on weight-related outcomes [12, 21••, 22].

This paper aimed to critically review existing frameworks for assessing the evidence of effectiveness of interventions related to obesity prevention, and discuss the application of a custom-developed evidence framework as part of an obesity prevention policy priority-setting study in Australia.

Existing Frameworks for Classifying Evidence of Intervention Effectiveness

Two recent systematic reviews mapped the available evidence frameworks for non-clinical use to understand the characteristics and limitations of their adoption for categorising interventions for complex population issues [18••, 21••]. Movsisyan et al. [18••] identified and mapped 17 evidence frameworks relevant to health and social policy analysis across 13 constructs of evidence quality (‘evidence domains’) common across frameworks, including study design, consistency, measures of precision and magnitude of effect. The review found that while similar evidence domains were used to assess strength of evidence across frameworks, the criteria within domains varied between frameworks. For example, 12 of the 17 frameworks included study design as a measure of evidence quality; however, the criteria varied from specifying two to five levels of evidence related to the study design, to not specifying any criteria and using open questions, such as ‘appropriateness of the study design to answer the research question’ [23]. Of those frameworks that did specify criteria related to study design, all identified RCTs as the ‘highest’ level of evidence. Similarly, while 15 of the 17 frameworks included a domain related to quality of study execution, some only considered criteria related to internal validity, while others also considered external validity. Despite several shared evidence domains between frameworks, some domains were notably less common. Less than half of the frameworks considered magnitude of effect, and only three frameworks considered coherence of evidence across the causal pathway. Additionally, frameworks generally did not include processes to synthesise evidence from multiple sources, with only the Grading of Evidence for Public Health Interventions (GEPHI) framework considering consistency of evidence from different study designs [24]. However, this framework was limited in other aspects, such as the lack of guidance on interpreting contrasting evidence across study designs.

The review by Katz et al. [21••] focused on evidence frameworks applied to lifestyle medicine interventions (interventions targeting modification of lifestyle related risk factors such as nutrition, physical activity and substance use) and related health outcomes and identified 15 relevant frameworks. Despite the more limited scope, the review identified many of the same frameworks as Movsisyan and colleagues [18••]. Although Katz et al. did not analyse framework characteristics across evidence domains, the authors did identify several similar domains (including study design, consistency and precision) used by the different frameworks to assess the strength of evidence. The authors also highlighted several deficiencies of existing frameworks related to their application to lifestyle medicine, including the inadequacy for evaluating long-term exposure-outcome relationships (e.g. diet and body weight) and interventions that are unsuitable for randomisation or blinding (e.g. smoking and long-term dietary behaviours). In addition, reviewed frameworks were limited in their ability to synthesise evidence from multiple study designs. In response to the review findings, the authors proposed a new evidence framework, the Hierarchies of Evidence Applied to Lifestyle Medicine (HEALM), which can be used to evaluate evidence from topic areas where RCTs are not practical. HEALM considers internal and external validity, plausibility of effect and plurality of evidence from different study designs, thus providing a promising framework for systematically assessing potential obesity prevention interventions. However, HEALM does not specify criteria for evidence consistency, leaving users of the system to subjectively determine whether evidence is ‘sufficiently consistent’. Further, while HEALM partly recognises the contribution of intermediary outcomes to the assessment of evidence strength by assigning value to intervention studies that provide evidence of causality, it does not provide a mechanism for assessing the strength of evidence for interim outcomes, which may be particularly important when there is a lack of evidence measuring long-term health outcomes. Lastly, its application to an evidence base is yet to be demonstrated and it is unknown how such ambiguities might be addressed in practice or whether other practical limitations might be identified.

The HEALM and other evidence framework reviews provide a good starting point to consider evidence frameworks related to complex health issues. However, it is likely that these frameworks will need to be tailored to the context that the evidence framework is used. With respect to obesity prevention, most frameworks provide inadequate guidance on ways to synthesise evidence across study designs and offer limited consideration of intermediate outcomes. Accordingly, a custom-designed evidence framework that recognises the nature and limitations of obesity prevention research and the value of synthesising evidence from multiple study designs is likely to provide clear benefits to decision-makers evaluating and prioritising obesity prevention interventions.

Development and Application of an Evidence Framework in Obesity Prevention Decision-Making

Development of an Evidence Framework for Prioritising Interventions for Obesity Prevention

The Assessing Cost-Effectiveness (ACE) of Obesity Prevention Policies (ACE-Obesity Policy) study was a priority-setting research study that aimed to determine and inform government decision-makers of the most effective, cost-effective, affordable and implementable policy options to prevent obesity in Australia [25•]. As part of the ACE-Obesity Policy initiative, the project team developed the Obesity Prevention Evidence Assessment (OPEA) Framework to assess the evidence of effectiveness for a broad range of interventions related to obesity prevention. The aim was to develop a framework that allowed all the evidence for an intervention to be considered and synthesised, while providing a simple summary measure that would enable consistent interpretation of the strength of evidence of varied obesity prevention interventions. The OPEA Framework was based on existing frameworks [10, 26] and was developed by the research team in an iterative process over the lifetime of the project, in consultation with a project steering committee comprised of international researchers, senior representatives from non-government organisations related to obesity prevention and senior representatives from government health departments.

The OPEA Framework consists of a two-stage process to assess evidence of effectiveness. The first stage involves conducting a literature review to identify studies relevant to each intervention area (e.g. taxes on sugary drinks, healthy school food policies) of interest. Table 1 shows the template for summarising the body of evidence from studies related to a specific intervention area. The second stage involves assessing the certainty of the effect of the intervention against the criteria outlined in Table 2, based on the evidence summarised in Table 1.

Table 1 Template for summarising evidence of effectiveness for interventions related to obesity prevention as part of the Obesity Prevention Evidence Assessment (OPEA) Framework
Table 2 Criteria for assessing the degree of certainty of effect in relation to obesity prevention interventions as part of the Obesity Prevention Evidence Assessment (OPEA) Framework

A key feature of the OPEA Framework is that it explicitly distinguishes between evidence of effectiveness related to various outcome categories: (1) weight-related outcomes, such as body mass index (BMI) and body weight; (2) diet or diet-related outcomes, such as fruit and vegetable purchases or consumption, sugar sweetened beverage (SSB) purchases and consumption, energy intake, and diet quality; and (3) outcomes related to sedentary behaviour and physical activity-related behaviour (e.g. minutes of physical activity, MET-minutes, step count). By separating out the evidence of effectiveness within each outcome category, the OPEA Framework enables a more nuanced assessment of the evidence of effectiveness than if all outcomes were assessed together. This feature is similar to the ‘chain of evidence’ approach employed by a small number of existing evidence frameworks [24, 30], which involves assessing evidence at each link in the theorised causal pathway to provide confidence in the overall intervention effect on the desired outcomes [18••]. The OPEA Framework requires the analyst to assess the reliability and validity of each outcome measure within each outcome category. The explicit recognition of outcome measures acknowledges that there are a wide variety of measurement tools in the obesity prevention literature, with a lack of standardisation or agreement around ‘gold standard’ measures in some areas [16].

A second key feature of the OPEA Framework is that it differentiates between situations in which (1) the evidence indicates no effect of the intervention on the measured outcome (‘no effect’); (2) the balance of evidence is ‘inconclusive’ or mixed regarding the direction of effect; and (3) there is no suitable evidence, or the outcome has not been assessed. These distinctions are important, although largely overlooked by most existing frameworks, with very few including ‘no evidence’ as a possible evidence rating [18••]. The inclusion of these evidence classifications is especially relevant to the obesity context, considering the limitations of the existing study designs [12] and the lack of real-world evidence of effectiveness relating to many globally recommended obesity prevention policies [10, 31].

A third key feature of the OPEA Framework is that it is designed to consider the balance of evidence for each intervention, including study design, consistency of results and the quality of methods used. These considerations are used to determine and classify the certainty of intervention effect under three levels – ‘high’, ‘medium’ or ‘low’ certainty of effect, based on predefined criteria (refer to Table 2).

Under the OPEA Framework, an intervention is classified as having a ‘high’ certainty of effect where evidence is derived from the most rigorous study designs relevant to obesity prevention at the population-level (one or more RCTs or systematic reviews of RCTs or multiple non-randomised trials with controls), which use high quality outcome measures (e.g. measured BMI rather than self-reported BMI) and show consistent results. Interventions are considered as having ‘medium’ certainty of effect where multiple studies of any study design show consistent results, using varied quality outcome measures (i.e. there could be a mix of measured and self-reported outcomes). This category is to show that there is value in the quantity of consistent evidence, even if it is of varied quality. Interventions are also considered to have ‘medium’ certainty of effect where evidence is provided by a single high quality study that was conducted in a setting highly relevant to the proposed implementation context, even if results are inconsistent with lower quality studies in other settings. ‘Low’ certainty of effect is where there is strong program logic of intervention effect but the evidence is based on a single low quality study or multiple studies that show inconsistent results and/or were based on low quality measured outcomes.

The OPEA Framework does require a degree of user discretion and judgement when multiple conflicting outcome measures are reported in the literature. For example, when there is good evidence of effect on one dietary outcome measure (e.g. SSB consumption) but inconclusive evidence from another dietary outcome (e.g. overall energy intake), the analyst must decide how to combine this evidence to assess the certainty of effect on dietary outcomes. This decision should be based on the quality of the respective study designs, the reliability and validity of outcome measurements and the program logic of how each intervention would impact these outcomes.

Application of the Evidence Framework in Obesity Prevention Priority-Setting in Australia

As part of the ACE-Obesity Policy study, the OPEA Framework was used in two ways. Firstly, the OPEA Framework was used in the initial scoping phase to narrow down potential interventions selected for further analysis, based on their likely impact on population weight in Australia. Like previous priority-setting studies based on more traditional hierarchy of evidence frameworks [32], interventions rated as ‘high’ or ‘medium’ certainty of effect were progressed for economic evaluation. Given the issues described above related to measuring the impact of population-level obesity prevention strategies and the need for decision-makers to make judgements based on the ‘best available’ evidence rather than the ‘best possible’ evidence [10], the study also progressed interventions assessed as having a ‘low’ certainty of effect if (1) there were other factors, such as the importance of the intervention in contributing to a comprehensive obesity prevention strategy, that increased the potential relevance of the intervention to decision-makers and/or (2) there was unlikely to be further evidence of intervention effectiveness available in the short term. For example, a national mass media campaign to reduce the consumption of discretionary foods was progressed to cost-effectiveness modelling despite having an assessment of ‘low’ certainty of effect on both weight- and diet-related outcomes. This decision was based on the assessment that a national mass media campaign has been advocated as a key component of a comprehensive obesity strategy [33] and due to the difficulty in the measurement of the independent effect of such an intervention on population weight or diet outcomes using a rigorous study design. It was therefore assessed that better quality evidence was unlikely to be available in the near future. When interventions were flagged as having ‘low’ certainty of effect, the researchers ensured that a high degree of uncertainty was incorporated into the cost-effectiveness modelling of the intervention.

Secondly, the OPEA Framework was used to provide an overall assessment of the strength of evidence for each intervention based on the balance of evidence. For each intervention, the strength of evidence was reported alongside the economic evaluation results and an assessment of other considerations (including equity, acceptability, feasibility and sustainability) important for intervention implementation [34]. The simple representation of the key findings of the analyses was designed to allow decision-makers to make a comparative assessment of the various interventions that could be ranked by the criteria most relevant to each decision-making context. Moreover, the incorporation of an assessment of strength of evidence allows for informed decisions by policy-makers who are prepared to diversify their risk in an obesity strategy by including ‘high risk-high pay off’ interventions (interventions with ‘low’ certainty of effect and likely to be highly cost-effective) alongside less risky investments (interventions with ‘high’ certainty of effect).

In total, the evidence base for 28 interventions was assessed as part of the ACE-Obesity Policy study [25•]. For 10 of those interventions, there was insufficient evidence of an intervention effect to warrant further assessment (an additional two interventions did not progress to economic evaluation for other reasons) [25•]. For the 12 interventions that did not undergo full assessment, scoping papers summarising the available evidence were made available [35]. Economic evaluations were conducted for 16 interventions. The certainty of effect (or strength of evidence) for these interventions, as determined by application of the OPEA Framework, is listed in Table 3. With regard to weight-related outcomes, only two interventions were classified as having a ‘high’ certainty of effect (‘community-based interventions’ and ‘financial incentives for weight loss by private health insurers’), and a further two interventions were classified as having a ‘medium’ certainty of effect (‘school-based intervention to reduce sedentary behaviour’ and ‘school-based intervention to increase physical activity’). The remaining 12 interventions were classified as having ‘low’ certainty of effect on weight-related outcomes.

Table 3 Certainty of effect for interventions modelled as part of the ACE-Obesity Policy study in Australia, 2018 [25•]

Of the ten interventions that measured diet-related outcomes, seven were classified as having a ‘medium’ certainty of effect, and three were assessed as having a ‘low’ certainty of effect. Of the four interventions relevant to physical activity outcomes, three were rated as having a ‘medium’ certainty of effect and one had a ‘low’ certainty of effect. No interventions were determined to have a high certainty of effect on diet- or physical activity-related outcomes.

Discussion

There are a wide range of frameworks for grading evidence; however, most frameworks are not well suited to assess the effectiveness of obesity prevention interventions for several reasons. Most notably, existing frameworks undervalue the contribution of non-randomised studies, which are commonly used to evaluate population-level obesity prevention strategies. In addition, they do not recognise the multiplicity of outcome measures employed across obesity research, and they do not include processes to synthesise evidence from multiple study designs. In this paper we described and demonstrated the application of the novel OPEA Framework to assess the strength of evidence for a range of population-level interventions as part of an obesity prevention priority-setting study in Australia. Like many other systems for grading evidence, the OPEA Framework categorises the certainty that an intervention will have an effect based on assessment of the available body of evidence, using widely agreed indicators of evidence quality (e.g. study design, consistency) [18••]. The OPEA Framework is differentiated from most existing frameworks in the specifically adapted classifying criteria to more appropriately address the complex nature of obesity that is not readily investigated through traditional cause and effect approaches and frameworks. The OPEA Framework recognises that many population-level interventions cannot readily be evaluated by RCTs and considers both high quality RCTs and high quality non-randomised studies as providing a high certainty of effect, provided there is consistent findings and accurately measured outcomes. The OPEA Framework also explicitly identifies the evidence of effectiveness in relation to direct (weight-related) and indirect (diet- and physical activity-related) outcome measures. The separation of evidence by type of outcome is aligned with the ‘chain of evidence’ approach utilised in the US Preventative Services Task Force’s framework [30] but is in contrast with some other frameworks, such as the Grading of Recommendations Assessment, Development and Evaluation (GRADE) framework, which penalises indirectness of evidence [36]. The inclusion of indirect outcome measures is necessary in the area of obesity and similarly complex health issues since the impact of many population-level interventions is difficult to measure directly or may only be seen over longer periods of time through incremental changes in diet- or physical activity-related behaviours [10]. Moreover, decision-makers are likely to value having nuanced information about intervention effects on obesity-related behaviours.

The application of the OPEA Framework in an obesity prevention priority-setting study confirms the importance of capturing indirect outcomes. We found that only assessing weight-related outcomes resulted in almost all the evaluated population-level interventions, many of which are recommended as key obesity prevention strategies [37], being considered as having a ‘low’ certainty of effect. There is a risk that a strength of evidence classification of ‘low’ may deter decision-makers from implementing population-wide strategies due to a perceived weakness in evidence [38]. However, the limited number of interventions with a ‘high’ certainty of effect on weight-related outcomes is largely due to the fact that many recommended interventions have not yet been implemented, and where they have been implemented, the period of evaluation was generally too short to observe a significant change in body weight. When diet- and physical activity-related outcomes were assessed, 12 of the 16 interventions were considered to have a ‘high’ or ‘medium’ certainty of effect across at least one of the outcome categories. Interestingly, no intervention had a ‘high’ certainty of effect related to diet- or physical activity-related outcomes. This is predominantly due to the difficulty in accurately measuring diet and physical activity [39], leading to most studies using self-reported data, thereby reducing the certainty of effect. For example, a SSB tax is widely recommended by health organisations globally [40, 41] and has been implemented in over 19 countries [42]; however, the OPEA Framework classified this intervention as having a ‘low’ certainty of effect on weight-related outcomes and ‘medium’ certainty of effect on diet-related outcomes (e.g. SSB consumption). In this instance, the strength of evidence was limited by the lack of studies measuring impact on weight outcomes, and the use of self-reported outcome measures. Additionally, much of the evidence related to weight outcomes is derived from modelling and parallel studies (e.g. of tobacco taxes), and there is a lack of understanding of the long-term changes to overall diet in response to changes in SSB consumption, reducing confidence in the evidence related to weight [43]. Importantly, the increasing introduction of obesity prevention measures (such as taxes on SSBs) globally suggests that strength of evidence is only one of the many factors that may influence health priority-setting. Indeed, several factors that affect obesity policy-making have been identified, including influential groups and networks, political ideologies and system characteristics, issue framing and timing [44].

Application of the OPEA Framework as part of the ACE-Obesity Policy study illustrated how the framework operates in practice to support obesity prevention prioritisation and confirmed that the framework is able to differentiate the certainty of effectiveness for a range of population-level strategies to prevent obesity. The OPEA Framework was also utilised in a recent rapid review of population strategies to support healthy weight [45•]. The rapid review was used as the central evidence assessment to inform the development of the Australian National Obesity Strategy [46], thus confirming the suitability of the framework to support real-world evidence-informed policy for obesity prevention.

The predominant strength of the OPEA Framework is the ‘fit-for-purpose’ design and demonstrated application in a real-world priority-setting activity. A further advantage is the ability to incorporate evidence from multiple study designs, which, although argued to strengthen evidence confidence [47], has been limited in almost all existing evidence frameworks [18••]. Limitations of the OPEA Framework include the inability to assess the strength of evidence for a combination of interventions, unless they are implemented and evaluated as a single study, (e.g. in community-based interventions). However, this limitation is reflective of broader evidence challenges common to complex public health problems [48]. The use of predefined criteria for assessment as part of the OPEA Framework (see Table 2) may also be a limitation in some instances as there will undoubtedly be situations where the evidence base cannot be clearly categorised as per the framework. However, the documentation of the available evidence enables a transparent and deliberative process for assigning the overall ‘certainty of effect’ for each intervention, which is particularly important in situations where evidence is ambiguous.

Conclusion

There is growing support for, and use of, evidence-based decision-making in obesity prevention and broader public health priority-setting. However, current systems to assess the strength of evidence for different health-promoting interventions are generally not well suited to complex health issues such as obesity and often lack validated use in practice. This paper described and demonstrated the use of a novel evidence framework, purpose-designed to assess population-wide obesity prevention interventions. Application of the framework to an obesity prevention priority-setting activity highlighted the importance of considering both direct and indirect outcome measures and provided insight into how evidence may be considered within broader contextual factors to inform decision-making.