Introduction

Need for right metrics

The need for robust and reliable metrics to quantify need and progress in global surgery has been pointed out long ago [1]. As Waters aptly wrote, “what is not measured risks being perceived as non-existent, cannot be planned for or proactively managed” [2]. The right metrics would help set targets to monitor system performance in addressing the burden of surgical diseases, and such metrics should exhibit validity, feasibility, and relevance in order to influence strategy and provide strong performance data [2].

Overview of extant metrics with their limitations

Mortality to DALYs

The search for the “right metric” in global surgery has been avidly pursued for the past two decades. Studies have classically relied on mortality data [3], but the Global Burden of Disease (GBD) project suggested complementing mortality with morbidity data for individual conditions and disease states [4], thus giving rise to the disability-adjusted life year (DALY) as a key metric of burden of disease (BoD) globally. The DALY (Fig. 1) has become the standard metric in the GBD study [5,6,7] and, despite limitations and criticism [1], the most-used metric in global surgery.

Fig. 1
figure 1

DALY infographic. https://commons.wikimedia.org/w/index.php?curid=20278903

Surgical backlog

Access to surgical care in LMICs is frequently delayed, resulting in increased mortality and morbidity, and traditional BoD estimates do not measure treatment backlog [8]. The surgical backlog refers to the number of individuals waiting for a specific intervention for primarily non-fatal conditions, and is a measure of unmet surgical need [9]. It provides practical information for scaling up surgical programs and must be addressed in any health systems plans to accurately guide policy for surgical workforce and infrastructure [9, 10]. Although actual backlogs are often difficult to estimate in LMICs [9, 10], the mean age delay between the time of onset of a surgical condition and the time of corrective surgery can be used as a surrogate measure (Fig. 2) [9, 11]. Surgical backlog calculations may provide a practical method to calculate the effective coverage (see below) for non-fatal surgical conditions [8].

Fig. 2
figure 2

Estimated global backlog in cleft lip and palate cases [9]. SEA = South-East Asia

Effective coverage

The effective coverage (EC) for any intervention is the estimate of the fraction of potential health gain that can be delivered to those who receive the intervention by the health system that is actually delivered [12, 13]. EC integrates the concepts of need, use, and quality, being expressed as EC = Q*(U/N), where N is the population in need of an intervention, U is the utilization and Q the expected quality of the intervention and its effect on disease burden [13, 14]. Gaps in EC may be the result of barriers to accessing care, available resources, provider quality, and factors that determine household demand [12]. The notion of “coverage gaps” seeks to disaggregate the impact of the different factors on overall EC. These include gaps in accessibility, affordability, unperceived needs, responsiveness, and technical quality [14]. Information on EC is critical to policy-making at the local, national and global levels [14] and is fundamentally an accountability measure [12]. The Mexican Ministry of Health has used EC since 2001 to benchmark health-care delivery in Mexican states [12].

DALY components

The DALY metric combines fatal burden (measured in years of life lost, YLLs) and non-fatal burden (measured in years lived with disability, YLDs) into a single metric. DALYs have been applied to surgical disease by disaggregating them into three peri-surgical categories: averted DALYs representing disease burden met through surgery, avertable DALYs representing disease burden potentially meetable through surgical care, and unavertable DALYs (representing the unavoidable disability and premature death existing regardless of the resources and capacity of any setting (Fig. 3) [15, 16]. In this framework, a DALY value is assigned not only to each surgical condition but also to the postsurgical state, reflecting the negative impact of surgical adverse effects and residual impairments [15]. For example, DALYs averted for patients can be calculated as [17]:

Fig. 3
figure 3

Surgical DALY components [82]

$${\text{DALYs}}\;{\text{averted}} = {\text{YLL}}*{\text{severity}}\;{\text{of}}\;{\text{disease}}\;{\text{weight}}*{\text{effectiveness}}\;{\text{of}}\;{\text{treatment}}\;{\text{weight}}*{\text{disability}}\;{\text{weight}}$$

The importance of using this metric to estimate the impact of surgical care is that the GBD framework is increasingly being used as a key source informing resource allocation [18].

Challenges related to the accurate measurement of DALYs in surgery include the broad spectrum of surgical care, distinguishing between surgical and non-surgical conditions, and estimating both avertable and actual averted DALYs for many surgical procedures [18]. It has been posited that conceptually it is better to focus on the burden averted through surgical services rather than averted through specific “surgical” conditions [16].

Cost-effectiveness

Surgery has been perceived as an expensive intervention, especially in comparison with other public health measures [16, 19]. Cost-effectiveness analysis (CEA) takes into consideration simultaneously cost and health impact in a validated and transparent framework [19]. Proposed metrics for the calculation of the health-benefit component of the cost-effectiveness of proposed interventions include life years (LYs) gained, quality-adjusted life years (QALYs), handicap-adjusted life years (HALYs) and DALYs [19]—with the latter being most commonly used. CEA studies have recently shown that although surgery requires more specialized human resources and infrastructure than traditional public health interventions, global surgical interventions are generally cost-effective and can play a pivotal part in population-based health-care delivery [19]. Many surgical interventions have been found to have comparable or even higher cost-effectiveness than standard medical interventions such as the treatment of hypertension and of HIV/AIDS [20]. There are several pitfalls to be avoided in cost-effectiveness analyses, in regards to the cost measurement, effectiveness measurement, probability estimation, valuation of the counterfactual, and heterogeneity and uncertainty [21].

The LCoGS dashboard metrics

A major step forward in the development of metrics in global surgery is represented by the core indicators proposed by the Lancet Commission on Global Surgery (LCoGS) [20, 22]. These indicators include specialist surgical workforce density, surgical volume, access to timely essential surgery, perioperative mortality, and catastrophic/impoverishing expenditures (Fig. 4) [20].

Fig. 4
figure 4

The 6 LCoGS proposed indicators of surgical capacity [20]

Specialist surgical workforce density is defined as the number of specialist surgical, anesthetic and obstetric (SAO) providers working per 100,000 population [20]. Twenty SAO providers per 100,000 population has been identified as an interim density for acceptable access to essential surgical and anesthesia care for adult surgical care, with these metrics to be reassessed in 2030 based on available data [20, 23].

Access to timely essential surgery refers to the temporal access to essential surgical and anesthesia care. It is the proportion of the population that can reach within 2 h a facility capable of performing so-called bellwether procedures [20, 23]. Bellwether procedures were identified to sample the infrastructure, resources and skill sets needed to treat a broad range of essential surgical conditions, and include laparotomy, management of open fracture, and Cesarean section [20, 23]. This indicator in combination with workforce density provides an encompassing measure of the availability of basic levels of surgical care [24].

The surgical volume indicator is the number of procedures performed per 100,000 population [20]. Minimum rates of surgery are associated with several desirable health indicators [25], and this minimum appears to be around 4000–5000 per 100,000 for adult surgical care [20]. Using this rate, an estimated 143 million additional procedures are needed in LMICs each year to save lives and prevent disability. Each country, however, needs to define its ideal surgical volume and case mix [20, 24]. Moreover, the surgical volume indicator is based on the incident burden each year, hence it doesn’t necessarily capture the large unmet need of the surgical backlog [8].

The perioperative mortality rate (POMR) is the percentage of in-hospital deaths from any cause in patients who have undergone a surgical procedure [20, 24]. The POMR is a feasible, credible, and critical measure of surgical quality, although its usefulness as an indicator is restricted by the absence of a standardized approach to measurement, poor understanding of the best time of measurement, and whether risk assessment is needed [20, 24]. Recently, however, Marks et al. [26] have showed that only small data sampling is required for a reliable POMR.

Financial risk protection is the access to needed health services without financial hardship. The goal of universal health coverage is to protect people against the adverse financial consequences of paying for care while allowing them use of high-quality health services. Protection against impoverishing  and catastrophic expenditures is an indicator of financial access to essential surgical and anesthesia care [20, 24]. About 33 million people globally experience catastrophic health expenditures each year related to surgical care [24]. These metrics define the proportion of households that are protected from catastrophic and impoverishing expenditures due to direct out-of-pocket costs of surgical care [27]. Monitoring of financial risk protection provides information on improvements and entails periodic household expenditure surveys, ideally every 2–5 years [24, 28].

All of these indicators are most informative when used together, and require equity stratifiers and disaggregates that must be considered. Despite the recent inclusion of the LCoGS indicators into the World Development Indicators dataset, a sustained, coordinated model for recurrent, accurate data collection is essential for achieving independent accountability in surgical systems strengthening [22].

Developing metrics in children’s surgery

Although the development of metrics in global surgery has made significant progress to date, within children’s surgery such metrics are in many respects only in their infancy. In this section, we will highlight the few early efforts in determining and using such metrics.

Incidence/prevalence

Unlike the situation in high-income countries (HICs), population-based studies on the incidence and prevalence of pediatric surgical disease are scarce in LMICs, particularly for conditions related to birth defects [1, 29, 30]. Most BoD data require community surveys or large institutions draining geographic areas [31, 32], but data quality is generally low [33] and most surveys miss the significant hidden mortality caused by limited access to care [34]. Such data are important not only for estimating the BoD of surgical conditions in children, but also for an appropriate allocation of health resources, design and implementation of preventive measures, and surgical health services prioritization [29, 30].

Burden of disease

The severely constrained access to prompt surgical care for children in LMICs results in a large burden of disease, as reflected in community surveys of the prevalence of surgical conditions [10, 35, 36]. Population-based documentation of BoD of specific surgical congenital anomalies is very important [32]. Although biased by referral and access, hospital-based data can provide valuable information on the burden of pediatric surgical diseases, especially in sub-Saharan Africa where there is a marked paucity of data [37,38,39]. In light of the inherent challenges in both community-based and institutional data, several investigators have combined them, hence both reinforcing data validity and gaining unique supply/demand perspectives in LMICs [40, 41].

Postoperative mortality rate (POMR)

Pediatric perioperative mortality rate (POMR), especially from congenital surgical conditions, is high in LMICs in Africa [30, 42,43,44]. Paucity of quantitative research in this area hampers the development of recommendations to address the issue. Condition-specific pooled POMR rates vary widely, with congenital conditions treated in the neonatal period having the highest mortality (Fig. 5) [43]. Additional hidden mortality from deaths before presenting at a facility imply that true mortality is even higher than presented from hospital-based data [43].

Fig. 5
figure 5

Global pediatric surgical workforce by surgical outcome [72]

Surgical backlog and effective coverage

There is a significant gap between children’s surgical need and its equitable and safe provision [43, 44], yet the magnitude of this gap is largely unknown. Surgical need is unmet largely due to variations in coverage, financial capacity, lack of trained staff, and limited political commitment [45, 46]. Though estimating unmet surgical need can guide policy makers, proxy indicators for estimating equitable provision of surgical care are lacking in the pediatric population [45]. It is difficult to estimate surgical burden using the DALY approach as the disability incurred during delays in care (the so-called delayed averted burden) is rarely captured. Since the concept of the surgical backlog is a measure of unmet surgical need, provision of care in a population in excess of the yearly incidence of new cases will result in a gradual decrease in the mean age at treatment and a decrease in backlog. Once backlog is cleared, unmet need becomes equal to annual yearly incidence in the population and all patients are treated at their ideal age. Backlog calculations provide a practical way of calculating effective coverage for priority non-fatal conditions [8, 9, 11].

DALYs and disability weights: met, unmet, and delayed

Measurement of surgical BoD using DALYs required the existence of specific disability weight (DW) values for children’s surgical conditions, which were conspicuously absent in many GBD studies [47]. Several of these DWs have been generated in the past years, allowing DALY estimates to be used more easily in pediatric surgery [48, 49].

The breakdown of BoD into avertable (unmet) and averted (met) clarifies the difficult balance between access to care, surgical need, burden of disease, and the magnitude of work remaining [45, 50]. Estimates of averted DALYs and delayed (prevalent) averted DALYs in such settings as matched North–South pediatric surgical centers, refugee camps, and global surgical charities reveal the significant unmet/residual burden in all resource-constrained settings [9, 50, 51].

Health-care valuation, cost-effectiveness, and financial burden

Cost-effectiveness analysis can be extremely useful in informing changes in health-care policy in resource-limited settings [52]. Its main advocacy role is to inform policy makers that surgical care is a good financial investment, i.e., that surgical care is at least as cost-effective as its non-surgical counterparts, and that it is a value-added proposition to any society. While these twin goals have been described for adult surgical care by the LCoGS, much work remains to be done within children’s surgery.

In terms of cost-effectiveness, several authors have documented the comparability of various children’s surgical procedures to medical interventions (Fig. 6) [9, 50, 53,54,55]. Cost-effectiveness has also been shown in a dedicated pediatric surgical operating theater [52] and for several charitable platform types [56], and a recent systematic review has summarized the evidence [57]. The cost per DALY averted through cleft lip and palate repair, treatment of hydrocephalus, and repair of pediatric inguinal hernia in developing countries is in tandem with other disease processes commonly addressed by the global health community, shows favorable benefit–cost ratio, and can have substantial impact on the economic health of countries in the developing world [53,54,55, 58]. Although major gaps exist in the quality and methodology of research limiting the precise economic valuation of surgical care, pediatric surgical procedures such as inguinal hernia repair, trichiasis surgery, cleft lip and palate repair, circumcision, congenital heart surgery, and orthopedic procedures should be considered essential surgical procedures because of the considerable economic value they offer [57].

Fig. 6
figure 6

Cost-effectiveness of pediatric surgical interventions [57]

Within health-care valuation some evidence has also been gathered, particularly for cleft lip and palate work [59, 60]. The financial burden of seeking surgical care is largely unmeasured in the pediatric population in LMICs, but some recent studies have documented the extent of the burden. Despite public funding of hospitals in Uganda, many patients experience catastrophic or impoverishing surgical expenditures, further worsened by out-of-stock supplies and non-functioning equipment [52, 61].

Surgical capacity

Surgical capacity for children in Africa is grossly sub-optimal, with a lack of human and material resources that is reflected in poor outcomes [44]. Mission and other charitable non-governmental hospitals and externally funded surgeons make valuable contributions to elective and specialist surgery in many countries [62,63,64,65], but they rarely contribute to capacity-building [66]. To assess surgical capacity, use of the PediPIPES tool as an adaptation of the PIPES tool to children’s surgery [67] has facilitated surgical capacity assessments in several LMIC settings beyond its original application in West Africa [41, 68]. Several other formalized and/or validated survey tools have been used in LMICs. They include the Global Assessment in Pediatric Surgery (GAPS) [69] in Somaliland [36], the WHO Surgical Assessment Tool (WHO) in Somaliland [36], and others [65].

Workforce metrics

A pediatric equivalent to the SAO metric developed by the LCoGS [20] is essential for identifying realistic workforce development goals in LMICs. This includes use of regional survey data to estimate the numbers of pediatric surgeons required per population [44, 70, 71], as well as a recent attempt to estimate the pediatric surgical workforce needed based on national surgical outcomes (Fig. 7) [72].

Fig. 7
figure 7

Pediatric surgical mortality in resource-constrained settings [43]

Bellwether procedures

Given that surgical bellwethers have become key metrics in surgery in general, the search for pediatric-specific bellwethers has been identified as a research priority. While some conditions such as gastroschisis have been proposed [73], an empiric process of deriving and validating pediatric bellwethers has recently been underway for both pediatric general and neurosurgical procedures [74, 75]. An e-delphi process within the Global Initiative for Children’s Surgery has proposed bellwether procedures by surgical subspecialty and hospital level, but respondent numbers were insufficient for consensus [75]. The current approach, in progress, has therefore been to gather hospital data across LMICs to validate proposed bellwethers at each hospital level. Geospatial tools have been successfully used to identify geographic variations on both the supply and demand sides of pediatric surgical provision, which may assist with bellwether procedure definition [76].

The road ahead

The brief overview above gives us a sense of how far we have come in identifying and applying metrics in global children’s surgery. The picture is not rosy—compared to their “adult” counterparts, children’s surgical providers have a significantly smaller toolset and actual data on which to base their strategies and interventions. For most surgical metrics, the data for children are tentative at best. Hence “the road ahead” for those committed to developing the empirical basis for global children’s surgery is clear—our priority is to contextualize the existing metrics to the surgical needs of children. Beyond that, there are further areas where providers (will) need new metrics, some generic and others specific to children. Following are several such instances.

There is an increasing awareness of wide intra-national variations in the provision of surgical care, often eclipsing the international variations typically highlighted [75,76,77]. These are due to a multitude of geographic access, rurality, socio-economic status and even gender factors in the communities. An increased granularity of summative care data is therefore needed, beyond the “national surgical dashboards” proposed by the LCoGS.

A unique population within children’s care globally are the newborns with surgical anomalies, accounting for a third of infant deaths worldwide or approximately half a million deaths each year [78]. Their resource needs are typically higher than those of older children, and their outcomes are generally poorer. Combining health data across the 0–15 (or even 18) age range leads to loss of accuracy to assess newborn surgical health outcomes. Data disaggregation by broad pediatric age ranges is therefore necessary.

Global health-care metrics have so far focused on demographics, met and unmet BoD, surgical capacity and access to care. Yet quality of care has remained secondary, overwhelmed by the urgency in providing any care. Recent efforts have focused on health-care quality, as shown by the Lancet Global Health Commission on High-Quality Health Systems [79] and global benchmarking for surgical mortality adjustment [80]. A parallel trend is the task of simplifying complex quality of care indices used in resource-rich settings into simple, easy-to-gather indices applicable to resource-constrained settings, without apparent loss in fidelity [81].

Ultimately, the “philosopher’s stone” in global surgical metrics is the development of holistic, integrative measures of system need, capacity, and quality of care provided. This was suggested 15 years back by Murray’s group [14], and achieved to some extent in the Health Access and Quality (HAQ) index developed by the same larger group of GBD collaborators [82]. Ideally, such indices would incorporate not only current access and quality measures, but also account for existing surgical backlogs, cost-effectiveness and sustainability of interventions.

Management thinker Peter Drucker famously stated, “If you can’t measure it, you can’t improve it.” The development of accurate, practical, and integrative metrics remains an essential priority in the coming of age of global children’s surgery. Such metrics will enable proper identification of gaps in care, guide capacity building efforts and result in effective strategies for improving the lives of millions of children with surgical conditions globally.