Abstract
Background and Objective
Bladder cancer is common among current and former smokers. High bladder cancer mortality may be decreased through early diagnosis and screening. The aim of this study was to appraise decision models used for the economic evaluation of bladder cancer screening and diagnosis, and to summarise the main outcomes of these models.
Methods
MEDLINE via PubMed, Embase, EconLit and Web of Science databases was systematically searched from January 2006 to May 2022 for modelling studies that assessed the cost effectiveness of bladder cancer screening and diagnostic interventions. Articles were appraised according to Patient, Intervention, Comparator and Outcome (PICO) characteristics, modelling methods, model structures and data sources. The quality of the studies was also appraised using the Philips checklist by two independent reviewers.
Results
Searches identified 3082 potentially relevant studies, which resulted in 18 articles that met our inclusion criteria. Four of these articles were on bladder cancer screening, and the remaining 14 were diagnostic or surveillance interventions. Two of the four screening models were individual-level simulations. All screening models (n = 4, with three on a high-risk population and one on a general population) concluded that screening is either cost saving or cost effective with cost-effectiveness ratios lower than $53,000/life-years saved. Disease prevalence was a strong determinant of cost effectiveness. Diagnostic models (n = 14) assessed multiple interventions; white light cystoscopy was the most common intervention and was considered cost effective in all studies (n = 4). Screening models relied largely on published evidence generalised from other countries and did not report the validation of their predictions to external data. Almost all diagnostic models (n = 13 out of 14) had a time horizon of 5 years or less and most of the models (n = 11) did not incorporate health-related utilities. In both screening and diagnostic models, epidemiological inputs were based on expert elicitation, assumptions or international evidence of uncertain generalisability. In modelling disease, seven models did not use a standard classification system to define cancer states, others used risk-based, numerical or a Tumour, Node, Metastasis classification. Despite including certain components of disease onset or progression, no models included a complete and coherent model of the natural history of bladder cancer (i.e. simulating the progression of asymptomatic primary bladder cancer from cancer onset, i.e. in the absence of treatment).
Conclusions
The variation in natural history model structures and the lack of data for model parameterisation suggest that research in bladder cancer early detection and screening is at an early stage of development. Appropriate characterisation and analysis of uncertainty in bladder cancer models should be considered a priority.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Evidence on the cost effectiveness of bladder cancer screening is consistent but very limited. |
Bladder cancer models rely on data with high uncertainty such as international data and assumptions. |
In the absence of sufficient data for complex models, more trials are needed to inform the parameters of natural history disease models, which in turn can inform the protocols of the trials to test the bladder cancer screening interventions. |
1 Introduction
Bladder cancer (BC) is a common malignancy with its highest burden falling on economically developed countries [1,2,3]. Worldwide, BC ranks sixth in men and 17th in women with the lifetime incidence risk of 1.1% and 0.27%, respectively [1]. The risk of BC increases with age and the higher risk for men than women reflects a higher exposure to carcinogens [1,2,3]. Tobacco smoking is the strongest risk factor, accounting for an estimated 50–65% of all BC cases [4, 5]. Other common risk factors include occupational exposure [6, 7], contamination of drinking water with arsenic and a family history of BC [1, 2].
Bladder cancer is usually first suspected because of visible haematuria or urinary symptoms [8, 9]. At the time of diagnosis, around 75% of patients have non-muscle-invasive BC (NMIBC) [10], which generally has a favourable prognosis. However, around 15% of patients with NMIBC will progress to invasive disease with a much lower expected survival [11]. The diagnostic procedures for symptomatic patients may include: cystoscopy, telescopic endoscopy, ultrasound and/or computed tomography [4]. Screening (i.e. detection of asymptomatic cancers) has been demonstrated to provide survival benefits in prospective studies [8]. However, there remains no conclusive evidence on the effectiveness of the implementation of either national or regional BC screening programmes [1, 8].
In clinical trial settings, several BC screening approaches have been explored [12]: urine dipstick is often considered as a screening intervention in primary care settings, with the potential for urinary biomarkers as well as cystoscopy with ultrasound or computed tomography [8, 13]. Guidelines from professional organisations across different countries, including the USA, Canada, the UK, Japan and the Netherlands, are consistent in recommending an evaluation for asymptomatic microscopic haematuria [14]. However, the recommendations vary regarding screening interventions, particularly the role of urine dipstick and how to define the target screening population [14].
From an economic perspective, BC is one of the most expensive malignancies to manage, with the follow-up costs being twice as high for medium-risk disease and five times as high for high-risk disease compared with low-risk (NMIBC) disease [15]. As multiple BC screening options emerge, modelling studies are often used to assess optimal screening regimes and outcomes prior to large-scale recommendations. The aim of this study was to classify the approaches that have been used in cost-effectiveness models in BC screening and early diagnosis with a specific focus on understanding the modelling methods that have been applied, the structure of the economic models, and modelling inputs and parameterisation. This review also summarises the main outcomes of the identified cost-effectiveness models.
2 Materials and Methods
An initial scoping search was conducted in September 2021 to identify existing reviews. No reviews of BC natural history or cost-effectiveness models were identified; however, search strategies from previous reviews of diagnostic and treatment interventions, and a review of the economics of BC were used to define the most appropriate search terms [16,17,18,19]. As the scoping search identified few studies, the literature scope was then expanded to include diagnostic and surveillance models to provide a comprehensive understanding of BC modelling. The International Society for Pharmacoeconomics and Outcomes Research Good Practices Task Force Report on Critical Appraisal of Systematic Reviews With Costs and Cost-Effectiveness Outcomes was followed in the development of the protocol and reporting of these studies [20]. The protocol registration number in the Prospective Register of Ongoing Systematic Reviews (PROSPERO) is CRD42021281256.
Based on the initial scoping review, a systematic search was conducted in MEDLINE via PubMed, Embase, EconLit and the Web of Science databases. This search was supplemented by searching the Health Technology Assessment database of the Centre of Reviews and Dissemination of the University of York, the National Institute for Health and Care Excellence appraisal system, the Open Access Theses and Dissertations (https://oatd.org), Google Scholar (the first 300 hits in the search for “bladder cancer”, “cost-effectiveness”, “model”) and the references of the included studies. The search period in the review was restricted from 01/01/2006 to 08/09/2021 to reflect current practice both with cost-effectiveness modelling methods and early detection pathways. The development of the search strategy was based on the recommendations of the UK InterTASC Information Specialists’ Sub-Group [21]. The search strategy was validated on the modelling studies identified through a targeted search. An example of the search strategy developed for one of the databases is reported in the Electronic Supplementary Material (ESM). An targetted update of the literature search was conducted in May 2022.
Studies in any language were included if they met the following criteria:
-
Population: human adult population;
-
Intervention: bladder cancer screening or diagnostic interventions;
-
Design: model-based research (either cost-effectiveness models or natural history models of bladder cancer);
-
Perspective/time horizon: any;
-
Publication type: original studies; full-text publications or reports.
Exclusion list:
-
Risk models, animal models, lab models, in vitro models, regression statistical models assessing relationships between the parameters or only cost assessments;
-
Reviews of the literature, protocols, commentaries and conference abstracts.
Titles and abstracts were screened by the first author (OM) using the Rayyan tool to synthesise the studies that fit the inclusion criteria [22]. The full texts of the articles were independently evaluated by a second researcher (AIH), who also validated the data extraction and duplicated the quality assessment for each of the included studies.
The extraction tables included the categories on several dimensions: (1) general information (authors, publication year, country, setting, funding) and PICO (Population, Intervention, Comparator and Outcome; (2) modelling methods (model type according to the taxonomy of model structures for economic evaluations of health technologies [23], software, cycle, time horizon, disease states, discounting, inflation, methods used for costs and outcomes, parametrisation approach and sensitivity analysis); (3) data sources; (4) choices in modelling BC; and (5) quality of the studies using the Philips checklist [24] and the Bilcke et al. guide on uncertainty evaluation [25].
The standardised evaluation of the included models was based on two instruments: the Philips checklist [24] and the guide on uncertainty evaluation by Bilcke et al. [25]. The Philips checklist included the questions on the structure (S1–S9), data (D1–D3), and consistency (C1, C2) [24]. The questions on uncertainty (D4) were excluded from the Philips checklist, while have been guided by the Bilcke et al. methodology [25] to avoid incompatibility between the instruments (this approach was selected as more detailed and explicit, see the ESM for the details). The ranking options of the Philips checklist included “yes”; “partially”, “can’t tell” and “no” (all treated as “no”); or “NA”.
The approach for data synthesis was consistent with the International Society for Pharmacoeconomics and Outcomes Research Good Practices for systematic reviews with cost and cost-effectiveness outcomes [20]. A narrative synthesis was used to address qualitative aspects of model design, including model scope, methods and choices in modelling BC. For screening studies, graphical synthesis reported standardised (inflated to 2022 and converted to international dollars) incremental cost-effectiveness ratios to visualise the cost-effectiveness outcomes by underlying disease prevalence, using the consumer price index and purchasing power parities to standardise the values [26,27,28]. Graphical synthesis of the outcomes for the diagnostic studies was not undertaken because of heterogeneity in PICO, methods and health settings [20].
3 Results
3.1 General Description and PICO
Our search identified 3082 records, of which 18 models—four on BC screening and the remaining on BC diagnostic or surveillance interventions (Fig. S1 of the ESM)—met our inclusion criteria. The excluded full text articles are reported in the ESM.
All included models were developed in high-income countries, with nine of them within the US context (Tables 1, 2). Payer perspective was mentioned in the majority of the studies (n = 12) with two studies stating the societal perspective but reporting the inputs for the direct medical costs only [29, 30].
Three and two out of four screening models simulated high-risk and general-risk populations, respectively [29, 31,32,33]. High-risk groups were defined in the models as heavy smokers and those with occupational exposure, and as any male individual above the specified age. Two related cost-effectiveness studies assessed biochemical bladder markers [32, 34] as an intervention for BC screening, and two assessed dipstick haematuria testing [29, 33] (all compared to no screening, Table 1).
The diagnostic models included patients with haematuria (n = 5), NMIBC (n = 8) and muscle-invasive BC (n = 1). A range of different diagnostic and surveillance interventions were assessed in the models. Hexaminolevulinate blue light cystoscopy and white light cystoscopy (WLC) were the most frequently compared interventions, followed by cystoscopy as a stand-alone or a combination of the interventions (Table 2).
In screening models, two out of four studies reported quality adjusted life-years (QALYs) [29, 33] and one more life-years saved (LYS) [31] (Table 1). In diagnostic models, QALYs were reported only in four out of 14 studies [35,36,37,38] and two more studies reported LYS only [39, 40], with cases detected and resource utilisation used as the primary modelling outcomes (Table 2).
3.2 Screening Models: Outcomes
The models that evaluated haematuria tests included the impact on bladder and kidney cancers, as well as other urological diseases. All studies concluded that BC screening is cost effective in either all populations (n = 1) or only high-risk population groups (n = 3, as defined using BC demographic features) (Table 1).
All studies concluded that screening is more cost effective with a higher incidence or prevalence of the disease (Fig. 1). There was no homogeneity in a value of BC prevalence or incidence that would define when screening becomes a cost-effective intervention. Cost per cancer detected was the lowest in the older age group (71–80 years) with the highest disease prevalence [24]; although no studies compared cost per QALY for populations among different ages to examine how cost effectiveness of screening varies by age.
3.3 Diagnostic Models: Outcomes
White light cystoscopy dominated the computed tomography scan [41], the protocol including a microsatellite analysis with control cystoscopy at 3, 12 and 24 months [30], and the protocol using virtual cystoscopy followed up by cystoscopy if the first test is positive [42]. Interventions that supplemented cystoscopy had higher costs and effects, while tumour markers had higher costs and varied values for clinical effects [30, 39, 41, 42]. The strategy of using the cystoscopy only for positive cases with other primary diagnostic tools (such as urine cytology or cystosonography) had lower costs and effects [38, 42]. Compared with hexaminolevulinate blue light cystoscopy, WLC had higher costs in two out of four studies [36, 43,44,45]. These studies concluded that hexaminolevulinate blue light cystoscopy had higher therapeutic effects than WLC, and is therefore likely to be cost effective [36, 43,44,45]. Only one of the included studies [28] assessed incremental cost-effectiveness ratio as costs per QALY (the intervention was considered as dominating). Three other studies [32, 35, 36] assessed cost per progression, recurrence or resource use, leaving a high uncertainty around interpretation of their results. The heterogeneity in the choice of other evaluated diagnostic interventions and their comparators was too large to support a systematic comparison (Table 2).
3.4 Screening Models: Methods
Two screening models used a decision tree and two others used Markov model structures [29, 34] (Table S1 of the ESM). All screening models were cohorts rather than individual patient-level models.
The models with decision tree structures predicted the potential health and cost impact of screening interventions by combining the characteristics of the screening tests (such as sensitivity and specificity) with underlying BC prevalence data [32, 33]. Average life expectancy by stage among the modelled population group (75-year-old men) was used in the decision tree model predicting LYS and QALYs over the lifetime [33]. The models with Markov structures (one with a lifetime and another one with a 5-year horizon) used a decision tree to model the screening and diagnostic pathways leading to the detection of BC; patients with the diagnosed BC entered one of the BC states (Markov model) and could undergo recurrence, surveillance, progression or death [26, 27].
3.5 Diagnostic Models: Methods
All but one diagnostic model [39] had a time horizon of 5 years or less. Five out of 14 diagnostic and surveillance models had a decision tree cohort structure [36, 40,41,42, 45], and one model was a simulated patient-level decision tree model [46] (Table S2 of the ESM). The decision tree structure was applied mainly in the diagnostic and surveillance models with the focus on clinical or healthcare outcomes (e.g. cancers detected, or healthcare resources used, and not LYS or QALYs); similar to screening models, the decision tree structure was used to model the diagnostic and treatment pathways based on sensitivities of the tests. In the simulated patient-level decision tree model of Georgieva et al., patients were assigned individual characteristics (including sex, age, smoking status and history of gross haematuria), and the probabilities of different types of urinary tract cancers were based on these characteristics at diagnosis [46]. This model predicted the number of detected and missed cancers, which allowed for the assessment of costs and the cost effectiveness of each intervention based on the sensitivity and specificity of each diagnostic test.
Seven diagnostic models were cohort-level Markov models [30, 35, 37, 39, 43, 44, 47] (one of them was a semi-Markov model [30]), and one model was an individual-level Markov model [40]. Markov states were used in the cohort models to simulate the transitions during the surveillance period (i.e. after the diagnosis), such as progression, recurrence of the disease or death. The simulated patient-level Markov model of Yuan simulated the natural history of secondary BC to assess the impact of different diagnostic guidelines, with the Markov states including the natural history of BC, treatment and death [38].
3.6 Screening Models: Sources of Data
Screening models were directly parameterised from published sources and/or registers and were based on assumptions on the disease incidence, prevalence and screening effect (e.g. downstaging) [32,33,34] [Table S3 of the ESM]. Base-case epidemiological inputs, such as incidence, were based on experts’ or researchers’ assumptions. The definition of high-risk populations varied by study, from 2% for prevalence to 10% for incidence [29, 31,32,33]. Data on costs were retrieved from the databases (Medicare, National Health Service reference costs and the National Health Insurance) and supplemented with data from local hospitals and expert opinions [24,25,26,27]. Three studies [31,32,33] used other inputs from published sources; the screening accuracy and downstaging data were retrieved from meta-analyses of international studies, individual publications, clinical experts’ and authors’ opinions. Models had differing assumptions on screening test sensitivities, which ranged from 60 to 100% for different tests (dipstick tests or biomarkers) and population groups (average risk or high risk) [32,33,34]. Two studies (with the UK and Japan context) reported QALYs as the outcome measures and both retrieved utility values from previous cost-effectiveness analyses, including those conducted in other countries (from Canada and the USA, respectively). A recent study by Okubo et al. evaluating the cost effectiveness of combining haematuria screening with a Specific Health Checkup (where a haematuria test is already performed for around 38% of participants) informed the transition probabilities by the Specific Health Checkup report and the National Cancer Registry data [29].
Specificity of the primary tests in the screening models (with values ranged from 60 to 99.9%) impacted the follow-up interventions and costs of diagnosis [24,25,26,27]. None of the models reported screen-induced overdiagnosis, overtreatment or other potential screening-related harms.
3.7 Diagnostic Models: Sources of Data
Most of the models were directly parameterised from published sources (i.e. used published data as direct model inputs) with one study also using a within clinical trial assessment [30] and two others manually calibrating some of the disease parameters by using the data from the European Organization for Research and Treatment Center as calibration targets [38, 47] (Table S4 of the ESM). Expert elicitation, assumptions and published sources were used for epidemiological data, with all but three studies referencing international data for some of the parameters including sensitivities, disease severity and progression [35, 37,38,39, 41,42,43,44,45,46,47]. National datasets (such as Medicare for all the US studies, National Health Service reference costs or the National formularies) were used in all but two studies with in-hospital cost calculations [30, 44] to estimate the direct medical costs. Variable uptake for the diagnostic and surveillance interventions was not considered in the included models, as it was not measured empirically for the evaluated interventions. Diagnostic studies included harms (n = 7) related to unnecessary tests for those with false-positive diagnoses, complications from invasive diagnostic and treatment procedures, including mortality from radiation-induced tumours and anaesthesia based on published data [30, 37,38,39, 41, 42, 46].
Three out of four studies reporting QALYs retrieved health-related utility values from previous cost-effectiveness studies [35, 37, 38]; all three studies (two from the USA and one from the UK) referenced a cost-effectiveness analysis of radical cystectomy in Canada that evaluated related utilities based on a standard gamble approach involving 25 urologists [48]. Mowatt et al. used utility values from the other urological cancers [39] stating that the modellers selected the best available source of the evidence to inform health-related utility values. While the study of Mowatt et al. [39] is not recent, the reliance of the later studies on qualitative data from the previous model suggests that scarcity in utility values may still be an issue.
3.8 Modelling BC
The identified models defined BC states in the following ways (Table 3):
-
1.
Without a standard classification system defining the cancer as detected, progressed and/or recurrent [30, 35, 40, 42, 45,46,47].
-
2.
Using Tumour, Node, Metastasis (TNM) system or its elements [34, 36] or numerical staging [29].
-
3.
Using risk-based classification states, such as NMIBC of low, intermediate and high risk, and non-metastatic and metastatic muscle invasive BC [33, 37, 39, 43, 44].
Some of the diagnostic and screening models simulated population groups including patients with asymptomatic microscopic haematuria [41], microscopic haematuria [39, 42, 46] or suspected haematuria [42], while predicting outcomes from the time of the diagnosis. However, none of the models simulated a complete natural history (i.e. progression of asymptomatic disease from primary cancer onset). Screening or diagnostic models can be divided into several types according to the inclusion of the natural history components.
1. Models Without Progression of Undiagnosed Cancer
Models of this type simulate effects and costs based on stage at diagnosis for screen-detected and symptomatic disease and did not consider cancer progression [33,34,35,36, 40,41,42, 45, 46] or considered only progression for diagnosed disease [29, 30, 47]. These models were informed by the assumed or evidenced incidence rates and test sensitivities. When modelling the consequences of a false-negative test instead of disease progression, these models assessed incremental costs. For example, the diagnostic study of Rodgers et al. [42] considered costs of repeat testing for microscopic haematuria with false-negative diagnosis. Teoh et al. [33] applied higher lifetime treatment costs to false-negative screened patients, similar to those detected symptomatically.
2. Models with Progression of Undiagnosed Cancer as a Result of a False-Negative Test
These models simulate progression to more advanced BC states for patients with a false-negative test result by combining prevalence data and characteristics of screening tests [37, 39, 43, 44]. For example, the diagnostic model of Sutton et al. included an undiagnosed state for patients with false-negative results and assumed that these patients will be diagnosed within the next 2 years; patients in an undiagnosed state could progress to low-risk, high-risk or metastatic states and could then be diagnosed [37].
3. Models with Progression of Asymptomatic Cancer
The only model that included undiagnosed states for BC that were not related to testing false negative (i.e. asymptomatic cancer) was a clinical surveillance model of Yuan et al. [38]. This model simulated the natural history of secondary BC for patients defined as low risk at the time of diagnosis and were disease free following the treatment. This model assumed a progression of patients from treated low-risk BC, to asymptomatic intermediate risk and then finally high risk. At each of these states, patients could transit to the detected state following the surveillance intervention. Diagnosed patients could not progress to more advanced disease but could progress to the death state as a result of BC death or age-specific death from other causes. The progression of asymptomatic disease was estimated by comparing the predicted disease rates to the one observed in the European Organisation for Research and Treatment of Cancer trials [49]. The process of calibration is not described in the article.
3.9 Quality Ranking Using the Philips Checklist
In general, studies addressed most of the evaluated quality criteria of the Philips checklist (Table S5 of the ESM), with 14 studies were scored “no” only on 30% or less questions. Meanwhile, assessment of internal and external consistency was not reported in 17 and 14 studies, respectively (possibly being reported in separate publications or reports). A short time horizon was also a frequent concern (n = 8 out of 18 studies) in the models (Fig. 2). The quality of two older screening studies was lower than the quality of later screening models and the diagnostic studies; however, because only a few studies were identified, no meaningful comparison can be provided. Agreement between the two reviewers for each category of the Philips checklist [24] was very high at 92%.
3.10 Structural and Parameter Uncertainty in BC Models
Structural uncertainty in screening models was related to different structural assumptions, such as using a decision tree structure to ascertain long-term outcomes, choice of static probabilities, using the BC cases detected as the modelling outcome (instead of the LYS or QALYs), methods and assumptions on BC mortality/survival use in modelling and mismatch between the selected perspective and costs [29, 31,32,33]. Structural uncertainty was not fully addressed in screening models (Table S6 of the ESM), with the study of Teoh et al. partially exploring structural uncertainty by specifying the availability of sources of evidence, their appropriateness and the limitations. Parameter uncertainty (related to the assumed epidemiologic values, unrepresentative populations [using international data or small sized samples] or unspecified sources) was present in all screening studies (Tables S3 and S6 of the ESM) [29, 31,32,33]. None of the published articles mentioned the model validation. Only the most recent study by Okubo et al. fully addressed parameter uncertainty by the explicit probabilistic sensitivity analysis [29].
In diagnostic models, short-term time horizon, static transition rates, choice of the outcomes (only BC cases detected or resource use and not LYS or QALYs), the approaches to test sensitivity, incidence, disease progression, recurrence and BC mortality evaluation were recorded among the other sources of structural uncertainty (Table S6 of the ESM) [7, 30, 35,36,37,38,39,40,41,42,43,44,45,46,47]. While most models did not report on structural uncertainty, Klaassen et al. [44] and Mowatt et al. [39] addressed the structural uncertainty by conducting scenario analyses, while three other studies [37, 38, 42] partially addressed structural uncertainty by explicitly specifying the accepted and the alternative assumptions. Similar to the screening models, parameter uncertainty was identified in all included studies and was addressed through the probabilistic sensitivity analysis in five diagnostic studies [37, 39, 41, 42, 46]. Two publications described a validation conducted for the diagnostic models (one of them with the calibrated parameters) using survival data or the risk distribution [38, 46].
3.11 Summary from Studies with Low Uncertainty
Studies that addressed at least partially both structural and parameter uncertainty [25] also were ranked high on Philips checklist criteria [24]. All three studies with explicitly addressed structural and parameter uncertainty (two of which were reports) were the diagnostic studies [37,38,39]. Mowatt et al. [39] analysed multiple diagnostic interventions concluding that cytology followed by WLC in initial diagnosis and follow-up while being the least effective strategy is the most cost-effective approach in the UK setting. Sutton et al. concluded that a diagnostic classifier for risk stratification of haematuria patients is cost effective in the UK with a probability of 68% [37]. Yuan et al. compared the long-term clinical effect of different guidelines in the US setting and concluded that none of the comparators dominate each other [38].
4 Discussion
This review explored methods used in modelling the cost effectiveness of diagnostic, surveillance and screening interventions in BC. The screening models evaluated the cost effectiveness of biomarkers and urine dipstick tests in general-risk and high-risk populations; all screening studies concluded that screening is cost effective with the underlying disease prevalence being its important determinant. The earlier models evaluating the cost effectiveness of biomarkers [31, 32] though had low quality and high structural and parameter uncertainties.
Diagnostic models assessed a wide range of interventions. In studies of variable quality, hexaminolevulinate blue light cystoscopy was consistently considered as a cost-effective intervention compared with WLC. The studies with low structural and parameter uncertainty concluded on the cost effectiveness of cytology followed by WLC in the initial diagnosis (compared with multiple alternatives) [39] and a risk stratification approach for patients with haematuria in the UK [37]. Diagnostic models had variable predictions on the cost effectiveness of urine biomarkers in BC diagnosis (reporting higher costs and variable effects compared with their alternatives), with a high-quality model with low uncertainty reporting that tumour markers are not cost effective in the UK setting [39].
The conclusions of the cost-effectiveness analyses are subject to provisos regarding limitations of the methods used and available data constraints, with the following discussion points identified:
(1) Correspondence of the PICO to the decision problem
The description of the population (asymptomatic, symptomatic, or diagnosed with NMIBC or muscle invasive BC) defined the initial and the following states of the models. The choice of the intervention will affect the model design because some screening and diagnostic tests, such as the urine dipstick test, may also lead to the diagnosis of other diseases (e.g. kidney cancer or other urological conditions). As such, BC models should assess the need to include the simulation of other relevant health conditions to avoid underestimating the potential benefits of screening and diagnostic interventions.
While patients, interventions and comparators were well defined in BC models, the economic outcomes investigated were more inconsistent. Bladder cancer models frequently reported cost per detection, recurrence, progression or resources used as the main outcome. While these outcomes may be interesting in their own right, they are inadequate in two regards: first, they do not capture the long-term mortality or health-related quality-of-life impacts of early or delayed detection; second, they do not allow comparative economic analyses across different health conditions and thus cannot inform policy decisions [50].
(2) Choice of the model structure
Selection of the model should be based on the simplest structure that addresses the objectives of the study, the structure of the disease and the clinical guidelines or treatment pathways [23]. The healthcare decisions, particularly large investments such as national screening programmes, should consider uncertainty that cannot be reflected in deterministic models. In cancer modelling, timing is important for costs and health outcomes, as costs are commonly higher the first year of diagnosis than the following years [51] and cancer-related decrements in health related utilities vary over time [52]. While stochastic timed models without interaction would be the expected choice for most BC screening and diagnostic models, in our review, most of the included models were deterministic, and the decision tree structure was used in more than one-third of all the analysed models.
(3) Modelling natural history of bladder cancer in screening models
In comparison to breast, cervical and colorectal cancers [53,54,55], the evidence pertaining to the cost effectiveness of BC screening is currently limited. As such, BC models are less sophisticated and have a much greater reliance on expert judgement than models for cancers with well-established screening programmes. Only one natural history model, without a cost-effectiveness component, was identified. However, as this model simulated only secondary BC cancer, it is not directly applicable to a screening population [38]. None of the cost-effectiveness models simulated a complete natural history (i.e. a progression of asymptomatic primary BC from cancer onset), which hinders cross-comparisons between modelling predictions. While there is some understanding of the BC risk factors, onset, progression and recurrence [11, 56], modelling natural history of BC is constrained by a lack of direct or indirect data that are able to: (a) inform the progression of asymptomatic disease (e.g. dwell time) and (b) inform long-term clinical outcomes (e.g. survival) in complex individual-level models or when the model states are consistent with the detailed histology of the disease. The absence of the natural history modelling leads to a general limitation of published BC screening models. Such models are not nimble enough to compare different designs of screening programmes or accurately predict a long-term effect of repeated screenings or the impact of screening on screening-related harms, such as overdiagnosis.
Modelling a complete natural history in screening models requires a complex structure and a life-time horizon to capture the long-term effect and harms. There is a high requirement in data for indirect parametrisation of such models (i.e. calibration of the parameters to inform the transitions in unobserved health states), including the prevalence of undiagnosed cancer, speed of cancer growth or sojourn time, and the probability for cancer spontaneous regression or recurrence [47], which in turn implies that more modelling inputs need to be evaluated for their quality in screening models compared to diagnostic models.
(4) Uncertainty in bladder cancer modelling
Structural and parameter uncertainty is common in screening and diagnostic BC models. This uncertainty relates to both the epistemic uncertainty in the applicability of data (e.g. using the international data or assumptions), an aleatory uncertainty with a frequent reliance on deterministic analysis and a lack of validation or scenario analyses to explore uncertainty in model structures.
The parameter uncertainty in the identified models suggests a possible scarcity of sources to inform country-specific parameters and a need to assess the transferability of sources available for modelling. In particular, the data need to be improved to inform health-related utility values in BC models.
While a clinical effect of medical interventions is generally considered to be generalisable, there may be specific considerations that make this less so for diagnostic tests, especially for screening interventions [57, 58]. It is common for cancer screening models to assume that disease onset is a setting-specific transition relying on a set of risk factors, while cancer progression consists of generalisable parameters [59]. This assumption, mainly based on a lack of data to state otherwise, suggests that careful consideration should be taken to generalise the baseline disease risk from other settings [57]. Considering that all models were developed within the context of high-income countries, neither their outcomes nor their inputs are generalisable to the middle-income or lower-income settings.
4.1 Implications for Research
While empirical evidence is necessary to inform the modelling parameters and to improve predictions, mathematical disease models are also used to inform the trials’ design [60, 61]. As such, development and implementation of trials needed to inform the models and models to inform the trials should be an iterative process. This also suggests that BC models informed by the limited trial data should be flexible enough to incorporate this iterative process when the new data appear, especially where this has the potential to inform developments to model structure in addition to simple parameter updates. The utility values for BC health states as well as population preferences for different diagnostic and screening interventions, currently not considered in the mathematical disease models, should be explored in future studies.
4.2 Limitations of the Review
While this review sought to search comprehensively the literature, there are limitations to note as well. Only one reviewer screened the initial abstracts, which may have resulted in missed studies or an unintentional bias in the initial search. To assuage any further bias, two independent reviewers assessed the full texts of the included publications and the quality of studies. Moreover, two of the included publications were grey-literature reports (i.e. publications that did not go through the formal peer-review process), which may not appear in a systematic search if the reproduction of the search strategy is attempted. To standardise the quality assessment, the Philips checklist [23] was used, with a very high average agreement rate among the raters (92%). However, some of its components, such as a short time horizon, are better suited for screening studies rather than diagnostic studies. Moreover, some limitations of the appraised health economic studies may be reasoned by compliance to local guidelines. Finally, all the models that were included in this review were from high-income countries, and therefore may not be generalisable to other populations across the globe.
5 Conclusions
Although the evidence pertaining to the cost effectiveness of BC screening is consistent, it is still in its nascent stages. More data are needed to systemically address uncertainties in models, as well as the natural history of BC. This suggests that BC models are not nimble enough to compare different designs of screening programmes, or to predict screening-related harms such as overdiagnosis. Future clinical trials may help to decrease uncertainty in the structures and parameters of BC models, as all models rely on data. Once the natural history of BC models is established, these models can then inform optimal population screening and surveillance strategies that may not be possible to evaluate in the scope of clinical trials.
References
Richters A, Aben KKH, Kiemeney LALM. The global burden of urinary bladder cancer: an update. World J Urol. 2020;38(8):1895–904.
Wong MCS, Fung FDH, Leung C, Cheung WWL, Goggins WB, Ng CF. The global epidemiology of bladder cancer: a joinpoint regression analysis of its incidence and mortality trends and projection. Sci Rep. 2018;8(1):1129.
Cumberbatch MGK, Jubber I, Black PC, Esperto F, Figueroa JD, Kamat AM, et al. Epidemiology of bladder cancer: a systematic review and contemporary update of risk factors in 2018. Eur Urol. 2018;74(6):784–95.
Saginala K, Barsouk A, Aluru JS, Rawla P, Padala SA, Barsouk A. Epidemiology of bladder cancer. Med Sci. 2020;8(1):15.
Cumberbatch MG, Rota M, Catto JW, La Vecchia C. The role of tobacco smoke in bladder and kidney carcinogenesis: a comparison of exposures and meta-analysis of incidence and mortality risks. Eur Urol. 2016;70(3):458–66.
Reed O, Jubber I, Griffin J, Noon AP, Goodwin L, Hussain S, et al. Occupational bladder cancer: a cross section survey of previous employments, tasks and exposures matched to cancer phenotypes. PLoS ONE. 2020;15(10): e0239338.
Cumberbatch MG, Cox A, Teare D, Catto JW. Contemporary occupational carcinogen exposure and bladder cancer: a systematic review and meta-analysis. JAMA Oncol. 2015;1(9):1282–90.
Cumberbatch MGK, Noon AP. Epidemiology, aetiology and screening of bladder cancer. Transl Androl Urol. 2019;8(1):5–11.
Jubber I, Shariat SF, Conroy S, Tan WS, Gordon PC, Lotan Y, et al. Non-visible haematuria for the detection of bladder, upper tract, and kidney cancer: an updated systematic review and meta-analysis. Eur Urol. 2020;77(5):583–98.
Kamat AM, Hahn NM, Efstathiou JA, Lerner SP, Malmström P-U, Choi W, et al. Bladder cancer. Lancet. 2016;388(10061):2796–810.
Fujii Y. Prediction models for progression of non-muscle-invasive bladder cancer: a review. Int J Urol. 2018;25(3):212–8.
Larré S, Catto JW, Cookson MS, Messing EM, Shariat SF, Soloway MS, et al. Screening for bladder cancer: rationale, limitations, whom to target, and perspectives. Eur Urol. 2013;63(6):1049–58.
Soria F, Krabbe L-M, Todenhöfer T, Dobruch J, Mitra AP, Inman BA, et al. Molecular markers in bladder cancer. World J Urol. 2019;37(1):31–40.
Linder BJ, Bass EJ, Mostafid H, Boorjian SA. Guideline of guidelines: asymptomatic microscopic haematuria. BJU Int. 2018;121(2):176–83.
Michaeli JC, Boch T, Albers S, Michaeli T, Michaeli DT. Socio-economic burden of disease: survivorship costs for bladder cancer. J Cancer Policy. 2022;32: 100326.
Yeung C, Dinh T, Lee J. The health economics of bladder cancer: an updated review of the published literature. Pharmacoeconomics. 2014;32(11):1093–104.
van Rhijn BWG, van der Poel HG, van der Kwast TH. Urine markers for bladder cancer surveillance: a systematic review. Eur Urol. 2005;47(6):736–48.
Wang Z, Que H, Suo C, Han Z, Tao J, Huang Z, et al. Evaluation of the NMP22 BladderChek test for detecting bladder cancer: a systematic review and meta-analysis. Oncotarget. 2017;8(59):100648–56.
Gandhi N, Krishna S, Booth CM, Breau RH, Flood TA, Morgan SC, et al. Diagnostic accuracy of magnetic resonance imaging for tumour staging of bladder cancer: systematic review and meta-analysis. BJU Int. 2018;122(5):744–53.
Mandrik O, Severens JL, Bardach A, Ghabri S, Hamel C, Mathes T, et al. Critical appraisal of systematic reviews with costs and cost-effectiveness outcomes: an ISPOR Good Practices Task Force Report. Value Health. 2021;24(4):463–72.
Glanville J, Bayliss S, Booth A, Dundar Y, Fernandes H, Fleeman ND, et al. So many filters, so little time: the development of a search filter appraisal checklist. J Med Libr Assoc. 2008;96(4):356–61.
Ouzzani M, Hammady H, Fedorowicz Z, Elmagarmid A. Rayyan: a web and mobile app for systematic reviews. Syst Rev. 2016;5(1):210.
Brennan A, Chick SE, Davies R. A taxonomy of model structures for economic evaluation of health technologies. Health Econ. 2006;15(12):1295–310.
Philips Z, Ginnelly L, Sculpher M, Claxton K, Golder S, Riemsma R, et al. Review of guidelines for good practice in decision-analytic modelling in health technology assessment. Health Technol Assess. 2004;8(36):iii–iv, ix–xi, 1–158.
Bilcke J, Beutels P, Brisson M, Jit M. Accounting for methodological, structural, and parameter uncertainty in decision-analytic models: a practical guide. Med Decis Mak. 2011;31(4):675–92.
Office for National Statistics. Consumer price inflation tables. 2022. https://www.ons.gov.uk/economy/inflationandpriceindices/datasets/consumerpriceinflation. Accessed 21 Sept 2022.
US Bureau of Labor Statistics. Consumer price index. 2022. https://www.bls.gov/cpi. Accessed 21 Sept 2022.
Organisation for Economic Co-operation and Development. Purchasing power parities (PPP). 2021. https://data.oecd.org/conversion/purchasing-power-parities-ppp.htm. Accessed 21 Sep 2022.
Okubo R, Hoshi SL, Kimura T, Kondo M, Asahi K, Iseki C, et al. Cost-effectiveness of mass screening for dipstick hematuria in Japan. Clin Exp Nephrol. 2022;26(5):398–412.
de Bekker-Grob EW, van der Aa MN, Zwarthoff EC, Eijkemans MJ, van Rhijn BW, van der Kwast TH, et al. Non-muscle-invasive bladder cancer surveillance for which cystoscopy is partly replaced by microsatellite analysis of urine: a cost-effective alternative? BJU Int. 2009;104(1):41–7.
Lotan Y, Svatek RS. Cost-effectiveness of bladder cancer screening. Expert Rev Pharmacoecon Outcomes Res. 2007;7(6):627–32.
Svatek RS, Sagalowsky AI, Lotan Y. Economic impact of screening for bladder cancer using bladder tumor markers: a decision analysis. Urol Oncol. 2006;24(4):338–43.
Sheh KT. An early health economic model of targeted screening for bladder and kidney cancer. In: Sheffield TUO, editor. HAR673 dissertation. The School of Health and Related Research, The University of Sheffield, Sheffield; 2018: p. 79.
Lotan Y, Svatek RS, Sagalowsky AI. Should we screen for bladder cancer in a high-risk population? A cost per life-year saved analysis. Cancer. 2006;107(5):982–90.
Al Hussein Al Awamlh B, Lee R, Chughtai B, Donat SM, Sandhu JS, Herr HW. A cost-effectiveness analysis of management of low-risk non-muscle-invasive bladder cancer using office-based fulguration. Urology. 2015;85(2):381–6.
Garfield SS, Gavaghan MB, Armstrong SO, Jones JS. The cost-effectiveness of blue light cystoscopy in bladder cancer detection: United States projections based on clinical data showing 4.5 years of follow up after a single hexaminolevulinate hydrochloride instillation. Can J Urol. 2013;20(2):6682–9.
Sutton AJ, Lamont JV, Evans RM, Williamson K, O’Rourke D, Duggan B, et al. An early analysis of the cost-effectiveness of a diagnostic classifier for risk stratification of haematuria patients (DCRSHP) compared to flexible cystoscopy in the diagnosis of bladder cancer. PLoS ONE. 2018;13(8): e0202796.
Yuan Z. A partially observable Markov decision process for optimal design of surveillance policies for bladder cancer. In: North Carolina State University, Raleigh, North Carolina. editor. NC; 2012: p. 96.
Mowatt G, Zhu S, Kilonzo M, Boachie C, Fraser C, Griffiths TR, et al. Systematic review of the clinical effectiveness and cost-effectiveness of photodynamic diagnosis and urine biomarkers (FISH, ImmunoCyt, NMP22) and cytology for the detection and follow-up of bladder cancer. Health Technol Assess. 2010;14(4):1–331, iii–iv.
Lotan Y, Woldu SL, Sanli O, Black P, Milowsky MI. Modelling cost-effectiveness of a biomarker-based approach to neoadjuvant chemotherapy for muscle-invasive bladder cancer. BJU Int. 2018;122(3):434–40.
Halpern JA, Chughtai B, Ghomrawi H. Cost-effectiveness of common diagnostic approaches for evaluation of asymptomatic microscopic hematuria. JAMA Intern Med. 2017;177(6):800–7.
Rodgers M, Nixon J, Hempel S, Aho T, Kelly J, Neal D, et al. Diagnostic tests and algorithms used in the investigation of haematuria: systematic reviews and economic evaluation. Health Technol Assess. 2006;10(18):iii–iv, xi–259.
Dansk V, Malmström P-U, Bläckberg M, Malmenäs M. Hexaminolevulinate hydrochloride blue-light flexible cystoscopy in the detection and follow-up of nonmuscle-invasive bladder cancer: cost consequences during outpatient surveillance in Sweden. Future Oncol. 2016;12(8):1025–38.
Klaassen Z, Li K, Kassouf W, Black PC, Dragomir A, Kulkarni GS. Contemporary cost-consequence analysis of blue light cystoscopy with hexaminolevulinate in non-muscle-invasive bladder cancer. Can Urol Assoc J. 2017;11(6):173–81.
Rose JB, Armstrong S, Hermann GG, Kjellberg J, Malmström PU. Budget impact of incorporating one instillation of hexaminolevulinate hydrochloride blue-light cytoscopy in transurethral bladder tumour resection for patients with non-muscle-invasive bladder cancer in Sweden. BJU Int. 2016;117(6b):E102–13.
Georgieva MV, Wheeler SB, Erim D, Smith-Bindman R, Loo R, Ng C, et al. Comparison of the harms, advantages, and costs associated with alternative guidelines for the evaluation of hematuria. JAMA Intern Med. 2019;179(10):1352–62.
Mossanen M, Wang Y, Szymaniak J, Tan WS, Huynh MJ, Preston MA, et al. Evaluating the cost of surveillance for non-muscle-invasive bladder cancer: an analysis based on risk categories. World J Urol. 2019;37(10):2059–65.
Kulkarni GS, Alibhai SM, Finelli A, Fleshner NE, Jewett MA, Lopushinsky SR, et al. Cost-effectiveness analysis of immediate radical cystectomy versus intravesical Bacillus Calmette–Guerin therapy for high-risk, high-grade (T1G3) bladder cancer. Cancer. 2009;115(23):5450–9.
Sylvester RJ, van der Meijden AP, Oosterlinck W, Witjes JA, Bouffioux C, Denis L, et al. Predicting recurrence and progression in individual patients with stage Ta T1 bladder cancer using EORTC risk tables: a combined analysis of 2596 patients from seven EORTC trials. Eur Urol. 2006;49(3):466–75 (discussion 75–7).
Kim DD, Basu A. How does cost-effectiveness analysis inform health care decisions? AMA J Ethics. 2021;23(8):E639–47.
Mariotto AB, Yabroff KR, Shao Y, Feuer EJ, Brown ML. Projections of the cost of cancer care in the United States: 2010–2020. J Natl Cancer Inst. 2011;103(2):117–28.
Wang SY, Hsu SH, Gross CP, Sanft T, Davidoff AJ, Ma X, et al. Association between time since cancer diagnosis and health-related quality of life: a population-level analysis. Value Health. 2016;19(5):631–8.
Mandrik O, Ekwunife OI, Meheus F, Severens JL, Lhachimi S, Uyl-de Groot CA, et al. Systematic reviews as a “lens of evidence”: determinants of cost-effectiveness of breast cancer screening. Cancer Med. 2019;8(18):7846–58.
Goldie SJ, Kim JJ, Myers E. Chapter 19: cost-effectiveness of cervical cancer screening. Vaccine. 2006;24:S164–70.
Silva-Illanes N, Espinoza M. Critical analysis of Markov models used for the economic evaluation of colorectal cancer screening: a systematic review. Value Health. 2018;21(7):858–73.
Martini A, Sfakianos JP, Renström-Koskela L, Mortezavi A, Falagario UG, Egevad L, et al. The natural history of untreated muscle-invasive bladder cancer. BJU Int. 2020;125(2):270–5.
Drummond M, Barbieri M, Cook J, Glick HA, Lis J, Malik F, et al. Transferability of economic evaluations across jurisdictions: ISPOR good research practices task force report. Value Health. 2009;12(4):409–18.
Daubner-Bendes R, Kovács S, Niewada M, Huic M, Drummond M, Ciani O, et al. Quo vadis HTA for medical devices in central and eastern Europe? Recommendations to address methodological challenges. Front Public Health. 2020;8: 612410.
Gopalappa C, Guo J, Meckoni P, Munkhbat B, Pretorius C, Lauer J, et al. A two-step Markov processes approach for parameterization of cancer state-transition models for low- and middle-income countries. Med Decis Mak. 2018;38(4):520–30.
Schiffer JT, Schiffer CA. To what extent can mathematical modeling inform the design of clinical trials? The example of safe dose reduction of tyrosine kinase inhibitors in responding patients with chronic myeloid leukemia. Haematologica. 2018;103(11):1756–7.
Herzog SA, Blaizot S, Hens N. Mathematical models used to inform study design or surveillance systems in infectious diseases: a systematic review. BMC Infect Dis. 2017;17(1):775.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Funding
This study was supported by the Yorkshire Cancer Research UK (Grant number RA/2019/R1/004). James W.F. Catto is funded by an NIHR Research Professorship.
Conflicts of interest/competing interests
James W.F. Catto reported receiving reimbursement for consultancy from AstraZeneca, Ferring, Roche and Janssen; speaker fees from Bristol Myers Squibb, Merck Sharp & Dohme, Janssen, Astellas, Nucleix and Roche; honoraria for membership in advisory boards from Ferring, Roche, Gilead, Photocure, Bristol Myers Squibb, QED Therapeutics and Janssen; and research funding from Roche. Olena Mandrik, Anne I. Hahn, Ann G. Zauber, Marcus Cumberbatch and James Chilcott have no conflicts of interest that are directly relevant to the content of this article.
Ethics approval
Not applicable.
Consent to participate
Not applicable.
Consent for publication
Not applicable.
Availability of data and material
Not applicable.
Code availability
Not applicable.
Authors’ contributions
The work reported in the paper has been performed by the authors, unless clearly specified in the text. Conceptualization: OM, AIH, AGZ, JC; methodology: OM, JC; analysis and interpretation of data: OM, AIH; drafting of the manuscript: OM, AIH, JWFC, AGZ, MC, JC; funding acquisition: JWFC, JC.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Mandrik, O., Hahn, A.I., Catto, J.W.F. et al. Critical Appraisal of Decision Models Used for the Economic Evaluation of Bladder Cancer Screening and Diagnosis: A Systematic Review. PharmacoEconomics 41, 633–650 (2023). https://doi.org/10.1007/s40273-023-01256-9
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40273-023-01256-9