Introduction

Osteoporosis is a chronic disease characterized by low bone mass and abnormal bone microarchitecture, which predisposes affected individuals to fragility fractures, the majority of which are sustained by women [1]. In Japan, an estimated 15 million people have osteoporosis [2], with 153,000 hip fractures occurring in 2010, rising to a predicted 238,000 hip fractures in 2030 [3]. Osteoporotic fractures produce a substantial financial burden in Japan; hip and vertebral fractures incurred an estimated cost of JPY ¥27.5 billion in 2009 [4].

Antiresorptive drugs, including bisphosphonates and denosumab, improve bone strength by inhibiting bone remodeling. They have been a mainstay of pharmacological osteoporosis treatment and are the agents considered for long-term therapy [1]. In contrast, bone-forming agents, a category which includes romosozumab, teriparatide, and abaloparatide, promote bone formation, increasing bone mass faster and to a greater extent than antiresorptives [1, 5, 6]. Both teriparatide and romosozumab have been shown to be superior to oral bisphosphonates in reducing fracture risk in patients at high risk of fracture [5, 6]. Bone-forming agents are given for 1 to 2 years prior to sequencing to an antiresorptive [1, 7, 8].

Romosozumab is a humanized monoclonal antibody that, by inhibiting the protein sclerostin, has both a bone-forming and resorption-inhibiting effect [5]. Teriparatide is a synthetic peptide identical to the first 34-amino acids (N-terminus) of human parathyroid hormone, which increases both bone formation and resorption, resulting in a net increase in bone mass [9]. The relative efficacy of these two bone-forming agents has been assessed by the STRUCTURE trial in women with postmenopausal osteoporosis previously treated with bisphosphonates [10]. In this study, the mean percentage change from baseline in total hip areal bone mineral density (BMD) produced by romosozumab was 3.4% higher than that of teriparatide at 12 months.

Economic modelling is commonly used to assess the relative cost effectiveness of pharmacological osteoporosis treatments [11,12,13]. Analyses would ideally use direct comparisons of fracture risk reduction in head-to-head randomized controlled trials (RCTs) to inform treatment efficacy. However, no such fracture evidence exists for the comparison of romosozumab versus teriparatide; only direct BMD evidence is available [10]. Indirect comparisons of fracture outcomes for these therapies would necessitate synthesis of trials with heterogeneous patient populations, time horizons, and follow-on therapies, thus producing substantial uncertainty in resulting estimates.

A meta-regression of RCTs of pharmacological osteoporosis therapies, conducted by the Foundation for the National Institutes of Health (FNIH), demonstrated significant associations between treatment-related changes in BMD and fracture risk reduction [14]. A statistically significant linear relationship between percentage total hip BMD change from baseline and relative risks of hip and vertebral fracture on a log scale was observed. This relationship from the meta-regression provides a method to estimate relative risks of fracture from BMD outcomes to inform economic analyses.

The aim of this study is to evaluate the cost effectiveness of romosozumab compared with teriparatide, both sequenced to alendronate (romosozumab/alendronate versus teriparatide/alendronate) in women with severe postmenopausal osteoporosis previously treated with bisphosphonates in a Japanese healthcare system setting, using BMD efficacy outcomes from the STRUCTURE trial to inform relative fracture incidence. Alendronate was selected as the sequential treatment since it is the most commonly prescribed antiresorptive in Japan [15].

Methods

Model overview

A Markov cohort model was used to assess lifetime costs and quality-adjusted life years (QALYs) associated with romosozumab/alendronate and teriparatide/alendronate. The model used a discount rate of 2% per annum for both costs and health outcomes in line with Japanese guidelines [16], and was conducted from the perspective of the Japanese healthcare system. Costs were assessed in 2020 US dollars.

The model structure has been described previously [17], and, like a number of previous analyses, is based on the model developed by the International Osteoporosis Foundation [11, 12, 18]. The model consisted of 7 Markov health states (shown in Fig. 1) and used a 6-month cycle length. All patients start in the “baseline” health state, and in each cycle are at risk of sustaining a fracture (hip, vertebral, or “other” fracture) or death. One year after the event, patients with a hip or vertebral fracture transition to the “post-hip fracture” or “post-vertebral fracture” state respectively, whereas patients with an “other” fracture return to the “baseline” state.

Fig. 1
figure 1

Markov structure of the model. Adapted from O’Hanlon et al. [17]. aDeath can occur from any other state. For simplicity, specific transitions are not shown.

The model has a hierarchical structure, where patients in hip fracture states can only sustain another hip fracture, and patients in vertebral fracture states can only sustain a hip fracture or another vertebral fracture. To correct for the underestimation of fracture incidence imposed by the hierarchical structure, incidence of omitted “lower hierarchy” fractures and their impact on costs and QALYs was estimated separately from the Markov process, as described previously [18]. In each cycle, the proportion of the cohort in each of the higher hierarchy states (hip and vertebral fracture) was multiplied by rates of lower hierarchy fractures. Total costs and QALYs in each cycle were correspondingly adjusted to reflect the healthcare cost and health-related quality of life (HRQoL) reduction associated with the additional fractures.

Patient population

The patient population consisted of Japanese postmenopausal women with severe osteoporosis, characterized by a BMD T-score ≤−2.5 and a history of fragility fracture (consistent with the World Health Organization definition of severe osteoporosis [19]), with a mean age of 78 years at baseline, based on the characteristics of females with an osteoporotic fracture from a medical claims database in Japan [20]. The assumption was made that half of the population had experienced one previous fracture, and half had experienced multiple prior fractures, as clinical evidence indicates that, among women with a history of fracture, the proportion of those with single versus multiple prior events is approximately even [21]. Consistent with the population of the STRUCTURE trial [10], the modelled population consisted of patients who were previously treated with bisphosphonates.

Treatment duration

In the model, patients in both arms received a total of 5 years of treatment, in line with previously conducted economic analyses [11, 12, 18]. Patients in the romosozumab/alendronate arm received 1 year of romosozumab, followed by 4 years of alendronate. Patients in the teriparatide/alendronate arm received 2 years of teriparatide, followed by 3 years of alendronate, since this is consistent with the maximum duration of teriparatide treatment [22], and is a common intended treatment duration for teriparatide in Japan [23].

In practice, persistence with pharmacological osteoporosis treatments is imperfect [24]. However, since no real-world persistence data currently exist for romosozumab, and because the controlled conditions of a randomized trial are unlikely to represent treatment discontinuation in practice, full persistence with all treatments was assumed in the model base case, with the impact of modelling imperfect persistence explored in scenario analysis.

After treatment discontinuation, the fracture reduction benefit does not stop immediately, but continues for a period of time (the “offset time”). Clinical evidence has shown that after 5 years’ treatment with alendronate followed by 5 years’ treatment with placebo, patients’ BMD remained at or above pre-treatment levels [25], indicating that the benefit of pharmacological treatment persists for some time. Therefore, in the model, the assumption was made that, after the 5-year treatment course, fracture reduction benefit declines linearly to 0 over the course of a further 5 years for both arms. To address uncertainty in the offset time, scenario analyses were conducted where fracture reduction benefit declines to 0 over the course of 1 year and 10 years.

Fracture incidence and treatment efficacy

Patients’ fracture risk in the model was composed of three elements: (1) age-specific general population fracture incidence, (2) the increased risk of fracture in the severe osteoporotic population, and (3) the risk reduction associated with treatment.

General population hip, vertebral, and “other” fracture rates for each year of age were informed by risk equations developed by a previous economic evaluation using Japanese epidemiological survey data [26]. Resulting fracture rates are summarized for 5-year age bands in Table 1.

Table 1 Epidemiological inputs used in the model (fracture incidence, prevalence, and mortality)

To calculate fracture risks in women with severe postmenopausal osteoporosis, general population rates were adjusted using relative risks of fracture per standard deviation decline in BMD T-score, to account for the BMD distribution of patients with a T-score ≤ −2.5, and relative risks of fracture in patients with a single fracture or multiple prior fractures versus no prior fracture [21, 27, 29], in conjunction with data on the general population prevalence of morphometric vertebral fracture, as described previously [17]. Data used in these calculations are shown in Table 1.

To calculate fracture incidence in treated patients, treatment efficacy estimates (relative risks of fracture) were applied to fracture risks in severe postmenopausal osteoporotic patients.

For patients receiving romosozumab/alendronate, relative fracture incidence compared with placebo was calculated indirectly from two sources: a comparison of romosozumab/alendronate versus alendronate alone from the ARCH trial (described in Saag et al. [5]) and a comparison of alendronate versus placebo from a network meta-analysis of osteoporosis therapies [32]. Due to their different modes of action, and the presence or absence of a sequential therapy, it is likely that the pattern of fracture reduction benefits over time differs for regimens containing a bone-forming agent versus antiresorptives alone. Therefore, the relative efficacy of romosozumab/alendronate versus alendronate was estimated time dependently, with relative risks calculated for each 6-month model cycle over the course of the 5-year treatment period.

To achieve this, parametric survival functions were fitted to time-to-event data for hip and nonvertebral fractures from the ARCH trial. Model fitting was applied separately by arm, to allow for changing relative efficacy over time. For each arm and fracture type, the parametric functional form with the lowest Akaike information criterion (AIC) was selected, from exponential, Weibull, log-logistic, and lognormal distributions. Selected regression parameters are shown in Supplementary Table 1. This method was not possible for vertebral fractures, since they are not always identified immediately, meaning that accurate time-to-event data were not available. Therefore, the relative risk of new vertebral fracture for romosozumab/alendronate versus alendronate for months 1–12 was applied in the first year of the model, and the relative risk for months 13–24 was applied for subsequent cycles (months 13–60 in the model). Treatment efficacy data for alendronate versus placebo and romosozumab/alendronate versus alendronate are shown in Table 2.

Table 2 Efficacy data used in the model (relative risks of fracture)

For patients in the teriparatide/alendronate arm, fracture rates were calculated by applying relative risks to fracture incidence in patients receiving romosozumab/alendronate. Relative risks for romosozumab versus teriparatide were estimated using BMD efficacy outcomes from the STRUCTURE trial [10], converted to fracture outcomes using the relationship between percentage total hip BMD change from baseline and relative risks of hip, vertebral, and nonvertebral fracture on the log scale, provided by the meta-regression conducted by the FNIH [14]. While the regression equations were not directly reported in the publication, fracture risk reductions associated with a 2%, 4%, and 6% improvement in BMD were available, which were used to reproduce the parameters. The slopes of these equations were used to translate the difference of 3.4% total hip BMD change from baseline for romosozumab versus teriparatide at 12 months in the STRUCTURE trial [10] into relative risks of fracture. BMD at the total hip was used in these calculations due to its high predictive value for fractures [14]. Resulting relative risks of fracture for romosozumab versus teriparatide are shown in Table 2. Full details of their derivation are shown in Supplementary Table 2. The assumption was made that the fracture reduction benefit of romosozumab versus teriparatide persists until patients in both arms have transitioned to alendronate (i.e., for 2 years), after which romosozumab/alendronate and teriparatide/alendronate were assumed to be equally efficacious. This assumption was explored in scenario analysis.

The STRUCTURE trial was conducted in subjects previously treated with bisphosphonates. Given that newer osteoporosis therapies are often used as second-line treatments [34], this is likely to be consistent with the characteristics of patients receiving bone-forming agents in clinical practice. However, to assess the cost effectiveness of romosozumab/alendronate versus teriparatide/alendronate in a treatment-naïve population, a scenario analysis was conducted where BMD efficacy was taken from a romosozumab phase II trial (2.8% difference in total hip change from baseline at 12 months), conducted in patients who were not bisphosphonate pre-treated [33]. In addition, a conservative scenario analysis was conducted where romosozumab/alendronate and teriparatide/alendronate were assumed to be equally efficacious for the entire duration of treatment.

To validate the approach of converting BMD to fracture outcomes, BMD-predicted and observed relative risks of fracture at 12 months were compared for romosozumab trials reporting both outcomes: FRAME (romosozumab versus placebo) [35] and ARCH (romosozumab versus alendronate) [5]. Outcomes, displayed in Supplementary Table 3, show that observed and BMD-calculated relative risks are similar, with p values showing no evidence of difference between observed and predicted values.

Mortality

Age-specific general population mortality was informed by life tables for Japan, to which age-specific relative risks were applied to account for excess mortality in patients who sustained a fracture [11, 30]. Inputs relating to mortality are shown in Table 1. In line with previous analyses [11, 12, 17], the assumption was made that excess mortality persists for 8 years for hip and vertebral fracture and 1 year for “other fracture,” and that only 30% of excess mortality following fracture is directly caused by the event.

Costs

Two categories of cost were included in the model: treatment costs and fracture-related morbidity costs. Where required, costs were adjusted to 2020 values using Consumer Price Index data and converted from Japanese yen to US dollars. All costs inputs are shown in Table 3.

Table 3 Cost and health-related quality of life inputs (2020 US dollars)

Treatment costs included drug acquisition costs and treatment management costs. For drug acquisition costs, annual prices of romosozumab (EVENITY®; monthly 210 mg subcutaneous injection; $5479/year), teriparatide (Forteo®; daily 20 μg subcutaneous injection; $5217/year) and alendronate (Fosamax®; weekly 35 mg oral dose; $269/year) were informed by Japanese list prices [36]. In a scenario analysis, the list price of Teribone®, rather than Forteo, was used for teriparatide.

Treatment management costs were informed by the Japanese medical fee schedule [37] and included the cost of a DXA and bone marker test for all patients receiving treatment. While receiving romosozumab, patients incurred the cost of 12 physician visits per year for treatment administration. Patients receiving teriparatide incurred the cost of 13 physician visits per year for repeat prescriptions (based on a Forteo pre-filled pen providing 28 doses). While receiving alendronate, patients incurred the cost of a physician visit every 2 months.

Fracture morbidity costs included the direct medical cost of fracture [20, 38], applied for the year in which fracture occurs, and long-term care costs, where a proportion of patients (26.6% [38]) enter long-term care following hip fracture. These patients are assumed to incur the cost of institutional nursing home care ($45/day [38]) until death.

Health-related quality of life

Health-related quality of life (HRQoL) of patients in the “baseline” state was informed by age-specific EQ-5D scores for the elderly Japanese population [39]. To account for the HRQoL loss in patients who sustained a fracture, utility multipliers were applied for the first year after hip, vertebral, and “other” fracture, and in the second and subsequent years following hip and vertebral fracture. These values were sourced from a prospective study of HRQoL following fracture in Japanese patients [41]. HRQoL inputs used in the model are shown in Table 3.

Analysis

The main outcomes of the evaluation were incremental discounted total lifetime costs and QALYs for romosozumab/alendronate versus teriparatide/alendronate.

In addition to base case results produced using parameter point estimates, uncertainty in model results was assessed through a number of scenario analyses: (1) Romosozumab and teriparatide were assumed to be equally efficacious. (2) BMD efficacy estimates were taken from a treatment-naïve population (romosozumab phase II trial [33]). (3) The fracture reduction benefit of romosozumab versus teriparatide was assumed to persist for only 1 year. (4) The list price of Teribone was used to inform the annual cost of teriparatide (US$101.57/56.5 μg vial providing 1 week of treatment; US$5300/year). (5 and 6) Treatment offset time (the duration of fracture reduction benefit after discontinuation) was set to 1 year and 10 years for both arms. (7) Imperfect treatment persistence was modelled. To inform this last scenario, treatment persistence data were taken from the STRUCTURE trial at 12 months for romosozumab [10], the VERO trial at 24 months for teriparatide [6], and from a real-world study of bisphosphonate persistence in Japan at 12, 36, and 60 months for alendronate [24]. Discontinuation in each 6-month model cycle was estimated assuming a constant rate of discontinuation between observed time points. Resulting estimates of the proportion of patients on treatment at each time point are shown in Supplementary Table 4.

Probabilistic sensitivity analysis was also conducted, where model inputs were simultaneously stochastically sampled from probability distributions over 1000 model iterations. These distributions were informed by standard errors of parameters (taken from published sources where available or assumed to be 10% of point estimate values otherwise), and by the nature of the parameter. Fracture rates, fracture costs, treatment management costs and the cost of long-term care were assigned a gamma distribution, since these values cannot fall below 0, but theoretically have no upper limit. Relative risks were assigned a lognormal distribution, since ratios are asymmetrically distributed. Proportions and utilities were assigned a beta distribution; BMD efficacy data and model parameters fit to ARCH time-to-event data were assigned a normal distribution. Drug costs were not varied probabilistically, since their values are not subject to uncertainty.

Results

Base case

Model base case results (Table 4) show that romosozumab/alendronate produces a cost saving of $5134 per patient compared with teriparatide/alendronate, due to a lower drug cost and lower fracture morbidity costs. Romosozumab/alendronate avoids an average of 0.082 fractures per patient versus teriparatide/alendronate, which results in a gain of 0.027 life years and 0.045 QALYs. Romosozumab/alendronate therefore dominates teriparatide/alendronate; it produces a greater number of QALYs at a lower cost.

Table 4 Base case cost-effectiveness results

Sensitivity analyses

Scenario analysis results (Table 5) show that using BMD efficacy estimates from a treatment-naïve population, assuming equal efficacy between romosozumab and teriparatide, reducing the duration of romosozumab benefit over teriparatide, reducing the treatment offset time in both arms, and modelling imperfect treatment persistence reduces the QALY gain and cost savings produced by romosozumab/alendronate. However, in all cases, romosozumab/alendronate remains cost effective, producing a cost saving and at least the same number of QALYs as teriparatide/alendronate. Using the acquisition cost of Teribone to inform the price of teriparatide marginally increases the cost saving associated with romosozumab/alendronate versus the base case, due to Teribone’s higher annual cost compared with Forteo. Setting the treatment offset time to 10 years marginally increases the QALY gain and cost saving produced by romosozumab/alendronate, because the extended fracture reduction effect prolongs the morbidity and mortality advantage achieved by romosozumab/alendronate in the first 2 years of the model.

Table 5 Scenario analysis results – romosozumab/alendronatea versus teriparatide/alendronateb

Probabilistic sensitivity analysis results (displayed as a cost effectiveness acceptability curve in Supplementary Fig. 1) show that romosozumab/alendronate is cost effective in 100% of stochastic iterations at a willingness to pay threshold of US$46,070 per QALY (JPY ¥5 million), and is cost effective in the large majority of iterations (>98%) over the entire range of assessed thresholds from $0 to $300,000.

Discussion

Outcomes of this cost-effectiveness analysis show that romosozumab/alendronate produces greater health benefits at a lower total cost than teriparatide/alendronate and can therefore be considered cost effective at any cost per QALY threshold. This is due to the lower overall drug cost per course of romosozumab (US$5479 for one year of romosozumab, versus $10,434 for 2 years of teriparatide) and the BMD benefit of romosozumab over teriparatide which, translated to fracture outcomes, generates additional cost savings and QALYs from avoided fragility fractures. Deterministic and probabilistic sensitivity analyses show that conclusions are robust to uncertainty in model parameters and settings, including assumptions regarding the magnitude and duration of romosozumab’s efficacy advantage over teriparatide, the duration of fracture reduction effects after treatment discontinuation, treatment persistence, and the drug price of teriparatide.

This study is unique in two ways. Firstly, it is the first evaluation to assess the relative cost effectiveness of two bone-forming agent-containing regimens sequenced to an antiresorptive. While the cost effectiveness of bone-forming agents has been assessed previously, these regimens have always been compared with either antiresorptive treatment alone or with no treatment. Secondly, this study is the first to incorporate BMD treatment efficacy estimates into the International Osteoporosis Foundation model framework, by translating BMD benefits to relative risks of fracture.

This study has a number of limitations. Most prominently, the model uses a surrogate outcome – BMD – to estimate relative fracture incidence for romosozumab versus teriparatide, rather than directly observed fracture outcomes. However, as has been demonstrated by the FNIH meta-regression, treatment-induced changes in BMD are a strong predictor of fracture outcomes [14]. Moreover, this method is preferable to estimating relative fracture outcomes via indirect methods which, as previously discussed, would necessitate synthesis of heterogeneous trials, resulting in a high degree of uncertainty.

A further limitation is that direct BMD efficacy data for romosozumab versus teriparatide are only available for up to 12 months in the STRUCTURE and romosozumab phase II trials. This is pertinent because romosozumab and teriparatide are provided for different durations (maximum of 1 year and 2 years, respectively) prior to sequencing to antiresorptive, meaning there is uncertainty in relative efficacy over time. Trial-based evidence also shows that teriparatide continues to produce an improvement in total hip BMD after 12 months of treatment [42]. However, even under the conservative assumption that both regimens are equally efficacious after 12 months (i.e., treatment benefit of romosozumab over teriparatide disappears after the observed time point), romosozumab/alendronate produces equivalent health outcomes at a lower cost than teriparatide/alendronate.

A third limitation is that the FNIH meta-regression used to translate BMD to fracture outcomes only included placebo controlled RCTs, the majority of which were conducted in treatment-naïve patients. Therefore, in converting BMD outcomes for romosozumab versus teriparatide into fracture relative risks, the assumption is made that the relationships established by the FNIH study are generalizable to comparisons of two active treatments in a bisphosphonate pre-treated population. However, the similarity of observed and BMD-predicted relative risks of fracture for romosozumab versus alendronate from the ARCH trial at 12 months indicates that this may be a reasonable assumption.

Additionally, the modelled population does not align exactly with the label for romosozumab in Japan [43], which specifies that patients should have (1) a T-score ≤ −2.5 with at least one fragility fracture; (2) a lumbar spine BMD of ≤ −3.3; (3) 2 or more prior vertebral fractures; or (4) the presence of a grade 3 vertebral fracture (assessed using the semiquantitative visual grading scheme). The modelled population corresponds to the first of these groups (WHO definition of severe osteoporosis [19]). Specifying a model population using multiple, non-mutually exclusive criteria is challenging, as it is difficult to quantify the extent to which these categories intersect, and how each factor affects fracture risk. Therefore, precisely modelling the romosozumab label population was not feasible. However, the severity of some of the categories (individuals with a lumbar spine BMD ≤ −3.3, for instance), and the possibility that patients may fulfil multiple criteria, indicates that fracture incidence in the label population may be higher than that of the modelled population. Therefore, given the superior fracture reduction efficacy of romosozumab, it is possible that modelling a population of severe osteoporotic women provides a conservative estimate of the cost effectiveness of romosozumab/alendronate.

A fifth limitation is that, while the base case evaluation is conducted in a population of bisphosphonate pre-treated patients, the ARCH trial, efficacy data from which are used to establish fracture incidence in treated patients, was conducted in a treatment naïve population [5]. Additionally, in Japan, weekly oral alendronate is provided as a 35-mg dose, whereas participants in the ARCH trial received a weekly dose of 70 mg. However, because the ARCH trial is not used to inform relative fracture incidence for romosozumab/alendronate versus teriparatide/alendronate arms, these inconsistencies are unlikely to affect model conclusions.

A final limitation is that, in the base case, patients are assumed to be fully persistent over the 5-year treatment course. This assumption was made due to the current lack of real-world discontinuation data for romosozumab sequenced to an antiresorptive. However, the scenario analysis in which imperfect treatment persistence is modelled (for which discontinuation data for romosozumab and teriparatide were taken from clinical trials) shows that romosozumab/alendronate continues to produce a higher number of QALYs at a lower total cost than teriparatide/alendronate. It is also likely that, in practice, persistence with romosozumab is higher than that of teriparatide, considering the monthly versus daily injection schedule, and evidence that more frequently administered treatments are associated with poorer persistence [44].

Despite limitations, this study provides a clear indication of the cost effectiveness of romosozumab versus teriparatide, when both treatments are sequenced to alendronate. Such conclusive results would not have been possible without the approach of converting BMD efficacy to relative risks of fracture, considering the inherent uncertainty associated with indirect comparisons of fracture outcomes. Results of this evaluation are specific to the Japanese healthcare system perspective. However, it is reasonable to expect romosozumab to be cost effective in any setting where the cost per course of romosozumab/alendronate is lower than that of teriparatide/alendronate, since the BMD advantage of romosozumab over teriparatide guarantees a QALY gain and lower total cost. More research is needed to facilitate future cost-effectiveness analyses of bone-forming agents sequenced antiresorptives. First, given the differing treatment durations of bone-forming agents and anticipated changes in relative efficacy over time, trial-based comparisons at multiple time points are required. Second, real-world evidence of persistence with bone-forming agents is needed, to appropriately capture regimen-specific differences in treatment discontinuation.

Conclusion

Results of this cost-effectiveness analysis indicate that the strategy of 1 year of romosozumab sequenced to 4 years of alendronate produces a greater number of QALYs and lower total cost than 2 years of teriparatide sequenced to 3 years of alendronate for the treatment of severe postmenopausal osteoporosis in bisphosphonate pre-treated Japanese women.