Introduction

Osteoporosis is a skeletal disorder characterized by reduced bone density and increased bone fragility and results in an increased risk of fracture. Osteoporosis affects approximately 10 million Americans and is associated with significant mortality and morbidity [1]. Bisphosphonates are the most commonly used pharmacologic agents for the prevention and treatment of osteoporosis. However, recent reports have raised concerns regarding the safety of long-term bisphosphonate use [26]. Safety end points highlighted in these reports include atypical femur fractures, osteonecrosis of the jaw (ONJ), and esophageal cancer.

Randomized clinical trials (RCTs) that establish treatment efficacy and lead to US Food and Drug Administration approvals are neither able nor intended to provide information on all the potential side effects of bisphosphonates or other osteoporosis medications. Enrolling from several hundred to a few thousand patients, preapproval RCTs are underpowered to detect or evaluate rare adverse events [7, 8]. For example, when data from three large RCTs totaling 14,195 participants were evaluated, only 12 such atypical femur fractures were observed, and no definitive conclusion could be drawn regarding the association between bisphosphonate use and these fractures [9]. Other limitations of RCTs include insufficient follow-up time to assess long-term safety profiles and stringent eligibility criteria that exclude vulnerable but important patient populations, such as older adults with multiple comorbidities [7, 8]. Significant financial cost associated with primary data collection further raises feasibility concerns for large-scale safety studies. Recognizing that serious side effects may surface after drug approval even with the most vigorous preapproval process, the US Food and Drug Administration Amendments Act of 2007 mandated the use of electronic health data, including administrative data, covering 25 million patients by July 1, 2010, and 100 million by July 1, 2012, to identify adverse events and safety signals [10].

In contrast to RCTs, administrative claims data available for observational research contain information on millions of patients that can be readily and efficiently used to investigate rare safety signals. These data reflect routine clinical practice in which medications are prescribed at various doses and used in diverse patient populations, and are not affected by selective participation or early dropouts. In addition, the use of all prescription medications is recorded longitudinally, is not subject to recall bias, and can be measured accurately with pharmacy claims [11, 12].

Table 1 lists health care utilization databases in the United States and Europe that are commonly used in pharamacoepidemiologic studies. The use of Medicare data in pharmacoepidemiologic research had been rather limited due to lack of information on prescription medication use, except in select subpopulations of those eligible for Medicare and Medicaid and with state-based supplemental drug coverage [13]. Since the introduction of Medicare Part D program, Medicare data contain medical and pharmacy claims for a large, representative sample of the population 65 years of age or older in the United States (38 million as of 2006) and constitute a promising data source for assessing and monitoring the safety of osteoporosis medications [14].

Table 1 Sample of administrative databases used in pharmacoepidemiology studies

Despite the appeal of using administrative claims data to evaluate drug safety, researchers need to overcome several barriers and challenges. The most important barrier is the lack of clinical information (eg, test results) and lifestyle factors (eg, smoking and diet) [12]. Such information is useful to identify and characterize patient populations, to adjust for confounding, and to ascertain study outcomes. In addition, high patient turnover rate in some commercial medical insurance plans impedes the ability to use the data to study long-term outcomes.

To facilitate the use of administrative claims data in future research evaluating the safety of osteoporosis medications, we discuss in this report the strengths and limitations of claims data and, where applicable, methods that may be used to address these limitations. We supplement the discussion with examples from recent studies using claims data to examine the safety of osteoporosis medications (Table 2). Specifically, we focus on attributes of claims data as they pertain to identification of study populations, measurement of medication exposure, ascertainment of study outcomes, and control for confounding, and discuss potential future application of claims data to examine osteoporosis medication safety outcomes.

Table 2 Recent studies using administrative data systems to examine the role of bisphosphonates in select safety outcomes

Identification of Study Populations

Claims data often lack crucial information required to identify a study population of interest in an optimal manner. In pharmacoepidemiologic research, two important elements in defining a study population are identification of the individuals diagnosed with the medical condition for which the medication(s) of interest are indicated (or contraindicated), and identification of users of the medications.

Identification of patients with osteoporosis using claims data presents a challenge because the results of dual energy x-ray absorptiometry (DXA) tests are typically not available. However, medical and pharmacy claims containing diagnosis codes, prescriptions filled, and procedures performed can be used to identify individuals with osteoporosis. We have used an algorithm containing diagnosis codes for osteoporosis and osteoporotic fractures, and fracture surgical repair or imaging codes to identify patients with osteoporosis among Medicare enrollees [15]. Claims for prescriptions filled for osteoporosis medications may be incorporated to enhance the algorithm but should not be used alone to identify patients with this condition because of the high prevalence of untreated osteoporosis. Thorough evaluation and validation of such algorithms for identification of osteoporotic patients remains to be done.

Identification of new users of a medication of interest is often of central importance to the validity and interpretability of pharmacoepidemiologic research because prevalent users are likely to be those who respond favorably to the medication or experience fewer side effects [13]. The identification of new users of osteoporosis medications is challenging because administrative claims data often do not contain dates of first use, which leads to left censoring. That is, an individual’s lifetime medical experience is often not available in any single claims data source, except in studies examining a recently marketed agent restricted to individuals with complete claims data since the launch of the agent.

The most commonly used approach to identify new users of a medication is to require a “clean” period of (typically) 12 months during which there is no prescription filled for the medication of interest. Several studies have used such an approach to identify patients initiating osteoporosis treatment [16, 17]. The length of the clean period should depend on the prescribing pattern of the medication of interest (eg, a longer clean period is needed to identify new users for zoledronic acid, which is administered annually, compared with intravenous [IV] ibandronate, which is administered quarterly).

Assessment of Exposure to Osteoporosis Medications

Currently, pharmacologic agents available for the prevention and/or treatment of osteoporosis include bisphosphonates (alendronate, risedronate, ibandronate, and zoledronic acid), calcitonin, teriparatide, raloxifene, and denosumab [18]. Depending on the data source and the medication of interest, the approach to ascertain medication exposure varies. In the United States, national drug codes, which identify the manufacturer, strength, dose, formulation, and packaging of each approved medication, are commonly used to record prescription medication use. Physician-administered drugs, such as IV ibandronate and IV zoledronic acid, are recorded using Healthcare Common Procedure Coding System J or C codes.

Although the use of such coding systems simplifies data entry and analyses, it introduces potential problems. National drug codes can be cumbersome to use due to lack of information on therapeutic class and inconsistent codes for packaging across manufacturers. Newly approved physician-administered drugs typically receive a temporary, nonspecific J/C code. During the first year after IV ibandronate and zoledronic acid were approved, they were assigned several nonspecific J/C codes (J3490, J3590, J9999, and C9399). The code J3490 was also assigned to more than 40 other physician-administered drugs, such as betamethasone acetate. To remedy misclassification of medication exposure, one approach is to drop from the study all patients whose claims contained the nonspecific codes. Such an approach would be suboptimal, as it would impede timely conduct of safety research as soon as a new agent was launched. Another approach is to develop and validate an algorithm, such as requiring claims containing the nonspecific J/C code and a diagnosis for osteoporosis submitted on the same day. Cost, interval between claims with the nonspecific code, and other information may be used to improve the specificity of such an algorithm.

There are several problems with assessment of medication exposure using claims data for which solutions are not readily apparent. Exposure to over-the-counter medications, such as calcium and vitamin D, cannot be ascertained. Misclassification of medication exposure can occur if an individual fills a prescription but does not take any or all of the medication obtained, or if claims are not submitted for inexpensive generic medications [19]. Since generic alendronate was made available at retailers such as Wal-Mart, a 90-day supply can be purchased for $24. This pricing is likely to result in unsubmitted claims and therefore could lead to incomplete identification of all alendronate users [19]. One method to address these issues can be supplementing claims with medical records or patient surveys. Additionally, in some data sources, such as Medicare, medication use is not observable when patients are hospitalized. When the frequency and duration of hospitalization are associated with safety outcomes of interest, results may be biased. Suissa [20•] demonstrated that misclassification due to unmeasured drug exposure can lead to an incorrect observation that a medication is associated with reduced mortality, and referred to this problem as immeasurable time bias. Several approaches have been proposed to minimize immeasurable time bias, such as restricting analyses to nonhospitalized patients, but none has performed satisfactorily [20•].

Adherence to treatment has two components: compliance and persistence [21]. Compliance refers to how correctly a patient takes the medication at the frequency and dose prescribed. Persistence refers to how long a patient continues to use the prescribed medication. Adherence to oral bisphosphonate therapy is poor and should be considered when assessing safety outcomes, as adherence is highly correlated with effectiveness and safety outcomes [2, 16]. Methods have been developed and validated to measure adherence using prescription claims. The most commonly used are the medication possession ratio and the proportion of days covered [2224]. These methods have been applied to examine adherence to bisphosphonate [16, 25].

Identification of Safety Outcomes

Previous discussion of the accuracy of clinical information that affects the identification of a study population also applies to the identification of a safety end point of interest. Claims data provide reasonably accurate and unbiased information on hospitalization, mortality, and the occurrence of acute clinical events that require major medical intervention (eg, hip fracture) [26•, 27•]. However, such data are less satisfactory for identifying medical conditions for which a specific ICD-9 code does not exist and for which specific procedures and/or medications are not required.

For example, atypical femur fractures are defined by radiographic findings of transverse or short oblique configuration [28]. However, claims data typically contain only the ICD-9 code for closed subtrochanteric fracture (820.22) and femoral shaft fracture (821.0X), regardless of the radiographic features. A recent validation study indicated that positive predictive values of claims-based algorithms for atypical subtrochanteric fractures varied from 69% (any position on hospital discharge diagnosis list) to 89% (both primary discharge diagnosis and surgeon’s diagnosis), and from 89% to 97% for diaphyseal fractures [29].

Similarly, ONJ, another rare but serious adverse event suspected of being associated with bisphosphonate use, did not have a specific ICD-9 code (733.45) until 2006. Prior to the introduction of the specific code, several claims-based studies were conducted using nonspecific codes, including codes for inflammatory conditions of the jaw (526.4), alveolitis of the jaw (526.5), and periapical abscess with sinus (522.7), to examine the association between bisphosphonate use and ONJ (Table 2) [4, 30]. Tennis et al. [31] reported that while 526.4 had the highest positive predictive value compared with other codes, using this ICD-9 code would overestimate the total number of cases by about 70%.

Some outcomes may be difficult to identify using claims data because they are associated with services that are not covered by medical insurance plans. For example, ONJ cases may be evaluated, diagnosed, and/or treated at dental clinics. Dental services are not covered in the traditional Medicare fee for service population or by many other insurance plans. Therefore, ONJ patients who were treated only at dental clinics and not referred to an oral or maxillofacial surgeon will go unrecognized.

Some issues with safety end point ascertainment may be addressed by using established claims-based algorithms or, if such algorithms do not exist, developing and validating new algorithms. We have used fracture algorithms developed by Baron and Ray, which utilize primary or secondary inpatient ICD-9 diagnosis codes or outpatient claims with Healthcare Common Procedure Coding System surgical procedure codes [32]. In the absence of a valid algorithm, the use of a surrogate marker of the outcome may be considered, as in the use of jaw surgery for ONJ [33]. Another two-stage approach is to use claims data to identify patients who potentially have the outcome of interest and then to obtain medical records for a formal review and adjudication. This approach is costly, time consuming, and may require informed consent and formal patient permission if electronic medical records are not available.

Control of Confounding

The prescription of a particular osteoporosis medication is commonly associated with several clinical and nonclinical factors, such as disease severity and prognosis, medical insurance, comorbidities, and functional status. In studies that evaluate treatment efficacy, confounding presents a major challenge because prognosis is the most important determinant in the nonrandom allocation of treatment (eg, perceived fracture risk exerts the most influence on a physician’s decision to prescribe osteoporosis medication). As a result, treated patients are at increased fracture risk compared with untreated patients. If fracture risk at the time of treatment initiation is not sufficiently addressed in the study design or adjusted for in the analyses, results can be significantly biased. This problem is known as confounding by indication. When the end point of interest is an adverse event, particularly a recently suspected adverse event, rather than a beneficial outcome, the likelihood of confounding by “indication” may be significantly lessened [34]. This is because choice of therapy for a particular patient is less likely to be guided by perceptions of the baseline risk of the adverse event of interest.

Similar to observational studies that involve primary data collection, confounding can be addressed in study design through restriction, using a comparison group with the same indication, and matching; and in data analysis through multivariable regression. In addition, there is a growing body of literature on approaches to controlling confounding, developed or adapted to applications in claims data. Examples include various adaptations of the Charlson comorbidity score and the use of high-dimensional propensity scores [35, 36•, 37].

Control of Confounding in Study Design

The purpose of restriction is to obtain a relatively homogeneous population with reduced variability with respect to the risk of developing the outcome of interest. In considering strategies via restriction to enhance the validity of effectiveness research in observational study, Schneeweiss and colleagues [13] evaluated several such strategies, including restricting study population to new users and to adherent users, and found that restrictions resulted in effect estimates similar to results produced in RCTs. Restrictions are also important to consider in observational studies of the adverse effects of osteoporosis medications. For example, to study the association between alendronate use and atypical femur fractures, Abrahamsen and colleagues [2] restricted their analysis to patients who sustained a fracture at any location other than the hip between 1997 and 2005 in order to limit the variability in fracture risk.

Carefully choosing a comparison group is another strategy to control for confounding that is frequently used in observational studies. Kim et al. [38] compared the occurrence of atypical subtrochanteric and diaphyseal fractures in bisphosphonate and raloxifene/calcitonin users—two groups presumed to be similar with regard to the severity of osteoporosis—to control confounding by disease severity. Finally, claims data are commonly used to conduct nested case-control studies. Both incidence density sampling and matching may be applied in these studies to maximize comparability between cases and controls with regard to major risk factors [39].

Methods to Characterize Patients Using Claims Data

In observational studies using primary data collection, confounding is addressed by obtaining information on all known and suspected risk factors of the outcome. Although such information is not directly available in claims data, it can be derived. The most commonly used approach is to require a period of time, or the “baseline period,” prior to starting follow-up, during which all claims are examined to obtain information on potential confounders. However, several caveats warrant attention. First, the length of the baseline period is positively associated with the number of diagnosis and procedure codes, and prescriptions filled. As a result, the longer the baseline period, the more medical conditions would be identified. Second, in some medical specialties, it is not unusual to see a physician once a year unless more frequent visits were deemed medically necessary. Thus, a baseline period of less than 1 year is likely to miss certain diagnoses. Sometimes an even longer period may be warranted (eg, DXA examinations are reimbursable every 24 months in Medicare). However, requiring a long baseline period can significantly reduce the number of eligible participants, especially when data with high enrollee turnover are used. The optimal length of a baseline period should be determined by the specific hypothesis of interest, and sensitivity analyses can be performed using various baseline periods to evaluate the likely impact on confounder assessment, as shorter periods are used to retain sample size.

To better characterize patients’ health status, various methods have been developed or adapted specifically to be used with administrative claims data. A validated index of comorbidity, originally devised by Deyo and colleagues [37] to predict 1-year mortality, was adapted for use with ICD-9 codes from claims filed for inpatient care. In recent years, developments in medical technology have led to a shift from offering medical care in inpatient settings to offering the same care in outpatient or clinic settings. Klabunde et al. [35] developed another comorbidity index that uses outpatient claims. This index has been shown to capture comorbid medical conditions that would have been missed if only inpatient claims were used, and to predict mortality and treatment [35]. Schneeweiss et al. [40] also recommended using an index based on inpatient and outpatient claims and further recommended using weights for the conditions comprising the index that are derived from the specific insured population under study.

Another widely used method to control for confounding is the propensity score approach proposed by Rosenbaum and Rubin [41] to improve statistical efficiency. Propensity scores are derived to measure each individual’s likelihood of receiving a medication, and confounding is addressed via matching, stratifying, or adjusting for the score. Claims data contain large numbers of variables that may be directly associated with medication use or may serve as proxies of factors that are associated with medication use. To facilitate the process of identifying covariates to construct a propensity score using claims data, a multistep algorithm has been developed to derive a “high-dimensional propensity score” [36•].

In addition to capturing the overall burden of illness, as in the use of a comorbidity score, or controlling for the likelihood of using a medication, as in the use of propensity scores, disease risk scores offer another approach to controlling for confounding and have been applied in studies using claims data to assess adverse effects of various pharmacologic agents [42, 43]. This approach is particularly useful when there are more than two comparison groups. Other data mining methods, such as Bayesian approaches, also have been shown to provide valid and comparable results [44].

Future Directions

Future pharmacoepidemiologic studies of the adverse effects of bisphosphonate use are likely to benefit from an increasing amount of clinical data that will become available as use of electronic medical records becomes more common and more comprehensive. We discussed the recent availability of Medicare Part D data, which already have been used to compare the short-term effects of rosiglitazone and pioglitazone on adverse cardiovascular events and mortality [45]. Medicare data present an unprecedented opportunity to researchers interested in monitoring the safety of osteoporosis medications and other therapies that are commonly used in the older adult population. As time passes, Medicare data will be particularly useful for assessing the safety of long-term osteoporosis medication use because the amount of turnover likely will be less than that of many commercial insurance plans in the United States.

Another future direction lies in the combination of claims data with data from other sources (eg, prospective cohort studies, large clinical trials, and disease registries) [26•, 27]. Combining claims data with other types of data for research is not a novel idea. The National Death Index, a compilation of national death records that provides information on causes of death, has been made available to researchers and used extensively to supplement primary data collection for many years. The Surveillance, Epidemiology and End Results (SEER) program collects clinical, demographic, and survival data for all cancer cases from geographic regions covering 28% of the US population. SEER-Medicare linked data constitute a unique source of data to study cancer-related safety outcomes (eg, esophageal cancer). Recently, Medicare data have been linked to the Iowa Women’s Health Study, a prospective cohort study of 41,836 postmenopausal women [26•]. Such a linkage extends the study period of the original study to evaluate long-term outcomes, reduces the impact of losses to follow-up due to institutionalization and changes in address, and lessens the effects of declining cognition and memory on the quality of self-reported information [26•]. These strengths are especially valuable in the older adult population.

Denosumab, a fully human monoclonal antibody to receptor activator of nuclear factor-κB ligand (RANKL), was recently approved for treatment of postmenopausal osteoporosis and skeletal-related events (eg, fracture, bone pain) in patients with bone metastases [46]. We expect that postmarketing surveillance of the safety of this new agent will use a variety of methodologic approaches and will include observational studies using administrative claims data.

Conclusions

Administrative claims data are increasingly used to study the safety of osteoporosis medication use. We have discussed issues commonly encountered and methodologies frequently employed to address them. Some of the issues may be remedied using these and other methods. Others, such as accurate identification of ONJ and atypical femur fractures, are extremely difficult using claims data alone. Administrative claims data from various sources have different strengths and limitations, and some of the problems that we have discussed may not apply to all claims data. Although it is beyond the scope of this review, an in-depth understanding of the strengths and limitations of the specific data to be used, including—but not limited to—the representativeness of the study population, the average duration of insurance coverage, the level of clinical information available (claims only, claims + electronic medical records), completeness of the claims, and prescription medication coverage, are essential to ensure appropriate use of the data and interpretation of the results.