Introduction

Acute liver injury is an important cause of adverse drug reactions and a common reason for regulatory actions [1, 2]. Idiopathic acute liver injury (ALI), unrelated to a specific cause, is rare [3]. Few studies have provided frequency estimates for ALI in the general population [4]. There is a great variation in the reported incidence rates of ALI, depending on the type of data source, characteristics of the study population and, especially, the case definition [59]. There is some research on the capability of observational databases to accurately capture ALI. The FDA mini-Sentinel project [10, 11] and FNIH’s Observational Medical Outcomes Partnership [12] have specifically analyzed the predictive value of the outcome across multiple US databases, primarily focussed on insurance claims data [13, 14].

Primary health care databases (PCDB) are available in several countries and have been proven to be an excellent source to perform epidemiological and post-marketing drug safety research [15, 16]. The recording of diseases with a non-specific diagnostic test or those that can easily be mixed up with other diseases is difficult to ascertain in PCDBs, and researchers may need to construct more complex definitions to identify all potential cases. ALI is one of these diseases, with multiple aetiologies to rule out, and one that can only be identified after detailed reviews of medical history [17].

The aim of this investigation was to study and compare the incidence of ALI using different routine health care databases. With this purpose, we developed a specific computer algorithm to ascertain ALI, considering different scenarios to test the validity of this algorithm for the future use on drug safety issues [18] and to assess the impact of different definitions on the incidence rates. This study was performed within the framework of the IMI-Pharmacoepidemiological Research on Outcomes of Therapeutics by a European ConsorTium (PROTECT) project (http://www.imiprotect.eu/).

Methods

Data source

This study was performed using two European databases: the Clinical Practice Research Datalink (CPRD; http://www.cprd.com) in the UK and the Spanish “Base de datos para la Investigación Farmacoepidemiológica en Atención Primaria” (BIFAP; http://www.bifap.org). Both are nationwide primary care databases. The CPRD contains data from more than 5 million active patients (8.3 % of the population) provided by primary care general practitioners (GPs) based throughout the UK. A very comprehensive dictionary of clinical terms, READ codes, enables GPs to effectively record medical conditions. The list of READ codes used to identify ALI is included Online Resource 1.

The BIFAP database was developed in recent years and includes anonymized clinical and prescription data collected by primary care physicians during their consultations, covering around 3.9 million patients and representing close to 8.6 % of Spanish population. Validation studies have been performed on other diagnoses [19, 20]. In the BIFAP database medical events are recorded using the International Classification of Primary Care (ICPC) codes, less granulated than the READ codes [21]. In addition to ICPC codes, the GP software (OMI-AP) includes a list of semantic terms to better describe the disease of interest where GPs can enter additional information as free text notes. BIFAP does not contain personal data; thus, patients cannot be identified nor their records be linked with other databases. See Online Resource 1 for codes and additional text mining search performed to identify ALI

Study population

The study period started in January 2004 and ended in December 2009. The study population encompassed patients of all ages registered during the study period. Within this period, we defined the start date for an individual, once the patient had at least 1 year of enrolment with the GP and 1 year of computerized prescription history. We excluded individuals with prior history of cancer, alcoholism, alcohol-related problems, gallbladder disease, pancreatic disease and other chronic liver diseases with clear aetiology such as viral, alcoholic or autoimmune (see computer codes in Online Resource 2). Individuals were followed from start date until the earliest of the following: the date a patient had a liver injury-related code (Tables 1 and 2), the patient died, the patient had a record of an exclusion criteria, the patient was transferred out of the practice, the end of the study period or the end of the practice data collection. Flow chart of study is available in Online Resource 3.

Table 1 Computer search algorithms to ascertain acute liver injury. Operational case definition
Table 2 Computer case ascertainment and manual review process in BIFAP database

Outcome definition

We adapted as ALI definition the classification issued from a consensus meeting on drug-induced liver disorders [22] and widely used in drug safety studies with databases [4, 6, 8, 23]. We developed computer algorithms that could be applied to both databases, relying on three data elements: (1) recorded medical information: a predefined code in CPRD and a predefined code from the BIFAP thesaurus list or in free text (Online Resource 1). Codes selected include either specific codes identifying ALI and unspecific codes that for themselves do not represent ALI but instead only implied a suggestion of a sign to further study; (2) biochemical parameters in laboratory tests recorded: including an increase of more than two times the upper limit of the normal range (ULN) in alanine aminotransferase (ALT) or a combined increase in aspartate aminotransferase (AST), alkaline phosphatase (AP) and total bilirubin provided one of them is two times ULN within 2 months; and (3) a referral or hospitalization entry within 2 weeks of recorded diagnostic code. Based on these three criteria, we classified computer-detected patients as potential cases including the following: definite, probable, possible ALI and non-cases (Table 1)

Two distinct case definitions of ALI for all ages were then used:

  • The restrictive definition of ALI, including only definite cases of ALI, defined as having the three components mentioned above: specific diagnostic code (or specific keyword in BIFAP) as listed in Online Resource 2, laboratory test criteria and hospitalization or referral to a specialist.

  • The broad definition of ALI, including definite and probable cases of ALI, required the occurrence of a diagnostic code (specific or non-specific) together with laboratory test criteria with or without a referral or hospitalization.

Patients with the following conditions were not considered as valid cases: (a) with liver function tests (LFTs) in the normal range or elevated but not in the magnitude proposed in the above definition, (b) with increased LFT values as results of routine investigations but without specific symptomatology recorded (e.g., fever, malaise, jaundice and abdominal pain), (c) with any exclusion criteria mentioned above within 6 months of the diagnosis date and (d) patients with clinical or laboratory alterations after 6 months of the initial onset date were considered chronic liver injury cases and excluded automatically in CPRD and during the manual review process in BIFAP.

Case ascertainment and validation of ALI diagnosis

In BIFAP database, the diagnosis of liver injury was validated by reviewing the clinical profile of all potential cases including free text comments annotated by the GP. The computerized patient profiles and free text fields of all potential cases, initially identified by computer, were reviewed individually by two researchers (AR, GR), blinded to computer assignment. In case of discrepancy, consensus was reached by joint review of the case files (Table 2).

In CPRD, a random sample of definite and probable cases was selected (20 %), and extra information recorded in free text was requested to validate the ALI diagnosis. For practical reasons, only part of the available text could be provided, including 20 words either side around the keywords. An external specialist, blinded to the classification based on our search algorithm, reviewed and classified the cases (Table 3).

Table 3 Computer case ascertainment and manual review process in CPRD database

Statistical analyses

Incidence rates (IRs) of first-time ALI by age (in 10-year categories) and sex were calculated in both databases per calendar year (2004–2009) and according to both case definitions (narrow and broad definitions). Incidence rates were estimated using as numerator the cases of ALI (using the confirmed cases after manual review in BIFAP and the computer detected cases in CPRD) and as denominator the number of person-years in each year, overall and in each age- and sex-specific categories. For the comparison of the IRs in the general population across databases and over time, we carried out a direct sex and age standardization using the European Union population in 2008 (EUROSTAT) as standard (http://epp.eurostat.ec.europa.eu/portal/page/portal/population/data/database).

Results

Case ascertainment and validation in BIFAP

In BIFAP, by an automated computer search, we initially identified 19,074 patients with a first-ever record of a liver-related code or related keyword during the study period (2004–2009). Of these, only 2,873 (15.1 %) were retained as potential cases applying the computer algorithms shown in Table 1 and categorized in mutually exclusive categories as definite (n = 179), probable (n = 1157) or possible (n = 1537) (Table 2). After manual review, 2,437 (85 %) were not considered as valid cases, more than half of them were patients with just incidental LFT findings. Among the 179 computer-detected definite cases, 65 % were not considered definite cases (Table 2). Among the computer-detected probable cases that were identified with specific codes related to liver diseases (labelled as probable group A, in Table 1) (N = 119), the proportion of no-confirmed cases was slightly higher (69.7 %) than for definite. Finally, among computer-detected probable cases identified with unspecific codes only including abnormal LFTs (labelled as probable B in Table 1) (N = 1,038), the majority were considered non-cases (83.3 %). We also reviewed a sample of computer-detected non-cases, and none was considered ALI. All cases considered definite after manual review had been detected with the computer algorithm as potential cases, and none came from the random sample of 120 test negative cases. Most of the definite-confirmed cases, 87 % (108/124), were initially classified by the algorithm as either computer-detected definite or probable.

When we tested the computer algorithm for the restricted definition (computer-detected definite cases (N = 179)), only 43 (24 %) were definite-confirmed ALI. This restricted definition had a low probability to detect all the real cases (sensitivity of 34.6 % (43/124) but was very specific, with less than 5 % of identified patients being false positive cases (specificity = 95.2 % (2,613/2,746).

When broadening the inclusion criteria, for those patients categorized by computer algorithm as definite or probable ALI (N = 1336), the manual reviewed confirmed only 271 (20.3 %) of them as valid ALI cases. This broad definition increased the sensitivity up to 62.1 %, but the specificity decreased to 56.3 % (1,372/2,437), meaning that close to half of the patients detected with this definition were false positive cases.

Finally, 124 patients fulfilled all the criteria to be considered definite-confirmed cases, and 312 patients fulfilled the criteria for probable cases.

Case ascertainment and validation in CPRD

By computer search and applying the criteria proposed in the study, there were 112,157 patients with a READ code suggestive of liver injury. Of these, 11,363 (10 %) had more than one code spread over more than 180 days since the onset of the first diagnosis and were excluded as chronic cases. Furthermore 99,639 (89 %) patients were excluded as they presented with exclusion criteria that could be the cause of their liver disease. Finally, 269 patients were considered as definite ALI cases and 729 as probable.

Case validation was undertaken on a sample of 208 cases, 21 % of the total number of definite and probable cases identified by computer search in the CPRD. The review of the 208 ALI cases, either definite or probable, gave an overall agreement rate of 58.6 % (Table 3). When restricted to the 101 definite ALI cases, the reviewer confirmed 64 cases as definite-confirmed idiopathic ALI (agreement rate of 63.4 %). Additional free text was available for a limited number of patients 59 (28 %), and in the definite category, only 47 cases had this additional information for clinical review by the expert (Table 3).

ALI incidence in BIFAP and CPRD

We observed that for all age groups, there was a higher incidence of definite ALI among females than males in both datasets and that there was an increased rate with increasing age (Fig. 1a). The overall incidence rate of definite ALI in 2008 was 3.01 (95 % confidence interval (CI) 2.13–4.25) per 100,000 person-years in BIFAP and 1.35 (95 % CI 1.03–1.78) per 100,000 person-years in CPRD.

Fig. 1
figure 1

Incidence rate of acute liver injury (ALI) in BIFAP and in CPRD (dashed line) by age and sex. Using the ALI narrow definition (a) and broad definition (b)

Figure 1b shows the comparison between BIFAP and CPRD on the incidence of ALI using the broad definition. In CPRD, there was a higher incidence in females, while in BIFAP, there was a higher rate of ALI in males for most age groups. For both countries, we observed an increase in the incidence of ALI with age, but there was a decrease in the very old age groups. The overall incidence rate in 2008 when using the broad definition of ALI was 27.5 (95 % CI 24.6–20.9) per 100,000 person-years in BIFAP and 4.9 (95 % CI 4.2–6.6) per 100,000 person-years in CPRD.

After standardization to the European population, definite ALI incidence rates were higher in BIFAP (ranging from 2.41 in 2004 to 4.44 in 2009) than in CPRD (range 0.55–1.31). There was a suggestion of a time trend over the study period with a slight increase in later years in BIFAP, while this pattern was not seen in CPRD. Using the broad definition of ALI, CPRD rates were markedly lower (ranging from 2.47 in 2004 to 5.03 in 2009) while in BIFAP, these estimates were much higher, ranging from 15.0 to 29.9 per 100,000 person-years.

Discussion

This study demonstrates that when taking into consideration the complexity of the ascertainment of this disease and the heterogeneity of recorded information, it is feasible to identify ALI cases in two European databases applying a common methodology. In order to accurately capture idiopathic ALI in these databases, operational criteria need to be very specific, and a broad definition has demonstrated to be unsuitable to detect valid cases of ALI.

Most epidemiologic studies have adapted the classification issued from a consensus meeting on drug-induced liver disorders [22], with no special distinction by age groups or sex. Recently, there have been some approaches to create definitions based on coded information from computerised databases, as well as on laboratory and clinical data. The OMOP project and the FDA mini-Sentinel project evaluated different definitions of ALI across a combination of insurance claim databases and estimated the predictive value of this outcome in different sources [1014], but not a general agreement was reached in the algorithm or codes used for the definition of idiopathic ALI. In our study, we tested two distinct case definitions: a restricted ALI definition and a broader one, resulting in substantial variation in the incidence rate. As expected, we observed that the more restrictiveness of case definition, the lower the number of cases in both databases with greater validity and smaller number of false positives. Aithal et al. have proposed changing the criteria for drug-induced liver injury by raising the cut-off level of ALT elevation to five times ULN [24]. While this criterion would exclude clinically unimportant and self-limited drug-related events, we believe that its application will bias the results towards more serious cases of ALI, and therefore, it would be less appropriate for the objectives of our study.

Our results confirm that idiopathic ALI is a very rare disease in the general population with an incidence rate between 1 and 4 cases per 100,000 person-years when ALI is defined with restricted criteria. These results are in line with previous reported estimates. Several drug safety studies on liver injury have been done using primary care UK databases [8, 23]. The incidence of idiopathic ALI ranged from 2.4 (95 % CI 2.0–2.8) per 100,000 person-years up to 14 per 100,000 person-years although differences in methodological approaches between studies need to be keep in mind [4, 5, 25, 26].

When using the broad definition, the age and sex standardized incidence of ALI in a year ranged from 2.5 to 5 per 100.000 person-years in CPRD and 14.9 to 29.9 1 in the BIFAP database. These results that include less specific liver injury were more in line to other population-based studies with different approaches that reported higher incidence rates [27, 28].

This study observed a higher incidence rate of ALI in BIFAP than in the CPRD database during 2004–2009 (Fig. 2); these discrepancies could be real or could reflect remaining non-identical methods of ascertainment, including differences in the electronic health records between both databases, the codification dictionaries used (READ versus ICPC) and the user interface with different structure for entry data [29], as well as the patterns of recording by the GPs. Moreover, in the UK-CPRD, the READ dictionary allows GPs to record a specific code for most health conditions only using additional comments to better describe the patient’s medical condition, allowing the confirmation of GP recordings in validation studies [17, 30]. In the BIFAP database, free text section is often used by GPs to record specific diagnosis. Hence, the information included in this section needs to be retrieved by semiautomatic search of keywords (data mining process) in addition to ICPC codes. Furthermore, routine laboratory testing, under health promotion programs with in some instances, lack of additional information from hospital or consultant visits, is recorded in BIFAP, leading to a higher number of potential computer-based probable cases (broad definition).

Fig. 2
figure 2

Incidence rate of acute liver injury in BIFAP and in CPRD by year (standardised by age and sex EURO weights 2008). Using the ALI narrow definition (a) and broad definition (b)

An important challenge of this study was the development of accurate definitions that enable comparisons of incidence of ALI and risk estimates in different datasets. Contrary to specific outcomes such as cancer where simple code-based algorithms are valid [31], the present study reinforces the importance of validation when pursuing more challenging outcomes, as relying only on automatic computer search may overestimate the true incidence of the disease. The consequence for drug safety studies using a restricted definition would be an increase in the specificity of the outcome at expense of potential underestimation of true cases and reduced statistical power to detect associations.

The main strength of this study is that it is population-based, using databases with large numbers of patients registered over long periods of time [16]. Furthermore, these patients are a good representation of the total population of their countries [15, 32]. Both Spain and the UK have a national health care system with universal coverage in which general practitioners (GPs) are the gatekeeper of the system for most health problems, and they prescribed most of the medications issued, around 80 % in BIFAP [32].

Another strength of this study was the parsimonious procedure and strict criteria applied for case ascertainment by constructing computer-based algorithms that which can be replicated in future studies using multiple databases, keeping in consideration the inherited inter-database heterogeneity between resources.

Among limitations, it is possible that some ALI cases are not recorded by GP, for instance, those occurring in-hospital, and therefore, the incidence rates of ALI found in this study should be taken with caution. Also, we have to mention that the validation process was different in the two databases, as in CPRD, only a sample of cases was available for this purpose. Finally, a limitation inherent in retrospective studies is that there may be missing information.

In conclusion, the construction of a standard definition with predefined criteria enables accurately identification of ALI cases and facilitates the timely comparison of incidence rates between different primary care databases. When the outcome to be studied is a rare and complex clinical condition such as ALI, the more restrictive definition and the possibility to review additional information to rule out differential diagnoses permit to capture more valid cases. The use of multiple databases for epidemiological research as tested in the PROTECT project will be important to ensure power for valid comparisons and risk estimations when addressing rare outcomes and exposures. Hepatotoxicity of medical products has led to many marketing withdrawals and post-authorization safety studies. Often, such studies take many years to be performed, and the development of outcome definition is an important and time-consuming part of the study design. The ALI algorithms developed and tested in this study for two distinct primary care databases hold promise for the conduct of more rapid multiple database studies of ALI in the future.