1 Introduction

As whole economic sectors adopt new digital solutions under terms such as “industry 4.0” and new technological paradigms like IoT (internet of things), the healthcare sector is changing as well. Medical imaging was one of the first key areas to adopt digital solutions for storing, processing, and distribution of patient data such as MRI (magnetic resonance imaging) or PET (positron emission tomography) scans. The “picture archiving and communication system” (PACS) replaced the need for physically stored images and allowed healthcare professionals remote access to all present or archive scans of their patients (Duerinckx and Pisa, 1982). This greatly reduced cost for long-term storage and time for images to be transferred from one station to another or in the worst case from an external storage facility to the hospital. Other medical areas followed, from medical samples being automatically processed and analyzed by laboratory information systems (LIS), electronic health records (EHR), to decision support systems ensuring drug therapy safety.

For the German healthcare sector, all of these individual efforts now culminate in the construction of the teleinformatics infrastructure (“Telematikinfrastruktur”). This infrastructure will provide a complete, fast, and safe exchange of information between patients and healthcare professionals. This digitization may result in faster adoption of new results and tools from research projects further improving drug therapy safety and reducing adverse drug events (ADR).

All of these systems require data of some sort. Be it patient information for electronic health records or medical knowledge databases for decision support systems. Therefore, this chapter describes the need for drug therapy safety tools and the data integration efforts of several medical information systems. The chapter concludes with an outlook on combining all of these efforts to further improve drug therapy safety.

2 Drug Therapy Safety

The safety and appropriateness of pharmacotherapy is an important topic in the field of medicine and under extensive research. Where younger patients rarely require more than one medication, the number of drugs taken in the elderly increases. In aging populations, multimorbidity is increasing with a corresponding increase in polypharmacy, which in turn is the prime risk factor for inappropriate prescribing. The evidence is well-known by several studies that the use of certain groups of medications in elderly and vulnerable patients is associated with falls (Fiss et al., 2010) and an increase in mortality (Chrischilles et al., 2009). With an increasingly older population in Germany, prognosticated to be 22% of the population aged 65 and older in 2022 (German Federal Statistical Office, 2021), the prevalence of multimorbidity is growing. Furthermore, inappropriate medications can impair cognitive properties (Boustani et al., 2010), reduce the quality of life, and cause additional costs for the healthcare system (Fick, 2001). The major challenges in gerontopharmacology are both over-treatment and undertreatment associated with polypharmacy.

Approximately 2.7 million BARMER insured people in Germany are suffering from five or more chronic diseases (Grandt et al., 2018). In addition, every fourth BARMER insured person aged 65 and older received at least one potentially inadequate medication (PIM) based on the PRISCUS list (Grandt et al., 2018; Holt et al., 2010). As more PIM lists have been published and new ones are emerging, like FORTA (Pazan et al., 2019) or EU(7)-PIM (Renom-Guiteras et al., 2015) this result would likely be even higher.

Further increasing the complexity of the prescription process is the growing number of available medications. The German Federal Institute for Drugs and Medical Devices (BfArM) reported for April 2021 approximately 103,975 medications on the German market. From these medications, 34,911 are freely available and 52,478 without a prescription (BfArM, 2021). Furthermore, polypharmacy increases the risk of drug-related problems such as medication errors and adverse drug reactions. Without the help of medical decision support systems, healthcare professionals are likely unable to review all potential issues for every patient case.

The increased interest in molecular analyses, not only by researchers but also by healthcare professionals, may finally lead the way toward personalized medicine. The adoption of sequencing technologies and others in hospitals is positive, but staff needs to be properly trained and new safety measures implemented to prevent errors in data interpretation. Subsequent changes in a patient’s drug therapy on an individual molecular basis need to be thoroughly tested and regulated to increase and not reduce drug therapy safety.

3 Medical Information System Examples

Following, different medical information systems are analyzed for their specific use-cases and needs in terms of data integration.

3.1 KALIS

KALIS is a web-based information system for checking drug-related problems in the medication process (Alban et al., 2017). It is comprised of multiple components, each tailored to a specific use-case. The main component is the pharmacological risk check. Here, medications and compounds can be checked with indications, side effects, and intolerances for interactions and other risks. Other modules help in finding potentially inappropriate medication for elderly patients, pharmacogenetic interactions in light of CYP enzyme defects, and guideline compliant analyses for hypertension.

Figure 11.1 visualizes the data integration architecture of KALIS. It is divided into integration, conception, and merging. The resulting KALIS-DWH has a uniform data structure and provides comprehensive information for the aforementioned risk-check components. Eight different data sources are integrated into the data warehouse:

  • Pharmacological databases (gray): ABDAmed (ABDATA Pharma-Daten-Service, 2021), ROTE LISTE® (Rote Liste® Service GmbH, 2021), and GELBE LISTE® (Vidal MMI Germany GmbH, 2021)

    Fig. 11.1
    figure 1

    Overview of the KALIS data warehouse integration pipeline (Alban et al., 2017)

  • International databases with patient-related case reports of adverse drug events (red): FAERS (FDA, 2021), ARD (Health Canada, 2021), and DPD (Health Canada, 2021)

  • Newly developed databases (green): CYP-P450 and PRISCUS-Liste (Holt et al., 2010)

The newly developed databases are based on information sources from scientific literature. Aggregating this knowledge into databases and merging it with pharmacological data enriches the risk analysis with new components.

The family of Cytochrome P450 enzymes (CYP) plays a crucial role in the metabolism of many substances. Variabilities between patients in the metabolism of medications by enzyme induction or inhibition and other genetic factors indicate a significant issue of pharmacotherapy. A new database CYP-P450 was designed, which contains information on substance-CYP enzyme interactions in the liver and kidney. This data is primarily based on the results of the literature research of Dippl (2011).

The PRISCUS list was created as a part of the joint project “PRISCUS” (Holt et al., 2010), which was funded by the German Federal Ministry of Education and Research (BMBF). The PRISCUS list includes 83 drugs available on the German drug market. The risk of these drugs for any side effects or age-related complications prevails the medical benefits. A new database was derived from the published list. For these 83 potentially inadequate drugs information such as reason, therapy alternatives, and more were integrated into a suitable tabular data format.

Due to different exchange formats (XML, ASCII, CSV, MDB) and license models, specific parsers were implemented in Java for each data source. These parsers were used to extract the datasets, transform the data into the respective MySQL database, and load it efficiently into KALIS-DWH. Additional metadata for data analysis is stored in a separate database.

DAWIS-M.D. (Hippe et al., 2010) is a data warehouse for molecular information including data sources such as DrugBank (Knox et al., 2010), SIDER (Kuhn et al., 2010), and KEGG (Kanehisa, 2000). The pharmacological databases of KALIS-DWH were fused with the biomolecular databases of DAWIS-M.D. This data can be used for knowledge discovery of the underlying mechanism of drug action or the potential impact on the disease. The data integration of biomolecular data sources was performed by implementing XML parsers in Java and using the software kit BioDWH (Töpel et al., 2008). National and international identification standards were used for coding, mapping, and assignment of medical information such as drugs, therapeutic indications, diseases, and side effects. These include the ATC index (Anatomical Therapeutical Chemical classification), ICD-10 (International Statistical Classification of Diseases and Related Health Problems), and MedDRA (Medical Dictionary for Regulatory Activities) (International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use ICH, 2021). In this way, the homogeneous data warehouses KALIS-DWH and DAWIS-M.D. provide pharmacological and biomolecular information for efficient and goal-oriented risk analysis of drugs. The standardized codes support the accuracy of data inputs and processing as well as a simple data exchange and uniform communication between KALIS and the end-user.

3.2 GraphSAW

GraphSAW is a web-based medical information system on drug interactions and side effects from pharmaceutical and molecular databases (Shoshi et al., 2015). Where KALIS focused mainly on vetted and official pharmaceutical databases such as ABDAmed (ABDATA Pharma-Daten-Service, 2021), GraphSAW provides a visual analysis and comparison with molecular databases such as DrugBank (Knox et al., 2010). The analyses are split into different components including drug–drug, drug–side effects, drug–molecule, drug–disease, drug–pathway, and pathway–disease interactions. A screenshot of the GraphSAW website is shown in Fig. 11.2.

Fig. 11.2
figure 2

Screenshot of the GraphSAW website. The analysis modules are shown on the left. Results are listed on the right and the main visualization in the middle. Here the molecular medication analysis is shown

The data integration utilized the two commercial databases ABDAmed (ABDATA Pharma-Daten-Service, 2021) and KEGG (Kanehisa, 2000) and the two freely available databases SIDER (Kuhn et al., 2010) and DrugBank (Knox et al., 2010) as visualized in Fig. 11.3.

Fig. 11.3
figure 3

Overview of the GraphSAW data warehouse integration pipeline (Shoshi et al., 2015)

DrugBank is the largest resource that collects binding data on small molecules, in particular those of drugs and proteins. At the time of creation 6711 approved and experimental drugs were integrated from DrugBank. As of April 2021, DrugBank contains more than double the number of drugs (14,460). DrugBank provides information on drug–drug as well as drug–target interactions, including CYP enzymes as mentioned in the KALIS section.

Further drug interactions were obtained from the commercial database ABDAmed that is based on approved and validated drug-related data in comparison to DrugBank. ABDAmed contains comprehensive facts for dealing with more than 47,000 drugs such as information about application and composition, risks, and drug interactions. The ABDAmed database includes also the side effects of drugs. More than 4500 side effects (3135 different; 1381 synonyms) were extracted automatically from full-text information in German and translated manually into English.

An additional 4192 different drug side effects were obtained from SIDER. Information on metabolic pathways was obtained from KEGG, which already integrates compounds from DrugBank (Knox et al., 2010), PubChem (Kim et al., 2020), CAS (American Chemical Society, 2021), and more. Therefore, DrugBank identifiers were used for mapping drugs between the data sources.

To map drugs between DrugBank and ABDAmed, the ATC classification system was used. MedDRA terms (International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use ICH, 2021) were used for coding drug side effects of both SIDER and ABDAmed. The mapping between drugs of DrugBank and SIDER was realized by drug names because these databases did not have corresponding identifiers for compounds. By introducing mappings between the heterogeneous databases, interaction and side effect information were assigned to all drugs.

The data integration was implemented as parsers written in Java for the bio data warehouse BioDWH (Töpel et al., 2008). Using the data warehouse architecture ensures both the availability and the relevance of the data sources. Additional metadata for data analysis is stored in a separate database such as extracted and translated side effects from ABDAmed.

3.3 PIMBase

In recent years, lists, criteria, and classification systems for assessing potentially inappropriate medication for geriatric patients were developed and published. Besides these PIM lists of medication with a negative risk–benefit balance (i.e. PRISCUS (Holt et al., 2010), AUSTRIAN PIM (Mann et al., 2011)), lists with a positive balance (i.e. FORTA (Pazan et al., 2019), EU(7)-PIM (Renom-Guiteras et al., 2015)) are also becoming the focus of interest.

However, those PIM lists are spread across scientific journals and difficult to access for patients or health professionals in the context of treatment. The integration of the various lists into a uniform database and subsequent merging as well as an implementation of a unique rating scale are essential for the qualitative improvement of the drug therapy in the elderly and offer opportunities for practical application to identify and reduce inappropriate prescribing.

The data integration for the PIMBase database is divided into multiple steps. First, the original PIM lists were collected and the format analyzed. Most lists are only accessible as tables in PDF files either as supplementary information or directly embedded in their respective publications. Freely available tools for automatic extraction of tables from PDF files are often unsuitable and error-prone. As a correct data transformation could not be guaranteed, most PIM lists were transferred by hand into a machine-readable tabular format. This step was thoroughly checked to prevent any copy errors or loss of context.

With all lists in a machine-readable state common information such as drug names, ratings, reasons, or alternatives were compared. A collection of different list entries is listed below.

  • Magnesium hydroxide.

  • Docusate sodium (oral).

  • Spironolactone >25 mg/d.

  • Concomitant use of theophylline together with ciprofloxacin may increase the risk of theophylline toxicity in the elderly.

  • In the elderly, avoid doses of acetylsalicylic acid greater than 325 mg per day due to increased risk of gastrointestinal bleeding and peptic ulcer disease.

The EU(7)-PIM list already provides annotations with the ATC index for the different drugs. The entries of the other PIM lists were manually annotated with ATC codes. For the first iteration of the PIMBase database, a simple relational database schema was developed mainly consisting of textual, numerical, and listing information. A simple python integration pipeline uses the created machine-readable lists and generates an SQL script readily usable in MySQL database installations.

Using the generated database, the first iteration of the PIMBase websiteFootnote 1 allows users to search for names and ATC codes and to see detailed information for each PIM entry. A screenshot of the website is shown in Fig. 11.4. With the addition of more PIM lists, multiple issues become apparent. For example, when searching for acetylsalicylic acid (ATC B01AC06 and N02BA01), four entries exist in the FORTA, one in the AUSTRIAN, and one in the EU(7)-PIM lists. All six different entries are annotated with matching ATC codes but still disconnected. The problem becomes more complex, where PIM entries are not only relevant for a single, but multiple drugs or even whole therapeutic categories. An example is the FORTA entry “Class I-III antiarrhythmic agents: All except Amiodarone and Dronedarone.” Not only do different lists may use slightly different synonyms for drugs but also use names, synonyms, or abbreviations of therapeutic categories which are not standardized. An example for these synonyms is {“Antimuscarinics,” “Antimuscarinic drugs,” “Muscarinic antagonists,” “Muscarinic-blocking agents,” “Muscarinic-blocking drug”}. When a user now searches for a certain drug or drug class, all relevant entries should be found. If a specific drug is searched for, but an entry only provides a drug class that includes the specified drug, the entry should still be found.

Fig. 11.4
figure 4

Screenshot of the PIMBase website with the rating scale on the left and the entries of potentially inappropriate medication in the center

This challenge necessitates the integration of therapeutic class hierarchies. Multiple databases provide their own categories and hierarchies such as NDF-RT, KEGG, and DrugBank. Independent hierarchies such as the ATC index and USP drug classification exist as well. However, each of these hierarchies has different intentions, number of hierarchy levels and drugs listed. An excerpt comparison of databases and hierarchies is visualized in Fig. 11.5. These hierarchies are used to implement a better search strategy in finding entries by drugs and drug classes. Drug entries in the leaf nodes need to be mapped to the ATC codes used for the PIMBase entries. Additionally, mappings between drug class hierarchies improve the number of entries found under category terms and reduce the number of duplicates in search suggestions.

Fig. 11.5
figure 5

Excerpt of different drug classes and therapeutic groups for antiparkinson agents from KEGG, ABDAmed, and DrugBank. NDF-RT therapeutic categories had no categories matching either antiparkinson or anticholinergic agents. * DrugBank has no category for antiparkinson agents and was substituted with the sub-category of anticholinergic agents

In addition to drugs and drug classes, PIM entries are in most cases specific to certain patient conditions such as indications, age, gender, and laboratory measurements. Providing only relevant entries for a specific patient is therefore even more complex. Diseases, side effects, and other keywords need to be annotated for each PIM entry. Furthermore, the logical relationship between them needs to be encoded in a suitable data structure such as decision trees. If a user only searches for a specific drug, but the entry is only relevant in combination with a condition, the entry should still be shown. Vice versa, if only a condition is entered, the matching drugs should be shown as well. This requires even more databases for disease information and suitable ontologies for measurements such as creatinine clearance. These challenges are currently under development.

Encoding each entry of all PIM lists with appropriate logical rulesets will result in a powerful toolset for healthcare professionals and patients. The quick access to relevant information for the specific patient situation will increase drug therapy safety and hopefully reduce inappropriate prescribing without the need to manually scan all PIM lists and a step further toward personalized medicine.

4 Outlook

Medical decision support and drug therapy safety are important but complex challenges. This chapter introduced several medical information systems and presented their data integration needs. Each of these systems represents a specific area of medical decision support and provides a piece to drug therapy safety as a whole.

The primary issue is the adoption by healthcare professionals. While being easily accessible and intuitive, none of the presented systems can communicate with other software such as hospital information systems. Communication standards like Health Level Seven (HL7) and FHIR and exchange standards for electronic health records need to be implemented. This will allow populating information system inputs directly from patient records and reduce the time and effort it takes to use the systems. Implementing these standards requires an extension of the data integration pipelines. This includes mapping relevant entities to the identification systems used in these standards.

In a future project, the concepts of all presented systems are planned to be merged into a single decision support system. Aside from the data integration needs, the entities and information from all systems need to be mapped. In most cases, this should be trivial where suitable identification systems are already present such as ICD-10 codes for diseases and ATC codes for medications. The development of an interaction check between KALIS and PIMBase with KATIS requires new information on the molecular composition and mechanics of remedies. This molecular data could then be used in the context of GraphSAW finding interactions between drugs and remedies.

Personalized medicine needs to analyze a patient as a whole, not only what medication he uses or which side effects are present. Allergies, diet, physical activity, and potential use of remedies all need to be considered to provide the best and safest treatment possible and to reduce adverse drug reactions. Therefore, the combination of all presented tools could provide a basis for personalized medicine in the future.