Keywords

1 Background

When presenting a new patient to a clinical facility, the physician has the task of making several decisions, e.g., regarding the correct diagnosis and appropriate treatment. Clinical decision-making comprises three integrated phases: diagnosis, severity assessment, and management. The physician makes the decision considering various factors, such as medical history, physical examinations, and his own experience [1]. However, decision-making for an individual patient has become more challenging due to the increasing amount of data from genetics, proteomics, and diagnostic imaging on Electronical Health Records (EHRs) [2]. In addition, many diseases have overlapping conditions. A single disorder can result in a wide range of signs and symptoms, and many disorders can result in similar signs and symptoms [1]. In case of rare diseases, there is also the additional challenge of lack of expertise and resources, which leads to delayed or even incorrect diagnoses and treatment [3]. Human cognitive abilities are limited, and this increasing complexity of decision-making requires validated decision support systems and computerised support at different levels and stages of the treatment process [2].

While technical processes and information systems are used in a variety of forms and in a wide range of healthcare settings, a particular focus lies on the decision support for medical professionals [4]. An application that assists in healthcare decision-making is often referred to as Clinical Decision Support System (CDSS). The term CDSS is used repeatedly in this review and includes other clinical systems used for decision-making that are not explicitly referred to as CDSS. Significantly, the main purpose of a CDSS is not to completely take the decision away from the medical professional. The system is only intended to assist during the decision-making process. Using both his own knowledge and the system’s information, the professional could make a better analysis of the patient’s data than either the human or the CDSS could do alone. The CDSS should therefore have a positive impact on the quality of clinical decisions in hospitals, general practice, and other healthcare settings [5].

Computer-assisted Case-based Reasoning (CBR) is a form of technical assistance which solves problems of a current target case with a methodology based on solutions of similar problems from the past [6]. CBR is applied in medical but also in various non-medical settings [5]. CBR can assist in the process of detecting a disease and help the medical professional make an appropriate decision. It is a problem-solving methodology inspired by the decision-making procedure of the human brain and is defined as a four step-process: Retrieve, Reuse, Revise, Retain [7].

In a CDSS based on CBR, an individual patient’s medical file is first matched against a computerised clinical knowledge database (Retrieve). The user of the software then receives patient-specific assessments and recommendations to support decisions regarding, for example, diagnosis or treatment (Reuse). Based on the output of the system, the medical professional decides which information and recommendations are relevant and which are irrelevant to the target case (Revise). After revision, the target case is included into the original case base (Retain). This is the learning phase of CBR, as new knowledge about the target case and its possible diagnosis and treatment is acquired after each cycle. In practice, CBR systems are in many cases limited exclusively to the retrieve step [8].

Although it could enhance the overall healthcare quality of patients [9, 10], the impact of CBR in the domain of clinical decision support has not been reviewed in depth in the literature. So far, there are only reviews that roughly outline this topic or only deal with it in a very general manner. For instance, Narindrarangkura et al. focused on the larger topic of Artificial Intelligence (AI) in CDSS [11]. Kong et al. examined knowledge representations and inferences in CDSS without addressing the specifics of CBR [12]. El-Sappagh et al. examines CBR frameworks with a focus on applications for diabetes mellitus and compares medical with non-medical uses [13]. Therefore, a more comprehensive overview of CBR for clinical decision support is necessary in order to map research performed in this area, to reveal knowledge gaps and give clinicians and researchers an overview of developments and current systems regarding this topic. The scoping review presented here will focus on disease groups, application areas, data scope, similarity metrics, data interoperability (interaction of heterogeneous systems for efficient data exchange), development status and expert validation of medical CBR systems. Possible future research approaches and limitations will also be considered.

The objective of this scoping review is to answer the following five research questions: (1) How has the research on “CBR in CDSS” evolved in recent years, compared to the general research on “AI in CDSS”?, (2) Which disease groups are dominantly addressed by CBR systems and for what purpose?, (3) What type and volume of data is collected and in what form is data exchanged within the CDSS?, (4) Which similarity metrics are used to retrieve past cases?, (5) Are the systems validated by human expertise?

2 Methods

The reporting of this scoping review follows the Transparent Reporting of Systematic Reviews and Meta-Analyses (PRISMA) guideline for scoping reviews, which is a common guideline in medical research to report scoping reviews. Accordingly, this review was designed using the PRISMA-ScR 2018 Checklist (AF1 PRISMA Checklist: https://osf.io/2xwcv) [14]. We considered 20 out of 22 checklist items.

After the identification of the objectives, the following five steps are applied in this scoping review: (2.1) identification of relevant keywords, (2.2) conducting the search query, (2.3) selection of eligible studies, (2.4) collection of data, and (2.5) compilation and summary of results.

In the preparation of this paper, a review protocol was created and uploaded to Open Science (https://osf.io/xauvt) [15]. The author RN prepared the protocol in March 2022, which was approved by all other authors on April 25, 2022. The protocol was published on April 27, 2022. The five-step process of the applied methodology is presented below.

2.1 Identifying Relevant Keywords

To identify keywords relevant to the search query in Sect. 2.2, the procedure described below was proposed by author RN and approved by all authors. An initial search was performed via PubMed, a database for medical publications of the National Library of Medicine (NLM). In PubMed the term “case-based reasoning [Text Word]” was generally searched for. The suffix Text Word defines the search scope for the preceding string, including titles, abstracts, Medical Subject Headings (MeSH), and other terms. MeSH belong to the NLM’s controlled vocabulary and are used for indexing articles in PubMed [16]. This initial search, conducted in February 2022, yielded 179 results for the years between 2011 and 2021. These articles were exported from PubMed and imported into an online review tool named Rayyan [17], where a filter option was used to automatically identify the most frequently occurring topics and keywords that were indexed with the publications. Terms based on the broader topic of decision or diagnostic support were selected. This also included terms of MeSH indexed in PubMed.

2.2 Conducting the Search Query

The search strings are determined from the pre-identified frequent terms and the search query for PubMed was defined with Boolean operators “AND” and “OR”, as shown in Fig. 1.

Fig. 1.
figure 1

Search query for the PubMed database

The search query was also performed on Web of Science (WoS). However, since MeSH terms are not used in WoS, the suffixes in the square brackets of the search terms were removed for the search. The field tag “TS” was used for the search terms, which searches the terms in title, abstract, author keywords and KeyWords Plus (Interdisciplinary search for all articles that have common citations) [18]. The category of “medical informatics” was selected to filter out non-medical and non-technical content. The rest of the search was carried out the same way as in PubMed.

Published articles from the period January 01, 2001 to February 28, 2022 were considered in the search. The results of the PubMed and WoS queries were subsequently merged in Rayyan and analysed for duplicates, which were removed accordingly.

Another search query for “AI in CDSS” was conducted on March 11, 2022 to compare research development in this general area with the research on “CBR in CDSS”. This search was done via PubMed and WoS for the years 2011–2021. For this purpose, the two terms “Artificial Intelligence” and “Clinical Decision Support” were combined with the Boolean operator “AND”.

2.3 Screening of Identified Publications

In order to check the eligibility of articles resulting from the search query, two rounds of screening were conducted to select the publications: A screening based on bibliographic data and a full text screening. The eligibility criteria for the title and abstract screening, consisting of five questions, were listed in a screening form, as shown in Table 1. All authors approved the form.

Table 1. Five step screening form for title and abstract screening with ‘Inclusion’ and ‘Exclusion’ criteria

The title and abstract screening was performed by RN and JS via Rayyan. In Rayyan, the authors involved in this screening can decide whether to “include” or “exclude” a publication. There is the additional option of selecting “maybe” and taking the decision later and/or in cooperation with the other author. The decisions for the consideration of a publication made by the others are not visible to one another during the joint review process to avoid influence of any kind. At the end, the decisions of the authors are evaluated together and any conflicts that arise are discussed and resolved by all authors.

The publications qualified for full text screening were exported from the Rayyan tool and inserted into the literature management system Citavi (citavi.com).

Full text screening was carried out via Citavi by the author RN in a similar way to the title and abstract screening. This process was reviewed by JS. The eligibility criteria for the full text screening consisting of three questions are listed in Table 2. The result of the full text screening was discussed with all authors and after general agreement, the data items were extracted.

Table 2. Three step screening form for full text screening with ‘Inclusion’ and ‘Exclusion’ criteria

2.4 Data Extraction

The data extraction was first carried out for 20 publications by the author RN. The result was then agreed and discussed with the author JS. After successful consultation, the data extraction was continued by RN and the list of data items shown in Table 3 was continuously updated and adjusted (AF2 Data extraction sheet: https://osf.io/2xwcv).

Table 3. Data extraction sheet with specified variables and their definition. The suffix ‘(Y/N)’ denotes results separated in a ‘Yes’ (is described) and ‘No’ (is not described) category. The categorisation into disease groups is based on the 21 health categories of the UKCRC Health Research Classification System [19].

2.5 Visualisation and Summarisation of Results

At the completion of the data extraction, the gathered data items (see Table 3) were summarised and visualised to present the results in this publication. A flow chart according to PRISMA-ScR was chosen to illustrate the selection of studies. To display the extracted data items, timelines (Publication year), bar charts (Disease Group, Type of clinical data, Type of similarity measure) and pie charts (Medical application, Patient number, Data Interoperability, Expert validation) were created.

3 Results

The process of literature selection is shown in Fig. 2 in form of a flowchart. For the years between 2011 – February 2022, 101 publications in PubMed and 62 publications in WoS were identified. After removing the duplicates, 125 records remained. These records were thereupon checked for eligibility. A further 24 articles were excluded because either no full text was available or no medical context or CBR process was described. At the end of the screening, 66 studies remained that were eligible for data charting. A list of all included articles and extracted data items can be accessed on Open Science (AF3 Included articles: https://osf.io/2xwcv).

Fig. 2.
figure 2

Literature flow diagram describing records identification and screening

3.1 Countries and Years of Publications

Most studies were conducted in whole or partially in China (n = 10), France (n = 7), Germany (n = 9), Spain (n = 10), United Kingdom (n = 13), and the USA (n = 6). The timeline of publications, see Fig. 3, (i), shows the years in which the articles were published. Research on “CBR in CDSS” has stagnated somewhat in recent years. A negative trend in research papers can be observed from 2011 to 2013. The number of publications increased again in 2014 and 2015. In the following years until 2021, the number of publications decreases slightly and fluctuates slightly from one year to another. For the beginning of 2022 (till February 28, 2022) there was no relevant paper yet, therefore the diagram has no data node at this point. For comparison, Chart section (ii) shows a search query for “AI in CDSS” conducted on March 11, 2022.

Fig. 3.
figure 3

Publication timeline representing the number of publications for (i) research on “CBR in CDSS” and (ii) research on “AI in CDSS” over the period 2011–2021

3.2 Disease Categories

In terms of disease groups – see Fig. 4 – cancer, and neoplasm (n = 25) are the most frequently targeted diseases treated with CBR applications. Metabolic and endocrine (e.g., Diabetes mellitus) (n = 13) are the second most addressed diseases. Generic health relevance (e.g., elder health) (n = 8), oral and gastrointestinal (e.g., Hepatitis) (n = 7), and cardiovascular conditions (e.g., high blood pressure) (n = 7) rank below them with similar frequencies. The categories musculoskeletal, reproductive health and childbirth, neurological, infection and blood are rarely examined (n = 1).

There are some articles in which the same authors have conducted several consecutive studies. For example, in the field of insulin bolus advice (metabolic and endocrine), five papers, and in the area of breast cancer (cancer and neoplasm), three papers have been published consecutively. The rest of the data set consists of either individual articles or papers with only one subsequent publication.

Fig. 4.
figure 4

Different disease categories identified in the studies and ranked by frequency of occurrence in these studies

3.3 Clinical Data, Application, and Patient Number

CBR systems often use demographic (n = 29) and historical data from the patient or family record (n = 30) as input. However, this information is often part of an extended data stream, e.g., demographic data combined with image and test data. Figure 5 shows the remaining clinical data input used in the identified 66 publications.

Fig. 5.
figure 5

Different types and frequencies of clinical data used as input to the CBR systems described in the studies. When interpreting the frequencies of the disease groups considered, successive publications by the same group of researchers must be taken into account.

Regarding medical applications, 90% (n = 61) of the CBR systems were designed for therapeutic/treatment or diagnostic purposes. Only a small percentage (n = 7) was devoted to basic research and other decision support, as shown in Fig. 6 (i).

More than 50% (n = 40) of all studies use data sets with more than 100 patient records. In most studies (n = 24) the size of the data set is between 100–1000 records. Less than 20 patient records are used in 12% of the studies. Between 20 and 99 patient records occur in 18% of the studies, as shown in Fig. 6 (ii).

Fig. 6.
figure 6

Pie chart of (i) the percentage of the medical application and (ii) the different database sizes in the studies

3.4 Similarity Metric

Among the similarity metrics used for analogy search in the retrieve step of CBR, the Euclidean distance appears frequently in the articles with 25 occurrences. In many cases, publications design a similarity function that is not specified or labelled, but often resembles the similarity and distance metrics listed in Fig. 7. In addition, weights are often assigned to the input variables, as certain parameters have a higher influence on the target variable than others [20]. The setting of weights must be considered when calculating the distances, as they can significantly impact the result of the similarity measurement. Weights are typically selected by experts [21] or determined by other methods such as equal weight techniques [22] or Machine Learning algorithms including decision trees and genetic algorithms [20, 23].

3.5 System Properties and Validation

Figure 8 visualises expert validation (i), and data interoperability (ii) of the 66 studies identified.

About 24% (n = 16) of the systems designed in the studies were validated by experts. A share of 14% (n = 9) deal with data interoperability. Five of the studies dealing with interoperability focus on international standards for the exchanging of data such as Health Level 7 (HL7) or Fast Healthcare Interoperability Resources (FHIR).

Fig. 7.
figure 7

Frequencies of similarity metrics used in the studies to measure analogies in the retrieval step of CBR processes

Fig. 8.
figure 8

(i) Expert validation and (ii) data interoperability plotted as pie charts to illustrate the relative shares. The exact values are stated in the corresponding text section.

4 Discussion

The scoping review presented here aims to provide an overview of the current and past development of CBR systems in clinical decision-making. The different systems in the eligible 66 publications of the years 2011–2021 were reviewed and analysed.

The described CDSS are predominantly used for diagnostic, therapeutic and treatment applications in the cancer and neoplasm domain and are most researched in China, Germany, Spain, and the United Kingdom. The studies mainly take large data sets (>100), which mostly contain demographic and patient data. A large prevalence allows the CBR algorithms to be trained sufficiently and to capture enough feature characteristics, just as physicians benefit from different attributes in differential diagnoses [24].

To measure the similarity between past case data and the current case, systems often use Euclidean distance. However, many studies introduce similarity and distance metrics that are not further labelled. Similarity metrics can also be used for different purposes, for example, to measure the similarity of numerical (e.g., Euclidean distance) or categorical data (e.g., Hamming distance) [25]. Feature weighting is not to be neglected and is of decisive importance in distance measurement [20]. Furthermore, a validation by experts and medical professionals has been conducted for only 16 of the systems. Data interoperability is discussed in 14% of the studies.

Not all selected 66 publications and the systems they contain can be discussed in detail within the scope of this review. However, this review shows clinicians and software developers what is known about CBR for CDSS and points to knowledge gaps in previous research that can be considered for future investigations. One gap is the lack of expert validation or the difficulty of carrying out such validation due to the associated enormous effort for larger data sets. However, expert validation can determine the usefulness and comprehensibility of decision support systems for users and health professionals [26]. Another point that has been poorly addressed in the studies is the interoperability of data. By integrating data exchange standards such as HL7, the autonomy of a software module from the overall system can be achieved and it can therefore be deployed and expanded as required without the need for special adaptations [27]. Merging CDSS and clinical information systems that interact with the EHR can build a virtual health record (VHR) and a homogeneous framework for modelling clinical concepts [28].

Most studies also mention issues and limitations for research and development in the clinical CBR domain. One limitation that is highlighted in some studies is the issue of incomplete data sets as databases grow larger. This affects the stability of the similarity ranking. Löw et al. deal with the handling of missing values in data, which is of critical importance when integrating training data into the CBR algorithm [8]. A further topic to be considered is the efficient retrieval of similar cases in the database to avoid long-running times, where the use of cloud computing is a possible workaround [29]. The method of assigning weights for the similarity metrics should also be carefully evaluated. When it comes to selection of weights by experts, it should be noted that this can be very subjective [30]. The genetic algorithm approach has been proven in many studies for the development of a weighted similarity function, for example in El-Sappagh et al. [23] and Yin et al. [31]. While most of these limitations are interesting for software developers, they are less pervasive on the physician and user side. However, as this review is also intended to address medical practitioners, these limitations were not investigated across all papers, but should not be neglected when developing such systems.

While general research on the use of AI in the field of medical decision support has increased significantly in recent years (especially in the past 5 years from 2016–2021), research in the field of CBR for CDSS has stagnated, cf. Fig. 3. However, CBR has a decisive advantage over other AI algorithms due to its explanatory and customisable character [13, 24]. Many AI systems developed today, especially neural networks, are “black boxes” where the user of the system cannot reconstruct how the decision was obtained [32]. This makes it difficult or impossible for the user to follow the system’s decisions and thus diminishes the system’s credibility.

CBR systems on the other hand not only output simple diagnostic suggestions, but also allow the user to view local and global similarity searches and set weights, metrics, and parameters to optimise retrieval [30]. A key point is the output of therapy and treatment recommendations of past cases, which is useful as additional input for further proceeding with a patient [33]. Furthermore, the system can learn by the user’s continuous evaluation of the proposed solutions [13]. In future healthcare applications, all the above factors could turn CBR into a reliable tool for clinical decision-making.

5 Conclusion

This scoping review discusses in detail the research and findings of the literature on CBR in clinical decision-making since 2011. Computer-assisted CBR is used to support medical practitioners in diagnosis and treatment for different diseases. The choice of input data, similarity metric and patient cohort is critical to the reliability of the application. In future research, the consideration of data interoperability and expert validation could be a crucial sticking point to make such case-based support systems operational for day-to-day clinical practice. Through a user-oriented approach, CBR could become an effective tool in the ever-increasing digitalisation of healthcare.