Introduction

Advancement of knowledge has complicated decision-making process and this puts more emphasis on implementing information systems to help in making everyday decisions. Therefore, computer systems are commonly used in many fields to support the process of decision-making. One of the computer-based information systems is artificial intelligence (AI) system [1]. According to the experts, the field of AI has two purposes: studying the thinking process of human, and then representing the process by machines such as computers and robots. In other words, AI is to develop computers and machines (software and hardware) capable of showing human-like behaviors, which are known as intelligent behaviors [1, 2]. Human intelligence, in general, is related to purposeful behaviors that include ability to understand complicated situations and showing proper responses, ability to learn, to gain knowledge, and apply reasoning to solve problems. Therefore, the aim of AI is to simulate human thinking and reasoning processes in machines. These machines are supposed to comprehend complicated situations, show proper response, learn, acquire knowledge, and apply reasoning to solve problems.

Clinical decision-making entails with unique complicacies. Expansion of knowledge in the field of medicine and the intrinsic complicacy of clinical decisions, which have to do with human life, have made the experts to think about using AI systems in medicine. Therefore, many researches are in hand regarding utilization of different AI systems for diagnosing and treating diseases. These works have studied the performance of expert systems, case-based reasoning, artificial neural networks, genetic algorithms, fuzzy logic, and combination of the different systems and techniques such as fuzzy neural networks, fuzzy expert systems, and so on [36].

There are two main reasons for expansion of expert systems: transferring mastery of experts to other staff; and preserving and keeping the knowledge of the experts who may leave the organization. Expert systems were actually designed to simulate reasoning processes of experts [2]. The concept of expert system is based on the assumption that the knowledge of the domain experts can be input into a computer system and retrieved when required. The function of the system is based on domain knowledge bases (in the form of rules, for example), the facts about the problem to be solved, and reasoning mechanisms [7, 8].

The main point in artificial neural networks (ANN) is to simulate the functions of neurons through developing artificial neurons. Neural networks are models of human brain that can simulate the way the neurons interact for processing data and learning from the experiences [2, 9]. Real cases are used for training the systems so that the systems discover the relationships between the variables and this is where learning takes place. Throughout the training, the system learns which features of the inputs are mostly related to the output and then it weights the relationships based on their importance. The system, then, can be used for dealing with new situations [8, 10].

These two types of systems have achieved promising results in medicine. However, it appears that utilization of such systems is not free of challenges and there are many questions to be answered. The present study focuses on these challenges.

Sample expert systems and artificial neural networks in medicine

Although there are many applications of expert systems and neural networks in medicine, we provide a few examples of such applications in the following section and Table 1.

Table 1 Some examples of expert systems and artificial neural networks successfully developed for clinical purposes

First aids

In a research, a rule-based expert system was developed to improve the quality of first aid services. The results showed that the system increased the performance. For instance, 98 % of the users managed to check patients’ airway correctly; however, this figure for non-users was only 36.5 %. In sum, the obtained score for non-users and users was 14.8 and 22, respectively (p < 0.01). In almost all operations, users reached 3.8 to 70.1 % higher performance [11].

Diagnosing types of headache

An expert system was used in another study for diagnosing and differentiating various types of headaches including migraine, tension headache, and so on. The patients while using an interface answered some questions about severity, frequency, duration and laterality of headache, etc. The assessments showed that the system was successful in correctly diagnosing 94.4 % of migraine (including tension headache), and 93 % of daily syndromes. On average, the accuracy of the system was 98 % [13].

Differentiating heart beats

A rule-based expert system was used in a study to diagnose and classify arrhythmic and ischemic heart beats. The information of QT and ST-T from electrocardiograph (ECG) was used for arrhythmic beats and R-R interval from tachogram was also used for ischemic heart beats. The results showed 90.4 % accuracy in diagnosing ischemic heart beats and 94.4 % accuracy in arrhythmic heart beats [21]. In another example, an automated method has been developed for ECG heart beats classification. Five different heart beats were considered: Normal, PVC (Premature ventricular contractions), LBBB (Left bundle branch blocks) RBBB (Right bundle branch blocks) and APC (Atrial premature contraction). Researchers extracted some features from ECG and developed a wavelet neural network (WNN). Although the sensivity or specifity of classifying these heart beats were 98.51 to 99.23 %, the overall accuracy was 98.78 % [20].

Diagnosing strabismus

In a study, an artificial neural network was used to diagnose strabismus. The system was web-based (www.strabnet.com) so that a physician could send the information to the system after examining his/her patient. The system conducted the diagnosis based on the input data and weights of the variables. The results showed accuracy of 100 % on real data [15].

Diagnosing hepatitis outcomes and fatality

In a study, a multilayer neural network (with 19 inputs) with Levenberg Marquardt training algorithm was developed to predict the outcome of hepatitis (live or die). The classification accuracy of the system was 91.87 % [10]. Researches, using the same data, developed a probabilistic neural network. The accuracy of the new system was 91.2 % [16].

Patients’ anatomy for radiotherapy

It is essential to take into account the anatomy of a patient to optimize treatment variables in radiotherapy. In a study, an expert system was used to determine clinical variables of new patients. In addition, an artificial neural network with six neurons was used to optimize the medical variables of each patient. The network was trained using the data of patients who have received treatment from five dosimetrists. The results showed that 96 % of the radiotherapy programs recommended by the system were acceptable in comparison with the treatment implemented by dosimetrists [8].

Chinese acupuncture expert system

In 2007, researchers developed an expert system to prescribe the most appropriate Chinese acupuncture treatment for a particular disease syndrome. This system can provide text and animation instructions about the location and insertion techniques of the acupuncture needle and a list of suggested diagnosis. Additionally, this system can provide teaching animation of each acupoint position. The results of its application have shown very satisfactory performance and increasing the confidence of the practitioners on applying needle insertion treatment [18].

Oncology

Expert systems and neural networks have been widely used to diagnose or differentiate cancers such as breast and lung cancers [19, 2325]. For instance, a neural network was developed to differentiate malignant and benign breast tumors based on mammography results and without biopsies. Accuracy of the system was 70 % [26]. In another study, a decision support system has been developed for diagnosing breast cancer. In this system, needle biopsy images have been used. From these images, color wavelet features were extracted and modeled using Support Vector Machine (SVM), Naïve Bayes classifier (NBC) and an ANN. The overall accuracy of classification was 98.3 % for SVM classifier. Additionally, the NBC and ANN accuracy were 93.5 and 92.5 % respectively [19]. In addition, neural networks have been used for diagnosing, and determining stages of urinary system cancers; the results showed high accuracy of such systems [27].

Discussion

Clinical decision making, diagnosis, and treating diseases are complicated tasks and even experts cannot reach to an agreement in some cases. AI systems can be of great help for physicians in this regard. In addition to the sample studies mentioned above, the successful results in using fuzzy expert systems to develop electronic stethoscope [28], using expert systems to diagnose thyroid diseases [29, 30], lung embolism [31], arrhythmia [32], and encephalopathy [33], using AI systems in ICU [9] and as well as using ANNs to diagnose cirrhosis [34] and application of adaptive neuro-fuzzy inference system to diagnose the heart valve diseases [35] are noticeable. Using these systems for medical purposes is featured with both advantages and challenges.

Advantages of expert systems and neural networks

Medical AI solutions help physicians to take more variables into account for diagnosing. That is, a physician may fail to take all the variables (e.g. results of some tests) into account in decision-making as the human brain capability to remember all the variables is limited so that the physician does not even look for such information or the physician may under or over estimate the weights of the variables. On the other hand, since the relationships between the variables have been put into the design of the system, all the variables are weighted properly. Therefore, depending on the accuracy of the weights assigned to the variables, more accurate clinical decisions are expected by the systems. Furthermore, many unknown variables add to complicacy of clinical decision making. Given the capacity of AI, such systems can achieve high accurate decisions in such situations. For instance, before referring a myocardial infarction patient to another hospital, many variables must be taken into account (e.g. general condition of the patient, the safe distance for the patient to go and so on) and each variable per se is also affected by many factors. An expert system, by taking all these factors into account, can achieve more reliable decisions [5].

Access to the knowledge and experience of the experts (clinical guidelines or knowledge of experts) is another advantage of the AI systems. Some experts posse great deal of unique information and knowledge, which cannot be found with other physicians. The expert systems are easy ways to transfer such knowledge to the inexperienced staff [7]. In this way, the new staff will gain more confidence in doing their job [11, 18]. In this regard, we can share this knowledge with physicians in developing countries through internet and web-based systems [15, 17]. This also transfers the experience of the physicians who have had the chance to deal with extraordinary, native or uncommon cases regarding which other physicians have no knowledge and experience. Neural networks can deal with situations about which there are unknown variables or unknown relationships between variables. Therefore, using such systems, physicians can make decisions considering unknown variables.

Spending more time to evaluate decisions (not making the decisions), making consistent decisions, and shorter processes are of other advantages of the systems. Considering several overlapping variables in clinical decision-making, physicians can faster achieve the final decision using AI systems. In addition, the systems ensure that different physicians come to the same decisions. In short, the systems provide recommendations at any time and place and conducting analyses is done faster. This is vital for clinical situations. For instance, neural networks made correct diagnosis of malignant or benign breast tumors with 70 % accuracy without biopsy [26]. Some studies have shown that neural networks were very effective to predict metastasis of cancer [27]. This capability to prediction and faster decision-making help physicians to take timely measures and achieve better results. Therefore, such systems can reduce the cost, time, human expertise and medical errors [7].

Challenges of expert systems and neural networks

In spite of many advantages of AI systems, using them is not free of challenges. To achieve better results, these challenges must be taken into account. Some of the challenges are noted in follows.

Data entry: before using such systems, the patients’ data must be entered. That is, the physician has to input the data once in the patient’s record (manually or electronically) and once into the AI system. Necessity to insert the data into several systems may hamper utilization of the AI systems, unless the data in the patient’s record is in an electronic format (electronic medical records) and can be used by the AI systems as well. Therefore, the AI system should be well integrated with electronic medical record (EMR) systems so that the Al systems can retrieve the needed inputs from the EMR and store the outputs into the EMR systems [7, 18].

Knowledge acquisition: before designing an expert system, the experts must be identified. That is, who is the expert? Are they eager (or reluctant) to participate in knowledge engineering process? Is there any disagreement (or agreement) between the experts in the field for solving the problem? How the knowledge of the experts should be acquired? How to make sure that the knowledge provided by the experts is reliable, valid and complete? In this regard, Kunhimangalam et al. argue that “the experts often feel difficulty in stating their knowledge in an orderly and logical manner or sometimes even in understanding their own decision making processes” [7]. Such challenges also hold in designing neural networks; what cases should be used to train the system? What variables of the cases should be used in training the system? Should the experts who will use the system determine the variables? and how? Obviously, the determined input variables have an influence on the accuracy of the system [7]. For example, a study showed that the accuracy of an ANN for diagnosing tuberculosis with 21 input variables was 92 %; however, another ANN with 38 inputs had 94 % accuracy [14].

Along with these questions, there are other things to ask; is the knowledge updated? In medical fields which are highly specialized, the users of the system do not trust the system when they perceive that the system is not up dated. It is necessary therefore, to be cautious as possible in finding the experts and keep the system updated relative with fast development of medical science by adding new rules or keep training the system through introducing new cases [36].

Modeling the medical knowledge: there is another challenge that adds to complicacy of the acquiring the knowledge; this is the lack of standard methods to represent clinical concepts for a computer [36]. To develop a knowledge base, the clinical conditions must be translated into computer language, which is not easy, because medical knowledge and modeling this knowledge is complicated. Some of the issues are; what patient’s data is pertinent to the decision? What concepts are there to concern about in making the decision? How these concepts are related to each other? What strategies must be adopted to solve the problem? How the knowledge must be used for applying the strategies [36]?

Validity and evaluation of the system: another key point regarding AI systems is to ensure reliability and accuracy of such systems. The performance is usually evaluated comparing with the gold standard and in many cases, the experts are the gold standards. The standard is easier to achieve for diagnoses support system, while it is not the same for treatment recommendation systems as it is difficult to reach consensus among the experts regarding the way of treatment [36]. On the other hand, the key question is which experts must participate in the evaluation of the system performance; whether those who have designed the knowledge base or others. If the performance of the system is evaluated by the same experts (who have been participated in knowledge acquisition) the performance may fit better. While, when the evaluation is done by a third expert, the disagreement with the system (low accuracy) might be to the procedure prescribed by the designers of the knowledge base (disagreement with the experts participated in knowledge acquisition), not to the actual performance of the system. Another question to be asked is that what criteria must be used to reject or accept the performance. Is the 90 % accuracy of an expert system for recommending a diagnosis acceptable? The fact is that 90 % accuracy means 10 % mistake in diagnosis and then treatment.

Selecting appropriate models, algorithms and structure of the system: another challenge for developing AI systems is selecting the appropriate algorithms and structure especially for developing classifier systems such as ANNs. Studies have shown that selected variables, algorithms and structure for ANNs developed for diagnosing tuberculosis increases the accuracy of the system from 77 to 94 %. For example, an ANN with back-propagation algorithm has shown 93.93 % accuracy, however, another ANN developed with the same data but optimized with genetic algorithm has shown 94.88 % accuracy [14]. The necessity of considering the appropriate algorithms and structure of ANN (such as number of hidden layers and the number of nodes in the hidden layer) has been shown in other studies [10, 16, 19, 20].

Wrong recommendations and responsibility: there is no accuracy check mechanism in majority of expert systems and neural networks. In addition to lose of trust to the system, this lack of accuracy check mechanism also problematizes finding a responsible for the performance of the system. Indeed, the question is “who is responsible for the recommendations made by the AI systems?” should it be the developer of the system, the experts engaged in knowledge acquisition, knowledge engineer, or the physician who uses the system? It seems that the physician should be taken as the responsible until such systems are not supposed to replace the physician. Physicians should make a decision weather use such systems or not. Additionally, they should increase their knowledge about functions of such systems, their inputs and outputs and interpretation of outputs; however, there are many controversies about the responsibilities of AI systems [37].

Limited clinical domains of AI systems: these systems are developed for a specific domain; for instance, diagnosing a specific type of disease. This raises some questions; are these systems required for all types of diseases (all types of decisions)? What specific types of diseases need such systems? Furthermore, this limited domain means that the knowledge included in the system is limited. The complexity of knowledge acquisition increases this limitation even in a specific domain. These limitations decrease the accuracy of the system and capability of physicians to make a decision. For example, in a real situation, physician may decide to consult with another specialist; however, the knowledge for such decision may not be included in an AI system. Therefore, the physician who relies on an AI system may be restricted to only options considered in the system [37].

System integration: according to what discussed above, is it necessary to integrate different AI systems? How such integration must be carried out? Is there a need for a comprehensive and common knowledge base for different medial domains? Considering the problems of knowledge acquisition, how such a knowledge base should be developed? How such AI systems should be integrated with other information systems such as EMR systems? It appears that such concerns have kept the AI systems from advancing from research works to practical usages. Therefore, many AI systems are not used in routine work flows; however, the ideal of such systems is to become a part of routine clinical work flows [7] otherwise we cannot see their effect on quality of the clinical decision-making.

Conclusion

We found that AI systems are featured with immense potential for better clinical decision-making. However, along with common concerns regarding any information systems, successful implementation of AI systems in clinical domains also entails other factors. Among these extra factors, clear definition of the problem to be dealt with, understanding the scope, applying appropriate experts and knowledge engineers, developing knowledge bases based on right and complete modeling of medical knowledge, continuous updating, participation of physicians in the developing process, exact evaluation of the system, as well as integrating the system with patients’ electronic medical records should also be considered.