Introduction

All medical knowledge––including the continuous addition of new and important scientific information––cannot be processed and stored by a single human brain. Physicians learn thousands of different diseases in medical school and are expected to remember and apply a substantial subset of these in daily practice. But it is impossible for an individual physician to keep current on the broad spectrum of new data and discoveries and to reliably recall and utilize that information at all relevant time points. This is part of a major challenge in medical imaging, where real-time errors are estimated to average between 3 % and 5 % and constitute nearly 75 % of medical malpractice claims [1•]. Graber et al. estimated that approximately 75 % of diagnostic errors were related to “cognitive factors” [2]. Diagnostic errors outnumber other medical errors by 2- to 4-fold and represent nearly 40 % of total ambulatory malpractice claims [3].

These cognitive errors include anchoring bias (being stuck on an initial impression), framing bias or faulty context generation (over-reliance on the specific way in which a question is posed), availability bias (tendency to jump to a conclusion based on a recent incident), satisfaction of search (not considering other possibilities once a probable answer is found), and premature closure (acceptance of an answer before it is verified) [1•]. Cognitive errors, such as premature closure and faulty context generation, have been implicated in 75 % of patient deaths in which physician/medical error was thought to play a role [4]. Machine learning and other “artificial intelligence” [AI] systems have the potential to be less susceptible to these biases and, despite their limitations, can serve in a complementary role to human decision makers.

According to a study from researchers at Johns Hopkins University, ~40,500 patients die in intensive care units each year in the United States as a result of diagnostic errors [5]. System-related factors, such as poor processes, teamwork, and communication were involved in 65 % of these cases. These types of diagnostic problems contribute significantly to rising health care costs, with an estimated \$300,000 per malpractice claim for misdiagnosis resulting from cognitive error or system-related factors [4]. It is anticipated that routine application of AI will decrease the risk of such errors. Such advanced technology may ultimately replace a substantial percentage––although certainly not all––of the work physicians do on a daily basis.

Effective, evidence-based medical practice requires that physicians be familiar with the most recent guidelines and appropriate use criteria. Because of the exponentially growing amount of information in peer-review journals, textbooks, periodicals, consensus panels, and other sources, it is impossible for health care practitioners to keep up with more than a small fraction of relevant literature. Adherence to guidelines and evidence-based medicine may be made even more complex by the variability in “standards of practice” across different communities and states, a variability that complicates the concept of a “gold standard” for diagnosis and treatment of certain illnesses. Advanced computer systems, such as IBM’s “Watson” technology, could assist by providing the most up-to-date evidence-based information to inform proper patient care decisions. This information could combine data from a specific patient’s history with data from large numbers of other patients with similar disease manifestations.

What Is Artificial Intelligence?

AI has been defined as an area of study in computer science concerned with “the development of computers to engage in human-like thought processes such as learning, reasoning and self correction” [6]. The phrase “artificial intelligence” is believed to have been used first at a Dartmouth College Conference in 1956 [7••]. AI allows programmers and users to overcome the many constraints of traditional decision support approaches, such as rule-based systems, which include difficulty in rule formulation and challenges in updating new rules. These traditional systems, although created with expert input, do not exhibit human behaviors, such as reasoning, self-improvement, and constant learning. Despite extensive efforts and initial excitement, the application of AI has fallen short of its potential in medical applications. However, AI has experienced something of a renaissance within the past few years in nonmedical applications [7••].

For example, Siri, which was originally introduced as an Apple (Apple, Inc., Cupertino, CA) iOS application by Siri, Inc., has become embedded as a major feature of iPhones, starting with the 4S (as well as the third-generation iPad), and has become one of the most popular AI applications and arguably, the best feature of the iPhone today. Siri serves as an intelligent personal assistant that can provide scheduling and information, such as time, weather, local restaurant facts, or directions, by connecting to the vast array of data available on the Internet.

By using a combination of speech recognition, natural language processing, and AI, Siri performs relatively mundane tasks that humans can do, such as look at a map or ask another person for directions, and for the most part, understands commands and performs with a minimum of errors. How does Siri limit mistakes? Directional navigation provides one example. When a person asks another person for directions, there is always a chance that the results could be misleading or incorrect. Because Siri is connected to the Internet, the application can access correct information and accurately direct the individual to the requested destination. Siri can also provide step-by-step support on the optimal route to reach the destination as well as time and distance required.

Although Siri is remarkable, its ability to respond to diagnostic or therapeutic medical questions is limited to its ability to initiate a search on the Internet. However, its bigger brother, Watson, has begun to be used in health care applications. IBM’s Watson, best known for its remarkable performance on Jeopardy!, provides many unique and transformative possibilities to resolve challenges associated with medical diagnosis and treatment. The Watson hardware, which costs approximately \$3 million, can process 500 gigabytes/second, the equivalent of 1 million books [8]. Much as Siri helps guide users to the best route to desired destinations, AI applications in medicine, such as Watson, may help physicians navigate through a complex set of patient symptoms, laboratory data, and imaging results to come up with a set of “most likely” clinical diagnoses and treatment options that may ultimately improve patient outcomes and reduce health care costs. IBM initially tested its developing AI medical acumen using the American College of Physicians Medical Knowledge Self Assessment study guide and subsequently improved on its performance by adding textbooks, such as the Merck Manual of Diagnosis and Therapy and additional medical journals and books that were not included in the original Jeopardy! database. The IBM developers then improved on the performance for medical applications by fine-tuning the weighting associated with the various algorithms utilized by the application for these medical domain questions. The team has also created a demonstration in which a patient’s presenting symptoms are input into the software and a series of progressive questions are posed by a health care worker to personalize the diagnostic and therapeutic recommendations made by the software. Two important features of the prototype software were the ability to provide multiple possible diagnoses and treatment options with relative confidence levels and the ability to trace the information utilized to make a recommendation.

Potential Applications of Artificial Intelligence in Medicine

One promising initial application of such technologies is to take advantage of the relative computational speed available in today’s computers made possible by parallel processing. This speed allows performance of on-the-fly syntheses of a patient’s electronic medical record from one or more sources and creation of a summary and current problem list. Patient problem lists are typically poorly organized and managed, with no single provider given overall responsibility. The potential to create a graphical synthesis of patient data using a combination of natural language processing and AI technology is exciting. Not only can AI systems perform a rapid and thorough search of single or multiple patient electronic medical records, such systems can also search the Internet, textbooks, and journals for data. IBM’s Watson, when integrated into practice, could potentially search through its exhaustive database, which would be kept up to date with current literature, to support accurate diagnoses and provide other possible diagnostic and treatment options. This technology could be utilized to cross correlate data from a patient’s family history, find patients similar to that patient, and evaluate ultimate diagnoses and treatment responses. As genomic, proteomic, and metabolomic databases become commonplace and searchable, the software will be able to utilize these data in making recommendations for patient screening and in formulating diagnostic and treatment recommendations. In addition to providing answers, the software could be utilized to ask additional pertinent questions to more effectively and safely direct a diagnostic work-up plan and the performance of tests that maximize efficacy and safety while minimizing health care costs.

Physician time with patients is currently limited. The 1995 Commonwealth Fund survey of physicians found that 29 % of physicians were dissatisfied with the amount of time they spent with patients and only 31 % were very satisfied. In addition, 41 % reported a decline in time with patients between 1992 and 1995 [9]. Recent cuts in reimbursement have put additional pressure on physicians with regard to time spent with patients. Limited time to spend with patients may contribute to rising errors and incorrect diagnoses. In 1993, the average visit length was 20 minutes for family practitioners and 26 minutes for general internists [9]. Today the average visit length for family practitioners has dwindled to 10 minutes, a timespan that severely restricts the ability to obtain an adequate understanding of patient symptoms, make an accurate diagnosis, and provide thoughtful care. Given today’s increasing patient loads and requirements for documentation, including interfacing with electronic health records, it seems inevitable that AI applications will be widely deployed in the next few years. As is true of other areas of health care information technology, advances and products designed outside of the medical space can be as many as 10–15 years ahead of practical application in health care. By utilizing these advanced computing technologies, physicians of the future may spend less time behind a computer and more time with patients, reversing the current trend.

The cost of health care will be another major driver in the application of AI in medicine. AI applications will be utilized to reduce unnecessary testing, decrease the disparity and discrepancies in care throughout the United States and the rest of the world, and reduce hospital admissions and length of stay.

Current Application of Artificial Intelligence in Cardiac Imaging

AI can be used in the field of cardiology in a number of ways as shown in Fig. 1, including determining the most appropriate type of imaging study for a specific set of symptoms [10]. If applied properly, AI could reduce inappropriate imaging studies and help physicians adhere to practice guidelines and ever-changing appropriate use criteria. For example, the Imaging in FOCUS (Formation of Optimal Cardiovascular Utilization Strategies) quality improvement initiative of the American College of Cardiology was recently introduced to reduce inappropriate use of diagnostic imaging through the use of AI that tracks appropriate use criteria [11]. Among 55 participating sites that voluntarily completed the radionuclide imaging performance improvement module, the proportion of inappropriate cases decreased from 10 % to 5 %. These preliminary data from initial participating sites suggest that through the use of self-directed, quality improvement software, and an interactive community, physicians may be able to significantly decrease the proportion of tests not meeting appropriate use criteria [11].

Fig. 1
figure 1

Flowchart of “artificial intelligence” applications in a patient with suspected myocardial ischemia

After images are acquired, additional AI tools may help physicians provide accurate interpretation of cardiac imaging studies [12]. Artificial neural networks are an example of ways in which AI systems that approximate the operation of the brain can be successfully applied in cardiac imaging [13•]. These networks have been utilized in diagnosis and treatment of coronary artery disease and myocardial infarction, interpretation of electrocardiographic studies, detection of arrhythmias such as ventricular fibrillation [14], and in image analysis for echocardiography and other cardiac imaging, as well as screening for heart murmurs in children [15].

Artificial neural network technology is complex and involves multiple processors working in parallel. Neural networks can be supervised (human feedback on the data and analysis) or unsupervised. A neural network is initially “trained” with large amounts of data and rules about relationships among those data elements. In what are known as feedforward systems, learned relationships can then be utilized to inform higher layers of knowledge. These networks utilize multiple approaches, including gradient-based training (adjustment of the weights of various elements to minimize the gradient of error), fuzzy logic, Bayesian methods, and genetic algorithms. In general, neural networks use weighting factors (depending on the importance of the category) and attempt many different combinations of weightings until the most accurate answer is identified.

Nuclear imaging is evolving from subjective (more “art than a science”) approaches to more objective, digital-based quantitative techniques, providing insight into the physiologic processes of cardiovascular disorders and predicting patient outcomes [16]. The digital-based nature of nuclear images permit the application of automated quantitative software to assist in interpretation of cardiac single-photon emission computed tomography (SPECT) and positron emission tomography (PET) studies. Using AI approaches, algorithms have been developed that take raw digital data output by SPECT and PET cameras, identify the location of the heart, and reconstruct tomographic slices of the heart into 3-dimensional sets of transaxial sections, all automated without operator interaction. These automated image analysis systems then analyze signals from several hundred regions of the heart in transaxial sections and compare the intensity of these signals with those expected in a normal sex- and radiotracer-matched heart to generate a quantitative map of the location, extent, and severity of regional signal differences. Depending on the radiotracer used, this approach provides objective information on myocardial perfusion, metabolism, and/or innervation and aids physicians with both the interpretation of images as well as assistance in selection of optimal treatment strategies. Beyond static dataset evaluation, dynamic measurement of the heart cavity and myocardium can be evaluated using electrocardiographically gated 3-dimensional images by automatically identifying the endocardial and epicardial surfaces of the left ventricular cavity and following their motion (contraction and thickening) throughout the cardiac cycle [17].

In SPECT and PET imaging, AI approaches can also be employed to highlight an abnormality and thus serve in an adjunctive capacity rather than suggesting a primary diagnosis. In this use case, AI serves as an image enhancement technique rather than as a diagnostic tool. This approach can result in an increase in both accuracy and speed of interpretation. In another application, this decision support technology can help to determine the limits of normal in a specific patient population. This is particularly useful given the variation in normal distribution of a specific radiotracer, (which is applied in order to mirror the distribution of blood flow in the heart) in normal subjects. For example, what would be considered to represent normal physiologic distribution of 13N-ammonia as a blood flow agent in the lateral region of the left ventricular myocardium may, in fact, represent an abnormal lateral perfusion defect with other myocardial perfusion radiotracers, such as 201Tl [18]. Although the interpreter may not be able to discern such subtle differences between cameras and radiotracers, AI applications can be utilized to access a large normal database to aid the physician in rendering the correct diagnosis. Thus, much like Siri’s ability to accurately direct an individual to the requested destination (by identifying the current location of the individual and comparing it to a large Google Maps database), AI applications in cardiac imaging are and will be used to help the physician to correctly interpret cardiac images by comparing the patient’s tomographic images with a large age- and sex-matched normal database that is specific for the radiotracer and the camera used in acquiring the patient’s images.

Future Application of Artificial Intelligence in Cardiac Imaging: Neural Networks and Watson

The use of AI systems, such as Watson, represent a novel architecture for evaluation of unstructured and structured content when compared with traditional “expert” systems that used forward reasoning (data to conclusions) or backward reasoning approaches (such as the if-then statements used by Stanford University’s MYCIN system and others that followed). These previous expert systems were costly, difficult to develop and maintain, and were “brittle,” requiring a perfect match between input data and existing rule forms. Rule forms are inherently limited in assumptions about which questions will be posed. Software such as that utilized by Watson, on the other hand, uses natural language processing and a variety of search techniques to create hypotheses making it more flexible, scalable, easy to maintain, and cost effective. This new approach makes it much easier to keep up with ever-changing information in imaging, medicine, and surgery.

In a future clinical image interpretation scenario utilizing AI technology, a requested study would first be evaluated for appropriateness based on the patient history, previous examinations and their interpretations, and the indication for the examination. The examination would, if deemed appropriate, then be “protocoled” with regard to the way in which it should be performed, amount and type of radiopharmaceutical and/or contrast material, MR imaging sequences, etc. to optimize efficacy, safety, and efficiency.

As a next-generation AI system assists in interpretation of a nuclear scan, for example, it would consider many variables and assign different weighting factors in order of their importance, such as whether the patient has had a previous myocardial infarction in the targeted area, risk factors for coronary artery disease, prior coronary angiography, percutaneous or surgical revascularization, and medical therapy based on empirical “experience” using big data mining techniques. All of these would be weighted and, because the computer would take them all into account, it might come up with multiple possible candidate diagnoses within a short period of time, with associated probabilities that might improve image review accuracy and efficiency. The software would also act as an aid in diagnosis by pointing out certain areas in a scan that should be reviewed carefully because they fall outside expected normal parameters personalized for a particular patient. As is true in human learning, the AI systems of the future will become better over time because they will have access to large numbers of imaging studies and their results. Prior recommendations and subsequent outcomes will be iteratively fed back into the technology’s algorithms, resulting in improved accuracy and efficacy.

Barriers to Widespread AI Use in the Near Future

In addition to its tremendous potential and recent advances, AI technology faces many obstacles before it can be fully implemented in medicine as outlined in Table 1. It must be sufficiently fast to be accepted by users and must be integrated into physician workflow, which will mean that it must be tightly interfaced with electronic medical record systems and, in the case of medical imaging, into image interpretation workflow at image review workstations.

Table 1 Opportunities and challenges for fully implementing artificial intelligence in medicine

Another challenge is in determining the accuracy of the system for diagnoses and treatment recommendations and other expert system applications. Multiple opinions from physicians in different specialties and subspecialties often conflict in a single case (as is seen in many malpractice cases). The “gold standard” in medical practice is not always obvious; nor is there consensus on whether such a standard should reflect “expert” opinion, the majority opinion among physicians, or best reported outcomes in similar settings.

Hospital and outpatient practices, encouraged by the Federal government’s “meaningful use” initiative, have been making major investments in electronic medical record systems. It is not clear what level of accuracy would be required for hospitals to make additional investments in AI systems, even with evidence indicating added value in a clinical setting. It is unlikely that the Centers for Medicare & Medicaid Services or other payers will provide additional reimbursement for AI technologies in the near future.

Another major barrier comes from the regulatory and medicolegal perspectives. Although society is increasingly accustomed to information technology–related glitches and problems, such technical errors and failures are likely to be deemed unacceptable in medical and health applications. Consumers are already aware that many online Web sites provide unreliable diagnostic information, incorrect pharmaceutical dosages, inaccurate medication expiration data, and more. Most vendors that have developed software in the medical space have avoided systems that render medical diagnoses or treatment recommendations because of perceived liability and the real difficulties associated with obtaining U.S. Food and Drug Administration clearance for this type of software.

Another major challenge involves concerns about patient privacy and security. Federal regulations place growing restrictions on private health data, but no matter how impenetrable computer programs and software are required to be, the risk of privacy breaches is ever present. An additional challenge lies in the competitive nature of the information technology market and the reluctance of electronic medical record vendors to provide a highly integrated solution to a third-party provider of AI software.

The use of AI also raises specific medicolegal concerns. For example, if a physician uses an AI system to help in clinical diagnosis and provide best treatment options for a patient, what happens when diagnosis turns out to be incorrect? If misdiagnosis resulted in delayed or incorrect treatment, then who should be medically liable for the adverse outcome? Should it be the authors of the software, the technology provider, the hospital who provided the technology, the doctor––or all of the above? Conversely, what are the medicolegal implications of not following an AI system’s recommendations with a subsequent adverse result? These and other perceived liability questions might result in software developers and medical providers perceiving AI systems as a medical liability. It is unclear what the drivers will be in reaching a point at which the perceived benefits of such a system outweigh potential downsides.

Conclusions

The practice of medicine is at a crossroads, with simultaneous increases in patient volume, an explosion in the amount and complexity of medical and scientific knowledge, and the transition to electronic medical records. This is occurring at the same time that the ratio of physicians and other health care providers to patients continues to decrease and while cognitive demands on physicians are at an all-time high. As the September 2013 Institute of Medicine report on the nation’s cancer care system stated, the United States has an “increasingly chaotic and costly” medical system that is in crisis and fails to deliver consistent care that is evidence based, coordinated, and patient centered [19]. The combination of AI, big data, and massively parallel computing offers the potential to create a revolutionary way of practicing evidence-based, cost-effective, and personalized medicine. However, barriers to adoption of AI technologies must be overcome from regulatory, legal, cultural, and political perspectives––even when technology solutions have matured. Cardiac imaging has been a relatively early adopter of AI techniques in image processing, structured reporting, and clinical decision support systems and can continue to lead the way for the rest of medical imaging and the practice of medicine.