Introduction

Given the remarkable and explosive growth of diagnostic technology in the latter half of the twentieth century, it is surprising that only a handful of studies have been performed to determine whether modern diagnostic methods and procedures have improved clinical diagnostic accuracy, lessened medical errors, or possibly even eclipsed the need for the medical history and clinical examination. In 1996 we published a study in which 100 randomly selected autopsies from each of the years 1959, 1969, 1979, and 1989 at a German University Hospital were analysed to determine whether advances in diagnostic procedures had reduced the rate of misdiagnosis (Kirch and Schafii 1996). Our results indicated that the introduction of new diagnostic procedures such as ultrasound, computed tomography, and radionuclide scans had not diminished the occurrence of misdiagnosis over those four medical eras.

These findings accorded with a methodologically similar survey by Goldman et al. (1983), who also studied 100 randomly chosen autopsy cases from the years 1960, 1970, and 1980 at a teaching hospital affiliated with Harvard Medical School. In all three medical eras investigated, they too documented a constant misdiagnosis rate of approximately 10%. In another 12% of cases, autopsies revealed diagnoses that were clinically unrecognised but had no adverse therapeutic consequences (false-negative diagnoses). When all erroneous diagnoses during the selected years were pooled, the diseases most frequently overlooked were pulmonary emboli, myocardial infarctions, neoplasms, and infections. A similar pattern of misdiagnoses was again observed at another German University Hospital, based on 477 autopsies (Thomas and Jungmann 1985). Furthermore, a large survey of diagnostic accuracy involving 141 autopsies carried out after 335 deaths at a United States Veterans Administration hospital also showed little change (13%) in the rate of misdiagnosis (Pelletier et al. 1989).

In 2000, however, Sonderegger-Iseli et al. (2000) published an investigation of diagnostic errors at the Zurich University Hospital in which they found a reduction in major diagnostic discrepancies from 30% in 1972, to 18% in 1982, and to 14% in 1992. The authors attributed the reduction of such discrepancies to the more sensitive modern diagnostic techniques in cardiology and possibly to improved clinical skills of physicians. Interestingly, these investigators also documented an increase in minor diagnostic discrepancies between 1972 to 1992. The definition of major diagnostic discrepancies in this study does not correspond to the term misdiagnosis used in our present analysis and that of other recent investigations. Thus results of the Sonderegger-Iseli study are not quite comparable to most of the other investigations.

In the 10 years since our last study ended, fast-paced and dynamic developments in diagnostic technology have continued relentlessly. The moot question has again arisen as to the impact of such technical progress on our diagnostic precision. We therefore extended our previous 1959–1989 investigation at Kiel University Hospital by another 10 years, thus exploring the issue over a 40-year period.

Patients and methods

Definitions

Our definition of misdiagnosis was based on autopsy confirmation (see Table 1). Misdiagnosis was said to occur when a disease that does not exist is assumed to be present and when the failure to recognize the true existing disease leads to a worsened patient prognosis. An iatrogenic consequence of this incorrect diagnosis is either the omission of treatment or the initiation of incorrect therapy which may delay or even prevent the patient’s recovery (Goldmann et al. 1983; Kirch and Schafii 1994, 1996). In the present survey, we also documented the rates of false-positive diagnosis and false-negative diagnosis. The former occurs when, following the diagnostic procedure, a disease that does not exist is thought to be present. In contrast to misdiagnosis, this erroneous assumption does not influence the patient’s prognosis. Likewise, a false-negative diagnosis, defined as a disease discovered at autopsy that was clinically unrecognised, has no prognostic relevance.

Table 1 Principal diagnostic definitions

Patients

The medical records of 100 randomly selected patients from the years 1999/2000 who died and then underwent autopsy at the First Medical Hospital of the Christian-Albrechts University in Kiel, Germany, were analysed retrospectively. The results were compared with those of the years 1959, 1969, 1979, and 1989 from the same hospital and according to the same procedure as described below (Kirch and Schafii 1996). As the number of autopsies in 1999 fell short of the required 100 cases, we pooled the years 1999 and 2000, during which a total of 143 patients were autopsied.

Only those records from patients who died after a hospitalisation (termed final hospitalisation) of at least 2 days at the First Medical University Hospital Kiel and who subsequently had a complete anatomo-pathologic examination at the Department of Pathology were eligible for inclusion. The distribution within the group of randomised patients with respect to age, gender, and length of hospitalisation did not differ from that of the autopsied patients in each of the years studied (1959, 1969, 1979, 1989, 1999/2000). Data of the deceased were summarised on a three-page evaluation form, including medical history and physical examination, underlying cause of death and contributory diseases, diagnostic tests performed, and other findings. The age and gender of all patients who died during the years 1959, 1969, 1979, 1989, or 1999/2000 at the First Medical University Hospital in Kiel were recorded, and the autopsy rates for these patient groups were calculated. The average length of the final hospitalisation was listed.

Diagnoses

The hospital charts and autopsy reports were first reviewed for clinical findings, diagnoses, and possible misdiagnoses. Cases were reviewed by a second internist when a first reviewer judged that a diagnosis had been made incorrectly, or when results of a diagnostic test appeared to be misleading. Diagnostic errors were classified as misdiagnoses, false-positive diagnoses, or false-negative diagnoses as previously defined (see Table 1).

Special diagnoses

Five common clinical diagnoses (pulmonary embolus, myocardial infarction, malignancy, infection in general, and pneumonia listed separately) were particularly scrutinised in view of possible misdiagnoses along with possible erroneous clinical assumptions.

Diagnostic techniques

The types of diagnostic procedures, frequency of use, and relative value in establishing the clinical diagnosis were also analysed. These included the anamnesis, physical examination, standard laboratory tests, imaging techniques, electrocardiogram, microbiological tests, as well as histological and cytological examinations. A diagnostic test result was considered conclusive if it established a diagnosis that was corroborated by the autopsy, or held to be misleading when it failed to indicate a diagnosis—later confirmed by autopsy—that clinicians expected to be present, or if it led to a misleading diagnostic conclusion. Test results that were neither conclusive nor misleading were considered inconclusive.

Results

Patients

In the five medical eras studied, the mean age of the autopsied patients was 58.9 years in 1959, 64.1 years in 1969, 65.0 in 1979, 74.2 years in 1989, and 64.4 in 1999/2000. A noteworthy decrease in the autopsy rate occurred during this period. In 1959, 88% (180/204) of the patients who died at the First Medical University Hospital were autopsied. This percentage dropped to 82% (203/248) in 1969, then precipitously to 58% (238/410) in 1979, 36% (121/335) in 1989, and 20% in 1999/2000 (143/715), an overall decrease of more than 65% (Fig. 1). The average length of final hospitalisation for each group of 100 patients was 12.4, 9.6, 10.8, 13.6, and 12.2 days in the five respective medical eras (Table 2).

Fig. 1
figure 1

Autopsy rates in five medical eras at the First Medical University Hospital Kiel, Germany

Table 2 Average duration of final hospitalisation for each group of autopsied patients

Diagnoses

As shown in Fig. 2, the misdiagnosis rate remained nearly unchanged between 1959 and 1999/2000. Of the 100 randomly selected patients from each of the 5 respective years studied 7, 12, 12, 11, and 11 were misdiagnosed. False-negative diagnoses were found to occur more frequently and tended to increase: 24 in 1959, 30 in 1969, 22 in 1979, 34 in 1989, and 41 in 1999/2000. By contrast, false-positive diagnoses were detected about as frequently as misdiagnoses. We recorded 7, 11, 9, 7, and 15 such occurrences in 1959, 1969, 1979, 1989, and 1999/2000, respectively.

Fig. 2
figure 2

Frequency of misdiagnoses, false-positive, and false-negative diagnoses in five groups of 100 randomly selected patients who died and were autopsied at the First Medical University Hospital Kiel, Germany, in 1959, 1969, 1979, 1989, and 1999/2000

Just as for the years 1959 until 1989, the most common diagnoses in the clinical charts and autopsy reports of the deceased population in 1999/2000 again included pulmonary diseases (32%), cardiovascular disorders (27%), and neoplasms (19%), followed by gastrointestinal (10%), cerebrovascular (7%), and urogenital diseases (4%). Pneumonia (55%) and pulmonary emboli (26%) were the most common lung conditions. Among the cardiovascular diseases, myocardial infarction (48%) was the most frequent diagnosis. The most common diagnoses among the malignant tumours were lung carcinoma (28%) and various haematologic malignancies (15%).

Special diagnoses

The frequency of diagnostic errors for the five common clinical diagnoses given particular attention, i.e. pulmonary embolus, myocardial infarction, malignancy, infection in general, and pneumonia, are shown in Table 3.

Table 3 Frequency of diagnostic errors for five common clinical diagnoses in pooled 500-patient sample, 1959–1999/2000

Pulmonary embolus

In the entire 500-patient sample, no less than 60% of the pulmonary emboli found at autopsy had gone unrecognised clinically. This rate remained stable over time: 63% of cases in 1959 and 64% in 1989, but increased in 1999/2000 to 76% of cases (Fig. 3A). Conversely, 41% of cases in which the clinician ascribed the patient’s death to a pulmonary embolus could not be proven by autopsy (false-positive diagnosis). The frequency of this type of error rose from 33% in 1959 to 44% in 1999/2000. Misdiagnosed pulmonary emboli were most frequently confounded with either myocardial infarction or cardiac arrest.

Fig. 3
figure 3

A Comparison of clinical and autopsy diagnoses in five groups of 100 randomly selected patients each who died from pulmonary embolus and were autopsied at the First Medical University Hospital Kiel, Germany, in 1959, 1969, 1979, 1989, and 1999/2000. B Comparison of clinical and autopsy diagnoses in five groups of 100 randomly selected patients each who died from myocardial infarction and were autopsied at the First Medical University Hospital Kiel, Germany, in 1959, 1969, 1979, 1989, and 1999/2000

Myocardial infarction

Of all myocardial infarctions found at autopsy, 22% were clinically undetected. The clinician’s diagnosis of myocardial infarction could not be proven by autopsy in only 9% of all cases. The rate of clinically undiagnosed myocardial infarctions increased in the last 20 years from 15% in 1979 to 31% in 1999/2000 despite ostensible progress in diagnostic techniques (Fig. 3B). In comparison with 1989, the number of clinical diagnoses of myocardial infarctions that could not be confirmed by the postmortem examination decreased from 20% in 1989 to 10% in 1999/2000 (≤8% in 1959, 1969, and 1979; Fig. 3B). Misdiagnosed myocardial infarctions were most frequently confounded with either pulmonary emboli or early septicaemia.

Malignant neoplasms

Of the malignancies found at the postmortem examination, 28% went undetected clinically (Table 3). The rate of clinically undiagnosed but autopsy-confirmed malignant neoplasms decreased during the last 20 years from 30% in 1979 to 21% in 1999/2000 (28% in 1959, 37% in 1969, and 31% in 1989; data not shown). Malignant tumours of the respiratory and gastrointestinal tracts were missed most frequently. All undiagnosed malignant neoplasms in 1999/2000 were solid tumours, whereas all leukaemias were recognised. Malignancies diagnosed clinically but unproven by autopsy were rare (8%). The rate of malignancies diagnosed but not found in the postmortem examination decreased with time: from 9% in 1959 to 4% in 1999/2000.

Infections

Of all infections, 48% were clinically undetected or attributable by autopsy to a site of infection other than that assumed to be present by clinicians (misdiagnoses plus false-negative diagnoses). In 31% of cases there was no correlate at autopsy to the clinically diagnosed infection (Table 3). The rate of false-negative diagnoses decreased from 74% in 1969 to 48% in 1999/2000, and the rate of false-positive diagnoses dropped even further—from 62% in 1969 to 26% in 1989, and to 36% in 1999/2000.

Diagnostic techniques

Within the last 20 years (from 1979 to 1999/2000) surveyed, the number of new diagnostic procedures such as ultrasound, endoscopy, computed tomography, nuclear medicine scans, and magnetic resonance imaging increased strikingly (Fig. 4). For instance, the use of endoscopy increased markedly from only 3% of the patients analysed in 1979 to 23% in 1989 and to 45% in 1999/2000. In this concern it has to be mentioned that since 1990 endoscopic activities were enhanced by the new head of the First Medical Hospital Kiel. In parallel, standard noncontrast radiologic procedures and laboratory tests also rose in this period (data not shown). It should be mentioned that in this survey, each diagnostic procedure was registered just once per patient, that is, for a patient who underwent sonography several times during hospitalisation, the results of each session were noted but the technique itself was listed only once. Thus, the number of diagnostic techniques shown in Fig. 4 does not equal the number of patients because some patients underwent more than one procedure.

Fig. 4
figure 4

. Frequency of several new diagnostic procedures applied in groups of 100 randomly selected patients each in 1979, 1989, and 1999/2000 at the First Medical University Hospital Kiel, Germany

Value of diagnostic techniques

For the pooled 500-patient sample, most of the additional diagnostic procedures performed (aside from medical history, physical exam, and standard lab tests) were inconclusive (about 60%). By contrast, the patient’s medical history and physical examination continued to play a key role in the diagnostic process, leading to a correct final diagnosis in 62–84% of cases (Table 4). The overall accuracy of the newer diagnostic tools was comparatively lower.

Table 4. Comparison of frequency of diagnostic procedures and their value in establishing the main diagnosis in pooled 400-patient (1959–1989) (Kirch and Schafii 1996) and 100-patient sample 1999/2000

Although the new diagnostic techniques provided conclusive information in about 30% of cases, they sometimes contributed directly to misdiagnoses as well as to false-positive or false-negative diagnoses. When comparing the years 1959–1989 with 1999/2000, we found that imaging techniques confounded the diagnostic process in 7% vs 25% of cases, respectively.

Discussion

During the last 20 years, powerful and sophisticated methods of investigation such as sonography, scintigraphy, endoscopy, computed tomography (CT), and magnetic resonance imaging have been introduced into the diagnostic arsenal and used with ever-growing frequency: in the last 20 years, their application rate at the Kiel University Hospital increased tenfold. For example, CT were ordered in two cases in 1979 and in 24 cases in 1999/2000. Endoscopy, used in only three cases in 1979, was used in 45 cases in 1999/2000. Nevertheless, these new techniques have not reduced the use of conventional methods such as standard laboratory tests or regular X-ray imaging. Rather, they are used in addition to them (Goldman et al. 1983; Showstack et al. 1982; Griner 1979).

In spite of the progress in diagnostic technology, the rate of misdiagnoses continued to remain unchanged over time (7% in 1959, 12% in 1969 and 1979, 11% in 1989, and 11% in 1999/2000). The rate of false-positive diagnoses increased between 1959 and 1999/2000 from 7% to 15%, as did the rate of false-negative diagnoses, which rose during the observation period from 24% in 1959 to 41% in 1999/2000. Our new data accord with our previous results as well as with those of other authors (Goldman et al. 1983; Pelletier et al. 1989; Bauer and Robbins 1972), leading us to conclude that the introduction of new diagnostic technology in the past decade has not appreciably reduced the rate of misdiagnosis. It is also interesting to note that in the only study purporting to show a marked decrease in diagnostic errors over time (Sonderegger-Iseli et al. 2000), the asymptotic decline in major diagnostic errors (class I and II) from 30% in 1972 to 18% and 14% in 1982, and 1992, respectively, appears to be approaching a plateau similar in magnitude to those found in other surveys of misdiagnosis.

Although the new diagnostic procedures are undoubtedly useful and lead to improved diagnostic accuracy for certain diseases—they may especially help to detect important diagnoses earlier, thus improving the patient’s prognosis—the modern diagnostic tools sometimes contribute directly to missed diagnoses (Fiorelli et al. 2000; Ferrucci 1979). Even the most accurate diagnostic tests, biopsies and histological examinations can be misleading (Schwartz et al. 1981). Therefore, the results of every diagnostic procedure must be interpreted carefully. In our study, imaging techniques were of high value in 30% of cases, but also confounded the diagnostic process 25% of cases. This is in agreement with Hollis (Hollis 2000) and Kroegel and colleagues, who pointed out that sonography used as a screening method not infrequently led to diagnostic misconclusions (Kroegel et al. 1999).

Just as the limitations of particular diagnostic tools may sometimes not be well understood (sensitivity, specificity, false-positive and false-negative rate, positive and negative predictive value, etc.), excessive reliance on test results may cause confusion and complicate the recognition of a disease (Bolann and Stelsnes 1999). Therefore, clinicians should be aware of the limitations of the diagnostic technique applied. Furthermore, when diagnostic procedures are ordered out of habit, the results often go unconsidered by their initiators (Kelley and Mamlin 1974; Middleton et al. 1989). Thus, clinicians should always review the findings of routine tests, especially routine laboratory screening tests. It is also essential to relate pathologic diagnostic findings to the history, physical examination, laboratory results, and other technical data. When considered alone, diagnostic methods are misleading in a high percentage of cases: up to 30% of laboratory tests and chest X-rays (Middleton et al. 1989; Dörner 1992; Hermann et al. 1975). Also, clinicians might not be entirely familiar with new diagnostic procedures’ methodologies and interpretations, especially when new techniques first become routinely available (Goldman et al. 1983). Such types of errors during the early years of our study may have led to a disproportionately higher rate of misleading diagnostic results. Furthermore, we have not determined whether the rate of conclusive diagnostic findings is higher in a population of hospitalised patients who were eventually discharged than in a group of patients who died in hospital. Another factor possibly contributing to diagnostic error is the rising age of patients who die while in hospital. The greying of the population results in older hospitalised patients with multimorbid conditions, and perhaps a tendency by clinicians to avoid performing certain diagnostic procedures in the elderly. Middleton et al. (1989), however, stated that there were no differences between geriatric and adult groups in terms of frequency or cause of diagnostic errors. They conclude that diagnostic accuracy is the same in geriatric and nongeriatric patients.

The most common diagnoses in the clinical charts and autopsy reports of the sampled deceased population included pulmonary diseases (mainly pulmonary emboli and pneumonias) and cardiovascular disorders (most frequently myocardial infarction). An important observation emerging from our study was that contrary to our expectations, the rate of clinical undiagnosed pulmonary emboli and myocardial infarctions increased from 1979 to 1999/2000 (65% vs 76% and 15% vs 31%, respectively). This does not mean that the possibility of pulmonary embolus was not especially considered by clinicians: in 43% of cases it was diagnosed as the cause of death, but could not be confirmed by autopsy. What is the reason? Pulmonary embolism can be particularly difficult to diagnose because the underlying thrombosis is diagnosed clinically only in a low percentage of cases and the symptoms are not specific. Sheifer et al. (2001) postulated that the atypical painless course of a myocardial infarction was the main reason for its misjudgement.

Another factor possibly affecting our results is the decreasing autopsy rate. The autopsy rates in Germany are now frighteningly low compared to other European countries (Brinkmann 2002). In our 40-year survey, a decrease of almost 70% was noted (88% in 1959, 20% in 1999/2000). For this reason, we had to pool the years 1999 and 2000. Such lower autopsy rates may be associated with a selection of patients subjected to postmortem examinations (Alderson and Meade 1967). Patients with severe, acute-onset diseases and “interesting” cases may be over-represented. Also, an increase in the hospital’s mortality during the course of the study speaks in favour for a higher number of patients with severe and complicated diseases who had to be treated in this clinic. Due to such possible biases, a higher percentage of misdiagnoses may have occurred in 1989 and 1999/2000 compared to those years when many more patients were autopsied. Indeed, it has been suggested by Sonderegger-Iseli et al. (2000) that a selection bias towards unclear cases might have masked improvements in diagnostic performance and led to the unchanged rate in misdiagnosis which we and others previously reported (Kirch and Schafii 1996; Goldman et al. 1983; Thomas and Jungmann 1985; Pelletier et al. 1989). However, we found no evidence for an inverse correlation between the decline in autopsies and an increase in so-called unclear or complicated cases. Furthermore, it could also be argued that if such a trend towards more difficult-to-diagnose cases would have occurred, it would have been offset, at least in part, by the introduction of newer and more powerful diagnostic techniques such as computer imaging and histological/cytological methodologies. Nevertheless, it was specifically these procedures that yielded the most misleading information in 1999/2000 (Table 4).

The increasing availability of modern diagnostic techniques offers a seemingly objective standard, which may reassure clinicians that their diagnosis is correct and diminish the need for autopsy confirmation (Baron 2000). However, confidence in a diagnosis is not an adequate assurance of its accuracy. Indeed, Cameron et al. found similar rates of misdiagnosis (15%) in cases where clinicians were uncertain of their diagnosis and requested an autopsy and in cases where the autopsy was waived because of diagnostic confidence (Cameron et al. 1980). By contrast, both Baron (2000) and Hartveit (1977) believed the clinician’s confidence in his or her diagnosis to be an assurance of its correctness, although the relatively high percentage of incorrect diagnoses (19%) in Hartveit’s group of clinically certain diagnoses argues in favour of the continued importance of the autopsy as an objective means of control. Furthermore, biased results from inadequately designed and/or inaccurately reported evaluations of diagnostic tests can further contribute to their premature dissemination and arouse false notions as to their accuracy, thereby leading physicians to draw incorrect diagnostic conclusions (Bossuyt et al. 2003a). This can be remedied by more complete and better reporting of diagnostic studies according to standardised guidelines such as those developed in the STARD statement and checklist (Bossuyt et al. 2003a).

Misdiagnosis can be considered as one of the key factors contributing to medical error, a subject that has recently received broad attention in the United States following the publication of the report by the Institute of Medicine (IOM) entitled To Err is Human (Kohn et al. 2000). Despite the (mis)perception in the U.S. by practicing physicians and the general public that the number of in-hospital deaths is considerably lower (10- to 20-fold) than the rate reported by the IOM survey (Blendon et al. 2002), the undeclining rates of misdiagnosis emerging from our five-era survey and others on both sides of the Atlantic indicate that the iatrogenic impact of misdiagnosis on morbidity and mortality may indeed be quite high and possibly intractable. Sadly, public health officials and policy makers in Europe have done little until now to examine and confront the issue of medical error and to bring it to the attention to the medical community and general public.

In conclusion, despite the introduction of new diagnostic procedures such as ultrasound, computed tomography, and magnetic resonance imaging, the rate of misdiagnosis has not been reduced and has remained nearly unchanged over four decades. Misinterpretation, technical errors, and over-reliance on these new procedures occasionally contributed directly to diagnostic errors. By contrast, the patient’s medical history and physical examination played an important role in the diagnostic process, leading to a correct final diagnosis in about 75% of cases. The most common diagnostic errors involved pulmonary emboli, myocardial infarctions, neoplasms, and infections. The reduction of autopsy rate from 88% in 1959 to 20% in 1999/2000 was remarkable and worrisome.