Introduction

Post-mortem computed tomography (PMCT) is an established supplement to forensic autopsy in both accidental and natural death [1,2,3,4]. PMCT debuted in the forensic realm in 1977 [5] and is the subject of much research [6], and several forensic institutions routinely use non-contrast PMCT prior to autopsy [7,8,9,10].

PMCT is non-destructive and rapid, and data are indefinitely storable. The acquired image data allow for easy-to-understand visual demonstration, 3D printing, crime scene reconstruction, animation of events prior to death [11,12,13], and simulation of different scenarios with advanced techniques such as finite element analysis (FEA) [14, 15].

We aim to perform FEA of blunt force skull trauma in future studies and need to determine if PMCT data are precise enough. FEA is an engineering tool in which a virtual force is applied to a computer model that simulates a fracture pattern, which provides a repeatable, objective, and observer-independent analysis of skull fracture. It is important to identify all fractures to match a FEA of a proposed scenario to the fractures seen at autopsy. In forensic pathology, all fractures are important, as they can be evidence of a specific traumatic force. The location, shape, and extent of fracture systems provide information on the direction of impact, number of impacts, force of impact, and shape of the impacting object. Several papers have compared PMCT to autopsy for skull fracture detection [16,17,18,19,20,21,22] and found sensitivities ranging from 0.67 to 1.00 and specificities ranging from 0.66 to 1.00, with the majority of studies reporting values above 0.90 for both. Methodologically, these studies consider the base and the vault separately, and the presence of any fracture system, no matter the number of fractures or the extent of fracture systems, constitutes a single positive finding, i.e. a fracture is either present or absent. In studies with many cases without and few cases with skull fracture, specificity will appear high as the number of true negatives is included in both the numerator and denominator, with only a small numerical contribution to the denominator from false positives.

The objective of this study was to establish sensitivity and specificity of PMCT for detection of individual fracture lines for each of the bones of the neuro-cranium in blunt force head trauma in adults, in order to facilitate future research with FEA.

Materials and methods

This was a retrospective study of autopsy reports from 2013 to 2019 and de novo interpretation of PMCT images. The Section of Forensic Pathology, Department of Forensic Medicine, University of Copenhagen (UCPH), handled all forensic autopsies requested by the police, the Danish Patient Safety Authority, the Transportation Safety Boards, and the Labour Market Insurance (private institution under governmental oversight) in eastern Denmark. Cases both scanned and autopsied, and with either a certain or a possible skull fracture mentioned in either the “PMCT” section or “internal investigation” section of autopsy reports were included. We excluded cases with non-blunt force skull fracture, neuro-surgery with bone missing at autopsy, age below 18 years, or severely comminuted fractures.

CT scans

The Section of Forensic Pathology scanned all cases on a 64-slice Siemens Somatom Definition scanner (Siemens Medical Solutions, Forchheim, Germany) prior to autopsy. Most deceased were wrapped in hospital sheets with a minor number in body bags. All were in the supine position with arms along the thorax. The tube current varied due to automatic dose modulation. Table 1 summarizes acquisition and reconstruction parameters. Two different scan protocols were used due to a prospective national research project, with parameters differing from routine practice [23].

Table 1 Scan parameters

Image reconstruction and interpretation

PMCT images were analysed with Myrian Expert 2.2 (Intrasense, Montpellier, France) [24] blinded to autopsy results and circumstances of death. Discontinuation of either the inner or the outer table of the cortical bone defined a fracture. Fractures were identified in the axial, coronal, and sagittal planes, registered in a database and drawn on sketches for documentation, see Fig. 1.

Fig. 1
figure 1

Sketch illustrating the six individual fracture lines of a single fracture system

The skull was divided into 39 locations based on anatomy and sidedness, as per Table 2. Sutures were considered anatomical locations, and diastasis was registered as such. It was possible for several fractures to appear in one location. A location with no fracture was registered as a true negative. The facial bones were not considered in this study.

Table 2 Anatomical locations

Each individual fracture line was registered as an independent data point. When a fracture line progressed into a new anatomical location, it was considered a new fracture line. When a fracture line abruptly changed direction more than ca. 50°, it was also considered a new fracture line. This angle was chosen for ease of determination. In Fig. 1, the fracture system consists of six individual fracture lines. All fracture lines were equally weighted statistically no matter their size or forensic importance. Fracture lines described as old and/or healed at autopsy were excluded from analysis.

Observers

All scans were interpreted by the first author, a junior doctor with approximately 1.5 years of experience in forensic pathology with more than 100 autopsies including full-body PMCT interpretation. The junior doctor has participated in regular departmental post mortem forensic radiology morning conferences of another approximately 300 cases prior to this study. The forensic pathologist holds a PhD degree in PMCT and has more than 15 years of experience with PMCT interpretation. The forensic anthropologist is an associate professor in forensic anthropology and forensic imaging, has attended the Virtopsy course in Switzerland, has more than 10 years of experience with PMCT, and is responsible for all advanced PMCT 3D visualisation analysis at our department. The clinical radiologist has 10 years of experience in radiology including four years of sub-specialisation in musculoskeletal- and trauma radiology.

Autopsy

A junior doctor supervised by a board-certified forensic pathologist performed the autopsies. Autopsy practice followed guidelines as set forth by Recommendation no. R 3 of the Committee of Ministers to Member States on the harmonization of medico-legal autopsy rules (99) and authoritative textbooks such as The Coroner’s Autopsy by Knight and Autopsy Diagnosis and Technique by Saphir [25,26,27,28]. For examination of the skull, the scalp was incised coronally and reflected to the supraorbital ridge anteriorly, the nuchal line superior-posteriorly and to about the external acoustic meatus laterally. With an electric bone saw, the superior part of the vault was removed, and the brain was removed before inspection of the skull base after removal of the dura. It was not routine to inspect the outside of the skull base.

Autopsy reports are standardized and written as full text. Positive findings, especially traumatic changes, are routinely photo-documented at our institution. Sketches, such as Fig. 1, are drawn on an at-need basis and were only available in a handful of cases in this study. We excluded cases with insufficient autopsy data for reliable analysis; thus, all fractures in this study were analysed based on standardized text and photographs.

Statistical Analysis

Cases for intra- and inter-observer analysis were randomly selected. Four observers participated in the inter-observer analysis: a junior doctor, a forensic anthropologist, and a forensic pathologist; all experienced in post mortem cross-sectional imaging, and a radiologist with no forensic experience. Cohen’s kappa was calculated with 0.41–0.60 considered fair agreement, 0.61–0.80 substantial agreement, and 0.81–1.00 considered almost perfect agreement [29].

Autopsy was the reference test when calculating sensitivity and specificity of the index test, PMCT. Sensitivity was the proportion of fracture lines present at autopsy detected by PMCT. It was calculated as the true positives divided by the sum of true positives and false negatives. Specificity was the proportion of anatomical locations without a fracture line at autopsy and without a fracture line detected by PMCT. Each anatomical location could contribute with more than one true-positive, false-positive, and false negative, but only one true negative. Specificity was calculated as the true negatives divided by the sum of true negatives and false positives [30]. Calculations were performed in Excel 16.0 (Microsoft, Redmond, USA).

Results

Ninety-nine deceased were included in the study. There were 77 accidents, one unknown cause of death, seven homicides, seven suicides, and seven natural deaths. Age ranged from 18 to 100 years, with an average of 50.6 years. The male to female ratio was 3.7:1, and Tables 3 and 4 provide information about cause and mechanism of death. Skull fracture would not be expected in some causes of death and mechanisms of death presented in Tables 3 and 4. This study sampled all cases where the autopsy report and/or original PMCT interpretation mentioned a possible skull fracture. In some cases, the underlying cause of death was non-traumatic, but the deceased had suffered a skull fracture when falling to the ground due to e.g. acute myocardial infarction or poisoning.

Table 3 Underlying cause of death
Table 4 Mechanism of death

The 99 cases were drawn from 4128 autopsies performed during the study period. Of those, 461 cases were not scanned (technical issues (n = 119), no reason stated in autopsy report (n = 254), did not fit inside scanner (n = 88)). Two hundred fifty cases were initially included, and 151 cases then excluded because of severely comminute fractures (n = 38), gunshot (n = 35), heat burst fractures (n = 20), neurosurgery with bone parts missing (n = 4), sharp trauma (n = 2), chop lesion (n = 1), insufficient reporting (n = 40), wrongful inclusion (n = 5), or age below 18 (n = 6). Neurosurgery with later cranioplasty with the original bone would not result in exclusion. In cases of craniotomy, our department usually receives the surgically removed bone prior to autopsy. This applied to nine cases.

Intra- and inter-observer analysis

For calculation of intra-observer agreement, we considered the first viewing the reference test. Based on 13 cases, Cohen’s kappa of agreement was 0.76, which is substantial. There were 127 true positive fracture lines, 34 false positive fracture lines, 20 false-negative fracture lines, and 376 locations without fracture, i.e. true negative.

The ten cases for inter-observer analysis yielded a Cohen’s kappa ranging from 0.43 to 0.58. The cases averaged 50 fracture lines and locations without fracture, and the observed inter-observer agreement varied from 0.76 to 0.83. The matrix between all four observers is shown in Table 5. Considering only the presence or absence of fractures in either the vault or base, agreement ranged from 19/20 to 20/20.

Table 5 Intra- and inter-observer agreement

Sensitivity and specificity

We found an overall sensitivity of 0.58 and overall specificity of 0.91. For the vault, we found a sensitivity of 0.68 and a specificity of 0.85. The skull base yielded a sensitivity of 0.49 and specificity of 0.84. When grouping the base, we found a sensitivity of 0.40 and specificity of 0.82 for the frontal, ethmoid, and sphenoid bone; sensitivity of 0.55 and specificity 0.91 for the temporal bone and Turkish saddle; and sensitivity of 0.60 and specificity of 0.79 for the occipital bone. For the sutures, i.e. diastasis, we found a sensitivity of 0.60 and specificity of 0.98. Tables 6, 7, and 8 show detail sensitivity and specificity for each anatomical location.

Table 6 Sensitivity and specificity for the vault
Table 7 Sensitivity and specificity for the base
Table 8 Sensitivity and specificity for suture diastasis

Sensitivity was higher for fractures described as dislocated at autopsy, as demonstrated in Tables 9, 10, 11, and 12.

Table 9 Sensitivity and specificity for the vault — non-dislocated fractures only
Table 10 Sensitivity and specificity for the vault — dislocated fractures only
Table 11 Sensitivity and specificity for the base — non-dislocated fractures only
Table 12 Sensitivity and specificity for the base — dislocated fractures only

The sensitivity and specificity of all four individual observers were comparable, as seen in Table 13.

Table 13 Sensitivity and specificity of all 4 observers on the 10 cases for inter-observer analysis

With the methods of prior publications [16,17,18,19,20,21,22], considering the presence or absence of a fracture system, we found a sensitivity of 0.97 for the vault and 0.93 for the base. Specificity would have been 0.58 for the vault and 0.45 for the base. The reasons for the low specificity are explained in the discussion.

Qualitative post hoc analysis

After the quantitative analysis that generated the data, we re-examined the eight poorest performing cases qualitatively. This had no influence on the numbers reported in this paper. Rather, it is an explanatory supplement. In two cases, only some of the several fracture lines constituting a single fracture system in the roof of the orbit were detected, and in another two cases, entire fracture systems comprised of hairline fractures located in the orbital roofs were overlooked. The fractures contributed with a high number of false negatives because they spanned more than one anatomical location or changed direction multiple times. In two cases, only some of the fracture lines radiating from correctly identified hinge fracture systems were identified, possibly due to satisfaction of search [31]. In one case, neurosurgical intervention masked the fracture line, as the surgeon had sawed through it in the lengthwise direction. In one case, artefacts from dental work partly obscured the missed fracture line in the base of the skull, although it was evident upon re-examination. Figure 2 shows examples of “easy” and “difficult” to detect fracture lines as they appear on the macerated skull (not standard practice), the 3D volume rendering (for demonstrative purposes only), and on the axial slice and reformatted coronal slice used for fracture identification in this study.

Fig. 2
figure 2

Examples of a correctly identified fracture line (red arrow) on the macerated skull (A), 3D volume rendering of PMCT data (B), and axial slice (C) and an overlooked fracture (blue arrow) on macerated skull (A), 3D volume rendering (B), and coronal slice (D). The deceased was not scanned in the Frankfurt plane due to rigor

Discussion

In this study, we determined sensitivity and specificity of PMCT for detection of individual fracture lines in each individual bone of the neuro-cranium in adults subjected to blunt force trauma. These anatomical locations are routinely dissected at autopsy and offer a reliable, validated, and generally accepted reference test. We focused this study on blunt force trauma to explore the basis for future studies with FEA of blunt force skull fracture, which is inherently different to sharp and gunshot trauma. This was also the reason for registering individual fracture lines rather than the absence or presence of a fracture system, as we needed to determine whether 3D models generated from PMCT data would provide sufficient data for FEA. The purpose of FEA of skull fractures is to provide an repeatable, objective, and evidence-based analysis of the skull fracture and provide a likelihood of the observed fracture system given the proposed explanation, e.g. an infant falling from a table vs. abuse [15] or an adult slipping in the bath vs. assault [14].

Sensitivity, specificity, and intra- and inter-observer agreement are significantly lower in our study than in the majority of previously published studies on the subject, and we argue that two methodological choices, i.e. registering individual fracture lines instead of presence/absence of fracture systems and including only blunt force skull trauma, are responsible for this. Our findings are virtually identical to the findings of others when we employ the commonly used methods for registering fractures.

Scan parameters vary slightly between clinical and post mortem CT, but image evaluation is similar, thus post mortem interpretation is subjective and subject to the same errors as all radiology [32]. In contrast to autopsy, CT images are easy to re-evaluate as the images are permanently storable. In the 1940s, the realization that radiologists did not always agree with neither themselves nor others when interpreting “routine” chest X-rays caused an eerie and spawned investigation into the subject of intra- and inter-observer agreement [33]. Indeed, “mistakes” and therefore intra-observer and inter-observer variation seem to be an inherent a part of radiology [33,34,35]. A problem in considering PMCT a diagnostic test is that the “cut-off value” for fracture is discrete and potentially varying. It is an internal, perceptual, and cognitive process within each interpreter to determine if what one sees is sufficient to cross the “threshold” needed to decide on the presence of a fracture. The three forensically trained observers exemplify the inter-dependency between sensitivity and specificity and different thresholds: observer 1 had the lowest sensitivity but highest specificity, observer 3 had the highest sensitivity but the lowest specificity, and observer 2 was in between.

Observer 4, a clinical radiologist, had no forensic experience and was not accustomed to the level of detail required, and scored a lower sensitivity. Clinically insignificant fractures or the exact shape of a severe comminute fracture system have no consequences for treatment and require neither detailed description nor the patient burdening and time-consuming scanner settings seen in forensic pathology [36, 37]. However, there have been cases of missed, minute fractures that lead to death [38, 39]. For skull fracture detection, the major difference between post mortem radiology and clinical radiology is the lack of secondary signs such as intra-cranial gas, bleeding, or swelling of the surrounding soft tissue when death is instant [40], and differences in imaging parameters. In this study, only discontinuation of the bone constituted a fracture, and only the axial, coronal, and sagittal slices were available for interpretation. In clinical radiology, tools such as maximum intensity projection, secondary signs of fracture, and knowledge of circumstances will presumably result in both higher sensitivity and specificity than presented here.

Post mortem radiology lies between the specialties of pathology and radiology. In studies on forensic PMCT, a radiologist is most commonly the interpreter [2], and it is generally advised that radiologists are trained in forensic pathology [16, 32, 41]. Radiologists trained in forensic pathology and forensic pathologists trained in radiology may be equally proficient [42], though Leth et al. found a kappa of 0.33 for injuries in the skeletal system and of 0.74 for injuries in the head region when comparing Abbreviated Injury Score (AIS) determined at PMCT by a radiologist and a pathologist [41]. A study on 14 cases of simple skull fractures found that a second reading by a radiologist after initial reading by a forensic pathologist provided additional details in nine cases [43].

Cohen’s kappa of agreement is misleading in this study for two reasons. First, the kappa coefficient assumes that the expected agreement depends on the marginal totals of the 2 × 2 table but makes no assumptions regarding the observed values. This means that, even with a high observed agreement, as in our study, the kappa value may be low. The rationale for the chance correction employed in Cohen’s kappa is that observers may agree by chance. This is not a valid assumption when it comes to “trained” observers, as more experience means that chance will play a lesser role in decision-making [44]. Second, the distribution of marginal totals affects the maximum kappa that may be achieved [44], and our population was unbalanced between positive and negative findings.

An overview of some of the larger, more recent studies on PMCT for skull fracture detection are presented in Table 14. In all studies but Jacobsen and Lynnerup [17] and Leth et al. [41], who used pathologists with years of experience in reading PMCT images, scans were interpreted by radiologists.

Table 14 Sensitivity and specificity reported in the literature

A reason for the difference between the literature and our study is the fundamental methodical difference of a fracture system versus the individual fracture line. In a study on rib fractures with a more comparable methodology, Schulze et al. found a comparably low sensitivity of 0.63 and specificity of 0.97 [42].

In several cases, fractures were detected in both the vault and base of the skull on PMCT, but not to the extent demonstrated at autopsy. Similarly, Wozniak et al. visualized the fracture systems in part in seven of 10 cases and the full extent of fracture in three cases [45]. Registering only the presence or absence of fracture systems, we would have found a sensitivity of 0.97 for the vault and 0.93 for the base.

Due to our inclusion criteria, only 14 skulls had no fractures in the vault, and five skulls had no fractures in the base. Given that presence of fracture was a criterion for inclusion, we consider the stated specificity when using the common method misleading for statistical reasons due to the low number of true negatives in our sample, as specificity = true negatives/(true negatives + false positives), and the false positives contribute disproportionately to the denominator.

A sensitivity of 1.0 is often reported, e.g. by Hoey et al. [46], Sochor et al. [47], Jacobsen et al. [43], and Di Paolo et al. [48]. In these studies, the sample sizes ranged from four to twenty, and the samples consisted of primarily motor vehicle accidents, aircraft mishaps, and high-velocity blunt force trauma or were selected on the basis of fractures detected at autopsy. We assume fracture systems in such cases are clearly visible because of the extent of injury. Le Blanc-Louvry et al. found a sensitivity of 0.97 for the vault and 0.85 for the base at first reading in 236 “routine” cases [20], and Legrand et al. found a sensitivity of 0.97 for the vault and 1.0 for the base in a sample of 73 consecutive deaths [22]. Both studies had a high proportion of gunshot deaths and should projectile trauma have caused skull fracture; it is our experience that they are clearly visible on PMCT. As seen from Table 4, some of our cases have suffered severe trauma, but as we registered all individual fracture lines, severe cases were not “easier”. We excluded cases with severely comminute fractures, as they were too difficult to establish location and direction of individual fracture lines in. Even with these cases excluded, we found a sensitivity comparable to Legrand et al. and Le Blanc-Louvry et al. had we used the common methodology.

For the parietal, occipital, and temporal bones and base part of the frontal and sphenoid bones, the number of false negatives appears high, but these locations contained fracture systems with extensive and intricate patterns. This resulted in a high number of individual fracture lines, and while the fracture system and its major fracture lines were correctly identified, a minute fracture line in the periphery of the fracture system still constituted an individual finding and contributed statistically equal to the larger, and forensically and clinical relevant parts of the fracture system.

The ethmoid bone appears to have a very low sensitivity. It was difficult to detect fractures in the thin, perforated cribriform plate on PMCT, but this finding is probably exaggerated statistically because of the few true positives.

In this study, many of the overlooked fracture lines were non-dislocated, without bleeding, and described as “hairline”, i.e. a minor fracture in which the bones remain aligned. The hairline fracture lines were elusive regardless of anatomical situation. Until the resolution in both acquisition and reconstruction of PMCT images, as well as the ability of the human eye and brain to distinguish such hairline fractures, is sufficient, isolated hairline fractures will remain overlooked for technical and physical reasons.

Sensitivity was higher for the dislocated fractures, but fracture lines of bone fragments that appear dislocated at autopsy may not have been so at PMCT, as the scalp, skin, and other tissues keep the bones in place and to some extent adjacent. Commingled and comminute bone parts made it more difficult to establish the location and direction of individual fracture lines on PMCT compared to autopsy.

Leconte et al. found 29 fractures in the skull base at PMCT and only 20 at autopsy with the differences being in the small (n = 3) and greater (n = 2) wings of the sphenoid, the pyramid of the temporal bone (n = 3), and the occipital bone (n = 1), for a kappa value of 0.68 for the base. In the vault, ten of ten fractures were the same for a kappa of 1.0 [49]. This phenomenon of more fractures at PMCT than at autopsy is the basis for the decision by several authors [18, 41, 50] to calculate agreement rather than sensitivity.

Cattaneo et al. performed a study on five piglets, where four were severely beaten post mortem in order to induce fractures. The piglets were CT-scanned, then autopsied, and finally macerated for osteological analysis. With a slice thickness of 3 mm and interpretation by two radiologists, 26% more fractures were found in the cranium with PMCT than at osteological analysis. The authors speculate that a reason for the excess of fractures at PMCT may be that a single, long fracture line is seen as two, shorter, independent fractures on PMCT, which explain the excess of fractures detected [40]. Considering these findings, we presume osteological analysis of macerated bone the absolute gold standard for fracture identification, and thus the additional fractures “seen” at PMCT as false positives of PMCT rather than false negatives of osteological analysis.

In our study, the neuro-cranium was visualized at autopsy, yet fractures just above the nasal-frontal suture, in the frontal sinuses, and of the occipital condyles were missed. All fractures registered at PMCT not found at autopsy were considered false positives, even when presumably “true”, which increased the number of false positives. Fracture lines, which were beyond doubt the same at PMCT and autopsy, but had been registered differently, e.g. diastatic vs. close to the suture line resulted in both a false and a false negative.

In the existing literature, specificity is not always reported [16, 51] or becomes 1.0 as inclusion criteria are presence of fracture [17]. In studies with a few false positives, they are masked statistically by a large number of true negatives from cases with no head trauma at all [19]. For these reasons, the specificity was lower in our study than reported elsewhere.

Conclusion

We investigated the sensitivity and specificity of PMCT for individual fracture lines in individual bones of the neuro-cranium because reliable PMCT data may be used for 3D models and FEA. Considering the aim of using 3D models based on PMCT data for FEA to assist the forensic pathologist in interpreting blunt force skull fracture, we suggest supplementing PMCT data with information gathered at autopsy in order to compare the results of finite element analysis and the actual fractures of the deceased’s skull.

PMCT with the parameters employed in this study is suited to detect the presence of a fracture system, but not suited to detect each individual fracture line. If a case warrants full elucidation of fractures, then autopsy supplemented by PMCT is better than PMCT alone.