Introduction

Post-mortem computed tomography (PMCT) is routinely performed prior to forensic autopsy at Danish forensic institutes and is known to be especially appropriate for examining bone and detecting fractures [1, 2]. In Denmark PMCT is mainly used as a screening and planning tool for the following forensic autopsy, but, internationally it has also been considered as a possible replacement for the traditional autopsy or as a triage tool regarding the decision of performing a forensic autopsy [2,3,4].

From a forensic pathology perspective, the diagnosis of fractures of the hyoid-larynx-complex (HLC) is important, as their presence is associated with traumatic neck injury, suggesting an unnatural manner of death [5]. As PMCT is especially suitable for detecting fractures, the diagnostic accuracy of PMCT when investigating cases where traumatic neck injury has occurred has been the subject of several studies [6,7,8,9,10,11,12,13,14,15]. Currently, autopsy and dissection of the HLC, is considered the gold standard for forensic examination of traumatic neck injuries. [14, 16,17,18]. Most fractures are discoverable during traditional autopsy, possibly supplemented by microscopic examination [16]. Some studies have found an almost perfect agreement between autopsy and PMCT [13,14,15], whereas other larger studies have found more varying results [6, 7, 10]. Several of the studies have presented Cohen’s kappa values to evaluate the agreement between autopsy and PMCT [6, 7, 10, 11, 13]. Published values for the hyoid bone range from poor to perfect agreement, i.e., 0.452 [11] to 0.91 [13] for overall agreement, while values for the thyroid cartilage vary from 0.452 [11] to 0.538 [6].

Most of the kappa values indicated, according to Cohen, moderate agreement; ranging from 0.41 to 0.60 [19]. Generally, both in clinical healthcare and in forensic pathology, a moderate level of agreement is not considered sufficient, as it indicates almost the same level of disagreement as agreement. Thereby the diagnostic accuracy of the test cannot be secured. Only two studies found kappa values greater than 0.81, signifying almost perfect agreement, one using full body PMCT [13] and one PMCT of the HLC [7]. Both studies utilized a thin slice thickness (0.625 mm) providing results underlining the importance of a minimal slice thickness in order to diagnose all fractures. The heterogeneity of the results emphasizes the need for more studies examining the agreement between the two methods.

A common challenge for diagnosing fractures in the HLC are the anatomical variants of the complex which can imitate fractures [20], introducing the risk of false positive diagnoses. A common developmental anomaly is called the triticeous cartilage, which can be found in the thyrohyoid membrane and can be mistaken for a fracture [21, 22].

Furthermore, age and sex influence the appearance of the HLC, i.e., fusion of the HLC becomes greater with age [23, 24] increasing the risk of fracture and therefore older victims of traumatic neck trauma have a higher risk of fractures of the HLC [23, 25]. Also, there is a great variation in the timing of ossification of the laryngeal structures [26], which usually begins in the second or third decade and increases with age. Males tend to have a greater level of ossification than females [27, 28].

To our knowledge, no summarized data have been published on the diagnostic accuracy of the method regarding fractures of the HLC, and we aimed to examine and calculate this by extracting data from relevant studies.

Therefore, the aim of this paper was to investigate the diagnostic accuracy in terms of sensitivity and specificity of a PMCT scan compared to traditional autopsy as the reference test in cases involving traumatic neck injuries and suspected fractures of the HLC.

Methods

A systematic literature search was performed in PubMed, Web of Science and SCOPUS in April 2023 to find articles comparing the use of PMCT and autopsy in diagnosing HLC injuries. The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines were considered [29]. The following search term was used in PubMed to capture all studies utilizing PMCT in cases involving traumatic neck injuries and HLC complex fractures:

(postmortem computed tomography) AND ((hyoid bone) OR (thyroid cartilage) OR (hyoid-larynx complex) OR (laryngeal cartilage) OR (arytenoid cartilage) OR (cricoid cartilage)).

The search string was modified for the search in SCOPUS and Web of Science respectively:

(postmortem AND computed AND tomograph ) AND (hyoid AND bone) OR (thyroid AND cartilage) OR (hyoid-larynx AND complex) OR (laryngeal AND cartilage) OR (arytenoid AND cartilage) OR (cricoid AND cartilage).

ALL= ((postmortem computed tomography) AND ((hyoid bone) OR (thyroid cartilage) OR (hyoid-larynx complex) OR (laryngeal cartilage) OR (arytenoid cartilage) OR (cricoid cartilage))).

Two of the authors independently screened titles and then abstracts, and assessed full-text papers for inclusion. A consensus decision was made in case of disagreement. The reference lists of all included studies were screened for relevant publications to include.

The primary outcomes of this review are sensitivity and specificity of PMCT. Secondary outcomes are data on scan protocol and evaluation.

The QUADAS-2 tool was used by two of the authors to independently assess the risk of bias [30]. Data extraction was completed independently, and a consensus decision was made in case of disagreement. Data included information on title, authors, publication year, country of origin, study type, compared methods, number of cases, demographic data, scan protocol, and results. Original authors were not contacted in cases of ambiguous or missing data. Data was reported as true positive (TP), false positive (FP), true negative (TN) and false negative (FN) to calculate sensitivity and specificity. For studies where this was not possible, data was only included in the qualitative analysis.

Statistical analysis was performed in R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, Vienna, Austria. Version 4.0.5), and references were organized in Mendeley Reference Manager (Mendeley, Elsevier, London, UK. Version 2.97.0). A continuity correction of 0.5 was used to avoid zero values in the 2 × 2 tables. The R-code by Henningsen et al. [31] was used for the calculations.

The R-code produces a meta-analysis of the proportional data by use of the “metaprop” function in the “meta” package, calculating Forest plots with 95% confidence intervals [31]. Higgin’s I2 and Cochran’s Q were calculated to assess heterogeneity among the included studies. Due to a small number of included studies with small sample sizes, the power of the tests is low. They were performed to provide complete data [32].

Inclusion criteria

All articles in English, consisting of retrospective studies and prospective studies that compared the use and accuracy of PMCT and autopsy in diagnosing fractures of the HLC were included. Studies using full body PMCT were included, as well as studies only examining HLC explants. For studies also examining PMCT of other parts of the body than the HLC, only the results pertaining to the HLC were included. The cases included in the review involved all ages and both sexes.

Exclusion criteria

Articles in languages other than English were excluded as well as letters, opinions, non-published abstracts, articles in non-peer reviewed publications and reviews. Articles that did not compare PMCT and autopsy in diagnosing fractures of the HLC were excluded, including studies using other types of imaging, as well as studies using micro-CT. Studies describing non-HLC injuries were also excluded. Further, case reports including n ≤ 3 were excluded. Finally, articles from before 1 January 2000 were excluded.

Fig. 1
figure 1

Selected studies. Demonstrates the inclusion and exclusion process of the original papers. n number. Figure layout from PRISMA guidelines [29]

Results

As seen in Fig. 1, the search strategy yielded a total of 259 results, which were screened for relevance. 52 duplicates were removed. Based on the title, 176 publications were excluded. The abstracts of the remaining 31 articles were screened, and 15 were excluded, resulting in 16 articles being assessed for eligibility by full manuscript review.

Exclusion criteria yielded no direct comparison between autopsy and PMCT (n = 5) and focus on gas bubble sign and not fractures (n = 2). Nine articles from the search were included, and cross reference produced one additional relevant article for inclusion. In total, 10 studies reporting on the usage and accuracy of autopsy in comparison to PMCT in diagnosis of HLC fractures were included in the systematic review as shown in Table 1.

Table 1 Included papers

As seen in Table 2, seven of the 10 included articles were retrospective comparative studies, ranging from six to 203 cases [6, 8,9,10,11,12, 15]. Two prospective studies were included, one cohort and one comparative study, with 54 and 236 included cases respectively [7, 13]. The final publication was a case study including eight cases [14]. The included publications ranged in time from 2005 to 2022. Of the included papers, seven compared full body PMCT and autopsy. One study used postmortem fine preparation (PMFP) and scanned the HLC explant after preparation [7]. Two studies used HE-staining for the HLC histopathology specimens, one of them comparing to full body PMCT [15] and the other comparing to PMCT of the HLC [10]. The age of the decedents varied from 0 to 95 [7, 10]. Two of the studies also investigated PMCT of other parts of the body than the HLC, and in these cases, only the results pertaining to the HLC were included [11, 13].

In all studies, imaging was reviewed by experienced radiologists and autopsies and histology examinations were performed by qualified pathologists.

Table 2 Demographic of included cases

Bias assessment

The bias assessment performed through QUADAS-2 is shown in Table 3. “☺” signifies a low risk of bias, “☹” signifies a high risk of bias and “?” an unknown risk of bias.

In the study by Lyness et al. [6], 13 cases were excluded because the neck structures were not visible on the PMCT images. Additionally, the service pressure of the mortuary affected the decision on whether to perform the PMCT. Both of these factors could introduce selection bias.

Table 3 Bias assessment. Demonstrates the bias possibly introduced by patient selection, index test, reference standard and flow and timing in the original papers, and the comparable applicability to the review question of the patient selection, index test and reference standard. “☺” indicates low risk of bias, “☹” indicates high risk of bias, and “?” indicates unknown risk of bias

PMFP was used in comparison to PMCT by Treitl et al. [7], and both were performed after autopsy, which might cause injuries to the specimen, resulting in diagnostic difficulties or artefacts, e.g. beam hardening artefact.

The studies by Deininger-Czermak et al. and de Bakker et al. [8,9,10] all gave the pathologists access to the radiologists’ reports before autopsy, which is likely to cause bias, by not blinding the pathologists. Furthermore, bias might result from the inclusion of a minor of the age of 11 by de Bakker et al. [10].

Graziani et al. [11] blinded the pathologists to the findings of the radiologists, but had the pathologists train the radiologists for a period of several months, which could increase diagnostic accuracy. Despite this, and the fact that the radiologists gained experience throughout the study, there were no statistically significant differences in the results from the first two years compared to the later years. Some of the cases considered for this study were excluded because a complete autopsy was not performed.

The limited information on the scan protocol in the study by Decker et al. [12] might cause information bias.

Selection bias might be introduced by the fact that the first PMCTs were not performed on 66 consecutive cases in the study by Le Blanc-Louvry et al. [13], as not available for all cases examined by the institution. Decker et al. [12], Kempter et al. [14] and Yen et al. [15] all described the scan protocol without many details, which made comparison of protocol data difficult. Finally, the radiologists were aware of the autopsy findings in the study by Yen et al. [15].

In most cases, the index test has a low risk of introducing bias, whereas the reference test has a high risk of bias in many of the included studies. Both the index and reference tests are applicable to this review.

Heterogeneity

The sensitivity for the hyoid bone and the specificity for the thyroid cartilage had a Higgin’s I2 of 0%, indicating low heterogeneity. Cochran’s Q was not significant for either, at 0.91 for the hyoid bone sensitivity and 0.45 for the thyroid cartilage specificity. The specificity for the hyoid bone had a Higgin’s I2 of 45%, meaning moderate heterogeneity, and the sensitivity for the thyroid cartilage had a Higgin’s I2 of 67%, meaning substantial heterogeneity. Cochran’s Q was significant for both, at 0.07 and < 0.01 respectively.

Sensitivity and specificity

Data on fractures, scan parameters and anatomical variants is presented in Tables 4, 5, 6 and 7. Figures 2, 3, 4 and 5 demonstrate the sensitivity and specificity of PMCT in detection of hyoid bone and thyroid cartilage fractures. The study by Graziani et al. [11] was omitted from these calculations, as it was not specified whether the fractures were in the hyoid bone or thyroid cartilage. Overall, the sensitivity for hyoid bone fractures was 0.70 [0.59; 0.79] and the specificity 0.92 [0.80; 0.97]. For fractures of the thyroid cartilage, overall sensitivity was calculated as 0.80 [0.62; 0.91] and overall specificity as 0.76 [0.63; 0.85].

The individual sensitivity for hyoid bone fractures ranged from 0.55 to 0.88, and the specificity ranged from 0.72 to 0.99. The individual sensitivities and specificities for the thyroid cartilage varied from 0.50 to 0.94 and 0.25–0.93 respectively. This means the rate of correctly identifying fractures varied from 50 − 94% between the different publications, and the rate of correctly identifying bones with no fractures varied from 25 − 99%.

Table 4 Fracture types. Demonstrates number and type of fractures discovered by autopsy and PMCT.
Table 5 Fracture overview by number of cases
Table 6 Scan parameters
Table 7 Overview of anatomical variants and gas bubble sign discovered by autopsy and PMCT.

Hyoid bone

Fig. 2
figure 2

Hyoid bone sensitivity. Sensitivity (grey square with black dot) and confidence interval (solid line) for hyoid bone fractures in each of the papers included in meta-analysis and aggregate sensitivity (dashed line with grey rhomboid), demonstrated by a Forest plot. TP true positives, FN false negatives, C.I. confidence interval

Fig. 3
figure 3

Hyoid bone specificity. Specificity (grey square with black dot) and confidence interval (solid line) for hyoid bone fractures in each of the papers included in meta-analysis and aggregate specificity (dashed line with grey rhomboid), demonstrated by a Forest plot. TP true positives, FN false negatives, C.I. confidence interval

Thyroid cartilage

Fig. 4
figure 4

Thyroid cartilage sensitivity. Sensitivity (grey square with black dot) and confidence interval (solid line) for thyroid cartilage fractures in each of the papers included in meta-analysis and aggregate sensitivity (dashed line with grey rhomboid), demonstrated by a Forest plot. TP true positives, FN false negatives, C.I. confidence interval

Fig. 5
figure 5

Thyroid cartilage specificity. Specificity (grey square with black dot) and confidence interval (solid line) for thyroid cartilage fractures in each of the papers included in meta-analysis and aggregate sensitivity (dashed line with grey rhomboid), demonstrated by a Forest plot. TP true positives, FN false negatives, C.I. confidence interval

PMCT

As seen in Table 4, fractures of the superior horns of the thyroid cartilages were the most common type of fractures in the included studies. Three of the studies used the same slice thickness of 1.5 mm [6, 14, 15] as seen in Table 6. Lyness et al. [6] also used 2.0 mm. Most of the other included studies used similar thinner slice thicknesses of 0.4–0.625 mm [7,8,9, 13], whereas de Bakker et al. reported a slice thickness of 1.0 mm and 0.5 mm for the larynx explant scan [10]. One study did not report on the slice thickness [12]. Deininger-Czermak et al. [9] also used a 0.6 mm slice thickness in addition to 0.4 mm for the separate larynx scan.

Four of the included studies reported the kernel used [6,7,8, 10], as seen in Table 6. These studies all discriminated between kernel-based reconstructions for bone and soft tissue, except Lyness et al. [6].

Anatomical variants were investigated in three studies [6, 7, 14] as seen in Table 7. Treitl et al. [7] found all variations through both PMCT and PMFP, while Kempter et al. [14] only found one on PMCT. Lyness et al. [6] found that PMCT diagnosed more variants of the hyoid bone than autopsy, but considerably less thyroid cartilage anomalies.

Four different studies found gas bubbles near to fractures [6, 8, 9, 14]. One study found almost as many gas bubbles close to fractures (n = 5) as gas bubbles without a fracture (n = 4) [8]. Lyness et al. [6] also found gas bubbles with no apparent fracture in one case.

Discussion

This paper aims to describe the diagnostic accuracy of PMCT in cases involving traumatic neck injuries and fractures of the HLC based on calculations of sensitivity and specificity. For the hyoid bone fractures, overall specificity was higher than overall sensitivity, meaning that PMCT based diagnosis seems to be better for correctly diagnosing the hyoid bones without fractures than finding the fractures. The opposite is true for the thyroid fractures, but here the overall sensitivity and specificity are more similar. All of the calculated overall sensitivities and specificities are above 0.6 and below 0.92 exemplifying a wide range between study results.

The slice thickness used in the studies varies from 0.4 to 2.0 mm. In the studies from 2005 to 2012 there is no apparent correlation between the number of diagnosed fractures and slice thickness, although an almost perfect specificity of 0.99 was calculated for hyoid fractures reported in the study by Le Blanc-Louvry et al. [13]. This specificity is likely caused by the high number of individuals without a hyoid bone fracture in this study [13]. In the studies from 2020 to 2022, Deininger-Czermak et al. [8, 9] and Treitl et al. [7] correctly diagnosed more hyoid and thyroid fractures using a slice thickness of 0.4–0.625 mm than Lyness et al. [6] did with a slice thickness of 1.5–2.00 mm, based on the calculated sensitivities. De Bakker et al. [10] used a slice thickness in between that of the previously mentioned studies of 0.5–1.0 mm, and also has a calculated sensitivity for the hyoid bone that correlates with this, although the sensitivity for the thyroid cartilage is slightly lower than the one for Lyness et al. [6]. This indicates the need for a slice thickness thinner than 1.0 mm. The calculated specificities as seen in Figs. 3 and 5 do not correlate with slice thickness in the same way, indicating that slice thickness is not as important for correctly identifying the hyoid bones and thyroid cartilages with no fractures. For thyroid fractures, the study by Treitl et al. [7] using a thin slice thickness results in the lowest specificity of 0.250.

As indicated by Bauer et al. who used PMCT for neck trauma, thinner slice thicknesses allow for a more accurate diagnosis and fracture grading [33], and Kettner et al. [34] also mention that it is necessary for fractures to involve two or more slices to be discovered by PMCT. Thus, a thinner slice thickness might increase the number of fractures diagnosed. Since the optimal slice thickness for fractures of the HLC is unknown, it would be valuable to compare different slice thicknesses in future studies.

Only Treitl et al. [7]. directly compared different degrees of dislocation and found a much higher kappa level of agreement between PMCT and PMFP for displaced fractures than fissures and non-displaced fractures. This suggests that dislocation is an important factor for diagnosing fractures on PMCT. It was not possible to calculate the sensitivity and specificity for different age groups due to the small number of cases and the lack of data in some studies. As mentioned previously, there is a greater risk of fractures with increasing age. Therefore, age dependent occurrence of fractures and possibly dislocation should be investigated in future studies.

As reported, some of the included studies found additional fractures on PMCT, which were not discovered during autopsy. Only Graziani et al. [11] and Kempter et al. [14] mention the possibility of these findings simply being false positives. Deininger-Czermak et al. [8, 9] explains the discrepancies as possibly being due to lack of discoverable perifocal hemorrhages or instability of the HLC before autopsy, which in turn did not lead to fractures being suspected.

The diagnostic accuracy of the PMCT analysis is also dependent on the experience of the investigator and their knowledge of forensic pathology. Two of the included studies comment on this [11, 12]. Although Graziani et al. [11] did not find an improved diagnostic accuracy after increase in experience among the radiologists, Decker et al. [12] mention that it might require years of training to acquire the skills to properly identify the injuries.

De Bakker et al. [10] additionally scanned the HLC explant separately, but still achieved some of the lowest sensitivities and specificities. This study [10] also found negative kappa values for location and number of fractures of the hyoid bone, meaning the agreement was worse than expected or a representation of random data [19].

Some papers describe the use of micro-CT for finding smaller fractures of the HLC and examining decomposed bodies, but so far, no larger studies have been conducted [34,35,36]. Papers regarding micro-CT were not included in the analyses, but are promising for finding smaller fractures of the HLC [34,35,36]. Micro-CT requires scanning the HLC ex-situ, and as previously mentioned, removing it from the body might cause injury to the specimen or produce artefacts.

An alternative postmortem imaging technique is postmortem magnetic resonance imaging (PMMR), which has been examined in traumatic deaths and proven to be better at diagnosing soft tissue injuries [37]. Three of the included studies additionally investigated PMMR in traumatic neck injuries [8, 9, 15] and found promising results compared to PMCT. Further analysis of the PMMR results were outside the scope of this paper. As there are disagreements between autopsy, PMCT and PMMR, with PMCT sometimes finding more fractures than autopsy and PMMR finding more soft tissue injuries, a combination of the different methods might be necessary to increase the diagnostic accuracy [8]. This should be investigated in future studies. The cost and accessibility of PMMR might limit its use.

Only three of the studies investigated anatomical variants of the HLC with heterogenous results [6, 7, 14]. PMCT had a perfect detection rate of anatomical variants in one study [7] and found more variations than autopsy in another study [14]. Despite this, a third study had variants misdiagnosed as fractures [6]. The distinction between fractures and variants is very important to ensure a high level of diagnostic accuracy in PMCT, and further studies are needed to determine how useful PMCT is in these cases.

Gas bubbles were found both associated with fractures and not associated with fractures, and the “gas bubble sign” [17] was not proven as a sure sign of HLC fractures in the examined studies.

An advantage of PMCT is the time efficiency. Autopsy and especially additional examinations like PMFP are time consuming. Treitl et al. [6] reports the mean time of PMFP as 208.2 min versus 28.9 min for the PMCT meaning a significant reduction of time used. Another advantage is the possibility to save images and ensure objectivity. This allows for review in the future, possibly by other professionals. Furthermore, some institutions aspire to limit the number of autopsies due to for example religious reasons or public opinion [3].

A disadvantage of PMCT is the cost and availability of the equipment. PMCT is routine in Danish institutions [1] but might not be readily available in other institutions.

To limit the risk of bias, case studies with n ≤ 3 were excluded, which in turn limited the number of papers included. One of the included studies was excluded from the sensitivity and specificity calculations, as it did not specify whether the fractures were in the hyoid bone or the thyroid cartilage, and grouped these as one [11].

Many of the studies were not single-blinded or double-blinded. In some cases, pathologists had access to the radiology report and in others, the radiologists had access to the autopsy report. In the future, it would be recommended to perform more prospective blinded studies with larger sample sizes.

Through the review of the literature, a lack of uniform reporting was found with regards to PMCT technical parameters, fracture location and displacement, and statistical analysis. It would be advisable to report these in a similar way in future studies for easier comparison of results.

Conclusion

This systematic review aimed to investigate the sensitivity and specificity of a PMCT scan in fracture diagnosis of the hyoid-larynx complex (HLC) compared to traditional autopsy in cases involving traumatic neck injuries The individual sensitivity for hyoid bone fractures ranged from 0.55 to 0.88, while the aggregated sensitivity was calculated as 0.70 [0.59; 0.79] and the specificity ranged from 0.72 to 0.99, aggregated at 0.92 [0.80; 0.97]. The individual sensitivities and specificities for the thyroid cartilage varied from 0.50 to 0.94 and 0.25–0.93 respectively, with aggregated sensitivity and specificity at 0.80 [0.62; 0.91] and 0.76 [0.63; 0.85]. This shows great variation in the results, and a large range between studies.

We found that PMCT is a useful diagnostic method in cases of traumatic neck injuries with fractures of the HLC but should not replace the traditional autopsy. For future studies comparing PMCT and autopsy, we recommend large blinded prospective studies using uniform reporting on fracture details, scan protocols and slice thicknesses.