Keywords

1 Introduction

The adult human skeleton consists of 206 bones, and each one of them has proven to be of extreme importance for the exploration of evolutionary, bioarchaeological and forensic questions. Some skeletal elements, like the skull and the pelvis, have received special attention, mainly due to their ability to elucidate issues such as herit¢ability and locomotion in modern and past populations (Martínez-Abadías et al. 2009; Gruss et al. 2015), while others, like the small hand and feet bones, have been comparatively neglected. This can be attributed to the poor preservation of such small bones in the archaeological and fossil record and the potential problems of identification and siding. Nevertheless, they are still as important as every other skeletal element in forensic investigations, for extracting biological information, that can lead to identification (e.g. unique conditions, handedness) (Danforth and Thompson 2008a; Varas and Thompson 2011) and/or provide evidence of violent events (e.g. defence cut marks, healed fractures) (Saukko and Knight 2016). The current study focuses on the morphological variation of phalanges in an effort to develop a pair-matching method applicable in commingled contexts.

Pair-matching techniques have been developed to improve the sorting of commingled human remains in various situations, such as, archaeological common burials, and mass graves, commingled decomposed remains produced by human atrocities, accidents or natural disasters (Garrido Varas and Intriago Leiva 2012; Karell et al. 2016). Such techniques can be based on visual assessment, osteometric sorting (Adams and Byrd 2006; Thomas et al. 2013) and/or more complicated methods of pattern comparisons such as geometric-morphometrics (Garrido-Varas et al. 2015) and point cloud comparison (Karell et al. 2016). For the purpose of developing pair-matching approaches, both virtual models and physical measurements have been used to develop methods for the scapula, the calcaneus and the metatarsals (Thomas et al. 2013; Garrido-Varas et al. 2015; Karell et al. 2016; Lynch 2017). Some individualisation techniques have successfully explored matching articulations based on metrics and regression analysis, but these are limited to large articulations of the lower limbs (Anastopoulou et al. 2018a, b). There is lack of methodological approaches for sorting smaller bones, such as the wrist bones, tarsal bones and phalanges.

Phalanges, in particular, pose a challenge for pair matching due to the fact that phalanges from different digits of the hand can be easily mixed and confused when sorted with the naked eye, which complicates identification. There is no study to date that presents a clear methodology on how to pair-match phalanges belonging to the same individual accurately and reliably. Thus, the aim of this study was to develop a methodology of pair-matching phalanges based on 3D models from reconstructions of Computed Tomography scans following a similar methodology as Karell et al. (2016) developed for the humerus.

2 Materials and Methods

2.1 Materials

The current study employed a sample of 515 phalanges from 41 hands (18 right and 23 left) belonging to 24 individuals. Proximal (PP), middle (MP) and distal (DP) phalanges from every available digit were used. A detail list of the available samples for each digit can be found in Table 1.

Table 1 Number of bones modelled from 24 individuals

The material of the study derived from individuals that were submitted to postmortem CT as part of a different project (Virtopsy.GR). The Virtopsy.GR is a research project that explores the validity of postmortem computed tomography (PMCT) as an additional technique to the autopsy findings in forensic investigations of death on the island of Crete (Kranioti et al. 2017). Each individual had undergone PMCT just a few hours after death. The CT scan data were anonymized and each case was given an identification number with basic demographic data (age and sex). The project was approved by the Ethics Committee of the University Hospital of Heraklion in Crete in June 2016.

2.2 Methods

2.2.1 Scanning Protocol

The CT scans were acquired by the General University Hospital of Heraklion in Crete, using a Revolution GSI system (General Electric Medical Systems, USA). This system provides up to 128 slices per tube rotation and offers more than 15 applications for routine use in cardiac, oncology, neurology, spine, urology, musculoskeletal and more. In the GSI mode the system switches the tube potential from 80 to 140 kVp at a fast rate of up to 4.8 kHz. Thus, it allows the reconstruction of spectral images in the range from 40 to 140 keV. GE’s Smart Spectral tools, such as GSI Assist and GSI Viewer 3D, enable one-click workflow on the console. GSI ASiR delivers dose neutral Spectral CT protocols. The scanning protocol for the Virtopsy.GR project uses a tube current of 50 mA, tube voltage of 120 kV, slice thickness of 0.625 mm and slice increment of 0.5 mm. Scans with a field of view of 250 × 250 mm (matrix 512 × 512) were made in the coronal (transverse) plane. Voxel size was 0.5 × 0.5 × 0.5 mm. Data were saved as a Digital Imaging and Communications in Medicine (DICOM) format and then converted to a High Dynamic Range file. So, only the CT scan data (the HDR files) were used for the purpose of this study.

2.2.2 Scanning Method and Segmentation

Following the completion of these scans, each scan was “cropped” using Amira 6.0 software, so as to include only the hands. Each hand was loaded on Amira 6.0 software and each phalanx was manually segmented using the “brush” tool. This procedure resulted in 515 three-dimensional phalange models which were extracted as stereolithography (.stl) and wavefront [.obj] format and were then randomized prior to analysis.

2.2.3 Model Manipulation

For the purposes of this paper two datasets were analyzed. D1 included the original 3D models created from the segmentation of the CT scans and D2 the models, in which the internal material (compact and trabecular bone) was removed, maintaining only the surface. D2 models were created using a function of Viewbox 4.1 beta software.

2.2.4 Mesh-to-Mesh Value Comparison (MVC) Method

All right phalange models were mirror-imaged using the free software NetFabb basic and were named as mirror-imaged right models (MIR). Two folders containing all left (L) and MIR models were compared automatically using Viewbox 4beta software following the guidelines set by Karell and colleagues (Karell et al. 2016). The software uses a trimmed ICP algorithm (Besl and McKay 1992; Chetverikov et al. 2002) to compare all homologous points between two models (meshes) and computes a single value which expresses the similarity between the shapes of the two models. The single value is called mesh-to mesh value (MTMV) and is expressed in mm. The software runs simultaneously all possible comparisons between the two folders using the following settings: The estimated overlap for the scan is set to 100%, whereas the number of initial positions for rough alignment was set at 20. This alignment used the nearest neighbor search “Approximate fast”, with a point sampling of 1%, so that it matched point to point with one hundred iterations. On the other hand, the fine alignment used the nearest neighbor search “Exact with normal compatibility”, with a point sampling of 100%, which matched point to plane with one hundred iterations. Finally, completing the mesh-to-mesh comparison, the program automatically creates an Excel spreadsheet of all the MTMV for analysis. The lowest MTMV are hypothesized to belong to true pairs, meaning the left and mirror-right phalanges belong to the same individual.

2.2.5 ROC Analysis

MedCalc software was used to conduct a Receiver Operating Characteristics (ROC) Analysis on the MTMV values. ROC Analysis is currently used to evaluate medical tests (Bewick et al. 2004), such as whether a person is positive or negative for a medical condition, as for instance, the presence of a virus. Here, ROC curves were employed in the evaluation of MTMV between potential pairs of bones as effective predictors of true pairs (belonging to the same individual). The hypothesis tested was whether a pair (L-MIR phalanx) is a correct match (positive) or not (negative). If both diagnosis (true match) and test (predicted match) are positive, the result is called true positive (TP), whereas if diagnosis is positive and the test is negative, the result is called false positive (FP). Similarly, a negative diagnosis with a negative test is called true negative (TN) and a negative diagnosis with a positive test is called false positive (FP). The quality of the test can be measured with sensitivity and specificity (Kranioti and Tzanakis 2015). Sensitivity (true positive probability) is the proportion of true matches that are correctly identified by the test, while specificity (true negative probability) is the percentage of non-pairs that are correctly identified by the test. Predictive value of a positive test is defined as: PVP = TP/(TP + FP) and predictive value of a negative test is defined as: PVN = TN/(TN + FN) (Bewick et al. 2004; Kranioti and Tzanakis 2015). To help decide whether a pair is a match or not, a cut-off point of the MTMVs is chosen. ROC curve is widely accepted as a method for selecting an optimal cut-off point for a test and to make comparisons between tests (Akobeng 2007). The ROC curve is created by calculating sensitivity and specificity and creating a plot with y = sensitivity and x = 1 − specificity for the entire range of cut-off points (Kranioti and Tzanakis 2015). A large area under the curve (AUC) reflects good performance of the test (Bewick et al. 2004; Akobeng 2007).

3 Results

A total of 73,000 comparisons were automatically conducted, MTMVs were calculated and then analyzed using ROC analysis. Sensitivity, specificity, area under the curve (AUC) and cut-off points were calculated for datasets D1 and D2 and compared. For example, for D1 for Digit 2 proximal phalanx sensitivity was 71.4%, specificity was 98.0% and the threshold was set to 0.434 mm (Table 1). For intermediate and distal phalanges, sensitivity was 86.7% and 62.5%, while specificity was 62.9% and 72.1%, respectively. The threshold for the intermediate models was 0.588 mm and for the distal ones 0.511 mm. For D2 Digit 1 distal phalanx sensitivity was 100.0% and specificity 63.7%, and for Digit 3 proximal phalanx 85.7% and 92.4% respectively. As indicated in Table 2, there was no significant difference in the performance of the two datasets.

Table 2 Sensitivity, specificity, cut-off points and AUC for each paired phalanges

ROC analysis results in a cut-off value which combines the best prediction of true pairs with best rejection of non-pairs. It is possible though to calculate, using thousands of simulations, the values for different thresholds of specificity and sensitivity and the corresponding cut-off points. Tables 3 and 4 illustrate the adjusted values for a fixed sensitivity of 80, 90, 95, 97.5 and 99% (Table 3) and fixed specificity of 80, 90, 95, 97.5 and 99% (Table 4) for D1 and D2 in an effort to predict the highest number of true pairs and to reject the highest number of non-pairs.

Table 3 Specificity, threshold values and 95% confidence intervals for fixed sensitivity at 80, 90, 95, 97.5 and 99% for D1 and D2
Table 4 Sensitivity, threshold values and 95% confidence intervals for fixed specificity at 80, 90, 95, 97.5 and 99% for D1 and D2

The MTMVs for true pairs were analysed for D1 and D2. Table 5 illustrates mean, standard deviation, minimum and maximum values for both groups. A Wilcoxon paired test was performed using 10.000 Monte Carlo simulations to test whether there were differences between the means of the MTMVs of the true pairs and this test produced negative results.

Table 5 Minimum, maximum, mean and standard deviation of MTMVs for true pairs and Wilcoxon z-scores for paired differences between D1 and D2
  • Interpretation of the results and potential application of the method

To better explain the results of the analysis we will showcase an example using the Digit 3 PP subsample. We analysed 22 left and 15 right Digit 3 PP that can be combined in 330 possible pairs, of which only 14 are true pairs. ROC analysis results in AUC = 0.932 (Fig. 1) with sensitivity 87.5% and specificity 91.1% for D1 and similar results for D2-87.5% and 92.4% respectively (see Table 4). We will assume that bones were scanned with a CT scanner and we will use D1 for the purpose of this exercise.

Fig. 1
figure 1

Area under the curve for Digit 3 proximal phalanx

For a threshold value of 0.497 mm, all MTMVs equal or less than 0.497 mm will indicate a true pair. In our data this resulted in 40 pairs of which 12 are true pairs. The method correctly identified 12/14 pairs resulting in 85.7% accuracy, but it also identified 26 pairs that are not correct. Similarly, from the remaining 290 comparisons (MTMV > 0.497), all but two do not belong together, thus the method rejects two true pairs and 288 non-pairs (91.1%). To identify all pairs, one must fix specificity to 99% (Table 3), which will raise the MTM threshold value to 0.588 mm. This will result in identifying 95 pairs as matches, even though only 14 are true pairs. At the same time 235 pairs will be rejected. It is worth mentioning that 11 of the 14 true pairs showed the lowest MTMV when compared to each other in contrast with all other comparisons. Mean MTMV for true pairs was 0.4399 ± 0.0853 SD. MTMV values >0.4399 + 2SD = 0.610 are excluding correctly 227 non-pairs. MTMV values >0.4399 + 3SD = 0.6958 are classified as non-pairs with great confidence.

Figure 2 illustrates an example of a true match (a, MTMV = 0.376 mm) and of a non match (b, MTVT = 1.201 mm). A colored distance map has been applied in both cases to show true distances between homologous points on the two models. Blue indicates point distances close to zero and red values close to 2 mm.

Fig. 2
figure 2

A colored distance map applied to two possible pairs of Digit 3 proximal phalanges created with Viewbox 4.1 beta, a true pair with MTMV = 0.376 and b non pair with MTMV = 1.201

Thus, one can conclude:

  1. 1.

    MTMV values >0.610 mm are highly indicative of non-pairs in Digit 3 proximal phalanges while >0.700 mm safely indicates non-pairs.

  2. 2.

    Lowest MTMV for a pair of Digit 3 proximal phalanges indicates high possibility of bones to belong together.

Naturally these results need to be confirmed with an independent sample.

4 Discussion

Phalanges are probably the most neglected bones of the human skeleton due to their small size. They have been previously studied for the purpose of creating biometric standards for sex (Smith 1991, 2000; Scheuer and Elkington 1993; Case and Ross 2007; El Morsi and Al Hawary 2013; Mahakkanukrauh et al. 2013), stature (Habib and Kamal 2010) and age estimation (Gilsanz and Ratib 2005) based on both classical osteometry and virtual methods. Yet, there is a substantial lack of population specific standards compared to other bones that are more likely to be recovered in forensic or archaeological settings, such as the skull or long bones. Nevertheless, there are a number of studies examining the hand morphology of modern humans, primates and fossils from an evolutionary and functional perspective (Deane and Begun 2008; Tocheri et al. 2008; Mednikova 2011, 2013; Ward et al. 2014; Almécija et al. 2015; Lorenzo et al. 2015), whereas a few others have examined hand plasticity in relation to activity (Karakostis et al. 2016). In addition, a few studies focused on providing a detailed methodology for identifying and siding phalanges and exploring directional and functional asymmetry (Case and Heilman 2006; Danforth and Thompson 2008b; Christensen 2009; Garrido Varas and Thompson 2011). To date, no other study has explored potential pair-matching techniques for sorting phalanges in commingled situations.

The current study used 515 hand phalanges belonging to 24 individuals to explore the potential value of the symmetry in these bones in sorting pairs. Left and right bones were CT scanned, modelled and compared to each other using a series of software combining both manual, semi-automated and automated procedures. A single value was used to compare the overall shape similarity between two models (L and MIR) and was then used to differentiate between pairs and non-pairs with the aid of ROC curves. Different threshold values were calculated in order to facilitate either the inclusion of all pairs (99% sensitivity) or the safe exclusion of non-pairs (99% specificity). These calculations allow an interactive use of the ROC curve by the user, depending on the circumstances of the case under investigation. For example, excluding possible matches may be as crucial to a forensic investigation as identifying true matches. In fact, the results of this study indicate that MTMVs > 0.61 for Digit 3 PP (which coincides with 2SD from the mean) is highly indicative of a non-match. This is in agreement with a previous study using the humerus, where all but one true pair were identified using a threshold of the mean plus two standard deviations (Karell et al. 2016). By selecting a threshold of 3 standard deviations from the mean, it is almost certain that one can safely identify non-pairs (see example of Digit 3 PP). This information is very important in a commingled situation as it would reduce significantly the number of DNA comparisons necessary to ascribe the skeletal element to the correct missing person.

The current method is the first semiautomatic 3-dimensional method that attempts to pair-match phalanges. The method was developed in a mixed ancestry and sex sample and these factors do not seem to affect the performance of the method in accordance with previous studies using MTMVs (Karell et al. 2016). Naturally, the sample needs to be further expanded in size, including surface scans and bones in different taphonomic conditions, so that the preliminary findings presented here can be validated. Nevertheless, the results appear very promising, especially if one takes into account the fact that, although phalanges are the least anatomically complex skeletal elements of the human body, the current method achieves notable success in rejecting a substantial number of possible pairs.