Introduction

One of the scopes of practice of forensic anthropologists is the estimation of sex from skeletal remains [1, 2]. This process constitutes an important step in the development of the biological profile of an individual from human skeletons [1, 3, 4]. While distinct morphological traits on the pelvis and the skull are sexually dimorphic, their accuracies in correct sex estimation using the non-metric methods are better depending on the level of experience of the examiner [5, 6]. Consequently, measurements of various bones of the human skeletons have been subjected to a number of statistical analysis including the use of discriminant function and logistic regression analyses [7,8,9,10,11,12,13]. However, these equations have been shown to be population-specific [14,15,16] which means that equations derived for a population group when applied to other population groups produce a lower classification rates. Consequently, population-specific equations have been derived for measurements of the skull [17, 18], clavicle [19, 20], long bones of upper and lower extremities [7, 12, 13, 21, 22], sternum [23,24,25], vertebrae [16], pelvis [6, 26], hand and foot bones [11, 27,28,29,30,31,32,33] and tooth dimensions [34] in different parts of the world with acceptably high average accuracies.

In South Africa, a country with one of the highest rates of murder cases in the world [35], similar efforts have been made in the formulation of local standards for sex estimation from the skull [10, 18, 36] and postcranial bones [8, 9, 11,12,13, 21, 37, 38]. The average accuracies in correct classification using these equations ranged between 56 and 98% [7,8,9, 18, 32, 37, 39]. These equations have been derived using samples of bones of South Africans of African descent (SAAD) and South Africans of European descent (SAED). Recent attempts at formulating similar equations for Mixed-Ancestry South Africans (MASA), a self-identified social group also known as Coloured, has been successful [7]. The need for population-specific equations is based on the existence of osteometric variations between population groups, and the degree of sexual dimorphism exhibited by measurements on bones from different population groups [6, 40, 41]. However, Steyn, Patriquin [40] highlighted some major drawback in the development and application of population specific discriminant function equations. These include the lack of bone collections or data for formulation of population specific equations in most parts of the world and the need for prior knowledge of the population group of the skeleton before the application of appropriate discriminant function equations.

Thus, Steyn, Patriquin [40] developed global discriminant function equations for sex estimation using measurements of the pelvis and concluded that population-specific equations are not required based on the data from pelvic bones. Subsequently, other researchers have shown that global equations from pooled data of measurements produce a more precise estimate of stature [42] and sex [16, 43] with reasonably high average accuracies which are comparable to and sometimes better than population-specific equations. It is therefore the aim of this paper to (1) formulate global equations for sex estimation using measurements around the nutrient foramen of the long bones of arm of South Africans of African descent (SAAD), South Africans of European descent (SAED) and Mixed Ancestry South Africans (MASA) and (2) compare the average accuracies obtained from the global equations with those obtained from population-specific equations. Dimensions around the nutrient foramen of the bone diaphysis offer an alternative to the midshaft measurements in forensic investigations because the nutrient foramen is easy to identify and dimensions around it are independent of the maximum bone length [13, 44,45,46].

Materials and methods

Materials

Prior to the commencement of this study, ethical clearance waiver (Certificate Number W-CJ-140604-1) was obtained from the School of Anatomical Sciences, University of the Witwatersrand, Johannesburg. The data analysed in the current study were obtained from two previously published studies [7, 21] in which a total sample of 988 bones (humeri: 327, radii: 325 and ulnae: 336) from South Africans of African descent (SAAD), Mixed Ancestry South Africans (MASA) and South Africans of European descent (SAED) of known sex and age-at-death. These samples were obtained from Raymond A. Dart Collection of Human Skeletons [47] housed in the School of Anatomical Sciences of the University of the Witwatersrand, Johannesburg and the UCT Human Skeletal Collection [48] housed in the Department of Human Biology of the University of Cape Town, South Africa. Skeletons from both collections were mainly derived from cadavers that have been used for dissection as part of the training of medical, dental, physiotherapy and occupational therapy students. The demographic information about the cadavers including ancestry are documented in the catalogue of information of these collections. The distribution of the samples is shown in Table 1.

Table 1 Skeletal sample distribution

Methods

Measurements

Five measurements namely maximum length (tl), linear distance from the proximal end of the bone to the nutrient foramen (penf), circumference at nutrient foramen (circ), anteroposterior diameter at nutrient foramen (apdiam) and mediolateral diameter at nutrient foramen (mldiam) were taken on each left bone. In the absence of the left bone, the right bone was used as there were no significant side differences. The measurements are well described in the previous studies [7, 21]. Data were described and analysed statistically using SPSS version 23 software program.

Statistical analyses

Descriptive statistics including means and standard deviations were calculated for males and females separately for combined populations for the humerus, radius and ulna. Thereafter, a one-way ANOVA test was performed to assess differences between the mean measurements of both males and females for each of the bones. In addition, multivariate analysis of variance (MANOVA) test was performed in order to assess the existence of statistically significant difference between multiple dependent variables at the same time. After establishing that significant differences exist between male and female mean measurements, combined data for all groups were subjected to stepwise and direct discriminant function analyses following the description of Bidmos, Asala [32]. The validity of the functions generated was assessed using the “leave-one-out” classification procedure. This procedure involves the classification of each case in the sample by the function that is generated without the case been tested. For each bone, the top three performing functions with an average accuracy of more than 80% were selected. Each of the nine functions selected was used to predict sex for each of the cases in the three different population groups. The average accuracy in correct sex classification for each of the functions was calculated for each population group separately.

In addition, population-specific stepwise and direct discriminant function equations were formulated for each bone. The best performing population-specific functions for each bone with average accuracies higher that 80% were selected. Each of the population-specific functions for a population group (for example SAAD) was applied on the data from the other two population groups (i.e. MASA and SAED). The average accuracies in correct sex classification was calculated separately for each population group been assessed in order to assess the performance of each population-specific function on other population groups.

Results

The descriptive statistics of all measured variables for pooled data are displayed in Table 2. Males consistently showed higher mean measurements for all variables compared to females. Statistically significant differences were observed between male and female mean measurements at p ≤ 0.05 for all measurements. Supplementary Table 1 also shows the descriptive statistics of each of the variables for each population group and for both sexes. The MANOVA test shows that there is statistically significant interaction between sex and population group for humeri and radii variables (Supplementary Table 2). However, statistically significant interaction was not observed for ulnae variables (Supplementary Table 2).

Table 2 Descriptive statistics of measurements around the nutrient foramina of the humerus, radius, and ulna from pooled data

The five humeri, radii and ulnae measurements were analysed using stepwise and direct discriminant functions (Table 3). The unstandardised coefficients, constants, average accuracies, cross-validation in correct sex classification and the sectioning points are presented in Table 3. Functions 1, 4 and 7 were derived from the stepwise analysis of measurements of the humeri, radii and ulnae with average accuracies of 82.4%, 86.5% and 83.5% respectively (Table 3). The other functions were formulated from a combination of measurements using direct discriminant function analysis of measurements of the humeri (Functions 2 and 3), radii (Functions 5 and 6) and ulnae (Functions 8 and 9). The average accuracies in correct sex classification ranged between 80.7% (Function 3, Table 3) and 85.2% (Functions 5 and 6, Table 3). The results of the cross-validation using the leave-one-out classification showed that the average accuracy in correct sex classification for most of the presented functions remained unchanged (Table 3). Functions 2, 5 and 6 showed a minimal and insignificant drop in classification rate of between 0.5 and 0.9% thereby confirming the validity of the derived functions from the pooled data. The pooled within-group covariance matrices by sex are presented in Supplementary Table 3.

Table 3 Unstandardised coefficients, constants, and accuracies for multivariate discriminant function analysis for pooled data

Table 4 shows the average accuracies following cross-validation of Functions 1 to 9 (Table 3) presented above on samples from each of the population groups. In the SAAD group, a decrease in average accuracies following cross validation was observed and this ranged between 0.8% (Function 3) and 5.3% (Function 4). The other two groups showed an increase in the classification rate. Most of the functions in the MASA group showed an increase in the average accuracies which ranged between 0.7% (Function 8) and 2.8% (Function 4). However, a drop in average accuracy of 0.5% was observed for Function 7 in this population group. All the functions in the SAED population group showed an increase in the average classification rate which ranged between 0.5% (Function 4) and 6.9% (Function 7). The average of the observed changes between the original classification rate and the cross-validation rate were 3.5%, 1.3% and 2.7% for the SAAD, MASA and SAED groups respectively.

Table 4 Average accuracies and cross-validation accuracies using pooled functions on samples of South Africans of African Descent (SAAD), Mixed Ancestry South Africans (MASA) and South Africans of European Descent (SAED)

The average accuracies and cross-validation accuracies are presented for the top two functions for the humeri (Functions 1, 2, 7, 8, 13 and 14), radii (Functions 3, 4, 9, 10, 15 and 16) and ulnae (Functions 5, 6, 11, 12, 17 and 18) for each population group (Table 5). The cross-validation of SAAD population-specific functions on the same sample showed a slight drop of average accuracies, which ranged between 0.8% (Function 6) and 1.7% (Functions 4 and 5). However, when SAAD population-specific equations were applied on the MASA and SAED population groups, the range of drop in average accuracies are 1.3–15.4% and 2.3–13.8% respectively. The average in the drop of accuracies for SAAD population-specific functions were 1.0%, 8.2% and 9.7% for SAAD, MASA and SAED groups respectively. The average accuracies for most of the MASA population-specific functions remained unchanged after cross-validation (Table 5: Functions 8, 10, 11 and 12). However, the other two functions showed a drop in average accuracies of 0.9% (Table 5). The application of MASA population-specific functions on the SAAD and SAED population groups showed a drop in average accuracies 0.4–10.9% and 1.4–10.4% respectively (Table 5). The average accuracies remained unchanged for two of the SAED population-specific functions, while the others showed a drop in average accuracies that ranged between 0.9 and 3% (Table 5). The validity of these functions on a sample of SAAD population group showed a drop in average accuracies which ranged between 6.7 and 8.7% (Table 5). A larger drop in average accuracies, which ranged between 8.7 and 22.4% was obtained when SAED population-specific functions were tested on a sample of MASA population group.

Table 5 Average accuracies and cross-validation accuracies using population-specific discriminant functions on samples of SAAD, MASA and SAED

Discussion

Estimation of sex remains one of the most vital aspects of the work of forensic anthropologists. Consequently, population-specific equations for estimation of sex from measurements of bones have been published for various bones of the human skeleton [10, 17, 18, 25, 37, 49,50,51]. These population-specific equations display higher average accuracies in correct sex estimation when applied to samples from the population from which they have been derived. It has therefore been suggested and advised that these population-specific equations should not be applied to other population groups as the degree of sexual dimorphism varies greatly between populations [14, 52]. Nevertheless, the drawback of the application of population-specific equations is that it requires a prior knowledge of the population group of any skeletal material [40].

In the current study, measurements of the humeri, radii and ulnae were shown to be sexually dimorphic which is consistent with the results of other studies from different parts of the world [19, 49, 53,54,55,56]. The range of average accuracies obtained for pooled discriminant function equations (DFEs) is comparable to those presented for previous studies in South Africa [37] and for other geographical parts of the world [17]. The average accuracies for the humeri (81–82%), radii (84–86%) and ulnae (83–84%) (Table 3) are consistently lower than those obtained for population-specific DFEs for humeri (81–89%), radii (83–89%) and ulnae (82–91%) (Table 5). Our results also showed increased average accuracies for MASA (0.1–2.8%) and SAED (0.1–6.9%) and a decreased average accuracy for SAAD (0.8–5.3%) when pooled DFEs were cross validated on samples from the respective population groups (Table 4). These observed increases and decreases in the cross-validated average accuracies for samples of different population groups are higher compared to that obtained for the pooled group (0–0.9%) (Table 3). In addition, a drop in average accuracies was observed when population-specific DFEs for SAAD (1.3 to 15.4%) were cross validated on samples of MASA and SAED. Similar results were also observed when population-specific DFEs for MASA (0.4–10.9%) and SAED (0.0–22.4%) were applied on the other two groups (Table 5).

The results of the cross-validation of average accuracies of both pooled DFEs and population-specific DFEs indicate that population-specific DFEs provide a higher classification rate compared to the pooled DFEs. This is in support of findings from previous studies confirming population specificity of DFEs [3, 7,8,9,10,11, 14, 17, 18, 32, 57]. However, it should be noted that the average accuracies presented for DFEs from the pooled data (Table 3) are reasonably high and are useful in the estimation of sex. The advantage of application of these functions in forensic cases is that they can be used without any prior knowledge of the population group.

South Africa with a population of about 58 million people consists of four major population groups that are spread over nine provinces. These distinct socially identifiable population groups are South African of African descent (blacks), South African of European descent (whites), Mixed-ancestry South Africans or Coloureds (MASA) and South Africans of Indian extract (Indians) [9, 35, 37, 58] . Identification of human remains poses a huge challenge in such a country with a diverse population with regard to the application of population-specific equations whether discriminant function equations for estimation of sex or regression equations for estimation of stature. While it is generally believed that population-specific DFEs and regression equations are associated with increased accuracy of estimation of sex and stature, Albanese et al. [59] argued that the assignment of an unknown to a population group is not only problematic but also sometimes impossible.

The possible reasons for this drawback include the lack of biological significance of some traits used in the assignment of population affinity and the difficulty in assigning an individual into a particular group if the individual falls within the boundaries of population groups [59]. Subsequently Albanese et al. [59] presented universal regression equations for estimation of stature from the femur and opined that this has both methodological and theoretical benefits compared to the use of population-specific regression equations. In an earlier study, Steyn, Patriquin [40] proposed the same notion of applicability of universal DFEs for estimation of sex from the pelvic bone.

Steyn, Patriquin [40] assessed the reliability of population specificity of DFEs derived for measurements of pelvic bones of South African whites, South African blacks and Greeks. Their study reported that the average accuracies in correct sex classification using all measured variables of the pelvic bone for the combined group, Greeks, SA whites and SA blacks were 94.5%, 94.8%, 94.5% and 94.5% respectively [40]. Similar results were also obtained in the direct analysis using pubic and ischial length (89–90%) and acetabular diameter (81.6–84.1%) and concluded that population-specific DFEs did not provide a higher classification rate compared to that obtained from pooled data [40]. In addition, the study suggested that it was not necessary to use population-specific DFEs for the estimation of sex using pelvic bones [40].

Macaluso Jr. [43] tested the reliability of pooled data DFEs presented by Steyn, Patriquin [40] on a sample of French pelvic bones. Macaluso Jr. [43] observed that the average accuracies of the pooled data remained unchanged when applied to a French sample and concluded that population-specific equations are not important when it comes to the estimation of sex using measurements of the pelvic bone. One of the reasons given for the lack of population specificity of DFEs from pelvic bone measurements is that it is a highly sexually dimorphic bone and it is designed for parturition, which is common in all population groups [40]. However, this may not be true for other bones of the skeleton other than the pelvic bones [40].

Recently, Hora, Sládek [16] evaluated the concept of population specificity of DFEs derived from measurements of the 12th thoracic vertebra (T12) and the first lumbar vertebra (L1). The study showed that while the two measurements of T12 i.e. anteroposterior body diameter and mediolateral body diameter, were found to be universally applicable in sex estimation, most of the measured variables of the thoracic and lumbar vertebrae showed population specificity in the assignment of sex [16]. The results of the current study are in agreement with the findings of Hora, Sládek [16] within the context of the diversity of population groups within South Africa. The universal application of the presented pooled equations in this study need to be tested in different geographical locations of the world. It should be noted that there may be need for the assessment of the applicability of population-specific DFEs derived from measurements of long bones of the upper and lower extremities due to the differences observed in the robustness of these bones in different population groups. However, this does not preclude an attempt to derive such equations which can be very useful especially during this era of increased international migration. It will become increasingly difficult to determine and choose equations to use during estimation of sex for migrants in different parts of the world.

In conclusion, the current findings indicate that discriminant function equations generated from measurements of humerus, radius and ulna of pooled population data of South Africans present with reasonably high average accuracies. Consequently, they are useful in the estimation of sex in cases when the population affinity is either difficult or impossible to ascertain and their applicability to populations of Southern Africa will require validation studies in individual populations from different countries in the region.