Introduction

The most reliable methods for the estimation of age in children are those based on dental maturation processes [1, 2]. To estimate the bone age, metric analysis of the skeleton is usually preferable, and the length of long bones provides the best precision [3, 4]. However, infant skeletons degrade easily due to taphonomic processes [5], so the ends of the diaphysis of the long bones are often altered. In these circumstances, other more resistant bones are preferable as, among others, the petrous part of the temporal bone [6] or the pars basilaris of the occipital bone [7, 8].

For this reason, the pars basilaris has been widely studied with the objective of developing methods for estimating age [9], in addition to metric models [7, 8, 10], assessing the obliteration of sutures [11, 12], or evaluating morphological changes for the estimation of postnatal survival [7, 13]. However, probably due to the difficulties in investigating infant osteological collections, many of these studies show some limitations, such as the use of unidentified archaeological or unrepresentative samples, or ineffective statistical analysis.

The design of systems for age estimation has evolved considerably in recent years. Importantly, prerequisites for the use of methods for age estimation have been defined, especially if they are used in forensic settings, helping to define the professional ethics in the field of physical and forensic anthropology [1428]. Among other considerations, it is recommended to never use methods that do not include information about the error in the estimate, preferably given as confidence intervals. Moreover, special attention should be given to quantifying the repeatability and reproducibility of the method, and care should be taken to not interpret the value of statistical significance as an indicator of the quality of the method. Priority should be given to those methods designed from representative and contemporary populations and related to study subjects, and the method used should have been previously validated.

Advances in computer technology and continuous scientific discussions in this context have also allowed for performing more complex analyses more easily and solving conceptual errors in the analysis methods used in the past. In this case, some of the most relevant examples are the use of classical versus inverse calibration [29], the Bayesian calibration [30, 31], transition analysis [32, 33], and image analysis using geometric morphometrics [34, 35].

These last examples clearly represent an improvement in methods and provide better estimates of age. But, in our view, they also considerably increase the difficulty of using these methods, since highly specialized computer equipment or extensive statistical knowledge is required, which many anthropologists responsible for making estimates of age routinely do not possess. For this reason, we believe it is important not to create a rift between the academic and strictly professional context, and achieve a balance that allows this discipline to keep growing, increasingly improving the methods used, but without compromising the applicability thereof.

The objectives of this study were to validate the methods proposed by Fazekas and Kósa [7] and by Scheuer and McLaughlin [8] for the estimation of age from metric analysis of the pars basilaris and to propose new regression formulae through classical calibration, which can be used as a relatively simple method to estimate age.

Material and methods

The study sample was selected from the Granada osteological collection of identified infant and young children [36]. This collection is currently formed by 230 individuals aged between 5 months of gestation and 8 years, and its main strengths are the very good condition of specimens, since all individuals were buried in isolated niches; the collection is relatively recent (mid to late twentieth century); and a lot of information antemortem is available, because in most cases, there are death certificates, birth certificates, burial certificates, and in cases where an autopsy was performed, a report by the coroner.

This information allowed us to use the following exclusion criteria to select the study sample: individuals whose cause of death was a premature birth, because their chronological age does not correspond to the skeletal age; individuals whose causes of death may have altered their skeletal development, such as anencephaly, hydrocephaly, or Werding Hoffman disease; individuals with obvious alterations in skeletal development (unknown pathology); individuals whose pars basilaris was found fully or partially altered by taphonomic factors; and individuals with unknown chronological age, either by the absence of official documents or incongruity between them. Once these exclusion criteria were applied, a sample of 114 individuals was obtained. The distribution by age and sex is shown in Table 1.

Table 1 Age at death (years) and sex distribution of sample

A digital caliper was used to collect data, but first, several training sessions were used to practice on the pars basilaris of several randomly selected individuals.

In addition to the measures used by Fazekas and Kósa [7] and by Scheuer and McLaughlin [8], four additional measures were adopted in this study, referring to the spheno-occipital synchondrosis and intra-occipitalis anterior suture (right), with the aim of estimating the age in cases where the bone is partially degraded. A total of seven different measures were taken from the pars basilaris of each individual. These are described below (Fig. 1):

Fig. 1
figure 1

The locations of the measurements on the pars basilaris

  1. (a)

    Maximum length: the maximum distance between the posterior edge of the lateral condyle and the spheno-occipital synchondrosis [9]

  2. (b)

    Sagittal length: the maximum distance between the foramen magnum and the spheno-occipital synchondrosis [9]

  3. (c)

    Maximum width: the greatest distance measured in the line of the lateral angles [9]

  4. (d)

    Maximum width of the spheno-occipital synchondrosis

  5. (e)

    Height at midshaft of the spheno-occipital synchondrosis

  6. (f)

    Maximum length of the intra-occipitalis anterior suture

  7. (g)

    Maximum width of the intra-occipitalis anterior suture

To calculate the intra- and interobserver error, 30 randomly selected individuals were re-measured by the main observer and a second observer, and concordance correlation coefficient (CCC) was calculated [23].

To test the effectiveness of the methods proposed by Fazekas and Kósa [7] and by Scheuer and McLaughlin [8], age was estimated for those individuals whose measures of the pars basilaris were within the range covered by the original methods. This data were compared with the chronological age from official records by the non-parametric Wilcoxon test for related samples, since the data distribution was not normal (p < 0.05, Kolmogorov Smirnov test). In addition, the mean difference, standard deviation, and number of overestimated and underestimated cases were calculated.

In order to propose a new system for age estimation, new regression formulae were calculated using classical calibration, since the reverse calibration caused a systematic error in estimates [29]. For this, a least-squares regression was performed using “age” as an independent variable and each of the measures of the pars basilaris as dependent variable. The logarithmic model was the best one to explain the results (Fig. 2):

Fig. 2
figure 2

Logarithmic relationship between the maximum length of the pars basilaris and gestational age

$$ \mathrm{Measure}\kern0.5em =\kern0.5em a\kern0.5em \times \kern0.5em \ln \left(\mathrm{age}\right)\kern0.5em +\kern0.5em b $$

Once this analysis was carried out, the above equation was easily transformed to offer estimation models:

$$ \mathrm{Age}\kern0.5em =\kern0.5em {\mathrm{e}}^{\frac{\mathrm{Measure}\kern0.5em -\kern0.5em b}{a}} $$

Since the logarithmic model does not accept negative values for the variable “age” in fetal individuals, the gestational age of individuals was used as an independent variable. To calculate this value, 280 days (40 weeks) were added to the chronological age. For this reason, when the formulae are used, the result should be converted to the desired unit. For example, to obtain the estimated age in years, 280 should be subtracted from the value obtained and the resulting value divided by 365.

Numerous previous studies have confirmed the existence of different growth rates in boys and girls [2, 37]. For this reason, to calculate the formulae, the results have been both separated by sex and combined.

Due to the nature of the data, it is not possible to provide a single error value for each function, because it increases progressively with age. Therefore, we believe that regression functions should also be used to calculate the error according to the value of the estimated age. To calculate these functions, the following analyses were performed: after obtaining the regression formulae for estimating age, they were applied to all individuals in the sample in order to calculate the error assumed in each individual (real age − estimated age). Subsequently, a linear model was created by least-squares regression using the real age and the absolute value of this error as variables (Fig. 3). This model allows for incorporating a confidence interval of 50 % to the estimate. Using the upper limit for a confidence interval of 95 % of this model (age and absolute error), we can get a value for the error that would serve to create a confidence in the estimate of 97.5 %.

Fig. 3
figure 3

Regression function to calculate the assumed error for age estimation with the maximum length

Results

The results for intra- and interobserver error are shown in Table 2. The results for the intraobserver error show substantial agreement for all measures. In the case of interobserver error, only the measures proposed by the original methods and the maximum length of intra-occipitalis anterior suture show substantial agreement.

Table 2 Concordance correlation coefficient to calculate the error intra- and interobserver with the measures used

The results obtained for the validation of the methods proposed by Fazekas and Kósa [7] and by Scheuer and McLaughlin [8] are shown in Table 3. Significant differences were found for both methods, except for the maximum width when the proposed method by Fazekas and Kosa [7] was applied.

Table 3 Validation of previous studies

Regression formulae for estimating the age and for calculating the corresponding error are shown in Table 4.

Table 4 Regression formulas for estimating the gestational age (GA) of individuals aged between 5 months of gestation and 6 years from the pars basilaris measures

Discussion

The results indicate that the methods for estimating age from metric study of the pars basilaris proposed by Fazekas and Kósa [7] and by Scheuer and McLaughlin [8] are not suitable for application in the context of physical or forensic anthropology. In addition to the significant differences between estimated and actual age, these methods pose as disadvantages the use of inappropriate statistical analysis and not to provide information about the assumed error. These elements have justified the proposal of new regression formulae as a novel method for estimating age.

The results obtained for intra- and interobserver error show that the degree of agreement is “substantial” for all measures, except those relating to the spheno-occipital synchondrosis and the width of the intra-occipitalis suture, which showed a moderate degree of agreement [38]. It is therefore preferable to use the measures proposed by the original methods [7, 8] when possible, and in cases where the pars basilaris is altered, the length of the suture must be used.

Regarding the validation of the methods tested, the proposed method by Fazekas and Kósa [7] underestimated the age of most individuals tested (Table 3). The differences between the actual and estimated age were statistically significant (p = 0.000) when the sagittal length of pars basilaris was analyzed, with an average underestimation of 47 days, which can be considered high for prenatal individuals. However, the results obtained by analyzing the width are more positive, because, although a tendency to overestimate was also observed, this was much lower (10.3 days) and the differences were not significant. The proposed method by Scheuer and McLaughlin [8] for the estimation of postnatal age, unlike the previous method, showed a clear tendency to overestimate, with significant differences for the three measures used (p = 0.000). Regardless of the values obtained after validation, most age ranges offered by this method were calculated from single individuals, so this study should be interpreted as a descriptive analysis of the Collection of Spitalfields [10], and not as a method for age estimation. In both cases, the results show that these methods are not suitable for the age estimation in the sample used. It would be risky to attribute a specific reason for this error since this was not the aim of this study. However, we can hypothesize differences in the distribution of age and sex in the sample, small samples in the original methods, interpopulation variability, differences in statistical methods, or a combination thereof.

The new regression functions showed relatively high values for R 2, indicating that, in general, this can be considered a good method for estimating age in infant skeletal remains, although it would be desirable to improve the characteristics of the sample in future studies. When data from both sexes were combined, the best results were obtained with the maximum length of the pars basilaris (R 2 = 0.85) and the maximum length of the intra-occipitalis suture (R 2 = 0.86). When the data were analyzed separately for both sexes, in all the cases, higher R 2 values were obtained in girls, which did not occur in boys. This result should be interpreted with caution, because it could be due to greater variability in the results in males, equivalent to the greater variability in this sex and therefore to a higher degree of error assumed when estimates are made. However, it could also be due to differences in the number of individuals analyzed, since females were less represented in the study sample. The sex estimation in juveniles is an extremely difficult task, being the most recommended option in case of unknown sex individuals with the application of combined formulae.

Comparison with methods to estimate age from the length of the long bones [3, 39] or from dental mineralization [2] shows that the metric study of the pars basilaris is not the most accurate system for this age group, as expected, but it will be one of the best options in incomplete or altered skeletons.

Other authors have proposed using spline systems instead of a logarithmic model [3]. To this end, the curve is divided into more rectilinear patterns, with the aim of simplifying the implementation of the method. In this study, we decided to use the logarithmic model for a number of reasons. First, the logarithmic model has a better fit (R 2 value closer to 1) than the spline system, so the error in the estimate will be lower. Moreover, according to growth patterns in humans, growth is logarithmic until 8 years of age [37]; so, if the age range of the sample is greater, two linear functions may be insufficient, and possibly three or four functions should be used. In this case, if we consider that seven different measures were employed and, ideally, independent formulae by sex should be used for each of them, we should propose a method with 84 formulae, including calculations for the respective errors, which really complicate the method. Additionally, the choice of cutoff point is random in the spline system and does not exist from the biological point of view. Finally, if the system is divided into several lines, the choice of which one to apply should not be based on age groups, since this is an unknown fact when we apply the method; instead, concrete measures should be used as a cutoff point. However, this would also create other problems when different functions are applied to individuals of the same age.

Although this study shows positive results compared with previous studies, especially for individuals in the postnatal period, the regression functions proposed in this paper have some limitations, which must be taken into account when they are used as a method for estimating age:

  • These functions should not be applied if the result of the estimate is more than 6 years of age or less than 5 months of gestation.

  • The sample used in this study is one of the best currently available for this age group; however, this is not a fully adequate sample. It remains small and inhomogeneous in terms of distribution by age and sex. For this reason, although confidence intervals at 50 and 97.5 % are offered, the results must be interpreted with caution.

  • These formulae have not been validated as a method for estimating age. Future studies will be needed to actually check their effectiveness.

Some studies have shown that frequentist systems or regression models, as presented here, involve certain methodological drawbacks and, indeed, there are more effective methods of analysis [40]. In our opinion, the existence of better methods of analysis does not mean that regression models are wrong, because they can also have advantages depending on the context in which they are used.

There are a variety of alternatives to choose for the methodology that will be used to design and validate methods for estimating age. Many of these methods have been found to be more effective than the classical calibration method used in this study. To mention some examples, geometric morphometrics has the ability to analyze a large number of variables simultaneously, both in form and size [34]; transition analysis methods combine various age-related processes to provide estimates that include confidence intervals [32, 33]; Bayesian calibration allows for the use of additional prior information to develop the estimation, reduces errors, and generates probability curves, which are very useful in the forensic context [30, 31]. However, in many cases, these studies have the disadvantage that they require considerable expertise in statistics or specific software that the anthropologist in many cases does not possess. As a result, many anthropologists are still using older models and schemes that already are known to be ineffective but allow them to carry out estimates quickly and easily.

Obviously, to fix this problem, we do not suggest ceasing to improve the analysis process to develop identification methods. The main solution should be to encourage interdisciplinary work and involve expert statisticians. Regarding scientific research, the methods employed in forensic anthropology have improved greatly in recent years, always seeking to increase the precision and accuracy of results, and this line of work must continue. However, it is important to keep in mind that such studies should also try to make relatively simple methods for routine use in forensic practice; ultimately, if we can facilitate the work, it will be done more efficiently. Alternatively, it would be good to guide future research to interdisciplinary collaborations with the goal of designing specific software that combines more sophisticated statistical methods with faster and more intuitive applicability. Until then, we believe that studies such as the one presented here are at a midpoint between the two contexts.

We are aware that after this study, it is essential to validate our results in an independent sample with similar characteristics to determine the effectiveness of the proposed formulas, which will be our main target for near future studies. It would be good for colleagues and researchers who read this work to not limit their interest to the usefulness of the functions presented here, which is obviously very important too, but also to value the method used to obtain them and how to present the results, with the aim of reaching consensus on these issues.

Conclusion

Methods for age estimation from the metric study of the pars basilaris proposed by Fazekas and Kósa [7] and by Scheuer and McLaughlin [8] are not suitable for application in the context of physical or forensic anthropology. New formulae for estimating the age and confidence intervals at 50 and 97.5 % to express the assumed error are presented. Despite the implied limitations of the analysis method used, we believe that these formulae allow for simple and fast application in forensic identification.