Introduction

Estimating the age-at-death of adult skeletal remains is one of the most important—and most difficult—aspects of forensic anthropological analysis. The methods for estimating age in adult skeletal individuals are based on morphological and degenerative changes in bones and teeth throughout life. The rate and degree of change are determined by a complex set of interactions among genes, culture, and environment that contribute to each individual life history [13]. The key to the successful application of a particular method is an understanding of whether the method is accurate (correct), precise (refined), and repeatable from an intra- and interobserver standpoint when applied to unknown individuals outside of the original reference sample [e.g., 48]. However, the reference samples on which many of the original methods were based are among very few known age-at-death collections of sufficient sample size for testing purposes [911]. Documented reference samples are even rarer outside of the USA. Additionally, variation in the aging process begins to increase during the third decade of life between individuals and within a single skeleton and continues to increase throughout life [12]. The error in age estimation can be quantified only when a method is tested on a contextualized osteological collection or on individuals of known chronological age. A contextualized collection includes known demographic data (sex, age, year of birth, and geographical area) as well as the socioeconomic and temporal context in which the individuals lived [13].

Two of the most common locations for the examination of the morphological changes related to the aging process are the pubic symphysis and auricular surface of the ilium. Todd [14] developed the first formal standards for determining skeletal age-at-death from pubic symphyseal morphology for white males in the Hamann-Todd collection. Todd later expanded the method to include white females and black males and females [1517]. More recently, Katz and Suchey [18] refined the Todd phase method using a sample of modern autopsied remains from the Los Angeles County Coroner’s Office. They concluded that sex- and population-based differences have a considerable impact on the reliability of the method. However, for American samples, the resulting Suchey-Brooks method [19] is commonly considered to be the best age estimation method and is widely used in forensic anthropology and bioarchaeological contexts [10]. The Suchey-Brooks reference sample is large and includes a number of modern North American ethnic groups. However, despite its popularity, pubic symphyseal age assessment has not performed well in validation studies outside the USA including those based on modern French autopsied individuals [20], Canadian pioneers [4], and modern Portuguese and Italian individuals from cemetery collections [10, 21]. These studies have demonstrated biased age estimates and difficulty in determining the age of individuals over 35 years. Furthermore, Sinha and Gupta [22] observed differences in the timing of age-progressive pubic changes in USA and Indian samples; Hoppa [23] observed similar differences between USA and English samples.

The original standards for estimating skeletal age-at-death from the auricular surface of the ilium were developed by Lovejoy et al. [24] using archaeological samples (Libben collection), American cadaver collections from the early twentieth century (Hamann-Todd collection), and forensic cases from the Cuyahoga County Coroner’s Office. In burial contexts, the auricular surface often preserves better than the pubic symphysis and the morphological changes continue well into the sixth decade of life. However, the Lovejoy method is more difficult to apply than the Suchey-Brooks method, and validation studies have shown that the auricular surface method suffers from repeatability problems [e.g., 25, 26]. Saunders et al. [4] used a small, documented population from Belleville, Ontario and reported overall agreement with Lovejoy et al. [24], but the reliability of the method decreased after age 45. On Portuguese and Italian individuals, Santos [21] and Hens and colleagues [10] found similar results. Using the Grant collection at the University of Toronto, Bedford and colleagues [27] found that the auricular surface method overestimated the ages of younger individuals and underestimated the ages of individuals over 50 by as much as 5–10 years. Results for a Thai sample were inaccurate and imprecise enough for Schmitt [2] to conclude that both the Suchey-Brooks and Lovejoy methods should be avoided on Asian samples.

Using 180 individuals of known age-at-death from the Spitalfields collection (London), Buckberry and Chamberlain [28] revised the Lovejoy method and proposed a new methodology. The revised method is based on the characteristics described by Lovejoy and colleagues but recognizes that the age-related features in the auricular surface change independently of one another. In this method, each auricular surface feature is analyzed and scored independently and then combined into a composite score related to a broad age range. This method is the most recent of the three, and although some authors have proposed modifications [26, 29], it has rarely been evaluated using documented osteological collections [30].

Information about the applicability of aging methods to samples from different populations and knowledge of population variation in aging processes are vital to successful adult age estimation. However, few studies have evaluated population differences in the accuracy of aging methods. With the exception of the Buckberry and Chamberlain method that was developed in London, these pubic symphysis and auricular surface methodologies have been developed and tested on modern skeletal samples (samples from later nineteenth century to present) derived from North American populations [4, 25, 27, 31, 32]. As we have seen before, only a few studies are based on samples outside of the USA including India [22], Thailand [2], Great Britain [23], France [20], Italy [10], and Portugal [21]. To supplement this literature, the current study evaluates three methods for adult age estimation using the pubic symphysis [19] and the iliac auricular surface [24, 28] on a modern documented Spanish sample. Specifically, our purpose is to analyze the accuracy and applicability of the methods to contemporary Spaniards and inform our understanding of skeletal aging processes in Spanish populations. These three methods (Suchey-Brooks, Lovejoy, and Buckberry and Chamberlain) were selected because of their popularity in forensic and bioarchaeological contexts [10] and because they never have been tested in a Spanish sample. The Lovejoy and Suchey-Brooks methods are among the most popular methods utilized by Spanish anthropologists. In Spanish anthropological manuals, both are highly recommended [33, 34], but they remain largely untested on Spanish population. Likewise, the more recent Buckberry and Chamberlain method has been rarely evaluated in a documented collection; our goal was to test its performance in relation to the Lovejoy method in the Spanish context.

Material and methods

The skeletal sample

Data were collected from the modern documented skeletal collection housed in the Museo Anatómico de la Universidad de Valladolid (Valladolid, Spain), which comprises 217 individuals interred in the cemeteries of Palencia and Valladolid. This twentieth century collection includes 124 males and 93 females ranging from 20 to 101 years of age-at-death. Demographic information, including age-at-death, was derived from obituary records [35]. Like most modern reference collections, the Valladolid sample is comprised of primarily older individuals with approximately twice as many males as females [13]. Individuals displaying innominate pathologies were excluded from the study, while individuals with non-inflammatory osteoarthritis or diffuse idiopathic skeletal hyperostosis were included as these conditions are commonly related to age. A total of 80 individuals (55 males and 25 females) from 23 to 101 years old were selected for the analysis. As differences between right and left pubic symphyses [10] and auricular surfaces [26, 28] are negligible, the left side was scored in nearly every case, although the right side was used if the left was damaged, pathologic, or unavailable. Table 1 provides descriptive statistics for the individuals of the sample who were selected for analysis by sex. Figure 1 depicts the chronological distribution of females and males examined during the course of analysis. T tests show that the differences in mean ages-at-death for males (55.58 years) and females (63.84 years) bordered on statistical significance (t = 1.80, p = 0.08). The female subsample is slightly older and more evenly distributed than the male subsample.

Table 1 Age-at-death information by sex for the 80 individuals sampled from the Universidad de Valladolid collection
Fig. 1
figure 1

Age distribution by sex of the 80 individuals sampled from the Universidad de Valladolid collection

While the sample is biased toward older adults, this accurately reflects the composition of contemporary documented samples in Spain [13] and is an opportunity to test the accuracy and reliability of the methods on a population subset that desperately requires additional study. During the laboratory component of the study, the innominates were isolated from the rest of the skeleton and the observations were completed without knowledge of chronological age, avoiding any subjective or objective information that could bias the observations.

Statistical methods

The success in the performance of an aging method can be defined as the proximity of an age estimate to an individual’s actual chronological age [36]. We analyzed the success in the performance of the Suchey-Brooks, Lovejoy, and Buckberry and Chamberlain aging methods in two ways: (1) by scoring the accuracy, that is, whether or not the chronological age of each individual was included in the age ranges provided for each method; and (2) by calculating bias and absolute error for each method. Both bias and absolute error are good indicators of a method’s inaccuracy [26]. Bias is the statistical measure that identifies the direction of the committed error in a method’s misclassification [2, 5, 10, 21, 25]—whether the estimated age is over- or underestimated. If the estimated age is older than the chronological age then the bias is positive. If the estimated age is younger than the chronological age then the bias is negative. Bias was calculated as the average difference between estimated age and chronological age using each method [Σ (estimated age − chronological age)/n).

Absolute error is the statistical measure that evaluates the degree of the committed error in a method’s misclassification [2, 5, 10, 21, 25]. The absolute error was calculated as the average absolute difference between estimated age and chronological age using each method (Σ|estimated age − chronological age|/n). In essence, absolute error represents absolute difference; it does not take into account the sign (positive or negative) of the difference between estimated age and chronological age.

Age estimation methods do not produce specific point estimates of estimated age, but rather, estimated intervals of age (e.g., 45–55). Thus, the extreme of the age range nearest to the chronological age was used to calculate the bias and the absolute error of the estimation. For example, if an individual with a chronological age of 65 years has been estimated at between 45 and 55 years of age, then the bias observed in this specific individual is −5 years (55–65 = −5) and the absolute error is 5 years (|55–65|=5). Contrarily, if the chronological age of the individual was 40 years and the estimated age was 45–55 years, then the bias would be +5 years (45 − 40 = +5) and the absolute error 5 years (|45 − 40| = 5).

Differences in the number of correctly and incorrectly classified individuals (accuracy) between methods and sexes were evaluated with chi-square tests of independence. Differences in the value of bias and absolute error between methods were evaluated with ANOVA tests.

In order to evaluate the applicability of the three analyzed methods to the Spanish population, two types of analyses were conducted:

  1. 1.

    The relationship between trait expressions (or phases) of a particular method and known chronological age was evaluated numerically by Spearman’s correlation coefficient. The Spearman’s correlation coefficient or Spearman’s rho is a non-parametric test of statistical dependence between two variables. It is used when one or both of the variables consist of ranks, like the phases of the adult aging methods. Spearman’s correlation coefficient assesses how well the relationship between two variables can be described using a monotonic function. If there are no repeated data values, a perfect Spearman correlation of +1 or −1 occurs when each of the variables is a perfect monotone function of the other.

  2. 2.

    The extent to which chronological age is capable of predicting membership in the phases for all three aging methods was analyzed via an unrestricted cumulative probit (ordinal) regression analyses [37]. Commonly referred to as transition analysis in the age estimation literature [38], probit regression analyses yield intercepts and slopes for each phase of a respective aging method that can be converted to means and standard deviations with the maximum likelihood function provided by the probit regression. This estimate represents the maximum likelihood at which an individual is most likely to transit from one phase to the next. Like the original transition analysis [38], the current analyses assume that the developmental trajectory for the phases of each aging method can be broken down into an invariant sequence of “n” distinct, non-overlapping stages. Furthermore, it is assumed that the morphological change is strictly unidirectional with respect to those phases of each method. The assumptions of transition analysis and related approaches fit well with the phase systems used to score age-related changes in adults. For this reason, approaches similar to transition analysis are being used in the anthropological literature to study senescent changes in bone [39]. For a more complete discussion of transition analysis, see Boldsen et al. [38] and Steadman et al. [40]. All statistics were calculated with SPSS 18.0.

Results

For clarity, the results of the error analyses will be presented first, followed by the results of the test on the applicability of the methods to Spanish skeletal samples.

  1. 1.

    Accuracy of the Suchey-Brooks, Lovejoy, and Buckberry and Chamberlain methods

    • Accuracy

    • For the purposes of this analysis, accuracy is defined as whether or not the chronological age of each individual was included in the age ranges provided for each method. The initial comparison of the accuracy of the three aging methodologies shows that the Lovejoy aging method performed poorly (20 of a total of 73 individuals were accurately estimated) when compared to both the Buckberry and Chamberlain (61 of a total of 71 individuals) and Suchey-Brooks (35 of a total of 49 individuals) methods (Table 2). The null hypothesis for the independence of the two variables (i.e., accuracy and aging method) was rejected with a chi-square value of 54.8 (p < 0.001, df = 2). A second chi-square test comparing the performance of the Buckberry and Chamberlain with 86% accuracy (61 accurate of a total of 71 specimens) and Suchey-Brooks methods with 71% accuracy (35 accurate of a total of 49 specimens) revealed there was no significant difference between their accuracies with a test statistic of 3.8 (p > 0.05, df = 1). Therefore, taking into account these results, the Buckberry and Chamberlain and Suchey-Brooks methodologies have comparable accuracies for the current Spanish sample.

      Table 2 also shows the accuracy of each method when segregated by sex. The performance of the Buckberry and Chamberlain method varied significantly by sex with low female and moderately high male accuracy (χ 2 = 8.29, p = 0.004, df = 1), while the Lovejoy et al. [24] (χ 2 = 1.27, p = 0.26, df = 1) and Suchey-Brooks (χ 2 = 0.42, p = 0.84, df = 1) methods were comparable among males and females.

      Figure 2 shows the distribution of the accurate and inaccurate age estimates for each individual in relation to the chronological age of the individual and the phases attributed to the individual in each method. Phase 1 in the three methodologies and phases 2 and 3 in Buckberry and Chamberlain method are not shown in Fig. 2 because they have not been attributed to any individual.

      As shown in Fig. 2, inaccurate estimates for the Buckberry and Chamberlain method were restricted to phase 7 (53–92 years), while the majority of inaccurate Suchey-Brooks estimates occurred during phases 5 (28–83 years) and 6 (42–87 years). By contrast, accurate estimates were almost entirely restricted to the final two phases of the Lovejoy method, phase 7 (50–60 years) and phase 8 (60+ years). Figure 2 also depicts the age-progressive pattern of the phases in all three methods. As anticipated, the variance increases in all three methods in the higher phases.

      Bias and absolute error. Table 3 provides the descriptive statistics for bias associated with the three aging methods. The Buckberry and Chamberlain method performed comparatively well with regards to bias. Of the ten individuals with inaccurate age estimates, four individuals’ ages were underestimated and six were overestimated. By comparison, the ages of 43 of the 73 individuals analyzed using the Lovejoy method were underestimated. The ages of ten individuals were overestimated. These results demonstrate that the 5-year intervals currently employed by the Lovejoy method are too narrow and hence ineffective for the current Spanish sample. The Suchey-Brooks method overestimated the ages of three individuals and underestimated the ages of 11 individuals.

      Table 4 provides the descriptive statistics for the absolute error associated with the three aging methodologies. The absolute error was significantly different among the three methods (F = 18.88, df = 2,190, p = 0.000). Like the measure of bias, the absolute error for the inaccurate cases was greatest for the Lovejoy method.

  2. 2.

    The applicability of the methods to Spanish skeletal samples

    Table 5 presents the non-parametric Spearman’s correlations for the phases and chronological ages in the Valladolid sample. The levels of correlation for the three aging methods are comparable, suggesting they all capture roughly the same information about the aging process in Spanish populations. With their broad confidence intervals the Buckberry and Chamberlain and Suchey-Brooks methods outperformed Lovejoy in terms of accuracy (see above); however, the levels of association with the aging process are indistinguishable. These findings indicate that broadening of the age intervals associated with the Lovejoy method would result in virtually identical measures of accuracy, bias, and absolute error as the Buckberry and Chamberlain and Suchey-Brooks aging methods.

    An important consideration is the extent to which chronological age is capable of predicting membership in the phases for all three aging methods. With continued calls for population-specific aging methods and legal challenges to the reliability and replicability of scientific methodologies, quantifying the performance of aging methods has never been more important. Using a probit-based model of ordinal regression, the covariate of known chronological age was regressed against age phase membership (transition analysis). The results of the transition analysis for the three methodologies are depicted in Fig. 3. In it, each line is the probability density of one specific age phase of one specific age method throughout the different ages of the individual life. It describes the relative likelihood for this phase to occur at a given age. For example, in the Buckberry and Chamberlain method, the maximum likelihood to have phase 6 is around 60 years of age and the maximum likelihood to have phase 7 is around 100 years (Fig. 3). In this way, we can know the age at which it is most probable to be classified in a specific phase of one specific method, thus indicating the age of transition between the different phases in a specific method. The ideal aging method would have the probability density of each phase well delimited and would exhibit minimal overlap between phases. However, this is not possible in adult age estimation methods; due to the great variability in the aging process, some (more or less great) overlap between phases is usually found. Therefore, the smaller the overlap between the phases of a specific method, the more statistically significant the fitted model is. Significance in the fitted model indicates the applicability of the method in terms of accuracy and precision. Therefore, the more significant the model, the more applicable the method is.

    The principal characteristic shown in Fig. 3 is the overlap between the different phases in each method. The data were consistent with the estimates of the fitted model for the three methods; however, due to the overlap of the phases there is low strength in the logistic regression model in the three methodologies (R 2Buckberry-Chamberlain  = 0.21; R 2Lovejoy  = 0.20; R 2Suchey-Brooks  = 0.14), the lowest being the one obtained in Suchey-Brooks method. As depicted in Fig. 3, the parameters estimated and regression coefficients were significant only for some phases of each method. For the Buckberry and Chamberlain method, only phases 4, 5, and 6 were significant, indicating that this method is applicable to Spanish populations, but that further statistical modeling and research into the covariance of chronological age with morphological change would be necessary.

    In the Lovejoy method, only phases 4, 5, 6, and 7 were significant. Provided that the age intervals associated with the morphological changes were adjusted, the Lovejoy method is potentially applicable to Spanish populations. Additional research on Spanish reference samples is recommended prior to the method’s systematic application in forensic and archaeological contexts.

    In the Suchey-Brooks method, only phase 5 was significant. The Suchey-Brooks method is the weakest of the three methods applied to this sample, though the subsample size was substantially smaller (n = 49 vs. n = 73). Spanish reference samples with additional younger individuals of less than 50 years of age would be necessary to test all the three methods in an appropriate manner.

Table 2 Accuracy obtained when the three methodologies of adult age estimation are applied taking into account the entire sample and the sexes separately
Fig. 2
figure 2

Distribution of the accurate (asterisks) and inaccurate (circles) estimations of age for each individual in relation to the chronological age of the individual and the phases attributed to him for each method (Buckberry and Chamberlain, Lovejoy, and Suchey-Brooks). Accurate estimation of age is when the chronological age fell inside the estimated age range. Inaccurate estimation of age is when the chronological age fell outside the estimated age range

Table 3 Descriptive statistics for bias associated with the three aging methods
Table 4 Statistics for the absolute error associated with the three aging methods
Table 5 Spearman’s correlation among the aging phases for all three methods and chronological age of each individual
Fig. 3
figure 3

Provability density of each specific age phase in each specific age method (Buckberry and Chamberlain, Lovejoy, and Suchey-Brooks) throughout the different ages of the individual life

Discussion

This study has evaluated the accuracy and bias of the three methods for adult age estimation based on the pubic symphysis (Suchey-Brooks method) and auricular surface (Lovejoy and Buckberry and Chamberlain methods) from a Spanish skeletal collection. These methods were selected because they are the most popular among Spanish anthropologists [33] and because they never have been tested in a Spanish sample. Unfortunately, in spite of the methods’ popularity in Spain, the present study has shown that the application of the three methods to a Spanish sample may be problematic. According to the results of this study, the methods differ significantly in their performances: the Lovejoy method estimates age poorly (27% accuracy), while both the Suchey-Brooks and Buckberry and Chamberlain methods estimate age with higher accuracy (71% and 86%, respectively). The accuracy of the latter two methods differs only by 15%. However, while the Suchey-Brooks and Buckberry and Chamberlain methods outperform the Lovejoy method with regards to accuracy, it is important to consider the width of the error intervals associated with the phases for these methods. The Lovejoy method was developed prior to the recommendation of statistically sound 95% confidence intervals. Based on its performance here and in other studies, it is clear that the five-year phase intervals used by the Lovejoy method were overly optimistic about the quality of skeletal data and the consistent rate of age-related change. In contrast, the 95% confidence interval phases for the Buckberry and Chamberlain and Suchey-Brooks methods are very broad and reflect the general quality of information on the aging process contained in the human skeleton. For example, stages IV (29–81 years) and V (29–88 years) in the Buckberry and Chamberlain method have interval widths of 52 and 59 years, respectively, that cover nearly the entire adult lifespan of humans.

The Spanish sample shows higher levels of accuracy than the Portuguese sample studied by Santos [21] when the Lovejoy and Suchey-Brooks methods were applied. However, the levels of bias and absolute error in the Portuguese sample are lower. The Spanish sample shows similar absolute error to the USA sample reported by Murray and Murray [25] for the Lovejoy technique. The Spanish sample demonstrates lower levels of bias and absolute error than the Thai sample used by Schmitt [2], the Canadian sample of Saunders et al. [4], and the Italian sample of Hens et al. [10] for the Lovejoy and Suchey-Brooks methods. It also shows lower bias and absolute error than the USA sample of Mulhern and Jones [30] for the Buckberry and Chamberlain method. On the other hand, correlation coefficients between age phases and chronological age for the present study are lower than those reported for the Spitalfields sample by Buckberry and Chamberlain [28]. They reported coefficients of Spearman’s correlation around 0.62, whereas the present study reports 0.37 and is similar to those obtained by the Lovejoy and Suchey-Brooks methods. All of these results indicate that the age/indicator relationship is quite variable among populations and support the observations of previous authors [5, 10, 21, 25, 26]. Furthermore, as these previous authors indicated [5, 10, 26] this variability increases with age.

One of the main problems of the adult aging methods is the estimation of age in the elderly. This is due to the great variability expressed by the age markers during the aging process, specifically in older ages. Age-related morphological changes in the skeleton occur as an individual undergoes growth, development, and maturation. The appearance of the age markers in an individual skeleton will vary depending on an individual’s life history. Influencing factors include health status, diet, living environment, cultural practices, and the presence of disease and trauma experienced during life [41, 42]. In sub-adult individuals, this change occurs more predictably but once skeletal development has ended, maturation of the skeleton occurs with less of an age-specific chronology [4345]. There are no set rates for the maintenance of the adult skeleton [19, 46] and for this reason, the observed variability in the age markers increases and the accuracy of the aging methods decreases with age.

With the intention of reducing the effects of the age markers variability in the aging methods of Lovejoy and Buckberry and Chamberlain, Osborne et al. [29] and Falys et al. [26] reduced the number of the phases and stages of these methods. Obsborne collapsed Lovejoy’s eight phases into a six phase system. Falys also reduced Buckberry and Chamberlain’s seven stages into three. In this way, both authors achieved an increase in the accuracy of both methods, specifically in older ages. However, these two new proposals have very broad intervals and reflect the general poor quality of information on the aging process contained in the human skeleton. For example, the age range of phase III proposed by Falys and colleagues is from 21 to 91, and the age ranges of phases 5 and 6 proposed by Osborne are 24–82 and 29–89, respectively. Thus, these new proposals, together with Suchey-Brooks and Buckberry and Chamberlain, are based on broad intervals with ranges that include most adult ages, therefore making it difficult for the chronological age to not be included in the estimated interval. These methods sacrifice precision for accuracy. However, both precision and accuracy are very important for individual identification, and forensic anthropologists should be committed on improving both. Establishment of the identity of an individual is of the utmost medico-legal significance, both in living and dead, especially in cases of murder or mass disasters, where the bodies are grossly mutilated or in advanced stages of decomposition. For identification, apart from sex (which excludes almost half of the population), age is one of the most important criteria for excluding large portions of the population [47].

Accuracy and reliability of older adult age intervals among the Spanish are particularly relevant in Spanish bioarchaeology and forensic anthropology, as such population variation data are particularly important to ongoing human rights investigations of mass graves from the Spanish Civil War era. Since 2000, archaeologists have worked to recover historic memory of the Spanish Civil War by exhuming the remains of victims of extrajudicial executions [cf 4850]. Physical anthropologists developing biological profiles of the victims for identification purposes have had to rely on the available skeletal aging standards, most of which were developed on USA reference samples. The magnitude of error involved in applying these methods to Spanish individuals who were likely born around the beginning of the twentieth century is unknown, and great errors have been observed when USA reference standards have been applied to Spanish samples. For example, the method for calculating stature based on USA reference samples fails in the estimation of living height in Catalonia. In Catalonia, the formulae proposed by Pearson [51] at the end of nineteenth century based on a French sample perform better because of the biological population history of French and Catalan populations [13, 33].

It must be emphasized that precision in forensic anthropology is important for individual identification and broad intervals of estimated age are not very useful. Therefore, the results of this study suggest that future methods of skeletal age estimation should allow for precision and flexibility in both: (1) applying different reference collections to different target populations and (2) estimating the age of an individual taking into account the variability observed in the feature. This flexibility can be found in methods based on Bayesian prediction. The success of this mathematical procedure, which generates accurate and less biased age estimates, has been demonstrated by several authors [3, 5255].

Conclusions

This study has evaluated three methods for adult age estimation based on the pubis symphysis (Suchey-Brooks) and the auricular surface (Lovejoy and Buckberry and Chamberlain) in a Spanish sample. The results indicated that the Lovejoy method estimates age poorly (27%) with clear differences in accuracy from the Buckberry and Chamberlain (86%) and Suchey-Brooks (71%). However, the accuracy of the Buckberry and Chamberlain and Suchey-Brooks methods are based on the width of the estimated intervals of age, which include most of adulthood, making it difficult for the chronological age to fall out of the estimated interval. This study suggests that future methods of skeletal age estimation should allow for precision and flexibility in applying different reference collections to different target populations and estimating age from the observed features in the age markers. These precision and flexibility are observed in methods based on the Bayesian prediction. Additional research on Spanish reference samples is recommended prior to applying systematically in forensic and archaeological contexts the three methods evaluated in the present study.