Introduction

While sex determination in adult specimens is a fairly reliable and accurate approximation, adult age estimation is a difficult task. However, it is essential in physical anthropology and forensic and osteoarchaeological sciences. Most of the research formulating and evaluating skeletal aging techniques focused on the os coxae (i.e., auricular surface and pubic symphysis) have demonstrated low accuracy and high bias when the methodology is used on different populations [113]. These results have highlighted the necessity of increasing research to better understand age markers related to biology, designing new predictive models, and finding new age markers. In this line, and following the experience of two previous studies [14, 15], Rissech et al. [16] proposed a male-specific method of age estimation based on the age-related changes of the acetabulum using Bayesian inference. The method recognized seven morphological variables correlated with chronological age: (1) acetabular groove; (2) rim shape; (3) rim porosity; (4) apex activity; (5) activity on the outer edge of the acetabular fossa; (6) activity of the acetabular fossa; and (7) porosities of the acetabular fossa. Each of these variables was divided into different stages, describing the different morphological conditions of the acetabular region (e.g., acetabular groove can be scored as: no groove (0), groove (1), pronounced groove (2), and very pronounced groove (3)). The method’s accuracy was evaluated in a sample of 242 male individuals from the documented skeletal collection of Coimbra (Portugal) [16]. In 2007, the same authors tested the method on a sample of 394 male individuals from four documented Western European collections: the Coimbra and Lisbon collections from Portugal, the UAB collection from Spain, and the St. Bride collection from England [17]. Their results indicated significant correlation between all the acetabular variables and age, illustrating the potential value of the acetabulum as an age marker for young, middle-aged, and older adults. Thus, the acetabulum appears to be a meaningful indicator of age, with great applicability throughout human life span. Furthermore, the results demonstrated the applicability of the method to collections other than the collection the method was developed on. However, as in other studies on the auricular surface and pubic symphysis [3, 6, 7, 10, 12, 13], Rissech et al. [17] found population variability in the morphological changes of the acetabulum due to age. In other words, the age estimations were less reliable when the reference population and the target population were more biologically distant.

Ideally, for the successful application of a specific age estimation method, it is necessary to study the method in depth, taking into consideration the biology of the age markers as well as possible sex-based and intra- and inter-population differences. Thus, several different authors have conducted studies using the Rissech et al. [16] method on different populations. Two studies indicated that the Rissech method performs much better than the methods based on the pubic symphysis [18] and auricular surface [1921] for both an Iberian sample [22] and a white North American sample [23]. However, other authors have suggested the existence of a low correlation with age [2426] for variables related to the acetabular fossa. Specifically, these authors have issues with variables 5, 6, and 7 of Rissech’s method, indicating their difficulty to be scored. Furthermore, although it is not based on a specific analysis on the possible existence of sex differences in the Rissech method, some authors have reported no sex differences in their age estimations based on the acetabulum [25, 27, 28]. San-Millán [29] have also proven a common aging pattern in acetabular shape, linked particularly with the Rissech et al. [16] variables related to osteophytic proliferation (variables 2, 4, and 5). However, both Mays [26] and San-Millán [29] found sex differences in acetabular aging rate between males and females in the Rissech method’s acetabular variables. Specifically, females showed a slower rate of aging than males.

Hence, this paper has two main goals: (1) to revise and better define the variables of the Rissech et al. [16] method related to the acetabular fossa, namely, variables 5, 6, and 7, and (2) to extend the applicability of this new approach to both sexes while analyzing age-related sex differences in the acetabular aging process. This latter analysis is herein undertaken in each variable independently for the first time.

Material and methods

Material

The osteological samples come from the Collection of Identified Human Skeletons curated at the Bocage Museum of Lisbon (National Museum of Natural History, Lisbon University, Portugal). The skeletons consist largely of Portuguese individuals who lived in Lisbon from 1805 to 1975 [30]. This collection originates from modern cemetery sources and comprised 1692 skeletons; however, only 699 of them have basic documentary data [30]. Males and females with fused acetabula were chosen for this analysis. We excluded individuals who showed pathologies affecting the acetabulum, but we included those with non-inflammatory osteoarthritis or diffuse idiopathic skeletal hyperostosis (DISH). These individuals were included because these conditions are related to age [31, 32]. By these criteria, 611 left os coxae were analyzed from individuals between 15 and 98 years of age at death (Fig. 1) and born between the second half of the nineteenth century and 1975: 294 males (mean age = 55.57 years, SD = 19.575) and 317 females (mean age = 62.75 years, SD = 20.795). This wide age range was used to illustrate all the morphological changes that occur in the acetabular area during the human life span. The right side was evaluated when the left was damaged, pathological, or unavailable. All observations were made by the first author.

Fig. 1
figure 1

Age distribution by sex of the 611 individuals sampled from the Lisbon Collection

Description of variables

Seven morphological acetabular variables, the same for males and females, were considered, keeping in mind the original Rissech et al. [16] methodology. Rissech et al. [16] describe seven traits on the lunate surface, the acetabular rim, the area adjacent to the acetabulum, the acetabular horns, the outer edge of the acetabular fossa, and the acetabular fossa itself. The current variables represent a modification of the original (see above). The variables tested in the current study are: (1) acetabular groove; (2) acetabular rim shape; (3) acetabular rim porosity; (4) apex activity; (5) activity on the outer edge of the acetabular fossa; (6) texture and bone density in the center of the acetabular fossa; and (7) activity in the acetabular fossa. The first four variables are identical to the original descriptions [16], although some nuances have been added (Table 1). However, the original Rissech et al. variables 5, 6, and 7, related with the activity on the outer edge of the acetabular fossa, the activity of the acetabular fossa, and the porosities of the acetabular fossa, respectively, were replaced and redefined. Thereby, new variables 5, 6, and 7 were described and analyzed for the first time here. Descriptions of the seven variables and their stages are described in detail in Table 1. The descriptions refer to the anatomical structures as shown in Fig. 2. Photographs to complement the descriptions of the original and newly defined variables are illustrated in Figs. 3, 4, 5, 6, 7, 8, 9, and 10.

Table 1 Morphological description of the seven acetabular variables and their stages
Fig. 2
figure 2

Acetabular terminology. 1 Acetabular fossa. This is the non-articular surface within the acetabulum. The three arrows highlight the three lobes of the fossa: anterior, superior, and posterior. 2 Lunate surface. This is the articular surface within the acetabulum, between the os coxa and the femoral head. 3 Acetabular rim surrounding the acetabulum. 4 Outer edge of the acetabular fossa. This is the limit between the lunate surface and the acetabular fossa. 5 Apex of the posterior horn of the lunate surface. 6 Acetabular notch. It is compressed between the two horns of the lunate surface

Fig. 3
figure 3

Images describing variable 1. a Female, 21 years. There is no groove below the acetabular rim. b Male, 20 years. Arrows indicate a slight groove below the acetabular rim. c Male, 43 years. Arrows indicate a pronounced groove below the acetabular rim. d Male, 65 years. Arrows indicate a very pronounced groove below the acetabular rim

Fig. 4
figure 4

Images describing variable 2. a Female, 18 years. Arrowheads indicate round and smooth areas of the acetabular rim. b Female, 21 years. The ischial part of the acetabular rim is rounded (arrowheads) and the ilial part is narrow (arrows). c Male, 16 years. The external part of the acetabular rim is rounded (arrowheads) and the internal part has an upright form (arrows). d Male, 21 years. The ilial part of the acetabular rim is rounded (arrowhead) and the ischial part is narrow (arrows). e Male, 37 years. Arrows indicate narrow parts of the acetabular rim. f Male, 45 years. The acetabular rim is rough to the touch due to the presence of little grooves. g Female, 32 years. h Female, 61 years. i Female, 56 years. j Female, 51 years. k Male, 83 years. Figures (gk) show osteophytic growth on the acetabular rim of different sizes and extensions

Fig. 5
figure 5

Images describing variable 3. a Male, 16 years. The acetabular rim is smooth without porosities and roughness. b Male, 18 years. Arrows indicate microporosity increased lightly on the area around the acetabulum. c Female, 24 years. Arrow indicates some microporosities on the acetabular rim, but it keeps a round and dense appearance. d Male, 67 years. The acetabular rim is not smooth to the touch. e Female, 85 years. Arrow indicates newly constructed bone with many micro- and macroporosites, suffering subsequent destruction. f Female, 82 years. Arrow indicates the area of the acetabular rim which has suffered bone destruction through micro- and macroporosities. g Female, 94 years. Arrow shows the place where macro- and microporosities of the destructured acetabular rim have partially invaded the lunate surface

Fig. 6
figure 6

Images describing variable 4. a Female, 21 years. The apex is round and smooth. b Male, 49 years. Arrow indicates a small bone spicule on the apex. c Female, 56 years. Arrow indicates an osteophyte ≥1 mm on the apex. d Female, 76 years. Arrow indicates a bone proliferation ≥1 mm, which covers the entire apex. e Male, 82 years. Arrow indicates an osteophyte ≥3 mm on the apex. f Female, 83 years. An osteophyte ≥3 mm covers the entire horn of the lunate surface. g Male, 68 years. h Male, 65 years. In (g) and (h), an osteophyte ≥5 mm is present on the apex

Fig. 7
figure 7

Images describing variable 5. a Male, 30 years. b Female, 32 years. In (a) and (b), there is no osteophytic formation growing from the outer edge of the acetabular fossa. c Female, 83 years. d Female, 69 years. In (c) and (d), arrow indicates a small osteophytic visor (<5 mm) formed on a portion of the outer edge of the acetabular. e Male, 71 years. Arrows indicate a small osteophytic visor (<5 mm) formed along the outer edge of the acetabular fossa. f Female, 72 years. Arrow indicates much visible osteophytic bone growth (≥5 mm) parallel to the fossa in a portion of the outer edge. g Male, 78 years. Arrows indicate much visible osteophytic bone growth (≥5 mm) around the entire edge, parallel to the acetabular fossa, obliterating it partially. h Female, 78 years. Arrows indicate extreme visible osteophytic bone growth (≥5 mm) around the entire edge, parallel to the acetabular fossa, obliterating it almost completely. i Female, 81 years. Arrows indicate bone growth from the outer edge, parallel to the acetabular fossa, which joins with the bone proliferation in the acetabular fossa itself, indicated by an arrowhead

Fig. 8
figure 8

Images describing variable 6. a Female, 20 years. The center of the acetabular fossa is dense, smooth, and mostly unaltered. b Female, 20 years. c Male, 16 years. In (b) and (c), the center of the fossa is dense and smooth; arrowheads indicate features resembling root etching. d Male, 45 years. e Female, 20 years. f Male, 49 years. In (df), the center of the fossa is smooth, but there are shallow porosities indicated by arrowheads. g Male, 46 years. The center of the fossa exhibits coarse granularity. h Male, 53 years. The center of the fossa exhibits relief as bone chords. i Female, 76 years. The center of the fossa exhibits relief as shallow microporosity very closely spaced. j Female, 46 years. The center of the acetabular fossa remains smooth, but the fossa has been densified. k Female, 78 years. The center of the acetabular fossa is no longer dense because it has suffered destruction. l Female, 83 years. The center of the acetabular fossa has suffered destruction, making it fragile. m Male, 69 years. The center of the acetabular fossa has suffered destruction, but also production of unstructured bone, indicated by arrowheads. n Female, 78 years. The bone proliferation from the outer edge of the acetabular fossa has obliterated it completely, making it impossible to observe the center of the fossa

Fig. 9
figure 9

Images describing variable 7. a Female, 16 years. There is no aging porosity in the acetabular fossa. b Male, 15 years. c Male, 21 years. d Male, 20 years. In (bd), different examples of “orange peel” porosity can be observed, indicated by arrowheads. e Male, 45 years. f Male, 30 years. g Male, 16 years. h Female, 39 years. i Male, 16 years. In (ei), different examples of swollen porosity can be observed. In (ai), arrows indicate normal porosity not related with aging process. j Male, 49 years. k Male, 47 years. l Male, 48 years. In (jl), different examples of the predominance of microporosity can be observed. m Female, 46 years. The acetabular fossa does not have age-related porosities over a large extent of the fossa, but the texture of the bone is densified. n Male, 65 years. o Male, 82 years. p Female, 63 years. In (np), different examples of loss of bone density in the acetabular fossa can be observed, due mainly to a combination of micro-and macroporosity and trabecular bone. q Female, 88 years. r Female, 80 years. In (q) and (r), two examples of loss of bone density and consistency as a result of bone destruction can be observed. s Male, 81 years. t Female, 85 years. In (s) and (t), two examples of the predominance of trabecular bone can be observed, making the fossa fragile. u Female, 88 years. There is a partial obliteration of the acetabular fossa due to bone proliferation within it

Fig. 10
figure 10

Regular porosity. a Male, 17 years. b Male, 16 years. c Female, 16 years. In (ac), regular porosity in the posterior lobe is indicated by arrows. d Male, 17 years. Regular porosity in the superior lobe is indicated by an arrow. e Female, 22 years. Regular porosity in the superior lobe is indicated by an arrow and bony bridges around the anterior lobe indicated by arrowheads. f Male, 21 years. Bony bridges around the anterior lobe are indicated by arrowheads. g Female, 18 years. h Male, 30 years. i Male, 18 years. In (gi), irregular and shallow hole not related with aging (not age-related macroporosity) is indicated by arrows

Appropriateness of the variables

To analyze the appropriateness of the variables and quantify the correlation with age, Kruskal–Wallis [33] and Kendal range [34] tests were performed for males and females, separately. Furthermore, box plots of known age, within the stages of each variable, were explored visually to recognize the fraction of stages within a specific age range when considering sex. Median, first and third quartiles, and maximum and minimum values have been considered.

Intra- and inter-observer error analyses were performed in order to quantify repeatability error. To evaluate intra-observer error during the analysis of the seven acetabular variables, 60 left os coxae were chosen randomly (30 males and 30 females) from the Lisbon Collection and evaluated twice, at different times, 1 month apart, by the first author. To quantify inter-observer error and evaluate the utility of the descriptions and photos, 37 left os coxae from the archeological collection of La Olmeda were examined, under identical conditions, by three different observers: the first author and two additional PhD students of Anthropology. In this case, only the three newly defined variables were evaluated as the first four original variables of the Rissech method had already demonstrated good levels of repeatability [16, 23, 26]. Given the impracticality for the two additional observers to visit the Lisbon Collection with the first author, the whole inter-observer analysis was undertaken by the three observers with the archeological collection of La Olmeda, housed in the Physical Anthropology Unit of the Faculty of Biology at the University of Barcelona. Furthermore, following the same protocol as in the Lisbon Collection, from this collection, we excluded individuals who showed pathologies affecting the acetabulum, but included those with non-inflammatory osteoarthritis or DISH due to their relation with aging [31, 32]. Also, in keeping with protocols used in the Lisbon Collection, the right side was evaluated when the left was damaged, pathological, or unavailable. The three observers evaluated the three newly defined variables using only the descriptions and photos provided here. Both intra- and inter-observer consistencies were evaluated using the weighted kappa statistic for ordinal data [3537]. Weighted kappa was used instead of the standard kappa to take into account the degree of disagreement between repetitions or observers. According to the assessment criteria of Landis and Koch [37], a weighted kappa score of 0.41–0.60 indicates “moderate agreement.” Therefore, 0.61–0.79 was used as an indicator of “substantial agreement” and 0.80–1.00 as almost perfect agreement between repetitions or observers, for the different traits considered [37].

In addition, to evaluate the possible sex differences in the aging process for these seven variables, Mann–Whitney U test was performed on the mean age between both of the sexes for each stage of each variable. Non-parametric testing was used in some cases due to small sample sizes and because the normal distribution of some of the variables considered in the analysis cannot be assumed.

Analysis of the accuracy of the variables in age prediction

The accuracy of the acetabular variables for age estimation was analyzed by applying Bayesian inference, as used by Rissech et al. [16]. Further details of this methodology are entirely explained by Lucy et al. [38]. Age estimation was performed through the computer program IDADE 2, already used in the original papers of Rissech et al. [16, 17]. In this program, prior probability (the probability that the age at death of an unknown individual falls in an age class before any acetabula have been evaluated) is estimated as the fraction of individuals in the reference collection with known age at death in that age class. Posterior probability (the probability that the age at death of an unknown individual falls in an age class after the acetabula have been evaluated) is based on conditional probability distributions of age (class) at death, given that a particular set of features has been observed in the test specimen [39]. These distributions were estimated based on the frequencies observed in the reference collection. Furthermore, the underlying assumptions are: (1) different variables give independent information about age and (2) test individuals are at least 15 years old and are drawn from a population with similar survivorship to the reference collection. In addition, a relatively narrow 5-year age class was chosen because the variables reported allowed age to be estimated with this level of precision, without sacrificing accuracy. Thus, the reported results are based on 5-year age classes.

The age-at-death estimation was performed through three different experiments, taking into account the sex of the test and reference individuals: experiment 1—considering exclusively male individuals; experiment 2—considering exclusively female individuals; and experiment 3—considering the entire sample without separating males and females. To carry out these experiments, the entire sample was randomly divided into test (226 individuals: 110 males and 116 females) and reference (385 individuals: 184 males and 201 females) samples. Thus, in the first experiment, where only males were considered, an analysis of the 110 males was performed as a test sample based on a male reference sample of 184 individuals. In the second experiment, where only females were considered, the analysis of 116 females was performed as a test sample on a female reference sample of 201 individuals. Finally, in the third experiment, the entire sample of 226 test individuals were analyzed (110 males and 116 females) based on a reference sample of 385 mixed individuals (184 males and 201 females). It is important to note that the individuals estimated were always exactly the same, which facilitates the comparison of the obtained results between experiments.

To investigate the success of the performance of the age estimation method for the three experiments, bias and absolute error of the estimations were analyzed. Both bias and absolute error are good indicators of a method’s inaccuracy [40]. Bias (estimated age − chronological age) is the statistical measure that determines the direction of the committed error in a method’s misclassification [9, 41, 42], identifying whether the estimated age is over- or underestimated. The bias is negative if the estimated age is younger than the chronological age and is positive if the estimated age is older than the chronological age. Bias was calculated as the average difference between estimated age and chronological age using each method [Σ(estimated age − chronological age)/n). Absolute error is the statistical measure that evaluates the degree of the committed error in a method’s estimation [9, 41, 42]. Absolute error was calculated as the average absolute difference between estimated age and chronological age using each method (Σ|estimated age − chronological age|/n). In essence, absolute error represents absolute difference, without taking into account the sign (positive or negative) of the difference between estimated age and chronological age.

To analyze possible sex differences in bias and absolute error, Student’s t test was applied in each experiment for each age group (15–39 years, 40–64 years, and >65 years). In other words, the possible sex differences were analyzed by comparing males (males based on males, experiment 1) and females (females based on females, experiment 2) and by comparing males (based on a mixed sample, experiment 3) and females (based on a mixed sample, experiment 3). In those cases in which bias and absolute error did not follow a normal distribution or the sample size was small, the Mann–Whitney U test was applied. Furthermore, to evaluate whether the use of sex-specific reference samples improves the results of the age estimation, the inaccuracy obtained within males (experiment 1 vs. experiment 3) and within females (experiment 2 vs. experiment 3) was compared with a Student’s t test of related samples. In other words, the bias and absolute error were compared for the male estimates coming from experiment 1 (male estimates based on a male reference sample) with male estimates coming from experiment 3 (male estimates based on a mixed reference sample). Also, a comparison of the bias and absolute error of the female estimates coming from experiment 2 (female estimates based on a female reference sample) was made with female estimates from experiment 3 (female estimates based on a mixed reference sample). In those cases where the normal distribution cannot be assumed, a Wilcoxon test was applied. Furthermore, Pearson’s correlation was performed for all males and females from the three experiments to examine the relationship between the absolute error and bias of the estimations with the individual’s chronological age. In those cases where the normal distribution for chronological age cannot be assumed, Spearman’s correlation was performed.

Results

Appropriateness of variables

Both Kruskal–Wallis and Kendall range tests indicated a significant correlation between each variable and chronological age for the male and female samples (Table 2). Box plots of the variables (Fig. 11), indicating the median, first and third quartiles, and the maximum and minimum ages in every stage of each variable, showed that the different stages of the variables develop in a sequential manner along all the life span, with a slight overlap in most of the cases in both sexes. Regarding the repeatability of the variables, both intra- (Table 3) and inter-observer errors (Table 4) showed good levels of consistency, with values higher than 0.67 with p < 0.001 in all cases. In agreement with Landis and Koch [37], a weighted kappa score of 0.61–0.79 (25 % of the cases evaluated) means substantial agreement, while a score of 0.80–1.00 (75 % of the cases evaluated) represents an almost perfect agreement between repetitions or observers for the different traits considered.

Table 2 Correlation between the seven variables and chronological age in males and females separately by Kruskal–Wallis test (left) and Kendall range (right)
Fig. 11
figure 11

Box plot between age and each of the seven variables to illustrate the relationship between the stage of each variable (horizontal axis) and age (vertical axis) for males (dark gray) and females (light gray). The central line in the boxes indicates the median, the bottom and top of the box represent the first and third quartiles, respectively, and the ends of the whiskers correspond to the minimum and maximum values

Table 3 Intra-observer reliability using the weighted kappa statistic (p < 0.001) for each individually assessed variable
Table 4 Inter-observer reliability using the weighted kappa statistic (p < 0.001) for each individually assessed newly defined variable

Analysis of sex differences in the aging process

Table 5 shows the results of the analysis considering how the stages of the variables specifically correspond to known age by sex. The Mann–Whitney U test indicated significant sex differences in mean ages, in some stages, for all variables, especially in the middle-aged stages (Table 5). These results suggest a single aging pattern for the acetabulum within both sexes, with different aging rates between males and females. The mean age of each stage of variables 1, 4, and 5 was always higher in females compared to males, showing females have a slower rate of aging in relation to males. The remaining variables (2, 3, 6, and 7) exhibited this pattern only from middle-aged stages (from stage 3 in variables 2 and 3 and from stage 2 in variables 6 and 7). However, the opposite pattern was observed in some of the first stages of variables 2, 3, 6, and 7 as males showed a slower aging rate for these specific characteristics than females.

Table 5 Descriptive statistics of the mean chronological age for every stage of each variable in females and males separately

Analysis of the accuracy of the variables in age prediction

After the age estimation of the individuals using the IDADE2 program, the bias and absolute error of the estimations were evaluated. Tables 6 and 7 show the differences between chronological age and estimated age based on the sex-specific reference sample (Table 6) and mixed reference sample (Table 7) for both sexes and the different age groups. Regarding the bias based on the sex-specific reference sample, it seems that the age estimates for both sexes are overestimated until the seventh decade of life and underaged after 70 years (Table 6 and Fig. 12). There was also, on average, a slight underestimation for women in the 20- to 39-year interval. The results also indicated that the mean absolute error based on the sex-specific reference samples was 7.28 years in males and 7.09 years in females (Table 6). When 10-year age groups were taken into account, the behavior of the absolute error changed greatly between the sexes (Table 6). When just males were considered, the lowest absolute error was found in the youngest group and in the 40- to 49-year age group. Worse results were found in the 70- to 79-year group, while the rest of the groups kept similar average intermediate errors. On the other hand, when only females are considered, the best estimations were made on individuals less than 30 years of age and the worst estimations for the 30–49 age groups, keeping intermediate errors similar to males for the rest of the age groups. In this regard, the error was within 10 years in 73.4 % of males and 75.0 % of females, while it was within 5 years in 44.0 % of males and 48.3 % of females (Table 6). In general, when the age estimations were made based on the mixed reference sample, the overall results are fairly similar (Table 6 vs. Table 7), independent of the nature of the reference sample. Furthermore, as expected, there is a negative and significant correlation between bias and chronological age in males (male reference sample: N = 109, r = −0.384, p = 0.000; mixed reference sample: N = 109, r = −0.244, p = 0.011) and females (female reference sample: N = 116, ρ = −0.269, p = 0.004; mixed reference sample: N = 116, ρ = -0.353, p = 0.000). However, regarding absolute error and chronological age, the correlation is only positive and significant in males (male reference sample: N = 109, r = 0.230, p = 0.016; mixed reference sample: N = 109, r = 0.245, p = 0.010).

Table 6 Number of individuals in each age group and their percentage of the total number of males and females (both estimated with sex-specific reference sample), with absolute error (|e|) less than specific amounts (<5 years and <10 years), tabulated within age classes, and overall specimens, mean bias, and mean absolute error
Table 7 Number of individuals in each age group and their percentage of the total number of males and females (both estimated with the mixed reference sample), with absolute error (|e|) less than specific amounts (<5 years and <10 years), tabulated within age classes, and overall specimens, mean bias, and mean absolute error
Fig. 12
figure 12

Amount of age bias recorded for each individual analyzed following the application of current acetabular age estimation technique. Top: males. Bottom: females. Gray: sex-specific reference sample. Black: mixed reference sample

Regarding sex differences, Table 8 shows the differences in bias and absolute error between males and females, taking into account the age group and the reference sample used for the age estimation. There are no significant sex differences in the bias between males and females. The exception is in the overall sample considering the mixed reference sample, where the method generally overestimated age in males and underestimated it in females. With respect to absolute error, the differences between sexes are not significant either in the overall sample or in the different age groups considered. However, individuals older than 65 years, estimated with the sex-specific reference sample, have an absolute error that is significantly lower in females. Finally, Table 9 shows the differences in bias and absolute error within males and females and compares them with the mixed reference and sex-specific reference samples (Fig. 12). The results indicated that there are significant differences regarding bias in some instances. There are several cases based on the mixed reference sample which have a value significantly higher than the sex-specific reference sample, and vice versa. Although there is not a consensus for considering age groups separately, on average, in the overall sample, the sex-specific reference sample seems to provide significantly better results for males and females (Table 9). With respect to absolute error, no significant differences were found (Table 9), with the exception of females older than 65 years. In this female age category, the error rate was significantly lower when the sex-specific reference sample was used.

Table 8 Differences in bias and absolute error between the sexes along the age cohorts estimated using the sex-specific and mixed reference samples
Table 9 Differences in bias and absolute error between individuals estimated using either the mixed reference sample or the sex-specific reference sample in both sexes separately

Discussion

This study has used 611 os coxae from the Iberian Peninsula to analyze the age-related morphological changes expressed in the acetabular area. The aim of the study was to restructure and better define the variables based on the Rissech et al. [16] methodology for aging the acetabular fossa and to extend the applicability of the method to females. The general results confirm that the acetabulum and the three newly defined variables are good age predictors for both Iberian males and females. In all three experiments, the absolute error obtained from the age estimation was within 10 years of chronological age for around 74 % of the sample. The absolute mean errors of the acetabular method analyzed in the three different experiments were 7.28 years for males and 7.09 years for females based on sex-specific reference samples and 7.35 years for males and 7.15 years for females based on a mixed reference sample. These results are lower than the results of the original method used on Canadian (8 years in [24]) and North American samples (8.61 years for both sexes in [23]). Furthermore, the mean absolute error of our revised method is significantly lower than that obtained when traditional methods are performed on the Iberian samples, i.e., Suchey–Brooks (12.38 years in [13]; 14.42 years in [22]) and Buckberry–Chamberlain (11.24 years in [13]; 14.12 years in [22]).

In addition, the inaccuracies obtained in this study contain the lowest error values comparable to the inaccuracies of other studies that are based on the classical age markers of the os coxae (auricular surface and pubic symphysis) in other populations. For example, the Lovejoy et al. auricular surface age estimation method [19] produced error rates of 13.8 years for males and 18.2 years for females in [7]; 11.4 years for males and 12.8 for females in [9], and 11.6 years in [42]. The Buckberry–Chamberlain auricular surface age estimation method [20] produced error rates of 11.8 years in males and 13.9 years for females in [11], 10.14 years in [20], and 13.03 years in [43]. Finally, the Suchey–Brooks pubic symphysis age estimation method [18] produced error rates of 14.2 years for males and 16.4 years for females in [7], 13.6 years for males and 13.8 for females in [9], 10.5 years in [43], and 8.95 years in [44]. This indicates that the revised acetabular method for adult age estimation performs better than the classical methods based on the os coxae.

Based on the described unreliability of the pubic symphysis and auricular surface as estimators of age at death in older adults, some researchers have proposed that it may be nearly impossible to determine age in elderly skeletons [4547]. Variability is an intrinsic characteristic of aging, and it is a well-documented fact that age markers become progressively more inaccurate with advancing senescence [48]. Age estimation methods tend to provide a narrow age range for juveniles and young adults, with the age ranges gradually and then markedly becoming broader as an individual ages [49]. For example, even using a Bayesian approach to evaluate observations on the auricular surface and the pubic symphysis, Schmitt [32] and Schmitt et al. [50] had to lump together all individuals aged 60 or older into a single age class as they had difficulties in estimating the age of older individuals. In addition, the aging pattern for the different traits of the acetabulum is sequential, but not linear. This means that not every individual necessarily passes through all phases of each acetabular trait. However, even with a general underestimation of age in individuals older than 70 years, the current results confirm that all ages were estimated with reasonably good levels of accuracy. Interestingly, this indicates that observations from the acetabular area may be useful to estimate age at death in adult individuals of any age, including older adults. This accuracy for older individuals is largely explained by the longer maturation and aging time course of the anatomical features of the acetabulum used to estimate age at death. As some authors have suggested, the acetabulum may ultimately be the most reliable for individuals older than 60 years [51].

Additionally, in conjunction with the auricular surface, the acetabulum is one of the best-preserved areas of the pelvis due its robust and durable nature, as well as its lateral positioning on the os coxa [52]. These characteristics allow the acetabulum to be well preserved and resistant to post-depositional processes, making this method very useful for age estimation. This is an important advantage, particularly in osteoarchaeological contexts where the fragile pubic symphysis is rarely preserved. Even in the numerous cases where the pubis, combined with the nearest anterior area of the acetabulum, including a small part of the anterior apex of the lunate surface, is missing, the “incomplete” acetabula could be scored using this method as none of the variables use this missing part specifically. In this line, some authors have suggested a method using a combination of features of the acetabulum and the auricular surface [27, 53]. Both age markers also have an increased ability to discriminate older individuals, comparable to the pubic symphysis. However, it should be noted that this methodology, as with those based on the pubic symphysis and auricular surface, is restricted to fully skeletonized os coxae, requiring bones to be devoid of soft tissues. This is especially relevant in forensic cases with putrefied and/or mummified bodies.

While the original work studied age-related acetabular changes in males only [16], this new approach, with newly defined variables, seems to be equally applicable to both sexes. In agreement with San-Millán [29], both sexes appear to follow a single aging pattern; however, the aging rate is different. Although only significant in mainly the middle-aged stages, physiological changes due to age seem to occur significantly earlier in males than in females (Table 5 and Fig. 11). Even though a slower rate of aging of the female acetabulum was already described by Mays [26] and San-Millán [29], the current study is the first where every acetabular trait was evaluated separately between males and females and in detail. Although there are several hypotheses that try to explain the differences, the slower rate of aging in females could be related with the well-known longer longevity of females [5456]. With respect to the variables related to osteophytic growth [29], this fact could be associated with the higher male predisposition to form bone comparable to females [5759]. Due to this tendency, males could be forming bone earlier in the acetabular rim (variable 2), the posterior apex (variable 4), and the outer edge of the acetabular fossa (variable 5).

Interestingly, the opposite pattern was observed in variables 2, 3, 6, and 7 around the fourth decade of age, with a mostly significant faster aging process in females vs. males during this phase of life only (Table 5 and Fig. 11). The earlier maturation of females has been broadly described, even in bone development [60], and could explain the faster aging rate in the acetabulum before the shift observed. Males seem to show a regular and sequential aging pattern; whereas in females, a noticeable gap is observed around 50 years of age. This difference likely corresponds with menopause as it radically changes the aging pattern from a faster aging rate to a slower one, comparable to males. After this remarkable change in the relationship, the slower aging rate of females is maintained until the end of their life span. Nevertheless, with the exception of variables 3 and 7, the last stages of all the variables did not show significant sex differences in mean age, suggesting that the development of the characteristics evaluated becomes roughly equivalent between sexes toward the end of life. This fact could be related with the well-known relative masculinization of some variables with aging due to the ossification of the connective tissues and bone proliferation [6164]. This is in accordance with the results for variables 2, 4, and 5, which relate directly with bone proliferation. However, only the variables mostly related with the porosity phenomena (variables 3 and 7 and almost significant in variable 6) differed significantly in mean age in the last stages of the life span between sexes, i.e., females recorded the last stage being older than males. Although Rissech et al. [65] noted that the unique acetabular variable that significantly relates to osteoporosis was variable 3 (acetabular rim porosity), the higher level of osteoporosis typically found in females does not appear to affect the use of acetabular rim porosity as an aging variable.

Despite the sex differences previously described in the aging rate on the acetabulum, analyzing both sexes separately, with sex-specific reference samples, does not improve or significantly change the final age estimation based on absolute error average. When age intervals are taken into account, males and females have different reliabilities across age groups. Some difficulties in the estimation of age appear in females who are 30–49 years old and in males who are 70–79 years of age. This fact could reflect sex-specific age ranges with high variability in the physiological changes in the acetabulum, or it may be an artifact of sampling bias in these specific age–sex groups. Due to the significant sex differences between the biology of these age-related variables and the aging rate, and the existence of significant sex differences in mean absolute error in adults older than 65 years based on sex-specific reference samples, it is strongly recommended that a sex-specific reference sample be used in the age estimation based on acetabular characteristics.

Although some authors have suggested that the original morphological characteristics of the acetabular fossa are less consistently associated with age [2426], the new redefined variables related to the fossa correlated significantly with age in this sample. Some authors have also noted a difficulty in assessing these variables; however, the new revised variables have been better defined, with clear and thorough descriptions. The aim was to make the variables more easily applicable to other populations, at least for populations in Western Europe. Thus, the high values of weighted kappa confirmed the usefulness of the new clear descriptions and illustrative images. In addition, the high levels of repeatability have proven that the three newly defined variables have stages that can be consistently evaluated for an untrained but osteologically competent observer. As Rissech and co-workers had described previously [16, 17], both bone growth on the outer edge of the acetabular fossa (original and new variables 5) and porosity and/or proliferation of destructured bone (original variables 6 and 7, restructured in the new variable 7) occurred as an age-related phenomenon. Furthermore, the new variable 6 considers the “activity” or texture change within the center of the fossa and is related to the old variables 6 and 7. The acetabular fossa is normally covered with a pulvinar tissue that cushions the movements of the femoral head inside the acetabulum. This tissue gradually degenerates with age [66] and stops protecting the acetabular fossa against the impact of the femoral head. This causes the fossa to lose consistency and bone density, becoming fragile in older individuals. In some cases, destructured new bone is generated along the fossa, sometimes obliterating it. The current results support the previous results of Rissech and co-workers [16, 17] and demonstrate the usefulness of the acetabular fossa’s physiological changes for age estimation, at least in an Iberian population.

Future analyses should test the new variables on different populations to assess inter-population variability and enhance the applicability of acetabular variables on different populations, both geographically and temporally. In this line, one of the major goals of the IDADE2 program, which is based on Bayesian inference, is the possibility of choosing a representative reference sample based on the characteristics and context of the target sample or specimen being analyzed.

Finally, it is necessary to point out that age estimation concerns biology, where variability is the rule. Even considering population data, every individual may show different aging patterns. Due to this variability, previous experience of applying this method, as in other aging techniques, is valuable, but, due to the demonstrated high repeatability, not essential. In this line, young individuals’ acetabula are actually very porous, so it is important not to confuse normal porosity (Fig. 10) or bone damage from poor preservation with any of the traits considered here. Morphological indicators of age reveal the skeleton’s continuing adaptation to extraneous factors such as biomechanical stress, mineral metabolism, growth, remodeling, and disease [48]. Furthermore, individual differences in physical activity, nutrition, genetics, endocrine activity, body weight, and environmental conditions (climate, latitude, humidity, etc.) could be increasing the variability within and between populations. The relationship between chronological age and skeletal age indicators is neither constant nor linear [59]. Correct age estimation must consider this unavoidable limitation in order to define, with precision, the limits of results provided [51]. In this way, it is indispensable to improve aging techniques, thereby reducing the error and generating population standards based on specific and complete documented samples. Bayesian inference presents new opportunities for improving the accuracy of age estimates and appears to provide robust and appropriate results [16, 17, 38, 6769].

Conclusions

In the last 10 years, the acetabulum has been established as a robust skeletal age indicator. Based on the results of this study, the acetabulum keeps remodeling continuously throughout the life span of an individual. Also, its traits are sequential and follow an unambiguously recognizable aging pattern in both sexes. In addition, the acetabular area presents a high frequency of preservation, both in archeological and forensic contexts. These facts allow the acetabulum to provide accurate estimations over the entire life of an individual, making it a promising alternative to estimate the age of death of older individuals.

Furthermore, the acetabulum performs with lower absolute error than other age indicators of the os coxa, with mean absolute errors of 7.28 years for males and 7.09 for females based on a sex-specific reference sample and a 10-year interval in around 74 % of the sample. The evaluation of the traits separately has shown that the aging patterns are similar between males and females, but not the aging rates. Due to these differences in the biological process, it is strongly recommended that the application of this method be used in conjunction with a sex-specific reference sample. While the current database from the Lisbon Collection could be useful to estimate the age at death of samples from similar biological and temporal contexts, it is necessary to conduct further testing of this new approach on more geographically and temporally specific samples to take into account both inter-population differences and secular trends and possibly to create large and complete reference samples from different populations, generating a broadly applicable reference database. Thus, researchers could then apply the correct population reference for an individual by choosing between different reference samples depending on their target material. This will allow researchers to achieve better and more accurate age estimations based on more geographically, ancestrally, and temporally proximate reference samples. It remains to be seen whether a general aging pattern will emerge, with different aging rates in separate populations, as was found in this study between males and females.