Introduction

Treatment of immigration has a twofold dimension: integration of the immigrant into a legal situation, placing the individual and his rights in the centre; and treatment of the illegal immigrant, focusing attention on the following: (a) control of migratory flows (origin, transition, frontiers and interior); (b) protection of the internal labour market, the culture and living conditions of nationals; and (c) the gradual integration of the immigrants into the native society when they have settled. The lack of effective and democratic economic structures, wars and armed conflicts, ethnic tensions, systematic violations of human rights and natural disasters are related to the instability of migratory flows and explain to a great extent the increase of irregular migratory currents. Cross-border movement to new more stable environments has led to a growing demand for estimations of the chronological age of children, youngsters and juveniles that lack official papers or contrastable documents [1,2,3,4,5].

The importance in legal terms is due to the fact that in international treaties and local, regional and national systems, displaced persons are possessors of rights and obligations (protection as victims and liability for criminal activities). In children, youngsters’ and juveniles’ age determines the rights, the scope for containment, the bodies responsible for managing claims, possibilities of repatriation and/or adoption and the administrative, civil and criminal procedures to be applied in such situations. In Europe, the Schengen agreements and the treaties of Amsterdam and Lisbon allocate competence for immigration policies to the European Union, while responsibility for complying with and managing agreements is given over to the member states. Although states/nations have different regulations, 14 years is the average age established for exemption from criminal liability (minimum 10 years in England and Wales, Northern Ireland and Switzerland; maximum 18 years in Belgium and Luxembourg); 18 years is the age commonly used to separate legal minors and juveniles; and in general terms, jurisdiction for minors is not applicable to juveniles of between 18 and 21 years. Depending on the severity of the crime, different levels of liability are attributed to minors with ages in other age ranges (14–16 years and 16–18 years, amongst others) [6, 7]. In the ethical and medical fields, efforts are geared towards caring for victims of exploitation in transit and at the destination, and for children and youngsters that live in precarious conditions. The biological age is of interest for diagnosing alterations to the rhythm of growth and endocrine, genetic, kidney, metabolic and/or nutritional disorders, and for prognosticating the final height of the individual. Predicting possible developmental delays can help in furthering preventive therapy and, if necessary, in correction [8,9,10,11]. On the other hand, in cases of death, an effective lack of identification of the body has legal, civil and economic consequences for the deceased and their family [4, 12].

Without contrastable documentation, the chronological age is related to the level of development and degree of somatic maturity of the individual (biological ages) and, in particular, with the bone age, dental age, morphological age (age of growth) and level of sexual maturity [5, 13,14,15,16]. The bone age is determined by the degree of ossification of the wrist bones (carpus) and by the development and degree of fusion of the metacarpal bones, the phalanges and the distal epiphyses of the radius and ulna [17,18,19,20]. In adults, the level of fusion of the proximal epiphysis of the collarbone is assessed to determine if an individual is over 21 years of age [10, 21]. With an orthopantomograph (OPG) and intra-oral dental radiographs, the dental age is assessed by establishing both the state of eruption of the dentition (number and groups of teeth that emerge in the oral cavity) [22, 23] and the state of mineralisation of the dental crowns and roots [24, 25]. The morphological age is diagnosed according to the physical characteristics such as height, weight and the general shape of the body, and the results are contrasted with growth curves [13, 26]. The level of sexual maturity refers to the state of development reached by secondary sexual characteristics and the appearance of menstruation in girls. Sexual maturity is also related to the accelerated general growth of the body that is observed during puberty [13].

Although individuals are grouped according to ancestry, sex and age, the high variability in biological development has been made plain with all the techniques and in all populations [4]. Socio-economic differences, systemic diseases, environmental conditions, nutritional habits and disorders and endocrinal and congenital disorders (congenital hypothyroidism, adrenal hypoplasia, precocious puberty, etc.) partly explain the variation [27, 28]. To further best practices when estimating chronological age, groups of international experts have highlighted the possibilities and limitations of the methods, they have set out the need to combine techniques (physical, dental and bone examinations) to achieve greater precision and have published directives and recommendations to provide protocols and encourage transparency. An especially noteworthy element in this context is the contributions formulated or promoted by the American Board of Forensic Odontologists (ABFO), the International Organization for Forensic Odonto-Stomatology (IOFOS) and the Study Group of Age Estimation of the German Society of Legal Medicine (AGFAD) [5, 29,30,31,32,33,34,35].

Without access to any complete and updated information on estimating the chronological age of the Spanish population and without an explicit formula that enables the results of the bone, dental and physical examinations to be integrated, we have set out the following objectives in these studies: (1) evaluate the systematic error (systematic differences in comparison to the benchmark American population) and the variance in the estimation of the chronological age when applying the Greulich and Pyle Atlas (GPA) to the Spanish population; (2) propose weighting factors to reduce systematic errors in the estimation of chronological age; and (3) obtain a decision rule to predict maturity (18 years) and separate individuals that have a doubtful classification. More specifically, we propose a sequential classification procedure in two or more stages (decision tree): (a) evaluate the degree of bone maturity with the Greulich and Pyle method and, while ensuring specificity and sensitivity, classify the individuals into three categories (mature, minor and undetermined) and (b) using additional techniques (dental and morphological development along with that of other bones) and other more sophisticated techniques (e.g. molecular osteology; histomorphology), focus attention on the individuals classified in the undetermined group. The reason for selecting the Greulich and Pyle Atlas is that it is the most widely used reference for evaluating the bone age of children, youngsters and juveniles. It is simply applied and easily accessed; the predictive capacity is reasonably high (the hand and wrist present multiple centres of ossification), and when carrying out the radiograph, the subject receives a small amount of radiation (0.0001 to 0.1 mSV in each exposure [36]). The Greulich and Pyle method was not designed to estimate the forensic age; the standards are old and the subjectivity in recognition of the patterns (reference radiographs) affects the reliability of the measurement. The systematic error in the estimation of the chronological age can be offset if the bias in each population is known, and the uncertainty/variability in the estimation of the chronological age is similar to the one reported in other methods.

Materials and methods

The data base studied was compiled from conventional radiographs of the left hand and wrist of 1150 individuals of Spanish nationality, whose ages range from birth to 18 years in women and between birth and 19 years in men (560 girls and 590 boys: 30 for class of age and sex from 1 year to 18 and 19 years, respectively, and 20 for the “under one year”). In an industrialised environment and universal access to a high-quality public/private healthcare system (HAQ-index), child malnutrition in the Spanish population can be regarded as residual, and the life expectancy is 80 years for men and 86 years for women (Spanish National Institute of Statistics, Spain 2016).

The radiographs were provided by the Hospital Sant Joan de Déu (university hospital of the University of Barcelona, which specialises in paediatrics, gynaecology and obstetrics) and by the Image Diagnostics Service (Servei de Diagnòstic per la Imatge (SDPI)) Pura Fernàndez of the Hospital of Llobregat (Barcelona). All the radiographs selected in this study were conducted in line with the established protocol: (a) the patient is placed in a seated position at the edge of the radiographic bed, resting the hand to be examined on the RDI; (b) to obtain an anteroposterior project (AP) of the hand, the patient places his outstretched fingers, slightly separated and relaxed in close contact with the plate, along with the carpus and metacarpus; and (c) the imaging technician places the X-ray beam on the third metacarpal and carried out the radiograph. To avoid the effect caused by atypical values, only radiographs were used to patients who went to the hospital for possible fracture or trauma that did not affect the bone structure of the hand and wrist. Therefore, individuals who presented fractures or anomalies in development were not included, while radiographs where the bones appear to be distorted (poor quality of radiograph or incorrect projection) were also discounted. In some exceptional cases, radiographs of the right hand were used when a radiograph of the left hand was not available. The justification for using the right hand is to maintain a balance in the sample size in all the categories and because the difference in the prediction of the bone age when using radiographs of both hands is not significant [37,38,39]. All the procedures and studies have been carried out in accordance with the ethical standards established by the Ethical Committees of Clinical Research (Hospital Sant Joan de Deu: ECCR attached to the Fundación Sant Joan de Deu; and SDPI Pura Fernàndez of the Hospital of Llobregat: ECCR of the Instituto Universitario de Investigación en Atención Primaria Jordi Golp). In accordance with Spanish legislation, all cases studied were previously anonymised. In particular, the only data available to the authors of this study is as follows: date of birth, date radiograph was taken, chronological age (difference between the date the radiograph was taken and the date of birth), sex and nationality.

Greulich and Pyle Atlas

The Greulich and Pyle Atlas contains a set of standard radiographs of the left hand/wrist, representative of the bone age in each class of age and sex (0 to 18 years in girls and 0 to 19 years in boys), and the indicators of maturity of the hand bones and of the distal epiphyses of the radius and ulna. To define the references or standards, the authors conducted a longitudinal study using a population sample made up of 1000 girls and boys, of high social class (without nutritional or pathological problems that might affect growth), born in Cleveland, OH, USA, during the period 1931–1942 [17]. With radiographs of the left hand and wrist, in a flat position and with a posterior view, the sequential readings of the bones that make up the radiograph are taken in the following manner: (1) the presence or absence of the carpus bones is determined (each of the eight bones has an established time of appearance [40]; (2) the degree of mineralisation and fusion of the distal epiphysis of the radius and ulna is determined; and (3) the degree of ossification of the proximal epiphyses of the phalanges and metacarpals is evaluated. With this qualitative information in mind, and avoiding confusion irregularities appear in the order of appearance of the bones, the bone age is determined in accordance with the highest degree of similarity in the atlas standard. In practice, adjustments of standards to each population are unavoidable, since genetic, environmental, socio-economic and time factors influence the level of bone maturity and explain to a great extent systematic error in estimating age when the method is applied to a population that is different from the benchmark [41].

Statistical treatment

Reliability in measurement has been related to repeatability (inter- and intra-observer) and with an error in application of the method. Lin’s concordance correlation coefficient [42] has been previously calculated to assess the repeatability of the measurement when applying the GPA method. For boys and girls, the value of the coefficient has been calculated thus:

$$ {\rho}_c=1-\frac{1}{n}\cdotp \frac{\sum_{i=1}^n{\left({y}_{1i}\hbox{-} {y}_{2i}\right)}^2}{s_{y1}^2+{s}_{y2}^2+{\left({\overline{y}}_1\hbox{-} {\overline{y}}_2\right)}^2} $$
(1)

where n is the sample size, y 1i is the first set of measurements (first observer or first replica), y 2i is the second set of measurements (second observer or second replica) and \( {\overline{y}}_1 \), \( {s}_{y1}^2 \) and \( {\overline{y}}_2 \), \( {s}_{y2}^2 \) are the mean and the variance of the first and second set of measurements. The mean difference between observations (inter- and intra-observer) and the standard deviation have also been calculated. To ensure that the results are representative, 80 individuals selected at random were included: 40 boys, 2 of each age class; 40 girls, 2 of each age class and two more selected at random to complete the sample. To evaluate inter-repeatability, two observers have been included (2 × 80 = 160 measurements in total), and to evaluate intra-variability, two replicas per individual have been included (2 × 80 = 160 measurements in total).

A systematic error and a random error are associated with the estimation of chronological age using the GPA method. The systematic error, which has been highlighted in many populations, is explained by the differences between the study populations and the benchmark (American 1931–1942). The random error has been associated with the growth and differential development of the individuals belonging to the same population, age class and sex. Thus, by age class and sex, the relation between the estimated and real chronological ages is formally expressed in the Eq. (2):

$$ {\mathrm{CA}}_{\mathrm{i}/\mathrm{j}}\left(\mathrm{GPA}\right)={\mathrm{CA}}_{\mathrm{i}/\mathrm{j}}\left(\mathrm{real}\right)+{\mathrm{e}}_{\mathrm{i}}\left({\mu}_{\mathrm{j}},{\sigma}_{\mathrm{j}}\right)={\mu}_{\mathrm{j}}+{\mathrm{CA}}_{\mathrm{i}/\mathrm{j}}\left(\mathrm{real}\right)+{\mathrm{e}}_{\mathrm{i}}\left(0,{\sigma}_{\mathrm{j}}\right) $$
(2)

where CAi/j(GPA) and CAi/j(real) are the estimated and real chronological ages corresponding to individual i, and e i(μ j,σ j) is the error in the estimation of the age corresponding to the individuals assigned to chronological age j. In this context, μ j and σ j are associated with the systematic and random errors, and the estimation of the systematic error is reduced to:

$$ {\widehat{\mu}}_j=\overline{{\mathrm{CA}}_{i/j}\left(\mathrm{GPA}\right)}-\overline{{\mathrm{CA}}_{i/j}\left(\mathrm{real}\right)} $$
(3)

Therefore, for estimated chronological age and sex, the mean and standard deviation of the real chronological age, the systematic error and the maximum random error with probability of 0.95 (1.96 s) have been calculated, and the mean difference between real chronological and estimated ages has been contrasted (Student’s t test).

To predict maturity (18 years), the classical criterion of classification into two categories has been used, and to separate the individuals of doubtful classification, a classification criterion into three categories has been introduced. The two category criterion consists of assigning the “maturity” category when the chronological age is equal to or more than 18 years and assigning the “minor” classification when the chronological age is equal to or less than 17 years. The three category classification criterion consists of assigning the “maturity” category when the chronological age is equal to or more than 19 years, assigning the “minor” category when the chronological age is equal to or less than 17 years and assigning the “undetermined” category when the chronological age is equal to 18 years. For both methods, the sensitivity (SEN = TP/(TP + FN), TP: true positive, FN: false negative), specificity (SPE = TN/(FP + TN), TN: true negative, FP: false positive) and positive predictive value (PPV = TP/(TP + FP)) have been determined, and the Wilson score intervals have been obtained [43, 44].

Results

The repeatability of the measurement when the GPA method is applied presents two components, reproducibility (inter-observer variability) and repeatability (intra-observer variability). Amongst observers, Lin’s concordance correlation coefficient has been estimated in ρ c,G = 0.99 for girls and ρ c,B = 0.99 for boys. The mean difference between observers has little relevance: 0.075 years (equivalent to 27 days), with a standard deviation of 0.27 years (equivalent to 3.2 months), for girls; and 0.05 years (equivalent to 18 days), with a standard deviation of 0.22 years (equivalent to 2.6 months), for boys. Amongst measurements carried out by one single observer, Lin’s concordance correlation coefficient has been estimated in ρ c,G = 0.99 for girls and ρ c,B = 0.99 for boys. The difference in mean between has little relevance: 0.03 years (equivalent to 11 days), with a standard deviation of 0.16 years (equivalent to 1.9 months), for girls; and 0.01 years (equivalent to 4 days), with a standard deviation of 0.04 years (equivalent to 0.5 months), for boys (Fig. 1).

Fig. 1
figure 1

Contrast in the measurement of bone age a, b with two observers (inter-observer) and c, d with two replicas of the same observer (intra-observer)

For the sample obtained in the Spanish population, the systematic error in the prediction of systematic age significantly affects both sexes (girls and boys). The systematic errors are not uniform and vary between − 0.81 and + 0.92 years for girls (equivalent to − 9.72 and 11.04 months) and between − 1.15 and 0.34 years for boys (equivalent to − 13.8 and + 4.08 months). In girls, the greatest differences by default (estimated age less than the one provided for in the atlas) can be clearly seen in the ages between 2.5–4.17 years (2 years and 6 months—4 years and 2 months). On the other hand, the greatest differences by excess were observed from 10 years upwards, especially in the ages between 14 and 16 years. In boys, the differences by default are of greater magnitude and particularly affect individuals with chronological ages between 3 and 6 years and at 14 years. To offset the systematic errors that are produced when applying the atlas to the Spanish population, the sign adjustments contrary to the bias vary between − 11 and + 10 months in girls and between − 4 and + 14 months in boys (adjustment CR-age). For age predictions of more or equal than 1 year, the random error (maximum error with probability 0.95) affects both sexes on most of the age classes and varies between 0.82–2.5 years in girls and between 0.72–3.52 years in boys. For age predictions of less than 1 year, the random error is much smaller in absolute value but is very high in relative value (Tables 1 and 2). In this context, the contrast between the bone age and the chronological ages is graphically represented (Fig. 2).

Table 1 Estimation of the chronological age by applying the GPA method to girls: description of sample (size, mean ± standard error of the difference between the real and estimated chronological ages); systematic bias and random error in the estimation of the chronological age (probability 0.95, the most important systematic and random errors are highlighted in italics); contrast of measurements by estimated age class (*denotes significant differences in the measurement); and adjustment in the estimation of the chronological age (months)
Table 2 Estimation of the chronological age by applying the GPA method to boys: description of sample (size, mean ± standard error of the difference between the real and estimated chronological ages); systematic bias and random error in the estimation of the chronological age (probability 0.95, the most important systematic and random errors are highlighted in italics); contrast of measurements by estimated age class (*denotes significant differences in the measurement); and adjustment in the estimation of the chronological age (months)
Fig. 2
figure 2

Contrast of bone ages and chronological ages. a Girls and b boys

For the sample of boys with chronological age between 16 and 19 years, the skeletal ages categorised into three groups (≤ 17 years, 18 years, > 18 years) have been distributed as follows: 29, 1 and 0 for boys with chronological age 16 years; 14, 16 and 0 for boys with chronological age 17 years; 2, 11 and 17 for boys with chronological age 17 years; and 0, 1 and 29 for boys with chronological age 19 years. Therefore, sensitivity in predicting legal age (SEN = TP/(TP + FN)) is similar when using the two classification criteria: 0.97 = (11 + 1 + 17 + 29)/((11 + 1 + 17 + 29) + (2 + 0)) in two categories (“legal age” and “minor”); and 0.96 = (17 + 29)/((17 + 29) + (2 + 0) in three categories (“legal age”, “minor” and “not decided”). On the other hand, specificity (SPE = TN/(FP + TN)) and the positive predictive value (PPV = TP/(TP + FP)) are much higher when using the decision rule in three categories: SPE = 0.72 = (29 + 14)/((1 + 16 + 0 + 0) + (29 + 14)) and PPV = 0.77 = 58/75 in two categories; and SPE = 1.00 = 43/43 y PPV = 1.00 = 46/46 in three categories. The decision rule in three categories assigns 24.17% of the individuals between 16 and 19 years to the “not decided” category (Table 3).

Table 3 Classification of boys in two and three categories. TP true positive, FN false negative, FP false positive, TN true negative, + predictive value positive predictive value

Discussion

The results obtained in the inter-observer repetitiveness study (ρ c = 0.99, 0.075 ± 0.53 years for girls and ρ c = 0.99, 0.05 ± 0.43 years for boys) highlighted the robustness of the GPA method when evaluating bone age. The mean difference and standard deviation are concordant with the ones described in previous studies: [45], in a sample of 47 individuals from Central Europe (paediatric radiologist); [46], in a multiethnic sample of 159 individuals residing in Denmark (Asia, Central and Eastern Europe, Middle East, Sub-Saharan Africa and Northern Africa); and [47], in a multiethnic sample of 2614 individuals residing in France. The mean differences and standard deviations observed are similar to those obtained by Groell et al. [45] and Lynnerup et el. [46] and are less than the ones obtained by Chaumoitre [47]. On the other hand, the intra-observer repetitiveness is much reduced and is similar to that obtained in the above-mentioned studies. In these contexts, the border effect to a great extent goes to explain the classification errors. In particular, no differences in classification are observed when the individuals present the characteristics belonging one single age category, while the greatest probability of error (doubtful classification) is concentrated in the individuals that present characteristics of two categories [48].

The regularity in bone development of the individuals justifies the relation between chronological age and the level of development and bodily maturity of the individuals (biological age). Taking the American population described in the Atlas as a benchmark, the systematic error (inter-difference) and the application of adjustment factors have been described for a large number of populations: African-American and European-American [49], Caucasian [50], Thai [51], Moroccan [52], Turkish [53], American [54], Italian [55], South African [56], Hindu [57], Scottish [58], Pakistani [59] and French [60], amongst others. In the Spanish population, the systematic error in the prediction of chronological age significantly affects both sexes: \( \overline{X}=0.01 \) years (− 0.81, + 0.92) for girls and \( \overline{X}=-\kern0.5em 0.33 \) (− 1.15, + 0.34) for boys. The biases obtained for the Spanish population match those obtained in the European area: similar to those reported by Groell et al. [45], van Rijn et al. [50], Tisé et al. [55] and Hackman and Black [58]; and less than those reported by Zabet et al. [60], with CA < SA. As regards the American population, the systematic differences obtained are similar to the ones reported by Mora et al. [49] and less than those reported by Calfee et al. [54], with CA > SA. Finally, in relation to other geographical areas, the biases are similar to those reported by Chiang et al. [51], Büken et al. [53] and Dembetembe et al. [56] and are less than those reported by Garamendi et al. [52], Patil et al. [57] and Manzoor [59]. Genetic, environmental and socio-economic factors, along with poverty have been related to bone development of individuals and with delays in the level of maturity in pre-puberty (inter- and intra-populations) [41, 61,62,63,64]. To offset the time imbalance in the inter-population maturation rate (Spanish and benchmark GPA), the correction factors of the atlas have been obtained (in the opposing direction to the systematic error).

In the same class of age and sex, the standard deviation in the estimation of the bone age reflects the heterogeneity in the pace of individual growth. For the Spanish population, the inherent variability in bone age (\( \overline{s}=0.84 \) years, 0.41–1.25 in girls, and \( \overline{s}=0.80 \) years, 0.36–1.76 in boys) also matches the results obtained in the European area: similar to the ones reported by Groell et al. [45], van Rijn et al. [50] and Tisé et al. [55] and less than the ones reported by Hackman and Black [58] and Zabet et al. [60]. As regards the American benchmark population, the inherent variability is higher than the benchmark [17], it is similar to the one observed by Mora et al. [49] and is less than the one reported by Calfee et al. [54]. As regards other territorial areas, the observed variability is less than the one reported by Chiang et al. [51], Büken et al. [53] and Manzoor [59], is similar to the one reported by Garamendi et al. [52] and is higher than the one observed by Patil et al. [57]. Besides the above-mentioned factors (genetic, environmental and socio-economic), the non-uniform differences in the rhythm of growth within each category (intra-population variability) are also attributable to the synergistic action of the growth hormone and sexual steroids (pre-pubertal depression: moderate variability, puberty: high variability and pubertal deceleration: moderate variability). In the forensic sector, this intra-population variability, which is high in all populations, turns into uncertainty/error when the chronological age is estimated (with a probability of 0.95, the maximum error is quantified as ± 1.96 s). Therefore, taking the mean variability observed in the Spanish population (s ≈ 0.82 years) as a reference, the results obtained highlight the limitations in predicting chronological age when only the bone information from the hand and wrist is used (on average, the maximum error is ± 1.96·0.82 = ±1.6 years).

Different alternatives have been described in the literature when using skeletal maturity scores for the hand and wrist bones. Using European standards and descendants of Europeans, the method of Tanner and Whitehouse 3 (TW3-RUS) is the most widely used quantitative method to reduce subjectivity in the recognition of patterns and to minimise the inter-observer and intra-observer variabilities [18]. Alternatively, local methods of use have also been described: for the French population [65] and for the Turkish population [66]. Unfortunately, the results obtained have not been overly encouraging. With the TW3 method, the intra-population variabilities reported the resulting maximum uncertainty/error committed in the prediction of the chronological age do not significantly differ from the errors committed when the GPA method is applied [67,68,69]. On the other hand, technological advances in image processing and analysis have favoured the development of computational tools that provide automatic reading of results [70, 71]. The automation acts on the bias and on the inter- and intra-observer variabilities, but not on the intra-population variability (which is attributed to differential growth). The bias can be offset with the obtained adjustments, and by eliminating the effect of the observers, the intra-population variability can be estimated by difference of variances. For the Spanish population and by extension to many more populations, the intra-population variance without the effect of observers has been estimated at 0.822–0.252 = 0.61, and therefore, the error/uncertainty in the prediction of chronological age is not significantly reduced (± 1.96·0.611/2 = ± 1.53). This rationale is also backed up by experimental results reported in the literature. Thus, for example, when using the BoneXpert automatic method on a sample of 179 girls (3–15 years) and 226 boys (3–17 years), the bias was reduced after calibration and the intra-population variance was not reduced (s GPA = 0.84 and s BX = 1.23 in girls, s GPA = 0.87 and s BX = 1.05 in boys) [72]. The reported increase in variance may be attributable to chance or errors in image recognition.

To achieve greater reliability in predicting chronological age, international organisations have proposed adding additional information to treating in a multivariant form [5, 29,30,31,32,33,34,35]. The results obtained to be classified into two categories of “minors” and “mature” are compatible with the results reported by Garamenti et al. [52] in a Moroccan population. To achieve greater reliability in prediction, we have proposed a sequential classification criteria based on decision trees. At the first level of hierarchy, we have suggested the GPA method, since it is regarded as physiologically more stable than dental eruption or the maturation of secondary sexual characteristics [73]. The implementation of the decision criteria in three categories has enabled the doubtful individuals to be separated into the category of “undetermined” and to satisfactorily classify in the categories of “mature” and “minor” (0.96 in sensitivity, 1 in specificity and 1 in predictive value). At the levels of subordinate hierarchy (second and third, if necessary), the classification focuses solely on the individuals classified in the “undetermined” category. To establish the classification category of the undetermined individuals in the subordinate levels, tests shall be carried out with other commonly used techniques (dental and morphological development along with that of other bones), and if the discriminatory capacity of such techniques is insufficient, the use of other more sophisticated techniques shall be explored (e.g. molecular osteology; histomorphology).

Conclusions and future directions

The systematic error (differences with regard to the benchmark population), the adjustment factors to minimise error and random error (basically attributed to differential growth of the individuals) have been obtained for age class and sex. The results obtained have enabled the error in predicting chronological age (with or without adjustment) to be delimited when it is unknown.

In accordance with the guidelines set forward by different international organisations (ABFO, IOFOS, AGFAD), we propose a sequential classification criterion to determine maturity. Using decision trees as a basis, the implementation of a numerical criteria in three categories, “undetermined” (doubtful individuals), “mature” and “minor”, has provided very satisfactory results in the Spanish population. The use of this system has resolved a considerable part of the problem, and attention (additional resources) is focused on the sub-group of doubtful individuals.

In this context, the future actions are geared towards the following: (a) in other populations, validate the efficiency of the classification method in three categories (generalise the applicability of the method); (b) try out alternative to the G&P atlas to reduce the number of individuals classified in the category “not decided” (minimise the human and economic cost); and (c) establish the most adequate strategy for dealing with “doubtful individuals” (completing the classification). The quantitative alternatives to the G&P atlas are geared towards reducing the inter- and intra-observer errors (subjectivity) and towards reducing the effect of resolution/discretisation (interpolation with scores). The application of these methods has contributed towards progress being made, although insufficient to significantly reduce the prediction errors. Despite this, it would be interesting to evaluate their possible effect on reducing the number of individuals classified as “doubtful”.