Introduction

Forensic practitioners often face challenging cases where they have to determine whether an individual is a minor or an adult [1, 2]. Human identification for humanitarian reasons in civil and criminal cases involving young people, cases of adoption, mass disaster scenarios, illegal trafficking, migration, unaccompanied minors seeking asylum, early marriage and child labour are just some cases [1, 2].

In this context, several specialists are often involved in providing a reliable age assessment for legal purposes and according to the standard principles of patient confidentiality and professional conduct. For this reason, multiple techniques from different forensic disciplines can be used in order to attain a result that is the closest to reality [3].

According to guidelines published by the International Study Group on Forensic Age Diagnostics (AGFAD: Arbeitsgemeinschaft für Forensische Altersdiagnostik) (https://www.medizin.uni-muenster.de/en/rechtsmedizin/schmeling/agfad/about/home), forensic age estimation in minors is a standardized process in which the radiographic study of the hand-wrist bones and the study of third molar development are often the most useful methods for determining whether an individual has attained the age of majority [3, 4]. In countries where ionizing radiation imaging techniques are forbidden for application in medico-legal cases, other techniques may include specialized modalities, such as computed tomography (CT) or magnetic resonance imaging (MRI) [3].

Several guidelines for assessing the age of suspected minors have been proposed, showing that estimating chronological age in living individuals is complex [5]. Age indicators can be applied, with different degrees of accuracy, throughout age groups [6]. Radiographs of the left hand can be used to analyse the form and size of bone elements and the degree of epiphyseal ossification in adolescents and young adults [3, 7]. One of this method’s limitations is that changes in the hand-wrist bones are not clear after the age of 14–16 years [3, 7].

In a later stage of development, third molar analysis is very important for age assessment between adolescence and adulthood because it is the only tooth still in development during that period. All teeth, except for third molars, finish their development between 12 and 14 years of age, and, in the age span of 15.7–23.3 years, the third molars represent the only teeth still growing. Therefore, assessing legal age can only be carried out by observing and measuring the third molar maturation process [8,9,10].

The third molar is characterized by high variability in formation and large-scale diversity in presence or absence [11, 12]. Sometimes, this tooth is useless for analysis because of its position or rotation in such a way that it cannot be analysed. Some individuals do not grow third molars at all. In some cases, matured (stage H) third molars may appear as early as 15 years of age, whereas, in others, they may have not appeared at all, even at the age of 25 [13, 14]. The worldwide average of third molar agenesis is 22.6% [15]. The ratio of bilateral to unilateral third molar agenesis is also significantly higher [16]. In addition, pathology, such as unrestorable caries, non-treatable pulpal and/or periapical pathology, infection, internal or external resorption of the tooth or adjacent teeth, and disease of the follicle including cysts or tumours are all well-defined criteria for lower third molar removal [17, 18].

Differences in the development rate of third molars between dental arches and left–right symmetry have also been highlighted in previous studies. Kasper et al. (2009) could not find evidence for left–right symmetry in third molar development [19]. Mincer et al. (1993) reported a slight but random difference between the development of third molars of the same arch [20]. Conversely, Levesque et al. [21], Olze et al. [22], De Angelis et al. [23], and Thevissen et al. [24] reported the absence of a significant difference between the development of left and right third molars.

In 2008, Cameriere et al. [25] developed a new dental method for assessing adult age. This method is based on the relationship between age and the normalised measures of the third molar’s open apices, known as the third molar maturity index (I3M).

This technique records continuous data and is based on ratios between measurements of apical pulp widths and tooth lengths. A cut-off value of 0.08 was determined to assign an individual to a juvenile or adult age [25]. Since I3M was introduced, several authors have tested and validated this method in various populations [26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56]. This method evaluates the left lower third molar, and in cases where they are missing, the right ones are examined and measured. Cameriere et al. [25] demonstrated that the I3M better discriminates between adults and minors as being 18 years and older or younger than 18 years, both in males and females, since it has high accuracy for estimating adulthood regarding forensic purposes according to a recent systematic review and meta-analysis [1].

This study aimed to validate the cut-off value of I3M (I3M < 0.08 suggests that the subject is older than 18 years; I3M ≥ 0.08 suggests that the subject is younger than 18 years) [25] on both the right (I3MR) and left (I3ML) lower third molars and to estimate the risk of error in those cases in which only one of them is available (e.g. hypodontia, unilateral lower third molar agenesis, early extraction or forced removal of the tooth), using a large multi-ethnic sample.

Materials and methods

Sample

A total of 10,181 orthopantomograms (OPGs) from four continents was evaluated: 1631 subjects from Africa; 861 from North and South America; 3919 from Asia; and 3770 from Europe. The sample distribution, according to sex and age, is shown in Table 1. The following I3M ranges categories were considered to analyse the relationship between index and real age (years) of each subject: (0, 0.04); (0.04, 0.08); (0.08, 0.3); (0.3, 0.5); (0.5, 0.7); (0.7, 0.9); (0.9, 1.2); (1.2, 1.5); (1.5, 1.8); (1.8, 2.1); (2.1, 2.4); (2.4, 2.7).

Table 1 Sample distribution according to sex and age

These categories were chosen according to the available published research on the I3M [1]. Their primary function is to graphically represent how the third molar root apices gradually close as a subject age increases. According to the literature, the narrower the ranges, the more precise the graphic display of the I3M is in relation to the continuous phenomenon of root apices closure [1].

The sample selection criteria were as follows: clear OPGs and all permanent teeth present, including left and right third molars. The presence of any visible dental and bone pathology (e.g. large carious lesions or endodontic treatments), children with any systemic diseases or endocrine anomalies (e.g. dysmorphology, abnormally short roots) and subjects with previous root canal treatment in the lower permanent teeth were excluded from the study. Heavily rotated and impacted third molar with no visible roots were also excluded from this study.

No identification data were collected, and images were saved with high resolution in the JPEG file format, and automatically anonymised before the selection process. All data were recorded in an Excel file and the columns indicated as follows: continent, subject’s identification number, sex, date of birth and date of OPGs.

Intra- and inter-observer variabilities

Three forensic odontologists with varying levels of experience performed the observer error analysis. Repeated observations from the first author (NA) were used to determine intra-observer agreement, while inter-observer analysis was based on comparisons with those of two other observers (IG and LGJ). For this purpose, 100 OPGs were randomly selected 1 month after the initial scoring to calculate the agreement’s percentage for both intra- and inter-observer test. During the process, the observers were categorically blinded from the chronological age and sex of each subject. Intraclass correlation coefficient (ICC) was applied to calculate intra- and inter-observer variability.

Statistical analysis

The real age was classified as a binary variable, divided into adults (18.00 years or older) and minors (13–17.99 years old). The normality distribution for both the I3ML and I3MR was evaluated by the Kolmogorov–Smirnov test, which showed to be non-normal distribution data (p < 0.05). The I3ML and I3MR were compared according to the categories of age (adult or minor) according to each continent by Wilcoxon Signed Rank test, in order to evaluate possible differences between the measurements.

Each subject was classified as minor or adult according to the application of a cut-off developed by Cameriere et al. [25] in both third molars (I3ML and I3MR): I3ML and I3MR values < 0.08 classified a subject as an adult, and values ≥ 0.08 classified a subject as minor. Then, the estimated classifications for both I3ML and I3MR were compared to a real classification of individual’s age, and the diagnostic values of prediction were observed.

Specific descriptive statistics (also known as quantities) were calculated according to sex and third molar: contingency tables (confusion matrices) were generated, and the respective predictive values of sensitivity and specificity (Se and Sp, including 95% CI), positive and negative predictive values (PV + and PV − , including 95% CI), positive and negative likelihood ratio (LR + and LR − , including 95% CI) and accuracy (Acc) of estimates were determined. The LR + is the true positivity rate divided by the false positivity rate (sensitivity/1-specificity). It shows how much to increase the probability of being an adult, given a positive test result. The LR − is the false negative rate divided by the true negative rate (1-sensitivity/specificity) and indicates how much to decrease the probability of being an adult, given a negative test result [57]. The main advantage of LRs (over other measures of diagnostic accuracy, such as sensitivity and specificity) is that they can be used to quickly compare different diagnostic strategies and thus refine judgment and decision in the forensic daily practice.

All the analyses were performed using the R software (version 3.6.3, R Core Team, R Foundation for Statistical Computing, Vienna, Austria) and SPSS software (version 26.0, Armonk, IBM Corporation, NY, USA), adopting a significance level of 5%.

Results

Regarding the measurement process, the intra-observer agreement values were 0.993 (95% CI: 0.990–0.996) and 0.976 (95% CI: 0.964–0.984), for the L3M and R3M, respectively. The values for inter-observer agreement were 0.983 (95% CI: 0.976–0.988) and 0.974 (95% CI: 0.959–0.983), for the L3M and R3M, respectively.

As shown in Figs. 1, 2, 3 and 4, I3M values gradually decreased as age (years) increased across all the age groups in both molars according to each sex.

The distribution of the sample according to sex and measurements of I3ML and I3MR is described in Table 2. The lower third molar mineralisation varies according to sex, and it occurred earlier in males than in females for both I3ML and I3MR, mostly evident at the stages between 0 and 0.04, 0.04–0.08, 0.08–0.3, 0.3–0.5, 0.5–0.7, and 0.7–0.9 (Table 2).

Fig. 1.
figure 1

Boxplots of the relationship between chronological age and I3ML in females. The horizontal red dotted line is at 18 years of age.

Fig. 2.
figure 2

Boxplots of the relationship between chronological age and I3ML in males. The horizontal red dotted line is at 18 years of age

Fig. 3.
figure 3

Boxplots of the relationship between chronological age and I3MR in females. The horizontal red dotted line is at 18 years of age

Fig. 4.
figure 4

Boxplots of the relationship between chronological age and I3MR in males. The horizontal red dotted line is at 18 years of age

Table 2 Descriptive statistics of age distribution for each I3M category according to sex and age

Table 3 showed the sample’s descriptive statistics according to each continent: the I3ML and I3MR were compared giving the categories of “adult” and “minor”. The Wilcoxon Signed Rank Test for left and right asymmetry found no significant difference for each side separately, confirming left–right symmetrical third molar development, except for minors in the African group, and for adults in the Asian and European ones.

Table 3 Sample characterisation according to each continent and age categories of adult and minor for both I3ML and I3MR

Regarding the Acc, values of 80% or more were obtained when the I3ML and I3MR were used (Table 4). The Acc values were similar in the African, American and European samples according to the I3ML (around 83%). Meanwhile, American and European samples had similar Acc if the I3MR was used (around 85%) and a slightly better Acc than that obtained for the African sample (82.6%). Low Acc was achieved in the Asian sample if both I3ML and the I3MR were considered (73%). Concerning the Se, all the obtained values from both left and right third molar were smaller than the Sp values. PV + and PV − showed different values, whereas the high values for LR + (values were greater than 1), and the low ones observed for LR − (values were between 0 and 1), indicated greater discriminating ability.

Table 4 Quantities and predictive values for classification of subjects based on I3ML and I3MR measurements

Tables 4 and 5 show the pooled data according to the continent and both lower third molars, and when the I3M was considered for prediction, respectively. According to the quantity statistics derived from the contingency table, it is evident that both classifications demonstrated high values of Sp and PV − in comparison with Se and PV + , and high values of LR + and low values of LR − .

Table 5 Quantities derived from the contingence table based on I3MR measurements, adopting the I3ML as gold standard for age prediction

In Table 6, the number of subjects correctly classified is indicated when one of the third molars classified them differently from their real age.

Table 6 Real classification of subjects in case of wrong evaluation obtained by using one of the third molars

Discussion

Age estimation of a living person is a complex multidisciplinary process due to the important humanitarian, civil and criminal implications that it often entails [1, 7]. Depending on the age threshold under scrutiny, selecting the most appropriate age indicator for accuracy and availability is necessary.

A specific challenge in forensic age estimation is assessing age of majority of asylum seekers lacking valid information relative to their identity. In this ethically sensitive scenario, a low number of false negatives has higher priority compared with a low number of false positives [58, 59].

As mentioned before, different age indicators have different applicability throughout age groups, and third molar is an advantageous indicator in the transition between adolescence and adulthood. Therefore, its use is fundamental for estimating age during these phases and, particularly, to accurately distinguish an adult subject from a minor [8,9,10, 36]. While there are several issues related to the application of third molar for age estimation, such as some authors vehemently disregarding its use both for ethical and methodological reasons, the progress observed in the search for more accurate and reliable methods is undeniable [57, 60].

The I3M method, the focus of this investigation, is a distinct result of these efforts, reinforced by its validation in different populations. Nonetheless, developmental variability still raises issues, for example, regarding differences between sexes, left/right asymmetry and possible lack of this tooth. These issues need to be addressed in order to improve scientific interpretation of discrepant results found in a diachrony of investigations. Thus, this study aims to evaluate I3M between left and right third molars using a multi-ethnic sample and the impact of unilateral absence of the third molar for assessing age of majority.

Considered as ground-breaking in forensic sciences and dentistry, this is the first dental study targeting both left and right third molar measurements on panoramic radiographs of subjects from different continents.

The results confirmed that the lower third molar’s mineralisation proceeds in a directly proportional manner with increasing age. The rate of development varies according to sex, with a longer delay in females than males for both I3ML and I3MR by a mean of 0.3 years between stages 0 and 0.9. Willershausen et al. [61] obtained similar results by analysing a multi-ethnic sample of 1202 panoramic x-rays and demonstrated that third molar roots generally develop at a faster pace in the maxilla than in the mandible, with delayed development of both the upper and lower third molars in girls as compared to boys by a mean of 0.7 years. Sex differences in third molar development have been reported with contradictory results in studies of samples from different origins. In separate research studies, Deitos et al. [49], Cameriere et al. [51], AlQahtani et al. [54], and Kumagai et al. [31] reported earlier third molar mineralisation in males of different geographic origins. Nevertheless, while other works indicated no statistically significant differences between groups, the I3M cut-off value performance was still assessed for males and females separately [42, 45, 47]. The sex differences in third molar maturation are often attributed to a reverse sexual dimorphism, explained by the fact that these rates contradict the female tendency of presenting a faster maturation, thus the need for sex-specific standards [54, 62]. Nonetheless, it is argued that the absence of sex differences on maturation rates of the third molar found in some research studies may be due to, in a later stage of development, males and females acquiring a similar developmental rhythm [48].

This study found no significant differences in the ability to discriminate between adults and minors using the left or the right I3M separately for each continent. This fact confirms the hypothesis that the development is generally symmetrical for both sides, according to the results obtained in previous research [19, 61].

According to the Wilcoxon Signed Ranked Test, some significative differences were observed in the Asian, European and African groups. Although these differences are not observed in the values ​​of medians and values ​​of first and third quartiles, they reflect those cases in which the lower third molars’ development does not occur in a similar way in the same age range of the same population. These results are in agreement with the concept of “fluctuating asymmetry” proposed by Bassed et al. [63]. They stated that when all of the identified asymmetric cases are assessed separately (i.e. minors from the Asian group and adults from the European and African ones), it is seen that asymmetric development is an issue on an individual basis. Therefore, in all these cases in which differences were observed, age estimations based on one of the third molars may not provide the best estimate for an identified subject.

Given the quantity statistics shown in Table 5, it was observed that the accuracy obtained from the assessment of I3M of both sides (> 80%) does not differ significantly from that obtained by analysing the individual sides: in the African, American and European samples, the accuracy values obtained by examining the I3ML separately (about 85%, 86% and 85%, respectively) are very similar to those obtained from the estimate made with the I3MR (about 84%, 88% and 86%, respectively) with a slightly better result for the I3MR of the American group. The most significant discrepancy is evident between the different continents rather than between the right/left side. In fact, compared to the other continents, the Asian sample obtained a slightly lower accuracy value on both sides (corresponding to about 75% of the I3ML with 74% of the I3MR).

It is not clear whether this reflects a true ethnic or geographic difference. This heterogeneity may be due to some study-specific factors (such as rather training and experience and imaging settings) rather than to real ethnic differences [64].

When the I3M is compared with a staging method, such an accuracy does not increase much more than in the current research. In Thevissen et al. [65], every available third molar was scored following the 10-point scoring system described by Gleiser and Hunt [66] and modified by Köhler et al. [67] within 9 country-specific populations. The percentage of correctly identified adults ranged between 72 and 93%, the percentage of correctly identified juveniles was between 33 and 87%, and the percentage of correctly identified juveniles and adults (all subjects) ranged between 71 and 85%. In a recent meta-analysis and systematic review [68], the diagnostic accuracy in the entire cohort was 71.3%. In the age range of 15–20.9, diagnostic accuracy was 64.6%.

However, when all quantities are taken into account as a set of connected variables, it is evident that the performed test discriminated very well between the two age categories (i.e., at least 18 years of age and younger than 18). Both classifications demonstrated high values of Sp and PV − in comparison with Se and PV + , and high values of LR + and low values of LR − , indicating that the age estimation by I3MR method is as accurate as that obtained by using the I3ML. This low sensitivity and high specificity for the age prediction threshold of 18 years is visible in all the geographic groups, thus indicating that this threshold favours a low number of adults misclassified as minors.

In those cases, in which one of the lower third molars is absent, or not completely developed, it is possible to use the opposite side. In this work, the high values of LR + for both I3MR and I3ML means that a mature third molar is more likely in a subject of the older category than those of the younger than 18 years old. Conversely, the low values of LR − showed that a subject with a negative test is more likely to be found if aged at least 18 than if their age is in the younger category.

These findings are especially interesting since the problems due to the lack of one of the third molars are unique as regards the anatomical districts evaluable for age (such as hand, clavicle and knee). Third molar agenesis has been reported as the most frequently occurring dental alteration, but it is not the only one [15, 69, 70]. In the literature, the reported reasons for third molar absence include the risk of impaction associated with caries, pericoronitis, periodontal defects in the distal surface of second molars, odontogenic cysts and dental crowding [71,72,73,74,75]. Finally, besides different biological or iatrogenic causes, the lack of a third molar may also be due to the deliberate act of extracting this tooth in an attempt to avoid the routine procedures for assessing age in minors seeking asylum [2].

In case of agreement between the I3MR-I3ML results, the ability to correctly identify a minor subject is approximately 86% (R) to 88% (L) for the African continent, 89% (L) to 90% (R) for the American continent, 93% (L) to 94% (R) for the European continent and 92% (L) to 91% (R) for the Asian continent. On the contrary, identifying an adult subject is about 10% lower for both I3MR and I3ML for the American and African samples, 15% for the European sample and 30% for the Asian sample. This means that, in general, the I3M method shows high specificity and low sensitivity with a higher risk of false negatives in identifying adults than minors. Concerning the LR + and LR − , these tests showed very high LR + and very low LR − , thus indicating greater discriminating ability. These values are independent of prevalence and relevant to express the probability of a diagnostic test result at the subject level: the larger the magnitude of the positive valence to the LR + , the more accurate the screening cut score is. Furthermore, LR-s range from zero to 1 with values closer to zero, representing a stronger likelihood that I3M screening performance at that particular cut score accurately categorizes the subject as minor.

Bassed et al. [63] assessed the effect of left/right asymmetry in dental and skeletal elements through the analyses of CT-scans, and in which the third molar was scored according to Demirjian et al. stages [76]. This study found that, in terms of left⁄right asymmetry, 12.6% of individuals differed in development by one stage, but only 1% by two stages in a sample of 570 individuals who had both lower third molars present [63]. Notwithstanding, the degree of asymmetry between left and right sides demonstrated that there was a good statistical agreement between contralateral sides, which supports this investigation’s findings. These authors also reported that left/right asymmetry presented a higher impact on age estimation through the clavicle than with the third molar analyses. Additionally, Bassed’s work [63] reported that asymmetry does not induce a significant bias when a sufficiently large sample is evaluated; however, with individual cases, its effect needs to be considered in forensic practice. Therefore, while our results show an agreement between I3MR and I3ML, it is necessary to evaluate its application when aiming to determine between adults and minors, thus offering an estimate which is as accurate and reliable as possible.

In fact, although the sample analysed in our study showed a high concordance between right and left, in 8% of the subjects, the evaluation of the I3ML is opposite to the I3MR, i.e. the evaluation of one side identifies an adult subject while the other identifies a minor. These cases of discrepancy are mainly represented by subjects who, in reality, are of an adult age. This finding invokes an interpretation of the scientific data which is very different from that normally given by the law, which states that in cases of doubt, the subject is considered a minor [77, 78]. However, with a greater likelihood of identifying an adult as a minor than the reverse, the minor’s rights are considerably safeguarded, lowering the possibility of producing ethically unacceptable errors in age assessment [59].

When these results are compared with those obtained by applying a qualitative method, outcomes may vary depending on the population under analysis [79]. However, the systematic reviews focusing on the “diagnostic performance” of mature third molars (apex closed) to distinguish between subjects younger or older than 18 years are still scarce. This hampers any reliable comparison between these two procedures [14].

The stage of mineralisation is classified according to various scales; the most commonly used is that of Demirjian et al. (A–H) [76]. On the one hand, it has been demonstrated that the staging techniques are slightly more accurate than those based on measurements and ratios [80]. On the other hand, a study [59] showed that, especially in those cases involving the possible criminal liability of the supposed minor (criminal proceedings), since the staging procedures may increase the number of false positives (ethically unacceptable errors), the forensic personnel have often considered the obtained error as significant.

Thevissen et al. [80] pointed out that ratios and measurements of third molars are less accurate age predictors than the stages of developing third molars and that measurements and ratios added no clinically relevant information to age prediction from third molar stage. In 2013, Thevissen et al. [65] compared third molar development registration techniques showing that the least accurate age predictions were achieved when the model based on the Cameriere’s registration technique was applied.

In general, the most serious issue of developmental stages is trying to assign the stage when the tooth in development has reached somewhere in-between the two available stages of development [45].

The I3M showed a more precise measure of the continuous process of maturation of third molar, regardless of ethnicity [64]. This index allows recording of the smallest visible open apex of this tooth and provides an accurate tool to overcome wide gap between Se and Sp for Demirjian’s stages G and H.

Cameriere et al. [25] demonstrated that, if the root apices of the third molar are closed (i.e. the third molar is at terminal grade H), then there is a high probability that the subject is at least 18 years of age. In addition, selecting a cut-off of 0.08 for the I3M allows avoiding an increase in the number of false positives when stage G of Demirjian’s technique is considered as a marker of adult age. Sharma et al. [81] demonstrated that, if stage 8 is selected as a predictor of adult age [82], it improves test sensitivity with respect to stage 9, but it evidently increases the number of false positives, which is considered an ethically unacceptable error in the judiciary system. If the I3M is used to estimate the legal adult age of 18 years, the same authors highlighted that it significantly increases test sensitivity with respect to stage 9. Furthermore, it minimises the number of false positive subjects. According to Liversidge and Marsden [83], when the Demirjian’s technique is applied, the probability of being at least 18, if the third molar is totally mature, changes from 0.75 to 0.98.

In summary, these qualitative techniques may be valuable age indicators, but considering the limitations of the last stage on adult age estimation (stage H) [76], and according to the high rate of false positives, they are not likely to be considered as useful as the continuous data in discriminating between minors and adults.

Limitations and future directions of this research

The main strength of this research was the possibility to gather reliable data from the global regions that are represented in these samples of ascertained ethnic backgrounds, from healthy subjects and known age. However, limitations were the relatively uneven age distribution of the sample and the difficulty in collecting data from certain geographical areas. In fact, since the OPGs were selected retrospectively from data obtained for dental care, a limited number of OPGs was collected in Australia and North America, where no further information was available from entire regions.

Concerning the current challenges and future perspectives, this research could be the foundation for a further novel study in which a combined approach left–right third molar is reproduced on a potentially larger dataset and by applying an appropriate statistical methodology to verify its usefulness in obtaining better age assessment. Further studies can include and analyse all four third molars and compare the results with cases that consider one or both mandibular third molars.

Conclusion

According to the literature, this is the first work in which I3M is explored on both left and right lower third molar, utilizing a vast sample with origins in four different continents: Africa, America, Asia and Europe. The results presented here validated the use of both I3ML and I3MR in the process of assessing age in cases of legal inquiry on the 18-year-old threshold since the accuracy attained for I3M on both right and left sides was high and did not differ significantly. This work indicates that there is a need to further explore data interpretation in their transition to legal and civil contexts.