Introduction

Estimated age is an important characteristic of an individual’s biological identity and should be differentiated from the chronological age. In a living individual, chronological age generally corresponds to calendar age, whereas the biological age or physiological age covers the growth period, which corresponds to the maturation process of different tissues and organs [1], and the aging phase of an individual [2].

Dental age is one of the indicators of maturity, like skeletal age, or changes in the secondary sexual or somatic characteristics. Dental age is used in clinical practice in orthodontics, pedodontics, pediatrics, and orthopedic surgery. It is also widely used in physical anthropology and forensic science, for the identification of victims of mass disasters, to solve criminal, civil, or administrative issues such as adoption, migration, child abuse, legal consent, asylum proceedings, marriage, social benefits, and age estimation in competitive sports [3,4,5,6,7].

In living individuals, dental age estimation based on dental development is applicable during the period from childhood to young adulthood, that is to say, over a period that extends until all teeth have completed their maturation process, which corresponds to a maximum dental age based on maturation. The estimation methods must therefore be as reliable (range and percentage of the confidence limits) and accurate (mean difference between dental and chronological age) as possible [3, 8,9,10].

Radiological methods for assessing dental age are based on the continuous process of dental mineralization and allow radio-morphological (i.e., Demirjian’s or atlas strategies) or radiometric (i.e., technical camera) observations [11, 12]. The number of tooth development stages described varies from 3 to 27 [13]. The challenge, therefore, lies in choosing a method that has enough stages to follow the continuous growth process as closely as possible and stages that are sufficiently differentiated to be easily discriminated by investigators [14]. Demirjian’s dental maturity method was the first to combine outline drawings, written descriptive criteria, radiographic examples of each stage, and rules of choice to decide on borderline stages [13]. Examples where dental age is either early or late in relation to chronological age have been documented when this method is applied to other populations [10]. Population sample (regional background, sexual dimorphism, chronological age distribution), cultural or environmental factors (e.g., socio-economic status, nutrition, dietary habits), and also endocrine factors, or even reasons of statistical methodology [3] may explain such differences. The dental maturity scoring system is nevertheless accepted worldwide for its ease of use and for its excellent inter- and intra-examiner consistency [10, 15]. Beside the initial use of only 7 permanent teeth inside the statistical model (first mandibular incisor to second mandibular molar), the methodology was improved by adding other indicators, such as the third molars, to increase the identifiable age range up to young adults [16,17,18,19]. Whether the aim is to infer chronological age or to position an individual in relation to an age threshold, population-specific standards rather than a universal standard or methods developed on other populations need to be used [10]. At the present time, many studies assess the accuracy of a reference method on a population different from that used in the original method [18, 20,21,22,23,24,25,26,27,28,29,30,31,32]. Although these studies may reveal overestimation or underestimation of a method carried out on a sample of the reference population, they do not give a tool that is directly usable in forensic science to answer the questions of age estimation or classification. Depending on the populations studied, the DEM method has a tendency to overestimate dental age and can give average differences between chronological age and dental age ranging from − 2.82 to + 0.10 years for females and from − 3.04 to + 0.23 years for males [33]. Compared to the DEM method, the WILL method overestimates to a lesser extent the age of the children [34].

Machine learning (ML) algorithms are part of artificial intelligence that may represent a powerful tool to develop more accurate algorithmic processes to estimate an individual’s dental age (regression problem) or define an age range (classification problem). ML is useful to discover patterns on large scale and multidimensional datasets that would have been difficult for individuals, even well trained, to identify [35]. Machine learning corresponds to a wide range of algorithm families and complexities, from easily interpretable models (e.g., linear regression, polynomial regression) to other models requiring development of additional explainability pipelines (e.g., support vector machine, neural network). The models can be supervised (the train dataset is “labeled”) or unsupervised (the model works on its own to discover information, dealing with unlabeled data). These techniques therefore make it possible to provide more specific tools for each population and to determine more accurately the equivalent of the standards table for the population [3]. Machine learning methods can be parametrized at several levels: the hyperparameters are parameters that are determined a priori for each model, whereas the parameters are determined a posteriori, after the training stage. If a forensic utilization is envisaged, these parameters must be identified to comply with Ritz-Timme et al. recommendations for age estimation [36]. ML methods are already used for bone age estimation [37] and are starting to develop for dental age [38,39,40]. ML methods provide a solution to formulate models capable of capturing complexity without the need for correspondence tables while preserving the same annotation system. ML methods are described as being more accurate than conventional radiological methods [38, 40]. These methods provide mean absolute error estimates of dental age that are less than 0.99 years relative to chronological age [38,39,40]. The advantages of these methods are the absence of the use of conversion tables specific to a given population, the adaptability of the methods for each population studied, the possibility of directly giving an estimate of the dental age for use in forensic science, the possibility of integrating more predictors than those used in the classical methods, and thus of widening the range of estimation of the dental age that can be estimated from children to the young adult population. Although these methods have been evaluated in comparison with classical reference methods [38, 40], no comparison between the use of ML methods on different populations has yet been made. In this work, supervised ML methods have been used, since the statistical models have been created through optimization of the model deduced from the “known” dataset [38].

The objectives of this study are to compare (1) the capacity of ten machine learning algorithms to predict dental age in children using the seven left permanent mandibular teeth compared to reference methods and (2) the capacity of ten machine learning algorithms to predict dental age from childhood to young adulthood using the seven left permanent mandibular teeth and the four third molars.

Materials and methods

Data collection

A sample of 3605 panoramic radiographs (age ranging from 2.18 to 24 years) of 1734 girls and 1871 boys was retrospectively collected from the clinical radiological database of the Department of Oral Medicine of the Toulouse University Hospital, corresponding to a large sample of the population of Southern France population irrespective of ethnicity. Radiographs of individuals with systemic diseases or developmental anomalies were excluded. Data were treated as cross-sectional, although a proportion of children from each group was represented by more than one radiograph. According to the guidelines provided by Schmeling et al. [9], we have been working to get an even distribution of subjects across all age groups, a balanced number of boys and girls, and a minimum number of subjects 10 times the number of examined features (7 permanent teeth, 4 third molars and gender, i.e., 12 features examined in this study). As a minimum of 10 boys or girls in each category per year of age is required, the 15 individuals of 3 years old of the sample have been excluded [13]. Since the birth date and the date of the radiological examination are known for each subject, the chronological age is calculated as the number of years elapsed between these two dates and recorded as two decimal points.

Ethical considerations

According to the French health regulations and law governing ethics in human research (Public Health Code), retrospective studies based on the exploitation of usual care data do not need to be submitted to an ethics committee but have to be declared or covered by reference methodology of the French National Commission for Informatics and Liberties (CNIL). A collection and computer processing of personal and medical data was implemented to analyze the results of this research. Toulouse University Hospital signed a commitment of compliance to the reference methodology MR-004 of the CNIL (number: 2206723 v0).

Radiograph analysis

During the assessment of all panoramic radiographs, the observers were blinded for the chronological age. The developmental stages of each of the seven left permanent teeth of the mandible, from central incisor to the second molar, were defined according to the eight radiographic stages defined by Demirjian et al. [11]. Tooth formation stages from A to D represent maturation stages of the crown, and stages from E to H represent maturation stages of the roots. When one or more index teeth were absent, its contralateral homologue was staged. The developmental stages of the third molars were assessed according to the same method by which an extra stage was added for the radiolucent bud (stage 1) [41]. Then each alphabetical stage was converted into a numerical development score (stage 1, 1; stage A, 2; stage B, 3; stage C, 4; stage D, 5; stage E, 6; stage F, 7; stage G, 8; and stage H, 9). When a tooth or its contralateral was absent, the developmental stage was set to zero (absence, 0). The conditions for estimating dental age by Demirjian’s method are only possible for individuals aged between 3 and 16 years. For this reason, a subgroup was identified for individuals under 16 years of age (U16 group) of the total sample (U24 group). To belong to the U16 group, the 7 mandibular teeth must have a maturity score. To belong to group U24, the stages of maturation of the 7 permanent teeth and all the third molars were considered. In this case, at least 4 permanent teeth from central incisor to second molar had to be staged, but all third molars could be absent.

Statistical analysis

Three months after the first assessment of all samples by the first observer (AG, confirmed and experienced dental practitioner), intra-observer reliability was tested by re-examining 50 panoramic radiographs randomly selected from the U24 group. Inter-observer agreement was tested with a second observer (LG, dental hospital intern) by assessing 50 randomly selected panoramic radiographs. The agreements were quantified using weighted Kappa statistics.

Two reference methods were used for dental age estimation on the U16 group: the one described by Demirjian et al. [11] and the one described by Willems et al. [19]. Demirjian’s method (DEM) to infer dental age is a three-step process. Firstly, each stage of the seven permanent teeth is converted into seven numerical biologic weights with the help of the sex-specific conversion table. Secondly, the weights are added together to give a dental maturity score. Thirdly, sex-specific tables of standards are used to convert the dental maturity score into dental age. Willems’ method (WILL) is easier because all numerical biologic weights of the seven permanent teeth were added together to directly give the dental age.

To validate the different ML strategies compared to the reference methods, the U16 and U24 datasets were randomly divided into a training dataset and test dataset in an 80–20% proportion, respectively. Gender was also used as a predictive feature. The ML approaches used training on two sources of information: the gender and the stages allocated by the main observer to the seven lower left permanent teeth and all third molars. The target values were the chronological ages. The following ML supervised regression algorithms were tested [42]: random forest (RF), support vector machine (SVM), decision tree (DT), Bayesian ridge regression (BRR), k-nearest neighbors (KNN), boosting method AdaBoost (ADAB), polynomial regression (POLYREG), and multi-layer perceptron (MLP). A pipeline was implemented for each ML algorithm, with the possibility of adding a variable normalization technique (min-max, standard scaling, or nothing) and a data reduction technique (principal component analysis or nothing). To test the benefit of combining the different techniques of ML mentioned above, stacking (STACK) [43] and voting (VOTE) [44] methods were also tested. The tuning of hyperparameters to obtain the best model was achieved by exploring multiple combinations using GridSearchCV function and a ten-fold cross-validation (to avoid overfitting, ensuring a 20% validation dataset to be used during hyperparameters optimization). The hyperparameters described in Supplementary Table 1 were tuned. To allow comparisons of the different machine learning methods, the entire procedure, starting with the random creation of new training and test sets, was repeated 20 times.

To allow ML methods to be compared with a reference method for individuals between 16 and 24 years of age, the age assessment based on the dental maturity score from the 4 third molars described by Hofmann et al. on a Central European population was carried out [45].

The accuracy of age predictions was measured by five indicators: the coefficient of determination (R2), mean error (ME; chronological age minus predicted age), root mean square error (RMSE), mean square error (MSE), and mean absolute error (MAE). One-way ANOVA was performed to compare the different indicators across the different ML algorithms, and the multiple pairwise comparisons were adjusted using Tukey’s HSD test [46].

The Bland and Altman method [47] was used to graphically assess the degree of agreement between chronological and biological age. All experiments were performed using Scikit-Learn 0.22.1 libraries [42] and Python 3.7.3. The significance level was set to .05.

Results

A total of 3570 orthopantomograms (1719 girls and 1851 boys) were finally considered for the group U24, from which 2230 orthopantomograms (1111 girls and 1119 boys) were in the subgroup U16. Age and sex distribution of the datasets are plotted in Fig. 1. The intra- and inter-observer weighted Kappa were 0.96 and 0.92, respectively.

Fig. 1
figure 1

Age and sex distribution for each category of age per year. X-axis represents age groups, 4 indicates all children from 4 to 4.99 years, etc.

Overall age prediction performance in the U16 age group

The age prediction performances are presented in Table 1. The mean error was obtained with chronological age minus predicted age; indeed, a negative value implies an overestimation of the age estimation method compared to the real age. The two reference methods for estimating dental age, DEM, and WILL were used for U16. The DEM method significantly overestimates dental age more than the WILL method (− 0.71 ± 0.07 and − 0.22 ± 0.08, respectively). For all other metrics, the WILL method was more accurate than the DEM method.

Table 1 Mean error (ME), mean absolute error (MAE), mean square error (MSE), root mean square error (RMSE), and R2 values for the U16 group (4–16 years old) assessing performance of machine learning regression methods and reference methods for chronological age estimation

All tested machine learning methods were significantly more accurate than the two reference methods for all metrics. ADAB and BRR achieved the lowest performance for MAE (Table 1, Fig. 2a).

Fig. 2
figure 2

Heat maps showing the mean of the mean absolute errors (MAE) calculated from the 20 replicates for each pair of dental age estimation methods for the 4–16 years (a) and 4–24 years group (b). Abbreviations: ADAB, AdaBoost; BRR, Bayesian ridge regression; DEM, Demirjian’s reference method; DT, decision tree; KNN, k-nearest neighbor; MLP, multi-layer perceptron; PAT_AGE, chronological age; POLYREG, polynomial regression; RF, random forest; STACK, stacking regressor; SVM, support vector machine; VOTE, voting regressor; WILL, Willems’ reference method. Standard Deviation (SD) of each method

Overall age prediction performance in the U24 age group

The age prediction performances are presented in Table 2 and Fig. 2b. The SVM and MLP methods result in a slight overestimation of dental age compared to chronological age (− 0.004 ± 0.071 and − 0.009 ± 0.148, respectively), while the other machine learning methods underestimate it. All metrics show significantly poorer performance for the ADAB and BRR methods. A non-significant trend shows better performance for the STACK, RF, MLP, and SVM methods.

Table 2 Mean error (ME), mean absolute error (MAE), mean square error (MSE), root mean square error (RMSE), and R2 values for the U24 group (4–24 years old), comparing machine learning regression methods for chronological age estimation

Figure 3 a shows Bland and Altman graphs for the two reference methods DEM and WILL. An overestimation of dental age over the whole U16 group is observed for DEM (mean error − 0.70) and WILL (mean error − 0.22). The plots show a tendency of both methods to underestimate dental age up to the age of 6 years and to overestimate dental age from the age of 12 years. The outliers are similar for both methods. The onset of dental maturation is earlier for girls compared to boys. A dense cloud of dots is observed for both methods until the age of 8 years, and then alignments of dots parallel to each other are progressively identified more markedly with age. Each of the upward sloping lines corresponds to a combination of stages for the 7 left mandibular teeth. The spread of minimum and maximum chronological age for a combination of maturation stages is therefore visually observed with the extremities of these alignments. Both reference methods overestimate dental age for the more advanced combinations of tooth maturation and are more centered for the earlier stages.

Fig. 3
figure 3

Bland and Altman plot of difference in years between chronological age and dental age estimated a Demirjian’s method (for the 4–16 years group), b Willems’ method (for the 4–16 years group), c stacking regressor machine learning method (for the 4–16 years group), and d stacking regressor machine learning method (for the 4–24 years group). Negative values indicate that dental age is higher than chronological age. Purple points represent male individuals and yellow points represent female individuals. These graphs represent the 20 independent experiments performed

Figure 3 b shows the results of the stacking learning machine method in the form of two Bland and Altman plots for group U16 and group U24 without and with the third molars, respectively. The dispersion of the measurements is greater for the U24 group from the age of 16 years, with maximal errors from 17 to 20 years, while decreasing after 20 years. Bland and Altman plots are available in supplementary data for all methods.

Discussion

Age estimation using the Demirjian and Willems reference methods

Compared to the world population

The first step in this work was to test our sample with two age estimation methods commonly referred in the literature. Among the reference methods, Demirjian et al. developed their method as a universal tool to assess dental maturity and predict dental age in children [11]. As expected from the existing literature, the results obtained by DEM method in this study overestimated the age of females by 0.18 years and males by 1.2 years. These results are consistent with the results of the meta-analysis of Jayaraman et al. which shows that the method overestimates the age on average of females by 0.65 years (− 0.10 to 2.82 years) and males by 0.60 years (− 0.23 to 3.04 years) [33]. Several authors have tried to overcome this problem, such as Willems et al. by distinct maturity score tables that give the age directly in years [19]. The results obtained by WILL method in this study underestimated the age of females by 0.36 years and overestimated the age of males by 0.77 years. These results are consistent with the results of the meta-analysis of Sehrawat et al. which shows that the method overestimates the age on average of females by 0.07 years (− 0.53 to 1.01 years) and males by 0.16 years (− 1.1 to 0.7 years) [34]. Although the meta-analysis of the results of these reference methods is consistent with our results, we can also compare them with a population that is geographically quite similar.

Compared to the population of Southwest France

The results obtained on a sample of population in the South of France with the two reference methods (i.e., Demirjian and Willems) are also in agreement with a previous study by Urzel and Bruznek, who used on a similar population, the same WILL age estimation method, and a variant of the DEM method revised by Demirjian and Goldstein [17, 24]: both methods overestimate dental age, and WILL method is more accurate than DEM method. For the WILL method, the results found by Urzel and Bruznek are more accurate, with a slight underestimation by 0.09 years for females and an overestimation by 0.14 years for males. These results can be explained by an age group distribution in this study with more individuals in the age groups below 7 years and above 13 years, namely, the age categories with the higher age differences in our study.

Age estimation using the machine learning methods

The second step of present work was to develop new standards based on machine learning strategies that can improve on the accuracy of actual table standards. Instead of developing a fixed standards table, machine learning may perceive complex data schemas. These strategies consider the workflow of ML methods (several algorithms may be chained) as well as the best hyperparameters determined a priori during training of the model.

In the literature, three main studies have used machine learning to estimate dental age and are discussed in this section: (a) the study by Tao et al. [38] with a multi-layer perceptron, using the same annotation system for the stages of tooth maturation; (b) the study by Štepanovský et al. [39] with several machine learning methods (multiple linear regression, support vector machine, multi-layer perceptron, radial basis function neural network, k-nearest neighbors, and k-star); and c) the study by Villa-Blanco et al. [40] with a convolutional neural network.

Comparison with reference methods

The first part of the evaluation of the ML methods was to compare them with the reference methods DEM and WILL under the same conditions of use of the latter, i.e., for the U16 population without using the third molars. All ML methods for estimating dental age outperformed the DEM and WILL reference methods. Among all machine learning methods tested (4–16 years), the maximum mean differences obtained were an overestimation of 15 days for the multi-layer perceptron method and an underestimation of 25 days for the support vector machine, whereas DEM and WILL overestimated the age by a mean of 257 days and 80 days, respectively. These results are in agreement with the results of other studies on dental age estimation [38, 40].

The study by Metsäniitty et al. [18] assessed dental age in a U24 population using the 7 permanent mandibular teeth and the 4 third molars. The development of these teeth was staged respectively according to DEM method and Köhler et al. [48] The reported RMSE was 1.175 years for females and 1.095 years for males. These results are more accurate than those obtained for U24. This may be explained on the one hand by the third molar staging system using 10 stages of mineralization and on the other hand by the under-representation of individuals older than 17 years old in the sample.

Comparison with a study using a multi-layer perceptron

The study by Tao et al. [38] assessed one particular ML method that is a multi-layer perceptron, using the same annotation system for the stages of tooth maturation, and 7 permanent teeth. The MAE, MSE, and RMSE results of Tao et al. for MLP are 0.990, 1.775, and 1.332 years for males and 1.261, 2.616, and 1.617 years for females, compared to 0.742, 0.907, and 0.952 years for male/female in the present study in U16. These differences can be explained by the broader age boundaries used by Tao et al. For U24, using 7 permanent teeth + 4 third molars, the MLP MAE, MSE, and RMSE results are 1.187, 2.701, and 1.643 years.

Comparison with a study using a different annotation system

The study by Štepanovský et al. used several machine learning methods (multiple linear regression, support vector machine, multi-layer perceptron, radial basis function neural network, k-nearest neighbors, and k-star) [39]. However, Štepanovský used an annotation system described by Moorrees et al. employing more stages of mineralization of the dental organ [1]. The population sample of Štepanovský et al. [39] was 662 males and 314 females aged 2.7 to 20.5 years, with 16 teeth annotated on the left side of the maxilla and mandible on panoramic radiographs (14 stages of maturation). The MAE and RMSE results are under 0.94 and 1.27 years, respectively. The first reason for this better accuracy may be the frequency distribution of the population: 10% of the population was aged over 15 years compared to the 35% of the present study; the 15–20 age group exhibits the largest variations. The second possible reason is the use of more dental maturation stages and more teeth annotated than in our study: 14 versus 8 stages and 16 versus 11, respectively. Thus, the population of Štepanovský et al. may be compared to the U16 group of this study where the MAE and RMSE of all methods tested are under 0.811 and 1.014 years. Indeed, using a simpler annotation standard does not result in a significant performance loss compared to a higher number of teeth annotations.

Comparison with a study using a deep learning approach

The study by Villa-Blanco et al. used a convolutional neural network (CNN, a deep learning strategy) [40]. The authors succeeded in developing a fully automatic method of image analysis. Without using any annotation system, panoramic radiography is used straightforwardly. For the U16 group, the CNN obtained a ME and a MAE of − 0.020 ± 0.97 years and 0.75 years, respectively. For group U25, the CNN achieved a MAE of 1.21 ± 1.17 years. These results are similar with those obtained for U16 and U24, respectively.

The use of a CNN proposed by Vila-Blanco et al. is particularly attractive because it is fast, easy to use, and less subjected to human interpretation. It presents a disadvantage according the explainability of the results. Only areas of interest on the image are identified using a Grad-CAM methodology [49], whose heat map is mainly centered on the teeth. Nevertheless, the authors underline that Grad-CAM heat map remains “significantly variable according to the images.”

Predictors used in ML methods

A strong advantage in forensic medicine relies on the combination of a proven annotation system (i.e., the maturation stages described by Demirjian) to create a statistical model adapted to the population to study. It was hypothesized that supervised machine learning is capable of interpreting data whose meaning is not accessible to us [35] and that the amount of data matters more than algorithms [50]. In the present study, sexual dimorphism (gender) has also been used as a predictor for ML, to avoid losing power by developing separate models for boys and girls.

Bland and Altman graphs to interpret the results of dental age estimation

The Bland and Altman graphs are informative at several levels. It can be visually observed that the reference methods tend to overestimate dental age for ages above 14 years. The ML methods show a more homogeneous distribution of points around the x-axis. We also observe for the same chronological age, the dispersion of the dental ages, which reflects both the individual variability of the maturation of the teeth, the imprecision of the machine learning methods, and the number of maturation stages. Thus, it is observed that even using third molars, the variability increases between 16 and 21 years of age (consequently the RMSE and MSE increase due to the square penalty). The Bland and Altman graph also allows us to observe the individual variability of a combination of maturation stages that corresponds to the alignment of points whose extremities correspond to individuals in the sample who are either early or late in development relative to the population sample. Finally, such graphs have an interest in the explicability of the ML methods used. Explicability of results is essential in forensic medicine. To be used in legal proceedings, it is necessary to provide elements of explanation. Some methods behave like black boxes (MLP, RF), while others are perfectly explainable (polynomial regression, decision tree). However, the methods the least easy to interpret give the best results on all the metrics (SVM, RF, MLP). New tools are being developed to break the black box and bring tangible elements of interpretation.

Drawbacks of machine learning methods tested

There are two approaches in forensic for the dental age. The first one estimates dental age and the second one aims at classifying individuals with respect to a key age [6]. A limitation of our study was to compare our results with those of the literature using reference methods to estimate dental age for the 16–24 age groups which is rather studied using a classification method. Prieto et al. [51] provided results comparable to ours on a Spanish population aged 14–21 years. They observe a mean error of − 0.1 ± 1.3 and − 0.07 ± 1.22 years respectively for the left and right third molar. Our results extracted from U24 for the 16–24-year-old subgroup give higher mean errors (Supplementary Table 2). Estimating the dental age for individuals aged from 16 to 24 years from the regression equation proposed by Hofmann et al. [45] gives results comparable (RMSE = 2.061 years) with those obtained with ML methods (Supplementary Table 2).

The use of dental age estimation methods, including machine learning methods in a forensic context, must consider individual variability. This is particularly important since the stages of dental maturation were developed in 1973 on a Canadian population [11]. Whatever the statistical approach, the variability of the prediction of dental age represents individual variability. Adding geographic origin, socio-nutritional conditions and endocrine factors using ML algorithm may help to reduce such individual variability [3] and improve dental age estimation. There is a trend towards maximum prediction accuracy for ML algorithms. This limit in accuracy may be due to the limitations of the annotation system and/or to inter-individual variability. A solution would be to go beyond the stages or to use additional predictors. Deep learning methods could allow a more detailed analysis of the maturation stages by breaking down the inter- or intra-examiner reproducibility. These methods are already used with CNN to analyze an entire panoramic radiograph to determine dental/bone age [40].

Conclusion

Machine learning methods were used in this study to assess the predictability of a dental maturation staging system based on dental panoramics. On the population studied in U16, as a reference, WILL method was significantly more accurate than DEM, and all ML methods were more accurate than the best reference method. On the population studied in U24, all ML methods rendered similar results, except for one (BRR). This study confirms that ML methods outperform the classic approaches for age estimation based on radiographic dental staging, from childhood up to early adulthood.