Introduction

Beta-thalassaemia major results from a hereditary defect in the synthesis of β-chains of haemoglobin leading to compensatory hyperplasia of the erythroid marrow [1]. If left untreated, it causes distinctive skeletal abnormalities, most noticeable in the short bones of the hands and feet, the ribs, the spine and the skull. Particularly in the hand, diffuse medullary expansion of the shafts of the tubular bones with thinning of the cortices with additional coarsening and thickening of the trabeculae is seen [2]. Conventional management of β-thalassaemia major with regular transfusions, if employed early in life, prevents the development of these skeletal deformities [3]. On the other hand, iron-induced endocrinopathies resulting in growth failure and hypogonadism are present in a significant proportion of patients with β-thalassaemia major [4]. Frequent assessment with hand and wrist radiographs for the determination of skeletal maturation and the prediction of final height is particularly valuable in these patients.

Currently, the two most commonly used methods for bone age assessment are the Greulich and Pyle (G&P) method [5] and the third edition of the Tanner and Whitehouse (TW3) method [6]. In both methods a radiograph of the left hand and wrist is required. Regarding the G&P method, most investigators use a modified version of the originally described technique whereby the overall appearance of the studied radiograph is compared to the corresponding standard in the G&P atlas. Although this approach is considerably less time-consuming and complex it may lack accuracy. The TW3 method is derived from a more solid mathematical base. In this method bones of the hand and wrist are classified into one of eight or nine stages, to which scores are assigned. Summing the scores of the individually studied bones gives a total score. Skeletal age is then calculated according to scoring tables. There are two versions of this method: the first uses the radius, ulna and short bones of the thumb, middle and little fingers (TW-RUS); the second additionally uses the carpal bones (TW-20). The TW3 method is more flexible and accurate; however, it is complex in use and requires more specialist training and experience. For both the G&P and TW3 methods, equations for the prediction of adult height have been developed.

This study was conducted in order to retrospectively assess and compare the rapid G&P method with the TW3 method for the determination of skeletal maturation in radiographs of children and adolescents with β-thalassaemia major. Additionally, the respective adult height prediction methods were compared in patients who had attained their final height.

Materials and methods

Skeletal maturity was evaluated in 191 radiographs of the left hand and wrist from 58 patients (28 males, 30 females) with β-thalassaemia major by two investigators. The first investigator was a consultant paediatric radiologist with almost 20 years of experience in this field. This reader assessed skeletal maturity using the rapid modified G&P method. An interpolated age of halfway between the standards was chosen when a patient’s radiograph was thought to be intermediate between two radiographic standards in the atlas. The second estimation on the same radiographs was obtained using the TW-RUS method by a paediatrician with a special interest in paediatric endocrinology and who was specially trained in the use of this method and had 5 years of experience. Both investigators knew only the gender of the patients and were blinded to previous estimations and chronological age. In order to assess intraobserver variation for each method the same 17 radiographs were reanalyzed by the two readers.

Additionally, predicted final height was calculated according to both methods using 47 radiographs (20 from eight female patients and 27 from seven male patients; mean chronological age 12.23 ± 2.8 years). All patients were aged more than 20 years when the radiographs were reassessed for this study and were considered to have attained their final height when two consecutive measurements 6 months apart gave the same height. The height of each patient at the time of initial radiographic assessment was extracted from the medical files. Using the G&P method, based on estimating bone age, final height was calculated by recording the percentage of final height achieved from specific score tables [6]. For TW3 final height prediction, the software provided with the method was employed. This software uses TW-RUS score, height and chronological age at the time the radiograph was obtained to calculate the expected final height using specific equations described in detail in the accompanying book [7].

For statistical purposes and graphical demonstrations Microsoft Office Excel 2003 software was employed. Data are presented as means±SD. Bland-Altman scatter plots [8], being easily interpretable, were used to compare methods and intraobserver variations. The one-sample Kolmogorov-Smirnov test was used to assess the normality of the distributions of the studied parameters. As all studied parameters had a normal distribution, the paired Student’s t-test and χ2 test were used to assess the significance of the difference between parameters. P values less than 0.05 were considered statistically significant.

Results

The mean chronological age of the patients at the time that all the 191 radiographs were obtained was 10.78 ± 3.96 years (range 1.54–21.03 years). Both the G&P and TW3 methods gave mean bone ages lower than mean chronological age (10.04 ± 3.69 years vs. 9.98 ± 3.39 years, respectively). The distribution of the mean differences between the estimated bone ages and chronological ages for the two methods in relation to chronological age groups is presented in Fig. 1. Both investigators demonstrated good intraobserver variability (95% confidence limits −1.06–0.92 years for the G&P method and −0.44–0.4 years for the TW3 method, Table 1). A Bland-Altman scatter plot is presented in Fig. 2 demonstrating the level of agreement between the two methods. The mean age disparity between the methods was −0.05 ± 1.03 years with 95% confidence limits ranging from −2.11 to 2 years. Figure 3 is a plot of the differences between the estimated bone ages and chronological ages for the G&P method versus the TW3 method, and indicates that the two methods tend to provide concordant results.

Fig. 1
figure 1

The TW3 method gives more advanced bone ages than the G&P method in thalassaemic patients aged <10 years. In patients aged >10 years, however, the TW3 method gives less advanced bone ages than the G&P method. The bone ages obtained with both methods were younger than chronological ages in all age groups (error bars represent 1 SD from the mean, positive for the TW3 method and negative for the G&P method)

Table 1 Intraobserver variation (mean age disparity is the first minus the second reading; mean absolute error is the absolute value of the difference between the two readings)
Fig. 2
figure 2

Bland-Altman scatter plot of the variation between the TW3 and G&P methods (mean±SD age disparity −0.05 ± 1.03 years; 95% CL −2.11–2)

Fig. 3
figure 3

Plot of the differences between estimated bone ages and chronological ages for the TW3 method versus the G&P method. The age differences had the same sign (either positive or negative) in 151/191 radiographs (79%) and different signs in 40/191 radiographs (21%)

Regarding final height prediction, the TW3 method seemed to be more accurate than the G&P method (mean absolute error 3.21 ± 2.51 cm vs. 3.99 ± 2.99 cm, respectively, P=0.048). Both methods had a tendency to overestimate final height of patients with β-thalassaemia major. This was consistent in both genders with the exception of underestimation of the final height in females by the G&P prediction model (Table 2). Finally, as expected, the error in final height prediction was lower in older patients and this was more prominent in girls than in boys (Fig. 4).

Table 2 Error in final height prediction (difference in centimetres between predicted and actual final height) and absolute error (in centimetres) for the two studied methods in boys, in girls and in the total population (values are means±SD)
Fig. 4
figure 4

Age distribution of error in prediction (predicted minus actual final height) with the G&P method (black circles) and the TW3 method (white triangles) in boys (a) and in girls (b). The error in final height prediction is lower in older patients and this is more prominent in girls than in boys

Discussion

Despite advances in conventional management, growth failure and hypogonadism are still reported among children and adolescents with β-thalassaemia major, especially in their second decade of life [9]. Hence, they should be routinely followed up regarding their stature and pubertal progression. Bone age estimation is a useful tool to assess skeletal maturity, to predict final height and to monitor treatment regimens. A mean of 3.3 radiographs (range 1–7) of the hand and wrist were obtained during the growth of each patient participating in this study and some of the patients were to be further investigated as they had not entered adulthood. Despite their frequent use, no study has yet evaluated and compared the most commonly used methods for estimation of bone age in this particular group of patients.

The accuracy of bone age estimation depends upon the investigator’s experience. In this study, two experienced readers were used. Hence, 95% confidence limits for both methods were satisfactory (−1.06–0.92 for the G&P method and −0.44–0.4 for the TW3 method). The absolute error between readings was greater for the G&P method than for the TW3 method, although the difference was not statistically significant. This greater error in reproducibility with the G&P method was probably due to the use of the rapid modified version. According to this, the bone age recorded is the age given for the closest reference match, which may result in discrepancies when an individual bone is more or less advanced than its closest match. Although the originally described method included a provision for assigning bone age individually to each bone [1], this technique has seldom been used [10, 11]. Indeed, the majority of investigators have used the rapid G&P method, which is less complex and time-consuming. In a study performed by King et al. [12] the average time taken for age estimation with the G&P method was only 1.4 min, compared to 7.9 min for the TW2 method.

Previous comparative studies in normal populations have shown that bone is estimated as younger with the G&P method than with the TW method [1317]. This consistent finding has mainly been attributed to racial and socioeconomic differences between the reference populations used for the two methods. The G&P method was based on study of American children of high socioeconomic status in the 1940s whereas the TW method was based on British children of low socioeconomic status in the 1950s. Previous comparative studies have compared the G&P method with either the TW1 or the TW2 methods. In the 2001 third edition of the TW method (now termed TW3) there are considerable changes in the reference population, which now includes population data from North America and Europe. Thus, bone ages estimated with the TW3 method are 1 year younger than those estimated with the TW2 method for children aged from 10 years upwards, but show smaller differences at younger ages [7]. A recent study comparing the TW2 and TW3 methods has indeed shown that TW3 estimates of bone age are younger than TW2 estimates [18]. Our study, the first to compare the newly developed TW3 method with the G&P method, supports this observation. Furthermore, our results show that the TW3 method estimates bone ages as more advanced than the G&P method in thalassaemic patients aged <10 years, but this was reversed in older patients (Fig. 1). Bone ages obtained with both methods were younger than chronological ages in all age groups except in thalassaemic patients aged <5 years assessed with the TW3 method. Bone ages estimated with both methods were closer to chronological age in younger patients, whereas in older patients estimated bone ages were significantly younger than chronological ages. This can be mainly attributed to the multifactorial growth and pubertal retardation observed in thalassaemic children and adolescents [19].

Prediction of final height based on bone age estimations is both challenging and clinically important as it influences to a large extent decisions on therapeutic interventions. Our results indicate that both methods generally overestimated final height in our patients, but mean errors were within acceptable limits although with wide ranges (Table 2). Indeed, if we set the confidence limits of attained final height to be predicted final height ±5.0 cm then only 64% of the predicted final heights obtained with the G&P method and 79% of those predicted with the TW3 method were within the limits, with a statistically significant difference between methods (P=0.029). The prediction of final height with the G&P method in girls seems to be particularly inaccurate as only 50% of girls attained a final height within the limits of predicted final height ±5.0 cm. Hence, these large confidence intervals must be taken into account and thalassaemic patients should be informed about the full range of predicted final height and not just the mean value.

Conclusion

Both studied methods (G&P and TW3) gave similar results regarding reliability, although the estimates were not equivalent. Thus, for serial assessments in an individual patient, one method should be used exclusively. The TW3 height prediction method seems to be more accurate in patients with β-thalassaemia major than the G&P method; however, large confidence intervals must be taken into account.