Introduction

Growth hormone deficiency (GHD) is a relatively common endocrine problem (1/3000~1/9000 in children [1]), which leads to short stature in childhood and could give rise to severe short stature in adulthood if there is no adequate supplemental hormone therapy.

GHD can be divided into acquired GHD and nonacquired (congenital) GHD according to its etiology. A clinical history of head trauma, central nervous system (CNS) infection, tumor, and cerebral irradiation are relatively common causes of acquired GHD. Though most of these causes of acquired GHD are easy to detect in imaging studies, the structural abnormalities in the vicinity of the pituitary gland of some nonacquired GHD cases may not be clearly revealed.

After the first study of pituitary stalk transection (interruption) using MRI was reported in 1987 [2], MRI has been regarded as the most superior and essential imaging modality for evaluating sellar and parasellar structural abnormalities in patients with GHD. In addition, in the initial evaluation of patients with GHD, it is important to use MRI to identify midline anomalies that may accompany nonacquired GHD, tumors, and other infiltrative diseases that may cause pituitary hormone abnormalities.

Furthermore, MRI provides important information for predicting prognosis and determining the need for screening in patients with nonacquired GHD. According to previous studies, isolated GHD (IGHD) patients with pituitary abnormalities are known to have a higher rate of progression to multiple pituitary hormone deficiency (MPHD) at follow-up than IGHD patients without pituitary abnormalities [3]. In addition, it has been reported that patients with IGHD who had pituitary abnormalities had better GH replacement treatment results than IGHD patients who did not have these abnormalities [4,5,6].

There have been many studies using MRI for nonacquired GHD. However, most of the studies were conducted in relatively small study populations, and the reported incidence of sellar and parasellar structural MRI abnormalities between studies has varied greatly [7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22]. For example, the incidence of isolated pituitary hypoplasia in IGHD patients ranges from as little as 4% [8] to as much as 50% [23]. The largest recent population study was conducted by Magnie et al., published in 2013, on 15,043 children with nonacquired GHD in KIGS (Pfizer International Growth Database) from 1987 to 2011 [24]. This study was a highly valuable study that confirmed that there is an important association between the pituitary structure and function in a large study population of various countries. However, as a limitation of the study, the exact number of patients in the two groups, IGHD and MPHD, was not revealed, so the proportion of MRI abnormalities between the two groups was not clearly identified; furthermore, it was difficult to clarify the criteria for pituitary hypoplasia from the KIGS data, which was the most common MRI abnormal finding. Moreover, the influences of other variables, such as the imaging protocol, geographical region, and the cutoff serum GH level for the diagnosis of GHD, were not evaluated. These areas are worth further investigation in a large study population. From this point of view, we performed a systematic review and meta-analysis to verify and integrate the MRI phenotype of patients with GHD.

Materials and methods

This systematic review and meta-analysis was conducted in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines [25].

Search strategy and selection criteria

PubMed and EMBASE databases were searched up to December 14, 2020, using the following search terms: pituitary AND ((growth hormone deficiency) OR (GH deficiency)) AND ((MRI) OR (MR imaging) OR (magnetic resonance imaging)) AND ((idiopathic) OR (nonacquired) OR (acquired) OR (secondary) OR (congenital) OR (pathologies) OR (structural abnormalities) OR (organic)) AND (children OR pediatric OR paediatric). The literature search was restricted to articles published in English. The literature search was performed by one reviewer (J.S.H.).

The inclusion criteria were as follows: (1) pediatric patients (≤ 18 years old) diagnosed with nonacquired GHD [26] and (2) detailed data sufficient to assess the proportion of sellar and parasellar abnormalities on brain MRI scans. The exclusion criteria were as follows: (1) acquired GHD (craniopharyngioma, microadenoma, history of head trauma, cranial irradiation, or CNS infection) and complex syndromes [24, 26, 27]; (2) conference abstracts, review articles, letters, erratum, articles in press, case reports, or books; (3) adult patients; (4) insufficient data for evaluating outcomes; and (5) overlapping study populations and data. Studies with larger study populations were chosen in cases of overlapping patients and data.

Data extraction and quality assessment

Data were extracted by two reviewers (J.S.H. and S.W.J.) using a standardized extraction form and disagreement was settled by consensus.

  1. 1.

    Study characteristics: authors, publication year, study period, institution, country, and study design

  2. 2.

    Patient characteristics: number of patients, age, sex, mean or median age, diagnostic method of GHD, and type of GHD (IGHD vs. MPHD)

  3. 3.

    MRI characteristics: MR imaging magnet, matrix size, field of view (FOV), slice thickness, contrast enhancement, and imaging criteria of sellar and parasellar abnormalities

  4. 4.

    MRI findings: proportion of total and specific sellar and parasellar abnormalities (isolated anterior pituitary hypoplasia (IAPH), empty sella, isolated stalk abnormality, isolated ectopic posterior pituitary (EPP), pituitary stalk interruption syndrome (PSIS), congenital mass), proportion of midline brain anomalies, and proportion of CNS abnormalities outside the pituitary area. PSIS was defined as triad of thin or interrupted pituitary stalk, aplasia or hypoplasia of the anterior pituitary, and absent or EPP [2, 28].

Two independent reviewers (J.S.H. and S.W.J., with 8 and 10 years of radiology experience, respectively) assessed the quality of each included study using the Risk of Bias Assessment tool for Nonrandomized Studies (RoBANS 2.0) with consensus [29]. The RoBANS tool is configured to evaluate the risk of bias in a total of 8 domains, including comparability of participants, selection of participants, confounding variables, intervention (exposure) measurement, blinding of outcome assessment, outcome evaluation, incomplete outcome data, and selective outcome reporting. The “comparability of participants” domain was not applicable in most of the included studies and was excluded in the quality assessment. The risk of bias of all domains was classified into three categories: “low risk,” “high risk,” and “unclear risk.” Quality assessment results were presented using Revman 5.0 version (Cochrane Community, Oxford, UK).

Statistical analysis

The primary outcome of this study was the pooled proportion of sellar and parasellar abnormalities, midline brain anomalies, and CNS abnormalities outside the pituitary area in patients with GHD. The statistical heterogeneity across studies was evaluated with the Q test or the inconsistency index (I2) statistic, and a P-value < 0.1 in the Q test and I2 ≥ 50% were considered to indicate significant heterogeneity [30]. Hartung-Knapp adjustment for the random-effects model for the meta-analysis of single proportions was performed [31]. Individual study weights were evaluated with the inverse variance method with the arcsine transformation [32]. In addition, we further investigated subgroup analyses according to the type of GHD (IGHD vs. MPHD), MRI magnet, geographical region, and cutoff serum GH level for the diagnosis of GHD (cutoff GH ≤ 5 μg/l vs. cutoff GH=10 μg/l) to explore the potential sources of heterogeneity. P values < 0.05 were regarded as statistically significant [33]. In subgroup analysis according to cutoff serum GH level, sellar abnormalities were additionally divided into two groups by severity of the MR abnormalities (IAPH vs. isolated EPP or PSIS or any two out of the three abnormalities of anterior pituitary hypoplasia, stalk abnormality, and EPP) [3, 34]. Begg’s test was used to evaluate publication bias [35]. A funnel plot was visually assessed and P < 0.05 of Begg’s test results represented the presence of publication bias. All statistical analyses were performed by an experienced professional statistician (J.S.L.) using the “meta” package in R software version 4.0.2 (R Foundation for Statistical Computing).

Results

In total, 885 articles were initially identified by a systematic search. After a full-text review of 75 articles, 32 studies with 39,060 children were ultimately included (see the flow diagram in Fig. 1) [7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22, 24, 34, 36,37,38,39,40,41,42,43,44,45,46,47,48,49].

Fig. 1
figure 1

Flow diagram of study selection

Characteristics of included studies and quality assessment

The study and patient characteristics are described in Table 1. In brief, 12 studies were retrospective, and one study was prospective in design. The number of patients ranged from 18 to 15,043, with mean or median ages of 3.4 to 14.1 years. The diagnosis of GHD was based on two pharmacologic stimulation tests in most studies (supplementary material 1). Regarding peak GH levels of pharmacologic stimulation tests, 17 studies defined GHD when the peak GH was < 10 μg/l. In 6 studies, cutoffs for the peak GH level were ≤ 5 μg/l [12, 16, 22, 39, 40, 44]. Twenty-four studies included both IGHD and MPHD patients. Two studies [8, 14] and one study [40] included only IGHD and only MPHD patients, respectively.

Table 1 Study and patient characteristics

The imaging methods are described in Table 2. MR imaging was conducted by either 0.5T, 1T, 1.5T, or 3T magnets. In studies that reported imaging slice thickness, most studies used slice thicknesses of ≤ 3 mm, except for a few studies [15, 17]. In studies with available FOV data, varying FOVs were used from 11 to 23 cm. In 16 studies, contrast-enhanced MRI scans was performed. In 13 studies, use of contrast-enhanced MRI scans could not be assessed due to a lack of information. Imaging criteria for each sellar abnormality are provided in supplementary material 2.

Table 2 Technical MRI features of included studies

The quality of the included studies was generally acceptable, but the risk of bias caused by incomplete outcome data was relatively high (31.3%) (supplementary material 3). This is because there were studies with a relatively large proportion (> 20%) of GHD patients who did not undergo MRI. In the two domains of blinding of outcome assessment and outcome evaluation, the proportion of unclear risk of bias was high (59.4% and 34.4%, respectively). The reasons for unclear risk of bias were that the report did not specify whether blinding was performed regarding clinical information for MRI reviewers or the report did not specify the criteria for abnormal sellar and parasellar MRI findings, respectively.

Sellar and parasellar abnormalities

The proportion of sellar and parasellar abnormalities was acquired in 32 studies, composed of 17,540 patients, and ranged from 8.3 to 100% [7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22, 24, 34, 36,37,38,39,40,41,42,43,44,45,46,47,48,49]. The pooled proportion of sellar and parasellar abnormalities was 58.0% (95% CI, 47.1–68.6%) (Fig. 2). Significant heterogeneity was noted in the Q test (P<0.001) and Higgins I2 (98.2%). No publication bias existed (P=0.94 by Begg’s test), which was illustrated in the funnel plot analysis (supplementary material 4). The pooled estimates of each sellar and parasellar abnormality are summarized in Table 3. The most common type of abnormality was PSIS (34.5%; 95% CI, 20.8–49.6%). The etiologies of sellar and parasellar mass were hypothalamic hamartoma, arachnoid cyst, pituitary cyst, and Rathke cyst [24, 48].

Fig. 2
figure 2

Forest plot of proportions for total sellar and parasellar abnormalities on brain MRI scans in patients diagnosed with growth hormone deficiency. The green box represents the point estimate, and its area represents the weight given to the study. A horizontal line indicates the 95% CI, and diamonds represent the overall pooled proportions

Table 3 Summary of the meta-analytic pooled proportions for sellar/parasellar abnormalities

Midline brain anomaly and CNS abnormalities outside the pituitary area

The proportion of midline brain anomalies was assessed in 29 studies, composed of 16,947 patients, and ranged from 0 to 45.7%. The pooled proportion of midline brain anomalies was 1.1% (95% CI, 0.1–3.0%) (Fig. 3). Significant heterogeneity was noted in the Q test (P<0.001) and Higgins I2 (89.3%). Publication bias was detected by Begg’s funnel plot analysis (P = 0.037) (supplementary material 5a). The adjusted pooled proportion of midline brain anomalies was 2.6% (95% CI, 0.9–5.1%) after applying a trim-and-fill method (supplementary material 5b). The etiologies of midline brain anomalies were hypoplasia of the optic nerve or optic chiasm, septo-optic dysplasia, corpus callosum dysgenesis, abnormality of the septum pellucidum, solitary central maxillary incisor syndrome, midline palate cleft syndrome, GHD with anophthalmia, nasal pyriform aperture stenosis, medially deviated carotid artery, complex anomaly of the hypothalamo-hypophyseal tract, sphenoidal meningocele, vermian agenesis, and craniopharyngeal canal in ten studies [11, 13, 17, 20, 24, 36, 44, 46, 48, 49]. The pooled proportion of CNS abnormalities outside the pituitary area was 0.4% (95% CI, 0.0–1.2%) (supplementary material 6a). Significant heterogeneity was noted in the Q test (P<0.001) and Higgins I2 (82.1%). Publication bias existed based on Begg’s funnel plot analysis (P = 0.024) (supplementary material 6b), and the adjusted pooled proportion of CNS abnormalities outside the pituitary area was 1.2% (95% CI, 0.4–2.3%) after applying the trim-and-fill method.

Fig. 3
figure 3

Forest plot of proportions for midline brain anomalies on brain MRI scans in patients diagnosed with growth hormone deficiency. The green box represents the point estimate, and its area represents the weight given to the study. A horizontal line indicates the 95% CI, and diamonds represent the overall pooled proportions

Subgroup analysis

The results of subgroup analyses by type of GHD, geographical region, MRI magnet, and cutoff serum GH level for the diagnosis of GHD are summarized in Tables 4 and 5 and supplementary material 710. Patients with MPHD showed a higher proportion of sellar and parasellar abnormalities than those with IGHD (91.4% vs. 40.1%, P<0.001). In the analysis of specific sellar and parasellar abnormalities, the pooled proportion of PSIS was higher in patients with MPHD than in those with IGHD (65.3% vs. 20.1%, P<0.001). The type of GHD did not influence heterogeneity in other specific sellar and parasellar abnormalities or midline brain anomalies (P≥0.103). Geographical region, MRI magnet, and cutoff GH level for the diagnosis of GHD were not significant factors of heterogeneity regarding sellar and parasellar abnormalities (P=0.668, 0.228, and 0.101, respectively). When sellar abnormalities were divided into two groups by severity (isolated anterior PH vs. severe MR abnormalities [isolated EPP or PSIS or any two out of the three abnormalities of anterior pituitary hypoplasia, stalk abnormality, and EPP]), the pooled proportion of severe MR abnormalities was significantly higher in patients with studies with cutoff GH values ≤ 5 μg/l compared to those with a cutoff value of GH=10 μg/l for provocation tests (72.8% vs. 38.0%; P < 0.001) (supplementary material 11).

Table 4 Subgroup analyses of MR findings by type of growth hormone deficiency (isolated growth hormone deficiency vs. multiple growth hormone deficiency)
Table 5 Subgroup analyses of sellar and parasellar MR abnormalities by geographical region, MRI magnet, and cutoff serum growth hormone level for a diagnosis of growth hormone deficiency

Discussion

In our systematic review and meta-analysis, the pooled proportion of sellar and parasellar abnormalities was 58.0% in children diagnosed with GHD, and significant heterogeneity was noted among the studies. Approximately 1.1% (2.6% after adjustment) of patients with GHD showed midline brain anomalies. Patients with MPHD showed a higher proportion of sellar and parasellar abnormalities compared with those with IGHD (91.4% vs. 40.1%). In the analysis of each sellar and parasellar abnormality, the pooled proportion of PSIS was higher in patients with MPHD than in those with IGHD (65.3% vs. 20.1%). Other variables of geographical region, MRI magnet, and cutoff GH level for the diagnosis of GHD did not influence heterogeneity.

The consensus guideline published by the Growth Hormone Research Society recommended performing brain imaging for all children diagnosed with GHD [50]. Despite the increasing use of brain MRI for children with GHD, the real-world incidence of abnormal pituitary anatomy in this population is not well known due to variability in the diagnostic criteria, the lack of a united reference standard of pituitary size, and different opinions on the role of MRI in the diagnosis of GHD [51, 52]. To our knowledge, this is the first systematic review and meta-analysis to report the proportion of sellar and parasellar abnormalities on MRI, and the pooled estimate was 58.0% in children with GHD.

Previous studies have identified a relationship between pituitary abnormality phenotype and endocrine profile in children with GHD; a patient with normal MRI findings or anterior pituitary hypoplasia is more likely to have IGHD, whereas PSIS is associated with MPHD [51]. In children with MPHD, normal pituitary anatomy is uncommon [51]. This study also confirmed via meta-analysis that the pooled proportion of sellar and parasellar abnormalities was as high as 91.4% in the MPHD group and was significantly higher than that in the IGHD group (40.1%). Moreover, we performed a meta-analysis to confirm the pituitary phenotype-endocrine profile correlation. The pooled estimate of PSIS incidence was significantly higher in patients with MPHD (65.3%). In addition, patients with MPHD showed a higher proportion of EPP and complex sellar abnormalities (i.e., any two out of the three abnormalities of anterior pituitary hypoplasia, stalk abnormality, and EPP); conversely, patients with IGHD showed a higher proportion of IAPH, although the differences were of borderline significance (P=0.109, 0.103).

The anterior and posterior pituitary, the hypothalamus, the optic nerves, and the forebrain have a common origin—the anterior neural plate—in the developing embryo [53]. Midline developmental anomalies are regarded as a spectrum disorder with variable clinical phenotypes from mild to severe forms [54]. In this study, various midline brain anomalies were identified in children with GHD and the pooled proportion was 1.1% (2.6% after adjustment). There was no significant difference in the pooled estimates between the IGHD and MPHD groups. This result might be limited because only a few included studies (n=10) reported cases with concomitant midline brain anomalies and there might be a possibility of underestimation of the incidence. Although the prevalence seems rare, all radiologists and researchers involved in the diagnosis and management of children with GHD should have good knowledge of the association between the pituitary gland and other midline brain defects. The interpretation of brain MRI data should be performed by neuroimaging experts who are familiar with midline brain anomalies.

Previous studies have demonstrated conflicting results regarding the association between the severity of GHD and brain MRI abnormalities. According to Acharya et al. [55] and Frindik et al. [52], severe GHD with a low peak GH level on stimulation tests was more associated with pituitary abnormalities on MRI, compared to mild GHD. In contrast, a recent paper by Alba et al. demonstrated that the severity of GHD did not predict pituitary abnormalities, and they concluded that brain MRI should be performed in all children with GHD regardless of the severity of the deficiency [56]. In our subgroup analysis, the pooled proportion of sellar and parasellar abnormalities was not significantly different between subgroups defined by the cutoff GH level on stimulation tests. However, patients in studies with low peak GH levels on stimulation tests were more associated with severe MR abnormalities than those in studies with peak GH levels of 10 μg/l. This result could be limited, because we categorized the studies by a cutoff GH level of “10 μg/l” rather than using a constraint of “5 to 10 μg/l” due to insufficient data, and patients in the subgroup could have a wide range of peak GH levels on stimulation tests.

There are limitations in our study. First, the diagnostic criteria of abnormal MRI were variable among the included studies, and the prevalence of pituitary abnormalities in each study could be influenced by the different criteria. In most of the included studies, the diagnosis of pituitary hypoplasia was based on pituitary height. However, the diagnostic criteria used for pituitary hypoplasia were not consistent, evaluating height based on the normal population (below 2 standard deviations) or using an absolute value (e.g., 2 mm or 3 mm). Worldwide normative morphological data using artificial intelligence algorithms or volumetric measurements using three-dimensional MRI might improve the quality of pituitary assessment in future studies [57]. Second, gene defects are increasingly recognized as factors of abnormal pituitary development; however, we could not perform subgroup analysis due to insufficient data. Third, the diagnostic method of GHD by the GH stimulation test was not uniform in the included studies. Therefore, we conducted subgroup analysis according to the peak GH cutoff values.

In conclusion, this systematic review and meta-analysis demonstrated that sellar and parasellar abnormalities were common in children diagnosed with GHD. The patients with GHD rarely had concomitant midline brain anomalies. Patients with MPHD showed more frequent and more complex sellar and parasellar abnormalities than patients with IGHD. In addition, this study confirmed a significant correlation between the frequency of severe MR abnormalities and low cutoff serum GH levels on stimulation tests in a large population of nonacquired GHD. The observed anatomic abnormality-endocrine profile correlation supports that brain MRI is essential to identify congenital structural abnormalities of the sellar and parasellar regions in children with nonacquired GHD.