Introduction

Growth hormone deficiency (GHD) in adults has been found to result in important cognitive and metabolic abnormalities [13]. As those abnormalities are nonspecific, diagnosis of the GHD is based on hormonal assessments. A low serum concentration of insulin like growth factor-I (IGF-I) in patients with a hypothalamic-pituitary disorder leads to a suspicion of GHD. This may be misleading, however, because serum IGF-I concentration is affected by various factors, including age, gender, obesity, starvation, hypothyroidism, serum estrogen level, chronic systemic diseases, and genetic factors [4, 5]. Basal GH measurement to diagnose GHD is useless and a 24-h integrated GH profile is not practical [6]. Therefore, provocative tests are recommended for the diagnosis of GHD [7, 8].

Although the tests with GH releasing hormone (GHRH) involve very potent stimulators [9, 10], they are not widely used due to the low availability of GHRH [11]. Therefore, currently the most frequently used tests are insulin tolerance test (ITT), and glucagon stimulation test (GST). ITT has been regarded as the gold-standard test for the diagnosis of GHD, because induced hypoglycemia is considered a powerful stimulator of GH release [6, 7]. However, many patients with obesity or impaired glucose tolerance show false blunted GH responses to ITT, resulting in overdiagnosis of GHD [12]. In addition, healthy subjects may also fail to achieve adequate peak GH level (>5.0 µg/L), probably because of a recent GH pulse before ITT [13]. Such an inadequate GH response in normal subjects has been reported also with other GH stimulation tests [1416]. Furthermore, ITT is contraindicated in patients with an ischemic heart disease or seizure disorder. Although GST has been reported to lead to a false diagnosis of GHD in subjects who are obese or >50 years old [17, 18], it seems that GST is a good alternative to ITT [1923]. A diagnostic GH threshold of 3 µg/L for GST has been a traditional cut-off value [17, 20], however lower cut-off values were suggested to have a higher diagnostic accuracy to discriminate patients with GHD from normal individuals [19, 24].

As testing for GHD requires further research, we retrospectively evaluated the diagnostic values of GST, ITT, and IGF-I level, and aimed to find optimal GH cut-off values for GST in a relatively large number of patients with pituitary disorders. In addition, we assessed the efficiencies of factors, including patient age, gender, body-mass index (BMI), and additional pituitary hormone deficiencies (PHDs), on the diagnosis of GHD.

Subjects and methods

Patients and study design

The study involved all of the patients with a pituitary disease and evaluated for GHD by ITT and GST at Department of Endocrinology of Erciyes University Medical School, since 2000. The approval from the local Ethics Committee was obtained before conducting the study. Age, gender, BMI (kg/m2), medical histories, and hormonal data were obtained from patient records. The results of ITT and GST, besides basal hormone levels including IGF-I, free T3, free T4, thyroid stimulating hormone (TSH), cortisol, adrenocorticotropic hormone (ACTH), prolactin, follicle stimulating hormone (FSH), luteinizing hormone (LH), estradiol (in females), and total testosterone (in males) were recorded. Patients were excluded if they had chronic systemic disease, diabetes mellitus, liver failure, renal disease, malnutrition, pregnancy; and those taking medications that could affect the results of hormonal analyses. As different assay methods and commercial kits have been used over the last 14 years, only patients assayed using the same method and commercial kit of IGF-I and GH (shown below) were included in the study. As a result, 216 patients who had relevant criteria were enrolled into the study. In addition, all 26 healthy subjects (18 women, 8 men) enrolled into the study as a control group, gave written informed consent prior to the study. Age, gender, BMI, medical histories, and hormonal data including above mentioned basal hormone levels and GST, but not ITT, of the healthy subjects were assessed. The healthy and patient groups were comparable in terms of age, BMI, gender, and estrogen status. The females with estrogen deficiency, who were regarded as postmenopausal or gonadotropin deficient premenopausal, were not taking estrogen replacement at least 3 months prior to undergoing the hormonal assessments.

Hormonal assessments

Blood samples for all hormonal assays and stimulation tests were drawn from patients in the morning; all medications were withheld prior to sampling. The assay method and commercial kit of the serum IGF-I and GH were Immunoradiometric assay (IRMA) and Immunotech sas (Marseille, France). The IRMA assays were sandwich-type assays and mouse monoclonal antibodies against IGF-I and GH were used by the kits. The monoclonal antibodies against GH recognize 22 kDa isoform of monomer, the dimer and GH bound to its binding protein. A dissociation step was used to release IGF-I from its binding proteins before measuring the serum IGF-I level. Values of intra-assay and inter-assay coefficients of variations were 1.5 and 14 % for GH, and 6.3 and 6.8 % for IGF-I. The analytical sensitivity of GH was 0.10 µg/L and of the IGF-I was 2 ng/ml. Normal reference ranges of IGF-I were decade-based, being 232–385 ng/ml in patients aged 20–29 years; 177–382 ng/ml in patients aged 30–39 years; 124–290 ng/ml in patients aged 40–49 years; 71–263 ng/ml in patients aged 50–59 years; 94–269 ng/ml in patients aged 60–69 years; and 76–160 ng/ml in patients aged ≥70 years. IGF-I standard deviation scores (SDS) were calculated by the mean and standard deviation (SD) values [25]. The SDS formula of IGF-I was as following: [(patient’s value − mean value) × SD value].

Stimulation tests

ITT was performed after an overnight fast, and blood samples were obtained prior to the intravenous administration of 0.1 U/kg (0.2 U/kg if BMI > 30 kg/m2) soluble regular insulin, as well as immediately afterward hypoglycemia (minute-0), and after 30, 60, 90 and 120 min. All patients had biochemically confirmed hypoglycemia (<40 mg/dL). The cut-off value of the peak GH was identified as 3.00 µg/L for GHD [6], and of the peak cortisol was 18 µg/dL for adrenal insufficiency in ITT [26]. GST was performed by subcutaneous injection of 1 mg (1.5 mg in patients >90 kg) glucagon (Glucagen, Novo Nordisk, Denmark). Blood samples for the measurement of GH and cortisol were obtained before glucagon injection (baseline value) and after 90, 120, 150, 180, 210, and 240 min. A peak cortisol concentration ≥9.1 µg/dL was regarded as a normal response to GST [27]. Two cut-off values of peak GH to GST were regarded as diagnostic for GHD: 1.07 and 3.00 µg/L. The 1.07 µg/L cut-off value was defined by a ‘classification and regression tree (CART)’ analysis, in which the peak GH values of the 26 healthy subjects who were assumed to not have GHD, and 52 patients with ≥3 PHDs who were assumed to precisely have GHD, were used as nominal independent variables. The latter GH level (3.00 µg/L) for GST has been used as the standard cut-off value [17, 20]. Three cut-off values for peak GH responses to stimulation tests were evaluated in the current study: (a) 3.00 µg/L on ITT; (b) 3.00 µg/L on GST; and (c) 1.07 µg/L on GST. There was an interval of 2 or 3 days between the ITT and GST.

Diagnosis of PHDs

In order to find the number of deficient pituitary hormones other than GH, four other anterior pituitary hormone axes were also evaluated in the present study. To diagnose ACTH deficiency, ITT and GST were evaluated together. An adequate peak cortisol on one or both of these tests was regarded as a normal response, while ACTH deficiency was defined as blunted responses on both tests. TSH deficiency was diagnosed by low free T3 and free T4 levels with inappropriately normal or decreased TSH levels. Gonadotropin deficiency was based on reduced basal estradiol (in females) or testosterone (in males) with low or inappropriately normal gonadotropin levels. Prolactin levels less than lower reference limits (2.4 ng/mL for postmenopausal, 3.3 ng/mL for premenopausal females, and 2.7 ng/mL for males) were regarded as prolactin deficiency.

Statistical analyses

SPSS for Windows-Version 15.0 (IBM Inc, USA) was used for all statistical analyses. Data are reported as mean ± SD, and if needful with range of minimum–maximum. Data were tested for normal distribution before comparison and correlation analyses. Mann–Whitney test for quantitative values, and Chi-square test for binominal values were used to compare groups, and Spearman test was used for correlation analyses. A probability value <0.05 was considered statistically significant. CART analysis was used to determine a cut-point of peak GH value on GST. In addition, receiver operating characteristic (ROC) curve analysis was used to find sensitivity and specificity values of the tests.

Results

Of the 216 patients, 163 (75.5 %) were female and 53 (24.5 %) were male. Mean patient age was 45.3 ± 14.4 years (range 20–79 years), mean serum IGF-I concentration was 128.1 ± 75.1 ng/ml (range 6.1–358 ng/ml) and mean BMI was 28.8 ± 5.5 kg/m2 (range 18.0–51.8 kg/m2). The BMI value of 61 (28.2 %) patients were ≤25 kg/m2, of the 66 (30.6 %) patients were between 25 and 30 kg/m2, and of the 89 (41.2 %) patients were ≥30 kg/m2. The underlying pituitary diseases in the patients are shown in Table 1. The most frequent cause was pituitary adenomas and related treatments (68.5 %), followed by Sheehan’s syndrome (13.9 %). In the healthy control group, mean age (45.4 ± 12.9 years), and BMI values (28.2 ± 3.0 kg/m2) were similar (p > 0.05) with the patient group. Similarly, there were no differences between the groups in terms of rates of gender and estrogen deficient women. The frequencies of estrogen deficiency among women were 9 out of 18 (50 %) in controls and 77 out of 163 (47.2 %) in patients (p > 0.05).

Table 1 Underlying pituitary diseases in the patient cohort

According to the ITT, GST with 3.00 µg/L cut-off, and GST with 1.07 µg/L cut-off, GHD was present in 186 (86.1 %), 161 (74.5 %), and 117 (54.2 %) patients, respectively. Table 2 shows a detailed distribution of patients according to GH responses to stimulation tests. All patients with an adequate GH response to ITT also had an adequate response to GST with either 3.00 or 1.07 µg/L cut-off. Furthermore, all patients with an inadequate GH response to GST with 1.07 µg/L cut-off showed inadequate responses to ITT and GST with 3.00 µg/L cut-off, too.

Table 2 The distribution of patients according to GH responses to ITT and GST with 1.07 and 3.00 µg/L cut-offs

Mean peak GH responses to ITT and GST in patients were 1.30 ± 2.30 and 2.29 ± 3.37 µg/L, respectively (p < 0.001). GH peaks on the ITT were occurred at minutes 0, 30, 60, 90 and 120 in 30 (13.9 %), 56 (25.9 %), 52 (24.1 %), 35 (16.2 %), and 43 (19.9 %) patients, respectively. On the GST, GH peaks at minutes 90, 120, 150, 180, 210, and 240 were occurred in 15 (6.9 %), 17 (7.9 %), 32 (14.8 %), 84 (38.9 %), 50 (23.2 %), and 18 (8.3 %) patients, respectively (Fig. 1). In the control group, 12 out of 26 (46.2 %) healthy subjects failed the GST using the cut-off of 3.0 µg/L peak GH value, with the lowest and mean peak GH values on GST were 1.18 and 4.03 ± 0.60 µg/L, respectively. There was no significant difference between the control subjects whose peak GH value on GST was <3.0 and whose value was >3.0 µg/L, in terms of mean age and BMI. Peak GH levels of the healthy subjects and pituitary patients are demonstrated in Fig. 2.

Fig. 1
figure 1

Percentages of patients according to the minutes at which peak GH concentrations obtained during the stimulation tests

Fig. 2
figure 2

Peak GH levels on the GST and ITT, with the cut-off values of 3.0 and 1.07 µg/L

The investigation of PHDs other than GH showed that, of the 216 patients, 83 (38.4 %) had one or more PHDs, whereas 133 (61.6 %) were not deficient in any pituitary hormones (Table 3). Of the patients with pituitary hormone deficiencies, 16 (7.4 %), 15 (6.9 %), 20 (9.3 %), and 32 (14.8 %) had 1, 2, 3, and 4 PHDs, respectively. Analyzing the relationship between the number of deficient pituitary hormones and GHD showed that all 52 patients deficient in 3 or 4 hormones had inadequate GH responses, thus GHD, which was diagnosed by the ITT, and GST with 3.00 and 1.07 µg/L cut-offs. GHD, however, did not always occur in the other 164 patients with fewer than three deficient hormones. The number of deficient pituitary hormones showed a significant negative correlation with serum IGF-I levels (p < 0.001, r −0.605). The mean IGF-I levels in patients with 0, 1, 2, 3, and 4 deficient pituitary hormones were 162.3, 114.1, 93.8, 58.0, and 52.9 ng/ml, respectively (Table 3). Similarly, a significant negative relationship was observed between the number of deficient hormones and mean peak GH concentrations on the ITT (p < 0.001, ρ −0.619) and GST (p < 0.001, ρ −0.718).

Table 3 The frequency of deficient anterior pituitary hormones other than GH, and mean IGF-I levels (ng/ml) according to number of PHDs

BMI scores did not significantly correlate with IGF-I concentration, IGF-I SDS, or with peak GH concentrations on the ITT and GST, when all patients were taken into consideration together. However, when patients with ≥3 PHDs whose BMIs were assumed to have a minimal effect on those parameters, were excluded, in the remaining patients BMI scores significantly negatively correlated with IGF-I concentration, IGF-I SDS, and peak GH concentrations on the ITT and GST. When all patients were considered together, the GST with 1.07 µg/L cut-off had 100 % specificity and 100 % sensitivity in ROC curve analysis, while GST with 3.00 µg/L cut-off had 100 % sensitivity and 84 % specificity for the diagnosis of GHD. Of note, those rates of specificity and sensitivity were obtained when data of 52 patients with ≥3 PHDs and healthy controls were assumed as precise variables, in terms of having GHD or not, in the ROC curve analysis.

Mean IGF-I concentrations and GH peaks did not differ significantly when male and female patients were compared. Patient age and IGF-I level showed a significant inverse correlation (p < 0.001, r −466), as did age and peak GH concentrations on the ITT (p < 0.001, ρ −0.413) and GST (p < 0.001, ρ −0.446). IGF-I concentrations showed a significant positive correlation with peak GH concentrations on both the ITT (p < 0.001, ρ 0.580) and GST (p < 0.001, ρ 0.733) as demonstrated in Fig. 3. When all 216 pituitary patients were included, all patients with an IGF-I concentration ≤95 ng/ml or an IGF-I SDS value ≤−1.1 had GHD, as determined by ITT, and GST with 3.00 and 1.07 µg/L cut-offs. The mean serum IGF-I concentration of the healthy group (213.9 ± 97.2 ng/ml) was significantly higher than the patient group (128.1 ± 75.1 ng/ml), and the IGF-I values of all healthy subjects were in normal reference ranges.

Fig. 3
figure 3

Positive correlations between IGF-I concentrations and peak GH concentrations on the ITT (left panel) and GST (right panel)

Discussion

The cognitive and metabolic abnormalities resulting from the adult GHD can be at least partially reversed by GH replacement therapy in most of the patients [28, 29], therefore the diagnosis is important. However, GHD is prone to overdiagnosis and thus needless replacement therapy. Advanced age, obesity, a recent GH pulse before testing, certain diseases and medications may lead to false positive biochemical diagnosis, a frequent problem in the clinical practice. Therefore, the recent guideline of the Endocrine Society has recommended testing the GHD in selected groups of patients, including those with a known hypothalamic-pituitary disease, other PHDs, and a previous history of pituitary surgery, cranial irradiation, central nervous system tumor treatment, head trauma, or subarachnoid hemorrhage [7]. In the present study, all patients had a known pituitary disease, a valid indication for testing GHD, and as expected, the most frequent cause was pituitary adenomas and their treatments with surgery, radiotherapy, or medical therapy (68.5 %). The second most frequent cause was Sheehan’s syndrome (13.9 %), and similar frequencies of pituitary diseases were found by a recent cohort study in Turkey [30].

The current guideline suggests a low serum IGF-I level is a strong indicator of GHD, especially in patients with three or four PHDs [7]. Since normal IGF-I levels do not exclude the diagnosis of GHD, stimulation tests are recommended. ITT and GHRH-Arginine are suggested as first-line tests, whereas GST is suggested only when GHRH is not available or ITT is contraindicated [7]. However, production of GHRH is limited and it is expensive, and ITT is cumbersome and has diagnostic challenges [12, 13, 31]. Therefore, another stimulation test is strongly required. Although GST has been recommended as a second line diagnostic test after the ITT and GHRH-Arginine test [7], previous studies showed that GST is a convenient alternative to ITT [1923]. The frequency of GHD was highest (86.1 %) using ITT and lowest (54.2 %) using GST with a threshold GH level of 1.07 µg/L in the present study. Furthermore, mean peak GH level on GST (2.29 µg/L) was significantly higher than that on ITT (1.30 µg/L). Although mean peak GH values on the ITT was generally higher than responses to the GST in the previous studies [1719], some investigators found similar GH peaks on comparison of ITT and GST [8, 32]. Brabant et al. [33] revealed striking country-specific variations in the underlying etiology and BMI status of the patients tested for GHD. The frequencies of underlying etiologies, except for Sheehan’s syndrome, and mean age and BMI of our patients were similar to the patients in those studies which found that ITT had a greater GH secretory potency [1719]; therefore some other factors, probably genetic factors, might be responsible for the higher peak GH response to GST found in our study.

In a recent study by Dichtel et al. [24], it was reported that a GH cut-off level of 1.0 µg/L for GST may reduce the overdiagnosis of adult GHD in overweight or obese patients. Berg et al. suggested a cut-off value of GH of 2.5 µg/L on GST with 95 % sensitivity and 79 % specificity for the diagnosis of GHD which was based on ITT [19]. Gomez et al. [20] found that GST with a GH peak cut-off level of 3.0 µg/L had a sensitivity and a specificity of each 100 % in the diagnosis of GH deficiency. Other studies also suggested a diagnostic GH threshold of 3.0 µg/L for GST [17, 32, 34]. To clarify this issue, this study also compared GSTs with GH thresholds of 1.07 and 3.0 µg/L. All patients with a normal GH peak on ITT were found to have normal GH peaks on GST both with 3.00 and 1.07 µg/L cut-offs, whereas none of the patients with blunted responses to GST with 1.07 µg/L cut-off had normal GH peaks on ITT and GST with 3.00 µg/L cut-off. Furthermore, all healthy subjects had peak GH responses >1.07 µg/L, while 12 (46.2 %) of them had peak values between 1.07 and 3.00 µg/L on GST. Therefore, those 12 controls could be misdiagnosed with GHD according to the GST with 3.00 µg/L cut-off.

According to the results presented here, GST with 1.07 µg/L cut-off seems to have a higher diagnostic accuracy for GHD than the 3.00 µg/L cut-off. The reasons for that are, first, evaluation of 44 patients with adequate GH responses to GST with 1.07 µg/L cut-off, but blunted responses to ITT and GST with 3.00 µg/L cut-off, showed that none of them had 3 or 4 PHDs while only 1 of them had 2 PHDs. It would be beneficial if a precise test, such as GHRH-Arginine test, could be used to understand whether those 44 patients had GH deficiency, and thereby compare the diagnostic accuracies of ITT, GST with 3.00 µg/L, and GST with 1.07 µg/L cut-off. And secondly, the lowest peak GH value on GST was 1.18 µg/L in the control group, and 12 out of 26 (46.2 %) healthy subjects failed the GST using the cut-off of 3 µg/L peak GH value. An important limitation of this study was the lack of ITT in the healthy controls, which avoided finding lower cut-off GH values for ITT, and thereby compare ITT and GST properly. Using the Immunotech GH and IGF-I assays, which were not calibrated against the international standard [35], was another limitation of the study, therefore our results may not be generalizable to other assay methods due to interassay variations.

In the study carried out by Hartman et al., the likelihood of GHD in patients with 0, 1, 2, 3, and 4 deficient pituitary hormones was 41, 67, 83, 96, and 99 %, respectively [36]. Moreover, the positive predictive value for an IGF-I cut-off level of 84 µg/L was 96 %. Almost all patients with an IGF-I < 84 µg/L had blunted peak GH responses (<2.5 µg/L) on all of the GH stimulation tests, and therefore did not require a GH stimulation test. An important limitation of that earlier study was the use of different GH stimulation tests, all of which used the same GH cut-off value for the diagnosis of GHD. In addition, that study evaluated the absence of antidiuretic hormone (ADH) rather than prolactin as a PHD [36]. Aimaretti et al. [37] also reported that although IGF-I levels may be normal even in patients with severe GHD, very low levels of IGF-I can be considered as definitive evidence of GHD in patients with panhypopituitarism who therefore do not require a GH stimulation test.

Biochemical diagnosis of adult GHD has been reported to be affected by various factors, including age, sex, phase of the menstrual cycle, BMI, number of other PHDs, concomitant diseases, and certain medications [4, 3840]. In the present study, patient age and number of other PHDs, but not gender, were found to be efficient factors. When patients with ≥3 PHDs were not taken into consideration, BMI scores had significant inverse correlations with IGF-I and peak GH values. The reason for excluding patients with ≥3 PHDs was that in those patients the effect of BMI on IGF-I and peak GH might be minimal, because most of them had very low IGF-I and peak GH values. Patient age was also negatively correlated with IGF-I levels and GH peaks on ITT and GST. Furthermore, the number of other anterior PHDs was also closely related to the diagnosis of GHD. In all patients with ≥3 PHDs, GHD was diagnosed by all 3 stimulation tests, and the number of PHDs was inversely correlated also with either IGF-I level or GH peaks. Furthermore, all patients with an IGF-I concentration ≤95 ng/ml or SDS value ≤−1.1 were found to have GHD according to ITT, and GST with 3.00 and 1.07 µg/L cut-offs.

It is well known that following adolescence, GH secretion decreases physiologically with aging [38]. However, diagnostic GH threshold responses to stimulation tests that are used currently are not age dependent. In the present study, GH peaks and IGF-I levels were strictly age dependent and mean patient age showed a positive relationship with the number of stimulation tests yielding blunted responses (Table 2). Micmacher et al. [18] also reported that GST was age dependent. They also concluded that GST did not differentiate between healthy subjects and patients with GHD after age 50 years. Monitoring GH for only 3 h may be responsible for the lower GH peak responses to GST than to ITT in that study. Leong et al. [21] observed that the majority of GH peaks were within 3 h and suggested omitting the GH response in minute 240. However, in the present study, peak GH values to GST were occurred at minutes 210 and 240 in 50 (23.2 %) and 18 (8.3 %) patients, respectively. Although the current guideline suggests GST by monitoring GH levels for at least 3 h [7], our findings are in good agreement with other results indicating that the time should be prolonged to at least 4 h [24, 34].

In conclusion, diagnostic challenges in adult GHD include the lack of pathognomonic clinical findings and precise biochemical test results. The withdrawal of GHRH by its manufacturer particularly complicates the diagnosis. The findings presented here indicate that patient age, BMI, IGF-I level, and number of PHDs are efficient factors associated with the diagnosis of adult GHD. A 4 h GST with a lower GH cut-off level such as 1.07 µg/L for GHD seems to be a suitable alternative of ITT. By using 1.07 µg/L as a GH cut-off during GST, a considerable number of patients diagnosed with GHD according to GST with 3.00 µg/L cut-off might be shown to have normal GH status.