Introduction

Currently, late night salivary cortisol (LNSC) is recommended by the Endocrine Society’s Clinical Practice as the first line screening test for Cushing’s syndrome (CS) [1]. LNSC is a practical indicator of free cortisol concentration as it is non-invasive and stable at room temperature for at least 1 week, offering the opportunity to collect samples at home [1, 2]. Moreover, we and others have suggested that LNSC can be used to assess the outcome of transsphenoidal surgery in Cushing’s disease, with adequate sensitivity and specificity [3, 4].

There are few and contradictory reports comparing reproducibility or accuracy of one or two measurements of LNSC. Some researchers recommend that at least two measurements of salivary cortisol should be obtained to increase the confidence of the test [1, 5]. However, few publications have evaluated intra-individual variability of LNSC in CS [4, 68] or compared the accuracy of more than one saliva sample [8]. Not all available assays have been widely evaluated and the majority of published studies have used RIA [6, 7, 916].

Recently, a good accuracy of an automated electrochemiluminescent immunoassay in the diagnosis of CS was reported [1719]. The advantage of the immunoassay is that it requires small volumes of saliva (which is collected easily), presents a low cost (similar to urine or serum cortisol), and achieves high-diagnostic accuracy.

The aim of this study was to evaluate the variability and reproducibility of LNSC using electrochemiluminescence immunoassay (ECLIA) and compare the accuracy of one or two samples in diagnosing CS.

Materials and methods

Patients and controls

The study was conducted between August 2009 and March 2010. Patients were prospectively recruited and assigned to the following groups: healthy volunteers, suspected, and CS.

The protocol was approved by the Institutional Review Board and the Medical Ethics Committee of our institution. Informed consent was obtained from all participants before samples were obtained, in accordance with the Declaration of Helsinki.

Healthy volunteers group (n = 64)

Sixty-four healthy volunteers were recruited through brochures in our institution, 20 males and 44 females (N = 64, age 43 ± 14.8, range 19–71). Exclusion criteria were BMI >30 kg/m2, chronic diseases (diabetes mellitus, hypertension, renal, hepatic, or cardiac failure), diagnosed depression, alcoholism, infection, pregnancy, or drugs known to interfere with pituitary adrenal axis secretion, such as antidepressants or exogenous corticoids. All patients were evaluated by one of the researchers, who recorded the use of oral estrogens, smoking habit, comorbidities, clinical findings suggesting CS, blood pressure, weight, and height.

Suspected group (n = 35)

Thirty-five patients were referred to us for clinical suspicion of CS by their physicians. They meet at least three clinical features suggestive of CS, namely central obesity, hirsutism, diagnosis of polycystic ovarian syndrome, muscle weakness, buffalo hump, facial plethora, or purple striae. Exclusion criteria were used of corticoids during the last year, pregnancy, cancer, or infectious disease.

Cushing’s syndrome group (n = 26)

Twenty-six patients were referred with clinical history and signs suggesting CS. To be included, they had to have two elevated 24-h urinary free cortisol (UFC) or 1 mg overnight dexamethasone suppression test (DST) >50 nmol/l and one elevated UFC.

A definitive diagnosis of Cushing’s disease was confirmed by the histological study of pituitary adenoma in 15 patients. One patient had a petrosal sinus sampling suggestive of ectopic tumor. Three patients had neuroendocrine carcinoma confirmed by histological study. The diagnosis of adrenal Cushing’s was confirmed in two patients with adrenocorticotropin (ACTH)-independent Cushing’s syndrome and hypocortisolism after adenoma resection. One patient had an ACTH-dependant Cushing’s syndrome without etiology, with UFC persistently three times as high as the normal values and DST of 324 nmol/l.

We also included four patients with suspicion of recurrence of Cushing’s disease because of elevated UFC or DST or both. To be included, patients had to have histological confirmation of corticotroph adenoma in the second surgery (Table 1).

Table 1 Characteristics of 26 Cushing’s patients included in the study

Methods

All participants were instructed for collection of UFC and two saliva samples for two consecutive nights at 23:00 (LNSC1 and LNSC2) with a commercial Salivette device (Sarstedt No. 51.1534.500). Patients were taught how to collect saliva using a cotton swab from the Salivette tubes. They were asked to keep the cotton swab under the tongue for 1–2 min and then place it back in the plastic container. Brushing their teeth, smoking, eating, or drinking anything but water for at least 120 min prior to sampling was prohibited. LNSC2 was collected the same day than UFC.

Samples of urine and saliva were centrifuged for 5 min at 3,600 rpm. They were then aliquoted and stored at −20°C until assayed. In all cases, a brief medical enquiry was recorded. Blood samples were taken the morning of delivery of the samples from suspected patients and healthy volunteers. All HV had normal serum glucose, creatinine, transaminases (alanine transaminase/aspartate transaminase), and C reactive protein.

Patients in the suspected group were evaluated with DST in addition to UFC and LNSC.

Assays

Adrenocorticotropin was measured after 30 min of rest and before 10:00 am by ECLIA (IMMULITE)®2000 SIEMENS. Normal values are 2.22–13.32 pmol/l.

LNSC, serum cortisol, and UFC were measured on a Roche Modular EP170 automated analyzer (Roche Diagnostics GmbH, Manheim) using the manufacturer’s specifications. Cortisol assay is a competitive ECLIA, with a measurement range of 0.496–1749 nmol/l and analytical sensitivity of 0.50 nmol/l. Inter assay precision is 11.6% for 5.3 nmol/l, 5.4% for 59.3 nmol/l, and 5.0% for 279 nmol/l. It was important to fully characterize the analytical performance of the cortisol assay in order to understand its capability and limitations to ensure that it is suitable for this purpose. The procedure to estimate the lower limit of quantification (LoQ is the lowest concentration for which the coefficient of variation [CV] is less than a target of 20%) follows the latest edition of the EP17-A Protocols. In brief, eight saliva specimens, with mean measured concentrations from 1.6 to 276 nmol/l, were assayed with repeated measurements to obtain an imprecision profile in which the coefficient of variation (relative standard deviation) was plotted against the mean concentration of the analyte. The function of the profile curve and its confidence limits were assessed by the computer program EP Evaluator ® 9.2.430. Based on the fitted model, the estimated LoQ (functional sensitivity) was 2.7 nmol/l with a CV of 18.1% (95% CI 16.1–20%).

Statistical analysis

The χ2 test was used to compare qualitative variables, the Mann−Whitney test to compare quantitative variables and the Spearman coefficient to assess correlations. The Bland–Altman plot was used to assess the absolute variability between measurements of LNSC1 and LNSC2. Sensitivity and specificity at different cut-off values for LNSC were obtained from ROC and AUC (area under the curve) analysis. All analyses were two-sided, and P values <0.05 were considered significant. Calculations were performed using the SPSS software package version 15.0 (SPSS, Inc., Chicago, IL).

Results

LNSC in Cushing’s patients, suspected group, and healthy volunteers

Baseline and clinical characteristics of the Cushing’s patients, suspected group, and the healthy volunteers are presented in Table 2. All patients included in the suspected group had two normal UFC and one DST <50 nmol/l, so CS diagnosis was excluded.

Table 2 Baseline and clinical characteristics of Cushing’s patients, suspected group, and healthy volunteers

Cushing’s patients had significantly higher UFC, LSC1, and LSC2 when compared to the suspected group (all P < 0.001) (Table 2). No differences were found between the healthy volunteers and the suspected group in UFC (P = 0.764), LSC1 (P = 0.913) or LSC2 (P = 0.698), but healthy volunteers were older than suspected group (P = 0.009).

We did not find a correlation between BMI and the highest LNSC, either in the suspicion group (Spearman rho = + 0.067, P = 0.705) or in the HV group (Spearman rho = + 0.043; P = 0.738). Age and LNSC were not correlated.

Intra-individual variability of LNSC

The Bland–Altman plot was used to assess absolute variations between LNSC1 and LNSC2 in the entire participating group (Fig. 1). The overall mean difference (LNSC2−LNSC1) was +1.086 nmol/l; nevertheless, the scattering of the differences increases as the LNSC average increases. The mean difference among patients in the suspected group was +0.223 nmol/l (range −2.015 to +5.104), while it was +2.249 nmol/l (range –27,590 to +40.833) in the CS group (Fig. 1).

Fig. 1
figure 1

Bland–Altman plot of absolute variation of late night salivary cortisol. This figure shows that the scattering of the differences increases as the LNSC average increases. Suspected patients Cushing’s patients

The median of intra-individual absolute variability in healthy volunteers was 22% (0–90.5%), 32% (0–144%) in the suspected group, and 51% (1.6–156%) in the CS group. Variability was higher among CS patients than among healthy volunteers (P < 0.001) and suspected patients (P = 0.05) (Fig. 2).

Fig. 2
figure 2

Intra-individual variability of LNSC in healthy volunteers, suspected group, and patients with CS. Patients with Cushing’s had higher variability than healthy volunteers and suspected patients (Mann−Whitney test). The horizontal bar represents median of variability

The accuracy of one or two measurements of LNSC as a CS screening test

In clinical practice, suspected patients undergo screening tests, so we compared the performance of LNSC measured in day 1 and the highest LNSC in suspected and CS patients. The AUC (area under the ROC curve) of LNSC1 was 0.945 (IC 95% 0.880–1.004), when considering the highest value of LNSC for each patient, AUC was 0.980 (IC 95% 0.954–1.007).

Given that the objective of a CS screening test is to achieve maximum sensitivity, we used the most stringent cut-off. With one measurement of LNSC, a cut-off value of 1.9 nmol/l had a sensitivity of 100 and 40% of specificity. Considering the highest measurement of LNSC, a cut-off value of 4.2 nmol/l had a sensitivity of 100.0% and a specificity of 83% (Table 3).

Table 3 Sensitivity and specificity of late night salivary cortisol for different cut-off values

Reproducibility of LNSC

Reproducibility is the ability of a test to be reproduced, in this case to persistently diagnose CS or to discard it (both normal and elevated). With a cut-off value of 4.2 nmol/l, we found 6, 23, and 11% patients with discordant results in healthy volunteers, suspected, and CS group, respectively.

Overall results showed that 26% of patients in the suspected group and 17% of healthy volunteers had at least one elevated LNSC, while the three patients in the CS group with one normal LNSC had mild CS (UFC <2 times normal value).

Discussion

The failure to diagnose CS could have severe consequences for a patient, with implications in morbidity and mortality. A screening test should be safe, cheap, and offer the maximum possibility of detecting CS. Our study found significant improvements in the diagnostic accuracy of the LNSC measurement by obtaining two samples.

The need of a second sample depends on intra-individual variability of LNSC. Viardot et al. [6] reported a variability of 22% between two samples of LNSC collected from healthy subjects on different days. Cardoso et al. [7] found 17% of variability with no differences between healthy subjects and CS patients. Finally, Nunes et al. [4], using the same RIA assay, found 35% of variability between two samples, suggesting a lower reproducibility than Cardoso. We found that variability between two consecutive LNSC in CS patients is higher than previously reported and significantly higher than healthy volunteers (51 vs. 22%, P < 0.01). Our results show higher variability of LNSC than previously described in CS [7], since variability of healthy volunteers is similar to previous publications [6, 7]. We cannot know if differences are secondary to assay variability or tumor secretion variability. Cardoso et al. [7] have evaluated variability in CS, but using RIA assay and not ECLIA. Methodological differences could explain our results. We do not find correlation between value of LNSC and variability, so variation is not secondary to severity of CS. Variability of tumoral secretion could explain our results, as we know that CS is not always constant, and even be cyclic.

Clinical conditions suggestive of CS can cause pseudo-Cushing state. However, none of our patients referred with suspicion of CS had abnormal UFC or DST. Most of previous reports are retrospective, performed in specialized centers, and patients were initially referred with at least one elevated UFC or DST [11, 14]. In some cases, inclusion criteria included not only clinical signs of CS but also frank diabetes, hypertension, and mood disorders [20]. Putignano et al. [14]. included patients with excessive alcohol intake, poor controlled diabetes, and severe depression. Our patients were prospectively recruited and referred by their physician only because of clinical suspicion. As seen in Table 4, obesity was the principal definitive diagnosis. Similar to Nunes et al. [4] none of obese patients had abnormal DST or UFC.

Table 4 Characteristics of 35 suspected patients included in the study

Reproducibility is affected in both suspected and CS patients. All patients in the suspected group had two consecutive normal UFC and normal DST, but nine of them had at least one elevated LNSC. False positive LNSC in non-CS patients has been described and can be caused by inappropriate collection time, severe stress, contamination with corticoids, and comorbidities. We specifically asked patients about pathologies known to interfere with the pituitary adrenal axis; we excluded infection, renal, or hepatic disease with biochemical evaluation and confirmed the collection time. However, we did not perform psychiatric evaluations. It is known that the hypothalamic–pituitary–adrenal axis can be activated in depressive disorders, causing pseudo-CS [21]. Follow-up with patients with false positive LNSC shows that six were in treatment of depression 1 year after inclusion in this protocol and we cannot discard that these patients had an undiagnosed depression when they were included. We also know that co-morbidity can affect LNSC measurement. Liu et al. [22] studied 187 males without CS and found that 16.6% had elevated salivary cortisol values. LNSC is significantly higher in men over 60 years of age and among diabetics versus non-diabetics. Unfortunately, our sample size was too small to evaluate if the performance of LNSC is worse depending on the clinical spectrum of the disease.

The accuracy of one or two samples of LNSC as a screening test of CS was recently affirmed by Zerikly et al. [8] using liquid chromatography tandem mass spectrometry (LC–MS/MS). They did not find differences in the diagnostic performance of LNSC using one or two samples, with ROC curves of 0.977 and 0.971, respectively (P = 0.64). However, the authors specified that results should not be applied to patients with subclinical CS. Our results agree with recent recommendations, as we found improvements in the diagnostic accuracy of the LNSC measurement by obtaining two samples. One issue with ECLIA, as opposed to LC–MS/MS, is the potential to cross-react with synthetic steroids like prednisone. However, ECLIA is standardized in most laboratories and does not require expensive and sophisticated equipment.

Although previous studies have evaluated the accuracy of LNSC in outpatients [6, 11] or in a suspected control group [6, 9, 1115], few reports have compared the reproducibility of LNSC in normal [6] and Cushing’s patients [7] or evaluated one versus two measurements of LNSC as recommended by the guidelines [8]. Our results suggest that LNSC measured by ECLIA is variable, and reproducibility is affected in both CS and non-CS patients. We found significant improvements in the diagnostic accuracy of the LNSC measurement by obtaining two samples and choosing the highest value.