Introduction

The transsphenoidal approach to the sellar region was first developed for resection of sellar pathology by Schoffler in 1907 and later popularized by Cushing without the aid of lens magnification [14, 17, 60, 77]. Introduction of the operating room microscope for transsphenoidal surgery by Jules Hardy in the 1960s greatly improved intra-operative visibility and surgical outcomes [13, 40, 60]. Since around the turn of the twenty-first century, the introduction of the endoscope has allowed for improved illumination and panoramic visualization of the anterior skull base, with many skull base centers rapidly adopting this new technology [14, 46].

Despite this, the choice between endoscopic transsphenoidal surgery (eTSS) and microscopic transsphenoidal surgery (mTSS) remains controversial in the neurosurgical community, and no head-to-head study has compared the two approaches in terms of efficacy or safety. Whereas mTSS requires either a sublabial incision or removal of the nasal septum, eTSS is most frequently performed transnasally with some disruption of the nasal anatomy [43, 57]. Perhaps as a result, some studies have showed that mTSS could also be associated with longer hospital stay postoperatively compared to eTSS [36]. On the other hand, the majority of endoscopic approaches utilize two-dimensional endoscopic lenses and are associated with a considerable learning curve [3, 12, 49, 58]. Some experts have also claimed that eTSS operations may last longer or result in higher rates of postoperative cerebrospinal fluid (CSF) leak than mTSS [66, 79]. Overall, no true consensus exists and many factors may play a role in choosing either of the modalities. Patient care could be improved by a more uniform practice and more objective comparative data.

With regard to surgical outcomes, gross total resection (GTR) remains of key importance, particularly for functioning adenomas. The presence of residual disease can necessitate adjuvant medical therapy, radiosurgery and place the patient at a greater future risk of visual decline or pituitary dysfunction. Although previous systematic reviews and meta-analyses have failed to show a significant difference in GTR for pituitary adenoma resection using either mTSS or eTSS [1, 36, 76, 81], we set out to update the estimated pooled rate of GTR after each method and to identify which patient and tumor-related factors were associated with higher rates of GTR.

Methods

Search strategy and paper selection

A systematic review of the literature was conducted in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines to identify studies reporting GTR in patients harboring pituitary adenomas in the PubMed, Embase, and Cochrane databases [65]. A search strategy was designed in consultation with a librarian, using relevant keywords for identification of articles reporting both approaches (Appendix 1).

All databases were searched on July 25, 2017 and duplicates were removed. All articles were screened for title and abstract relevance by two authors, independently, to identify articles reporting GTR for mTSS, eTSS, or both. Discrepancies in study selection were resolved by discussion and consultation with a senior author. Selected articles were subject to full-text screening. Only articles that reported GTR specifically for pituitary adenomas were included. Case reports, commentaries, abstracts, reviews, animal studies, studies with an endoscopically assisted approach or extended approach, studies in pediatric patients (< 18 years old), re-operations, and cadaveric studies were excluded. Only literature in English was reviewed.

Data extraction

Study characteristics were extracted from the full text of selected studies including authors, publication year, country of origin, sample size, study design, and duration of study. Patient characteristics were extracted including number, sex, age, type of pituitary adenomas (nonfunctioning pituitary adenomas [NFPA] vs. functioning pituitary adenomas [FPA]), histological type, number of macroadenomas, number of microadenomas, surgery type, and rate of GTR.

Meta-analysis

Data analysis was performed using Comprehensive Meta-Analysis (CMA) version 3 (Copyright 1998–2014. Biostat, Inc.). The fixed-effect model using the inverse variance method was used to obtain the overall rate and the 95% confidence intervals. The random-effects model that accounts for the within- and the between-study variances according to the method of DerSimonian and Laird was also used for comparison [25]. Pooled rate estimates of GTR together with 95% confidence intervals were used to assess the efficacy of transsphenoidal surgery among patients with any pituitary adenoma, FPA, and NFPA [25]. Heterogeneity was evaluated among studies by using Cochran’s Q test (P < 0.10) and I2 percentage. An I2 value > 50% was considered to be high [41]. Potential sources of heterogeneity were explored using subgroup analyses by categorical covariates: surgery type (eTSS; mTSS), tumor type (FPA vs. NFPA), continent (Asia, Australia, Europe, North America, South America), center (single vs. multiple), surgeon (single vs. multiple), male percent (high, defined as ≥ median value of 50%, vs. low < 50%), age in categories (25–35, 36–40, 41–50, 51–55, 56–60, and 61–65), study design (cohort; case series), microadenoma (low percent, defined as < median percent; high percent defined as ≥median percent); macroadenoma (low percent; high percent), FPA type (ACTH-producing, GH-producing, and prolactinoma), and publication after 2000. It is important to note that the p-interaction resulting from the subgroup analyses should be interpreted with caution because the original studies are case series and comparing two groups of studies based on a specific covariate will not resolve all the other potential differences among the studies being compared. Meta-regression was conducted on continuous covariates including international journal impact factor and year of publication. Publication bias was assessed using funnel plots, Egger’s linear regression test, and Begg’s and Mazumdar rank correlation test. If publication bias was identified, the number of missing studies was evaluated by the trim-and-fill method. A P value < 0.05 was considered significant except where otherwise specified.

Results

Search results

The systematic search resulted in 1641 articles after duplicates were removed. After title and abstract screening, 1514 articles were excluded, resulting in 127 articles for full text evaluation. After full-text screening, a total of 57 case series were included in the meta-analysis, with a total of 7896 patients who had undergone surgery for pituitary adenomas (Fig. 1, Appendix 2) [4,5,6,7,8,9,10,11, 15, 16, 18,19,20,21,22,23, 27, 30,31,32,33, 35, 37,38,39, 42, 44, 45, 47, 48, 50,51,52,53,54,55,56, 59, 64, 68,69,70,71, 74, 75, 78, 80, 82,83,84, 86,87,88,89,90,91,92]. The median percentage of males was 53.0% (range: 0–72.2%). Mean age per study ranged from 31.6 to 63.5 years (median of means = 50.0 years) (Table 1). The median percentage of macroadenomas was 86.3% (Table 2, Appendix 3). The median percentage of FPA was 47.3% (range: 0–100%).

Fig. 1
figure 1

. Flowchart. Study selection process of the identified studies

Table 1 : Characteristics of studies included in the analysis of gross tumor resection (GTR)
Table 2 : Patient characteristics in the selected studies

Pituitary adenomas

GTR was available for n = 8257 patients (Table 3). Using the fixed-effect model, the pooled rate of GTR among all studies was 71.0% (95%CI: 69.9–72.1%, I2 = 91.2%; P-heterogeneity < 0.01 under the fixed-effect model (Table 4) [4,5,6,7,8,9,10,11, 15, 16, 18,19,20,21,22,23, 27, 30,31,32,33, 35, 37,38,39, 42, 44, 45, 47, 48, 50,51,52,53,54,55,56, 59, 64, 68, 70, 71, 74, 75, 78, 80, 82,83,84, 86,87,88,89,90,91,92]. When eTSS and mTSS were compared, GTR rate was significantly higher in eTSS (n = 50 studies, GTR=74.0%, 95%CI: 72.6–75.3%, I2 = 92.1%; P-heterogeneity < 0.01) than in mTSS (n = 20 studies, GTR=66.4%, 95%CI: 64.5–68.2%, I2 = 84.0%; P-heterogeneity < 0.01) (Fig. 2). This difference was significant in a fixed-effect model (P-interaction < 0.01), but not in a random-effect model (P-interaction=0.40). To further assess the considerable heterogeneity in GTR observed in the pituitary adenomas overall, functioning pituitary adenomas (FPA) and nonfunctioning pituitary adenomas (NFPA) were assessed separately.

Table 3 : Result of gross tumor resection in pituitary adenoma
Table 4 : Results of gross tumor resection rate and 95% confidence interval in the following case (combine subgroups using fixed- and random-effect model)
Fig. 2
figure 2

. Subgroup analysis by the type of TSS, forest plot of gross tumor resection rate and 95% CI for patient with PA who had transsphenoidal surgery

Functioning pituitary adenomas

Eighteen studies reported GTR rate among FPAs (n = 1170) [5,6,7,8,9,10, 18, 21, 23, 30, 33, 35, 37, 39, 42, 45, 51, 55, 56, 74, 80, 87, 92]. Using the fixed-effect model, the overall GTR rate was 75.7% (95%CI: 73.1–78.2%, I2 = 67.5%, p-heterogeneity < 0.01). In a subgroup analysis for eTSS vs. mTSS, GTR rate was not significantly different comparing eTSS (GTR=75.8%, 13 studies) and mTSS (GTR=75.5%, five studies) using both the fixed- (P-interaction=0.92) and the random-effect models (P-interaction=0.67, Fig. 3).

Fig. 3
figure 3

. Subgroup analysis by the type of TSS, forest plot of gross tumor resection rate and 95% CI for patient with functional PA who had transsphenoidal surgery

All of the 13 studies reporting GTR after eTSS were published after 2000 and only 3 studies reported GH-producing to be the type of FPA. Using the fixed-effect model, significant sources of heterogeneity were identified for microadenoma percent (P = 0.04; high percent: 67.6%, three studies, which had a lower GTR than studies with low percent microadenoma: 80.1%, two studies), number of centers (P = 0.01; single center: 74.9%, multiple centers: 87.2%), age (P = 0.01; 36–40 years: 82.7%, 41–45 years: 70.5%, 46–50 years: 71.5%, 56–60 years: 83.2%), and study design (P < 0.01; cohort: 66.7%; case series: 78.3%). Nonsignificant interactions were identified for continent, country, male percent, and number of surgeons (all P > 0.05). No significant sources of heterogeneity were identified using the random-effect model (not shown). Meta-regression on journal impact factor and year of publication were not significant in both random- and fixed-effect models (P > 0.05 for all).

All of the five studies reporting GTR after mTSS were case series, conducted in a single center and published after 2000. Using the fixed-effect model, significant interactions were identified for age category (p = 0.03; category 51–55: 77.3%, one study, which had a higher GTR than each of 46–50: 72.4%, two studies; 41–45: 42.1%, one study), type of FPA (p < 0.01; one study with prolactinoma patients had a higher GTR rate of 88.9% than one study with GH-producing: 42.1%), in addition to continent (p < 0.01; GTR in Asia: 86.7%, two studies, which was higher than in Europe: 72.4%, one study; and North America: 59.5%, two studies). Using the random-effect model, however, sources of heterogeneity could be identified for age category: p < 0.01; and types of FPA: p < 0.01. Other variables such as continent, male percent, single surgeon, and microadenoma percentage were not a significant source of heterogeneity. Meta-regression on journal impact factor was significant in a fixed-effect model (slope = − 0.74: 95%-CI: − 1.47; − 0.01, p = 0.046) which suggested that a lower GTR percent was associated with a higher journal impact factor, but this association was not significant in a random-effect model (P = 0.38). Meta-regression on year of publication was not significant in both models (p > 0.05 for both).

Nonfunctioning pituitary adenomas

Twenty-seven studies reported GTR for NFPA (n = 2655) [5, 7, 8, 11, 19, 20, 23, 32, 35, 45, 53, 56, 64, 78, 89,90,91]. Under the fixed-effect model, the overall GTR rate for NFPA was 67.3% (95%CI: 65.3–69.2%, I2 = 87.7%, p-heterogeneity < 0.01). In a subgroup analysis for eTSS vs. mTSS, GTR rate was significantly higher in eTSS (GTR=71.0%, 19 studies) than in mTSS (GTR=60.7%, eight studies) (P-interaction < 0.01), although this difference was not significant in the random-effect model (P-interaction= 0.13, Fig. 4).

Fig. 4
figure 4

. Subgroup analysis by the type of TSS, forest plot of gross tumor resection rate and 95% CI for patient with nonfunctional PA who had transsphenoidal surgery

Among the 19 studies reporting GTR after eTSS, they were all conducted in a single center. Using the fixed-effect model, significant interactions were identified for the following variables: continent: p < 0.01 (GTR in North America: 78.2%, six studies, which was higher than in Europe: 68.4%, six studies; Asia: 70.5%, four studies; South America: 73.3%, two studies; and Australia: 48.7%, one study); age category: P < 0.01 (age category 46–50: 74.5%, two studies, which had a higher GTR than each of 51–55: 46.5%, one study, and 56–60: 73.7%, seven studies); publication after 2000 (P = 0.02; before 2000: 43.8%, 1 study, vs. after 2000 71.3%, 18 studies); study design (P < 0.01; cohort: 58.5%, 3 studies; case series: 73.6%; 16 studies). Nonsignificant interactions were identified for microadenoma percent (P = 0.41) and male percentage (P = 0.66). Using the random-effect model, only study design was identified as a significant source of heterogeneity (P < 0.01). Other variables such as number of surgeons were not available in many studies and were therefore not used for stratification. Meta-regression on year of publication was significant in a fixed-effect model (P < 0.01, beta: 0.03) suggesting an increased GTR with later publication year, but not in a random-effect model (P = 0.15). Meta-regression on journal impact factor was not significant in a random-effect model (P = 0.20), yet it was significant in the fixed-effect models (beta: − 0.13; P < 0.01) suggesting that studies published in a higher impact factor journal tended to report a lower GTR than studies published in lower impact factor journals.

Among the eight single center studies reporting GTR after mTSS, significant interactions were identified with the following variables using the fixed-effect model: continent (P < 0.01; GTR in North America 71.8%, three studies; GTR in Europe 59.0%, five studies), age (P = 0.013; age category 71–75: 75%, one study, which had a higher GTR than 56–50: 54.4%, five studies), study design: (P < 0.01; cohort: GTR=49.2%, two studies; case series: GTR=63.5%, six studies), and publication before 2000 (P = 0.019; before 2000: 41.7%, one study, after 2000: 61.6%, seven studies). However, using the random-effect model, no significant sources of heterogeneity could be identified for the following variables: continent (P-interaction=0.15), age category (P = 0.96), publication before 2000 (P = 0.141), and study design (P = 0.524). While seven studies did not report the microadenoma percentage, only one study indicated it had a higher macroadenoma percentage; six other studies reported a higher male percentage. Meta-regression on journal impact factor was significant with the fixed-effect model (slope = 0.13, 95% CI: 0.008–0.25, p = 0.04) indicating a direct association between a higher journal impact factor and a higher GTR rate, but this association was not significant in a random-effect model (p = 0.79). Meta-regression on study year was not significant in random- (P = 0.42) or fixed-effect (P = 0.41) models.

Publication bias

A symmetrical inverted funnel plot suggested the absence of publication bias in the GTR analysis for pituitary adenomas (Appendix 4). Furthermore, no significant publication bias was identified using Begg’s (P = 0.29) and Egger’s test (P = 0.52). In the analysis for FPA, a symmetrical funnel plot suggested the absence of publication bias (Appendix 5), which was also confirmed by Begg’s (P = 0.91) and Egger’s Tests (P = 0.82). In the analysis for NFPA, a slightly asymmetrical inverted funnel plot suggested the presence of publication bias where smaller studies showing a lower GTR rate could have been unpublished (Appendix 6); however, Begg’s (P = 0.11) and Egger’s test (P = 0.07) indicated no publication bias. After imputing four studies to the left of the pooled estimate using the trim and fill method, the new pooled GTR rate slightly decreased from 67.3% to 66.5% under the fixed effect model.

Discussion

This meta-analysis indicates that among patients who were not randomly allocated to either approach, eTSS results in a higher rate of GTR compared to mTSS, for all pituitary adenomas and for NFPA in fixed-effect models. For all FPAs, however, eTSS does not offer a significantly higher rate of GTR in both models. Despite these significant associations, the great heterogeneity among studies reporting both approaches could not be corrected by meta-regression, indicating that the results should be interpreted with caution.

Despite detailed meta-regression by both study and patient-level characteristics, the heterogeneity between studies of both modalities could not be alleviated. Due to the relatively low quality of evidence of the included studies, which mostly consisted of retrospective case series, this heterogeneity is not surprising. Some of the reasons for the great heterogeneity may include the learning curve associated with endoscopic resection, with more and less experienced surgeons reporting significantly different rate of GTR.

One recent survey among neurosurgeons found a significant correlation between the number of pituitary adenomas resections performed and postoperative complication rates (p < 0.05) [12]. For GTR specifically, one study found a significant relation when comparing the first 40 patients with the last 40 patients in their case series (52.5 vs. 75.0%, p = 0.036), while another study only found a nonsignificant trend towards higher rate of GTR with growing experience [5, 9]. However, another study comparing an inexperienced neurosurgeon performing eTSS to an experienced neurosurgeon performing mTSS showed no significant difference in GTR (p = 0.67), suggesting learning curve may not always compromise GTR [89]. In a multivariate model, however, the same study showed that larger pituitary adenomas were associated with a lower extent of resection [89]. In this meta-analysis, only a difference in percentage of macroadenomas was identified as a source of heterogeneity for mTSS NFPA resection, and difference in microadenoma percentage as a source of heterogeneity for eTSS FPA resection. The lack of a significant difference in GTR for eTSS and mTSS may be explained by longer experience with mTSS, despite improved visibility with eTSS.

One other meta-analysis also reported a significant difference in GTR between eTSS and mTSS, but heterogeneity was not described (79% vs. 65% respectively, p < 0.01) [24]. Similarly, one study examining 15 cohort studies also reported a higher rate of GTR for eTSS (OR = 1.86, 95% CI: 1.36–2.54) [34]. Another study found similar results for pituitary adenomas invading the cavernous sinus (47% vs. 21% respectively, p < 0.01) [26]. Three other systematic reviews have suggested no significant differences in GTR between the two modalities [76, 79, 81].

Although it remains unclear which of the two treatment modalities, eTSS or mTSS, is superior for GTR, other factors may also play a key role in outcomes for patients with pituitary adenomas, and this meta-analysis cannot fully address these concerns. For example, eTSS may be associated with shorter length of stay and lower costs [24, 36, 73, 76]. Other experts have suggested, however, that eTSS, which generally requires longer operative times, may adversely affect both patient and financial outcomes [66]. Furthermore, one meta-analysis found an association between eTSS and vascular complications when compared to mTSS (1.58 vs. 0.50%, p < 0.01) [1]. Proposed reasons for this difference include more aggressive surgical excision in patients undergoing eTSS, perhaps due to the superior visualization permitted by this modality. Other patient-related factors that may alter the choice may be quality of life and visual improvement after surgery, which have not been compared between the modalities [62, 85]. Also, a meta-analysis showed that eTSS was associated with more postoperative visual improvement [24]. Remission of hypersecretion of FPA may also form an indication for either of the modalities, although one meta-analysis showed a nonsignificant difference [36, 67]. This is particularly relevant as for most FPAs the main goal of the surgery is to achieve hormonal recovery instead of GTR [29, 67]. This is why hormonal recovery could be viewed as a far superior outcome to GTR for FPA patients. Nevertheless, GTR is suggested to be predictive of hormonal recovery [42]. However, it remains to be elucidated what the exact contribution of GTR is to postoperative hormonal recovery rate, as many other factors also contribute to this outcome (e.g., dopamine-antagonists for prolactinoma) [72]. Finally, recurrence, progression free, and overall survival, which were also not directly compared, could further aid decision making. Recently, an analysis of nearly 6000 operations demonstrated that eTSS was associated with higher rates of complications, longer postoperative hospital stays, and increased costs when compared to mTSS. It is important to remember that these economic factors may also play a role in decisions regarding methodological choice, beyond just patient- and prognosis-related variables [2].

Strengths of this meta-analysis include the systematic search strategy and fully updated reference list. This is the largest meta-analysis conducted to date on this topic, and the second to identify a significant difference in GTR between the two modalities [24]. Additionally, this meta-analysis reported and attempted to address heterogeneity via subgroup analysis by numerous study and patient-level characteristics.

Limitations of this meta-analysis include the high heterogeneity identified among the studies for both eTSS and mTSS. Additionally, odds ratios or relative risks could not be calculated due to the study design of the included studies, as the vast majority were retrospective case series. Furthermore, due to inconsistent reporting among the studies included, meta-regression by Knosp score, Hardy-Wilson tumor grading, or asymmetric suprasellar extension was not possible [26]. Furthermore, using both fixed- and random-effect models may help determine the true difference in GTR, but a random-effect model is often not significant when a fixed-effect model is. As with any meta-analysis, its strength is determined only by the strength of the studies included within it. The literature on this topic mostly consists of retrospective case series of varying size; thus, pooled analysis is limited in showing causality. Furthermore, it was not possible to incorporate surgeon experience, which may also influence GTR rate [5, 9]. Surgical outcomes after giant pituitary adenomas resection could not be compared separately as only outcomes after ETSS were reported in five studies and after ETSS, MTSS, and craniotomy in one study [16, 18, 21, 48, 54, 68]. The latter suggests that ETSS results in significantly higher GTR rate among giant pituitary adenomas [18]. This study also examined only GTR and not the many other factors that determine selection of surgical modality. GTR is an important but limited marker for surgical success, especially when resecting FPAs, for which hormonal recovery determines surgical success, and when stereotactic radiosurgery (SRS) is available [28, 67]. This limits the implications of this meta-analysis for FPA patients.

As the technology for eTSS continues to advance, it is likely that eTSS will continue to displace mTSS as the primary approach for sellar lesions, regardless of whether carefully collected evidence indicates superiority. The gold standard for comparison between the two modalities would of course be a prospective, randomized, controlled trial comparing eTSS to mTSS for a large number of patients, as suggested by the IDEAL (Idea, Development, Exploration, Assessment, Long-term Follow-up, Improving the Quality of Research in Surgery) Framework [63]. The IDEAL criteria require careful introduction accompanied by prospective evaluation for initial patients. This should then be followed by a randomized controlled trial to show true benefit [63]. There are many reasons why such a study is unlikely to occur, including surgeon preference and difficulties with patient enrollment. In light of these difficulties and the unlikelihood of such high-quality data, meta-analyses of currently existing studies represent the highest quality data available. Further studies may be improved by focusing on smaller subsets of these reports with the aim to reduce heterogeneity and identify more granular differences in the two approaches. Furthermore, a focus on evaluation of relevant outcomes to patients, such as hormonal recovery for FPA, visual recovery, and quality of life, is of vital importance. Also, alternative trial design may aid finding methodologically just ways of comparing these surgical modalities [61, 63].

Conclusion

The pooled GTR rate in all pituitary adenoma patients undergoing eTSS (74.0%) was significantly higher than the GTR rate in patients undergoing mTSS (66.6%). For NFPA, eTSS resulted in a significantly higher GTR rate (71.0%) than mTSS (60.7%) in a fixed-effect model. However, none of these differences were significant in random-effect models. A direct comparison between the two modalities was impossible due to the high heterogeneity among studies.