Introduction

The Royal Library of Alexandria, founded in the third century BC, was one of the first and the greatest repositories of medical literature [1]. The library is thought to have been destroyed in an attack by Julius Caesar in 48 BC with the resulting loss of some 400,000 parchments. If we were to repopulate that collection with current academic articles, it would take only 3 months before the shelves were overflowing: in the past 12 months, some 1.26 million articles have been added to PubMed. For the average clinician, keeping up to date with all the relevant literature has become an almost impossible task.

The sheer breadth of data drove the development of systematic reviews and meta-analyses to synthesise a meaningful conclusion from multiple trial outcomes, first appearing in the 1970s with early examples including a review of the use of vitamin C for the common cold [2]. But rather paradoxically, these tools have become so popular they have contributed to the boom in article publishing. Between 1991 and 2014, there was a 153% increase in the total number of articles indexed on PubMed; by contrast, there was a 2728% increase in the number of systematic reviews and 2635% increase in meta-analyses [3]. Rather than condensing the academic literature, reviews of the literature have disproportionately added to the volume of scientific publishing.

In comparison to pathologies such as coronary artery disease or neoplasms, chronic rhinosinusitis (CRS) is relative “small fry” in the field of publishing. The comparative lack of epidemiological scale studies into the causes, treatment, and outcomes of CRS means that we are dependent on good quality scientific reviews to gather small study groups together and draw robust, evidence-based conclusions. Initiatives to improve the academic rigour of reviews include reporting checklists such as PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) [4], which is largely aimed at guiding reviews of randomised data; and MOOSE (Meta-analysis of Observational Studies in Epidemiology) [5], a tool for reporting observational or non-randomised data.

The sheer volume of reviews for scientific reader to keep up to date with may be daunting, but of equal concern is the quality of these reviews. With this in mind, we set out to appraise the available reviews in CRS, both in number and in quality.

Methods

A PubMed search was performed using the following search terms and filters:

  • “Chronic rhinosinusitis” (filters: review; systematic review; meta-analysis)

  • “Chronic rhinosinusitis” AND “review”; “systematic review”; “meta-analysis”

  • “Nasal polyp” OR “nasal polyposis” (filters: review; systematic review; meta-analysis)

  • “Nasal polyp” OR “nasal polyposis” AND “systematic review”; “meta-analysis”

The ten most recent English language systematic reviews and meta-analyses of both “chronic rhinosinusitis” and “nasal polyp*” were identified. The journal impact factor (JIF) was identified for each article from Journal Citation Reports: JCR Science Edition 2010, and number of citations from the PubMed record, and the mean JIF and number of citations were recorded for each group of articles. Additionally, each article was analysed for its adherence in reporting the items recommended by the PRISMA checklist. For the purpose of this study, “adherence to the principles of PRISMA” was defined as containing more than 90% of the items on the checklist (more than 24 items from a total of 27).

Results

Chronic Rhinosinusitis

The results of the PubMed search for CRS are described in Table 1 and illustrated in Fig. 1.

Table 1 Number of systematic reviews and meta-analyses published pertaining to “chronic rhinosinusitis”
Fig. 1
figure 1

Trends over time of review publishing in chronic rhinosinusitis (CRS); SR systematic review. Number of published reviews shown

The characteristics of the ten most recent English language articles identified in the PubMed search using CRS search terms are described in Table 2.

Table 2 Characteristics of reviews published pertaining to chronic rhinosinusitis (no journal impact factor (JIF) available for Laryngoscope Investig Otolaryngo, resulting in exclusion of analysis of *, 2 articles; #, 3 articles; and ^, 1 article)

Nasal Polyps/Polyposis

The results of the PubMed search for nasal polyps/polyposis are described in Table 3 and illustrated in Fig. 2, Table 4.

Table 3 Number of systematic reviews and meta-analyses published pertaining to nasal polyps or polyposis (search term “nasal polyp” OR “nasal polyposis”)
Fig. 2
figure 2

trends over time of review publishing in nasal polyps/polyposis); SR systematic review. Number of published reviews shown

Table 4 Characteristics of reviews published pertaining to nasal polyps or polyposis (search term “nasal polyp” OR “nasal polyposis”). JIF journal impact factor

Discussion

Number of Articles Published

The publication of reviews in CRS mirrors those of the wider scientific field, with ballooning numbers over the last decade. The earliest identified article was a 1977 narrative review of aspirin intolerance and nasal polyps [6]. Since that paper, there has been a startling acceleration in the number of reviews published, with almost all analysed search fields seeing more reviews published in the last 5 years than in the 30-year period between that 1977 article and 2007.

Analysis of scientific publishing have suggested that the rate of increase of articles published per year has begun to taper off [7], but this analysis suggests that reviews in rhinology show no sign of going out of fashion. The root cause of this boom in review publishing is unclear—proposed reasons include an increasing number of journals and a perception of systematic reviews as being safe, uncontentious publications that are unlikely to fall prey to research fraud. In the wider scientific field, some have linked the increase in reviews with researchers located in China whose funding is intimately linked with the number of papers published [3]—however, this does not appear to be a factor in CRS publishing as our analysis demonstrated the vast majority of publications were in English language by first authors located in Europe and the USA. The reason for the spiralling numbers of reviews may be debatable; what is clear is that the trend for publishing reviews of evidence in CRS and nasal polyps shows no signs of abating.

Type of Articles Published

Even in the context of this spiralling number of SRs and meta-analyses, their number is dwarfed by non-systematic, or narrative reviews, with nearly three times as many identified in CRS search terms and five times as many in nasal polyposis. These reviews tend to be cited far fewer times than their more rigorous counterparts, although they are published in journals with comparable impact factors. This suggests that these articles may be read widely and published in reputable journals, but on the whole, they tend not to be referenced as keystone pieces of evidence. This is clearly attributable to their methodology—systematic reviews and meta-analyses attempt to gather all available literature and synthesise an independent conclusion from the aggregated evidence; narrative reviews tend to reflect an author’s interpretation of the evidence and may be prone to significant selection bias. This expert opinion is often useful and informative for clinical practice—which may explain their popularity in publication—but tends not to advance the scientific discussion. Systematic reviews also outnumber meta-analyses, possibly due to the rigorous study selection process and advanced maths required for meta-analysis making them less attractive to perform. Another problem inhibiting meta-analysis is the marked heterogeneity in outcomes used in CRS trials, preventing pooling of data [8].

Quality of Reviews

In contrast to narrative reviews, SRs and meta-analyses do attempt to contribute to scientific discussion through robust methodology that evaluates all available evidence. Suggested protocols for performing these types of review have been suggested by bodies such as the Cochrane collaborative (http://training.cochrane.org/handbook), while many journals insist upon use of checklists such as PRISMA. Despite this, it is debatable whether the increase in the number of scientific reviews has been mirrored by maintenance in standards of the quality of reviews. In our small sample, 15% of the SRs and meta-analyses did not adhere to the PRISMA checklist (in fact, several reviews that stated they were performed according to the checklist contained only half of the suggested items!). Recognition of the increasing contribution of scientific reviews has led to tools for the independent evaluation of them by tools such as AMSTAR2 (Assessing the Methodological quality of Systematic Reviews tool) [9••], which seeks to grade these articles on the basis of any minor or critical flaws that may be present. There is a degree of interpretation to AMSTAR2 (although there is good inter-observer reliability), and the authors have sought to avoid making it a scoring system to reflect the fact that one critical flaw may completely undermine a review, whereas robust conclusions may be possible despite a number of minor flaws. As the number of scientific reviews increases, principles for evaluating their value such as those espoused by AMSTAR2 will become increasingly valuable to the reader.

One particular limiting factor in our field of rhinology is the quality of evidence that is available to review. This reflects the difficulty in precisely defining disease endotypes [10]; the variability in endotype by population (such as neutrophil /Th1/Th17 predominant polyp disease in Chinese populations vs eosinophil/Th2 predominant polyp disease in Caucasians) [11]; and the problem common to all surgical specialties in the practicalities of conducting a trial where treatment arms include randomisation to operation. A Cochrane review performed in 2014 identified only four trials with a total of 231 patients suitable for inclusion when evaluating the evidence for operative or medical interventions in CRS [12]. As a consequence of the difficulty in performing RCTs in CRS, the most abundant data comes from large-scale observational trials. Although observational data is useful, particularly in the context of drawing conclusions about epidemiological characteristics of CRS, the large numbers of patients that can be studied completely overwhelm the small numbers enrolled in interventional trials. Scientific reviews that include both types of trial will naturally therefore have conclusions that are biased towards the findings of the observational studies. That is critical in diseases such as CRS where endotypes are not precisely defined and large observational cohorts may contain patients with many different pathophysiological processes. For example, one of the largest reviews of aspirin exacerbated respiratory disease (AERD)—the study cited more than any other in this sample—reviewed 27 studies and over 16,000 patients, but only 25% of patients and three studies were from researchers based in Asia with the remainder from the USA and Europe, and no subgroup analysis was performed [13]. The different pathophysiological processes responsible for nasal polyps in Asian and Caucasian patients means that a systematic review that draws conclusions from an aggregated population of these two different groups may be questionable.

Cochrane reviews are widely regarded as having a robust and rigorous methodology that lends them a high degree of credibility. It would be expected that there would be a frequently cited evidence source, but one large-scale analysis of systematic reviews in medical literature found that the Cochrane Database of Systematic Reviews and Health Technology Assessment, the two most prolific publishers of reviews, were infrequently cited compared to those published in journals such as Annals of Internal Medicine and JAMA [14]. Indeed, only one Cochrane review made their “top 50” of the most cited systematic reviews—particularly surprising given that access to the database is free or funded in many countries—and it has been hypothesised that their rigour may lead to excessive length, with some evaluations of oncological therapies stretching to over 200 pages. By contrast, Cochrane reviews in CRS and polyposis were very highly cited, disproportionately contributing to the citation count of the ten most recent reviews of nasal polyposis. It seems that the rhinology community value the rigour and tightly defined criteria of the Cochrane reviews.

Cochrane have recently published a suite of reviews on CRS [15,16,17,18,19], evaluating the effectiveness of antibiotics, corticosteroids, and saline irrigation. Of course, treatments are rarely used in isolation, and effectiveness may change when combination therapy is used as part of “appropriate medical therapy”, whereas most RCTs evaluate only a single intervention. A major limiting factor in these reviews was the heterogeneity in outcomes used in CRS studies; a systematic review of patient reported outcome measures for CRS identified 15 different validated disease-specific tools [20]. Cochrane therefore commissioned a project to identify the outcomes felt to be the most important from the perspective of patient and healthcare providers [8]; symptom severity, quality of life, side effects of treatment, and avoidance of surgery were considered important outcomes. This is now being formally developed into a core-outcome set for CRS trials, using the COMET (Core outcomes for effectiveness trials) methodology [21], with the aim of facilitating meta-analysis and enhancing the value of future research.

Interpreting the Evidence

Even when reviews are conducted rigorously and in accordance with best practice, there is no formula that can be applied for inputting evidence and outputting a standardised conclusion. There is always a degree of interpretation placed upon the evidence by the authors—in the choice of search terms, databases, and exclusion criteria. One particularly illustrative example is of two systematic reviews examining the use of image-guided surgery (IGS) published within months of each other, both by very well regarded groups, and yet reaching very different conclusions. The first review by Ramakrishnan et al. [22] reached the conclusion that there was no evidence that IGS improved surgical outcomes or reduced complications from endoscopic sinus surgery (ESS). The second review, published 3 months later by Dalgorf et al. [23] concluded that there is evidence of benefit in the application of IGS to ESS. How could these two groups reach such contrasting conclusions? Both covered similar time periods and searched the same databases. However, the exclusion criteria for the two reviews were very different: Ramakrishnan et al. had very tight criteria for inclusion, excluding cadaveric studies, those where trainees performed surgeries, and any that included extended skull base approaches. As a result, they considered only six studies for review. By contrast, Dalgorf et al. had a more inclusive approach to evidence and identified 15 studies for quantitative analysis—including extended approaches where lesions were extradural, and perhaps most critically, permitting the analysis of a study where trainees performed surgery [24]. This study was the only randomised, single-blinded study of IGS and thus was heavily weighted in meta-analysis. These two reviews both make clear and robust conclusions from the evidence that they have selected to include—but both very neatly illustrate the subjectivity that is inherent in systematic reviews.

Accessing the Evidence

Finally, a key finding from this investigation was the variability in search results with the terms used and the method of searching. Using the PubMed filter for articles tagged as “systematic review” identified vastly more articles in both CRS and nasal polyposis; by contrast, including the search term “meta-analysis” was more effective at returning results in CRS. This highlights the difficulty that authors may have when conducting scientific reviews, requiring a variety of keywords and MeSH terms applied across a number of databases. The burgeoning field of grey literature has also been highlighted as an important component for inclusion in systematic reviews [25], but, as yet, there is no single repository for accessing this type of report. Accessibility of evidence is also affected by the journal policy on open access or subscription, which may place constraints on which authors are able to access the articles and thus perform a thorough review.

Conclusions

There has been an explosion of systematic reviews in CRS. While much of this is of great value in helping physicians practice evidence-based medicine, we still need to assess the quality of reviews and consider their search strategies and potential selection before implementing their findings. Core outcome sets, initiatives such as AMSTAR2 and organisations such as the Cochrane Collaboration, will help ensure that SRs and meta-analyses reduce future research waste and do not become part of the problem.