Introduction

The prognosis of upper gastrointestinal cancer (UGIC) in the UK is extremely poor, with 5-year survival rates for esophageal cancer (EC) and gastric cancer (GC) of 12 and 16 %, respectively [1]. The poor prognosis of UGIC relates to it usually presenting at an advanced stage, with only one-third of UGIC subjects suitable for curative treatment [2]. The prognosis for subjects with early-stage disease, who are eligible for curative resection, has improved [3, 4] with 80 % alive at 1 year [5]. However, the prognosis for subjects with locally advanced and metastatic disease remains poor.

Selected single-institution studies in Western populations in a total of 908 subjects have reported that 4.6–14.0 % of UGIC subjects have had an EGD which did not identify UGIC in the 3 years prior to their eventual UGIC diagnosis [610]. These events can be termed post-EGD upper gastrointestinal cancer (PEUGIC) following the same principle as post-colonoscopy colorectal cancer [11]. Subjects who presented with alarm symptoms, including dysphagia, anemia, hematemesis, weight loss, or vomiting, at the time of EGD have been reported to be at increased risk of PEUGIC [7, 8]. In addition, squamous cell carcinoma in the proximal esophagus [8] and taking less biopsy specimens [7, 9] were reported to be associated with PEUGIC. Subject characteristics such as age and gender did not appear to affect the likelihood of PEUGIC [710].

Using a UK primary care dataset, the aims of this study were to determine the PEUGIC rate at a national level in an unselected sample and to identify associated risk factors for these events. The treatment and survival outcomes for PEUGIC subjects were also studied.

Methods

Study Design and Data Source

A retrospective nested case–control study was performed using The Health Improvement Network (THIN) database (Cegedim Strategic Data Medical Research UK, London). THIN is a primary care database which includes computerized anonymized longitudinal records from over 300 primary care centers in the UK [12]. Over 5 million subjects are registered with THIN primary care centers, and they are regionally and demographically representative of the UK. The data are organized by individual primary care centers, and each subject is identified by a computer-generated unique identifier within the center. Participating primary care practitioners systematically record each healthcare episode as part of their routine practice, which are anonymized and prospectively recorded by the THIN software. No identifying information (such as name, address, date of birth, postcode) leaves the individual primary care center. Clinical diagnoses are recorded in THIN as diagnostic Read codes (diagnosis dictionary). There is a potential delay in secondary care clinical information (a new diagnosis or procedure carried out) being recorded on the primary care system and THIN. This is reflected by an “event date” when it occurred and a separate “system date” when it was recorded associated with each Read code. The event date is backdated to the actual diagnosis or the procedure date.

Subject Definition

UGIC subjects were identified as any subjects over 18 years with either a GC code (“Appendix 1”) or EC code (“Appendix 2”) recorded in THIN between 2002 and 2009 (for GC subjects) and 2002 and 2012 (for EC subjects). The diagnosis date of GC or EC was defined as the first record of a GC or EC diagnosis code in THIN. Cases and controls with less than 36 months of retrospective follow-up available prior to their UGIC diagnosis were excluded, as it was not possible to ensure they had not undergone EGD in the 36 months prior to diagnosis. All subjects with a diagnosis of Barrett’s esophagus prior to UGIC diagnosis were also excluded to prevent confounding due to surveillance EGDs. Subjects with small intestinal cancers were not included in the study.

PEUGIC cases were defined as all UGIC subjects in the THIN cohort who underwent EGD between 12 and 36 months prior to eventual UGIC diagnosis. Controls were defined as UGIC subjects who did not undergo EGD between 12 and 36 months prior to UGIC diagnosis. Study variables were related to the “diagnostic EGD” when UGIC was diagnosed in controls and the “PEUGIC EGD,” the EGD which did not detect UGIC at least 1 year prior to eventual UGIC diagnosis, in PEUGIC subjects. If a PEUGIC subject had multiple EGDs in the 12- to 36-month period prior to their UGIC diagnosis, then the PEUGIC EGD was the EGD nearest in date to when UGIC was diagnosed. In order to take into account the potential administrative delay in primary care in UGIC diagnoses being recorded in THIN, the period within 12 months of UGIC diagnosis was excluded. The PEUGIC rates were calculated by dividing the number of PEUGIC subjects by the total number of UGIC subjects.

Subjects Demographics

Only birth years (rather than actual date of birth) are recorded in THIN, and age was therefore rounded to the nearest whole year prior to analysis. Mean age and standard deviation were calculated to analyze the effect of age. The Charlson comorbidity index was calculated using diagnostic Read codes for medical conditions recorded in THIN prior to the diagnostic EGD date in controls and PEUGIC EGD date [13]. Subjects were divided into three categories: 0 (no comorbidity), 1–4 (low comorbidity) and 5 or greater (high comorbidity). Socioeconomic status was derived at aggregate level by postcode from the subjects’ place of residence. This is recorded in THIN as the Townsend deprivation index [14], and it was separated into quintiles. For the purpose of analysis, the least deprived quintiles 1 and 2 were combined and compared with quintile 3, and the most deprived quintiles 4 and 5 combined, and subjects with no recorded Townsend score. Where there was more than one Townsend score recorded in THIN, the recorded score closest to the diagnostic EGD or PEUGIC EGD was used for analysis.

Presenting Symptoms

Diagnostic Read codes for upper gastrointestinal symptoms (abdominal mass, anemia, anorexia, dysphagia, hematemesis or melena, gastro-esophageal reflux disease (GERD), vomiting, and weight loss) which were recorded by primary care practitioners within the 12 months prior to diagnostic EGD or PEUGIC date were extracted. Alarm symptoms or signs included abdominal mass, anemia, dysphagia, hematemesis or melena, and weight loss.

Endoscopic Findings on PEUGIC EGD

The endoscopic findings at PEUGIC EGD were extracted. The endoscopic diagnoses included esophageal stricture, esophageal ulcer, esophagitis, gastritis, gastric ulcer, duodenitis, and duodenal ulcer. The UGIC location was recorded for EC as upper or middle esophagus, lower esophagus, and location unknown and for GC as proximal, body, distal, and location unknown. In the majority of subjects, the UGIC location was not recorded in THIN; therefore, the “free text entry” attached to the diagnostic Read code was examined to extract the anatomical location where available.

Treatment Outcomes and Survival for UGIC Subjects

The number of UGIC subjects undergoing resectional surgery, chemotherapy, or radiotherapy post-UGIC diagnosis was obtained by treatment Read codes. Survival was calculated from the EC or GC diagnosis date until the end of database registration, death, or end of data capture in THIN, whichever was soonest. Unadjusted and adjusted (for EC or GC, gender, age, deprivation, comorbidity, and alarm symptoms on presentation) survival at 1 year was calculated for PEUGIC subjects and controls.

Changes in PEUGIC Incidence with Time

In order to assess the change in the incidence of PEUGIC over the study period, subjects with EC and GC were separated according to their PEUGIC EGD date for cases and diagnostic date for controls into tertiles. The PEUGIC rate for each tertile was then compared.

Statistical Methodology

Statistical analysis was carried out with SPSS v20.0 (IBM, New York, USA). Independent t test and χ 2 test were used to compare differences in continuous and categorical variables, respectively. Unconditional logistic regression analysis was used to calculate odds ratios and 95 % confidence intervals (CI) of the influence of type of UGIC (EC or GC), gender, age, Charlson comorbidity index, socioeconomic status, presence of alarm symptoms, individual upper gastrointestinal symptoms, UGIC location, surgery, chemotherapy, radiotherapy, and survival at 1 year on PEUGIC. For tests of significance, p values <0.05 were considered significant.

A multivariate logistic regression analysis model was constructed to determine associations with PEUGIC following adjusting for confounding factors including UGIC type (EC or GC), gender, age, Charlson comorbidity index, socioeconomic status, and the presence of alarm symptoms. Multivariate analysis of treatment and survival outcomes were analyzed by individual regression models adjusting for confounding factors including UGIC (EC or GC), gender, age, Charlson comorbidity index, socioeconomic status, and the presence of alarm symptoms on presentation in each of the models. Unadjusted Kaplan–Meier analysis was used to compare survival in PEUGIC subjects and controls.

Ethics Approval

In the UK, all research involving data collected from National Health Service patients must be approved by a Research Ethics Committee. The THIN Data Collection Scheme was approved by the South-East Multicentre Research Ethics Committee (SE-MREC) [12].

Results

There were 11,966 UGIC subjects during the study period, with 5473 GC and 6493 EC subjects. Following exclusion of subjects who did not meet the study criteria, 4249 GC (44.8 %) and 5238 EC subjects (55.2 %) were included.

Subject Characteristics

The PEUGIC subject characteristics are given in Table 1. There were 633 PEUGIC subjects, 279 with EC and 354 with GC. The overall PEUGIC rate was 6.7 %, with the PEUGIC rate for EC and GC being 5.3 and 8.3 %, respectively. PEUGIC subjects were more likely to have GC than EC. This was less marked when adjusted for other variables but remained a significant association.

Table 1 The subject characteristics of post-EGD upper gastrointestinal cancer cases and upper gastro-intestinal cancer controls

Younger age and female gender were associated with PEUGIC. When UGIC subjects were separated into EC and GC subjects, the age association was only observed in GC subjects and the female gender association was only observed in EC subjects.

Increasing medical comorbidity was associated with PEUGIC. Subjects with a Charlson comorbidity score of 1–4 were at modestly increased risk compared with subjects without comorbid illnesses in univariate and multivariate analyses.

Increasing deprivation was associated with PEUGIC, with more deprived postcodes (Townsend score fourth and fifth quintiles) more likely to be associated with PEUGIC compared with Townsend score first and second quintiles. This association remained statistically significant following adjusting for confounding factors.

Presenting Symptoms Prior to UGIC Diagnosis

Presenting symptoms prior to UGIC diagnosis are given in Table 2. Subjects who presented with alarm symptoms within 12 months of their EGD were much less likely to be associated with PEUGIC. This effect was even more notable in subjects with EC compared with subjects with GC. Alarm symptoms remained strongly associated even after adjusting for potential confounding factors.

Table 2 Consultations with upper gastrointestinal symptoms in the 12 months prior to post-EGD upper gastrointestinal cancer endoscopy and prior to upper gastrointestinal cancer diagnosis in controls

In subjects with EC, PEUGIC subjects were most likely to present with GERD symptoms (45.2 %), whereas controls were mostly likely to present with dysphagia (44.8 %) in the 12 months prior to their PEUGIC EGD and diagnostic EGD, respectively. EC subjects who presented with dysphagia, weight loss, or vomiting were all less likely to be associated with PEUGIC. In contrast, EC subjects with GERD symptoms were nearly three times more likely to be associated with PEUGIC.

In subjects with GC, both PEUGIC subjects (40.1 %) and controls (20.4 %) were more likely to present with GERD symptoms. However, presenting with GERD symptoms increased the risk of GC PEUGIC more than twofold. Symptoms of anemia, vomiting, weight loss, dysphagia, or anorexia were all negatively associated with PEUGIC in GC subjects.

Endoscopic Findings

The endoscopic findings from PEUGIC EGDs are given in Table 3. The most common finding was esophagitis in 19.4 % of PEUGIC subjects with EC and gastritis in 22.6 % of PEUGIC subjects with GC. Endoscopic findings recognized to be associated with EC (esophageal stricture and ulcer) were reported in 5.7 % of EC PEUGIC cases, and findings associated with GC (gastric ulcer) were reported in 10.5 % of GC PEUGIC cases. Of the PEUGIC subjects with EC who had an esophageal stricture or ulcer reported at PEUGIC EGD and PEUGIC subjects with GC who had a gastric ulcer reported at PEUGIC EGD, only 50.0 and 64.6 %, respectively, had a follow-up EGD within 90 days. PEUGIC subjects who presented with alarm symptoms were significantly more likely to have esophageal stricture and gastric ulcer reported at their PEUGIC EGD.

Table 3 Endoscopic findings at post-EGD upper gastrointestinal cancer endoscopy

Subjects with EC in the lower esophagus appeared to be at lower risk of PEUGIC compared with subjects with EC in the upper and mid-esophagus, but there was no significant association, in part due to the large number of subjects with unknown UGIC location (Table 4). There was no difference in the site of GC in PEUGIC subjects, with equal proportions of proximal and distal GC in PEUGIC subjects and controls.

Table 4 Site of esophageal and gastric cancers

UGIC Treatment Outcomes and Survival

The UGIC treatment outcomes and survival are given in Tables 5, 6 and 7. PEUGIC subjects were more likely to undergo surgery than controls on univariate analysis. However, this association was confined to male subjects with GC. There was no difference between PEUGIC subjects and controls undergoing chemotherapy. However, when separating subjects with EC and GC by gender, female PEUGIC subjects, PEUGIC subjects with EC, and particularly female PEUGIC subjects with EC were more likely to have chemotherapy. In contrast, male PEUGIC subjects with GC were less likely to undergo chemotherapy. Following adjusting for confounding factors, PEUGIC subjects were marginally more likely to undergo radiotherapy compared with controls, but there was no overall difference in the likelihood of undergoing surgery or chemotherapy.

Table 5 Treatment outcomes and adjusted survival for post-EGD upper gastrointestinal cancer subjects and upper gastrointestinal cancer controls
Table 6 Treatment outcomes and adjusted survival for post-EGD upper gastrointestinal cancer subjects with esophageal cancer and esophageal cancer controls
Table 7 Treatment outcomes and adjusted survival for post-EGD upper gastrointestinal cancer subjects with gastric cancer and gastric cancer controls

When comparing PEUGIC subjects with controls, there was no difference in 1-year survival and overall survival (Fig. 1). When sub-analysis was carried out by separating subjects with EC and GC, PEUGIC subjects with GC were more likely to survive at 1 year compared with controls.

Fig. 1
figure 1

Unadjusted survival from date of diagnosis for subjects with post-EGD upper gastrointestinal cancer and controls

Change in PEUGIC Incidence with Time

EC subjects undergoing EGD prior to 2008 were between 2 and 3 times more likely to be associated with PEUGIC than subjects undergoing EGD after 2008 (p < 0.0001, p = 0.0001)(Table 8). The difference in time period was less marked in subjects with GC, with subjects undergoing EGD prior to 2005 1.5 times more likely to have PEUGIC, compared with subjects undergoing EGD after 2005 (p = 0.014, p = 0.003).

Table 8 The frequency of post-EGD upper gastrointestinal cancer by time period

Discussion

EGD is the gold standard for investigating upper gastrointestinal symptoms and diagnosing UGIC. In a recent meta-analysis, PEUGIC was found to be relatively uncommon occurring in approximately 1 in every 400 EGDs [15]. However, PEUGIC was relatively common among UGIC subjects, with 4.6–14.0 % having had an EGD which did not detect UGIC in the preceding 3 years [6, 810, 16]. Overall, PEUGIC occurs in 6.4 % of UGIC subjects within 1 year of diagnosis and in 11.3 % of UGIC subjects up to 3 years before diagnosis [15]. Two recent population-based UK studies have reported that 8.3 % of GC and 7.7 % of EC subjects have had an EGD up to 3 years prior to eventual UGIC diagnosis [17, 18]. An interval of 3 years is derived from the assumption that the doubling time for mucosal GC is 2–3 years from a Japanese study from the 1970s [19], and this interval is commonly used to define a false-negative endoscopic examination in the detection of UGIC. The PEUGIC rate from this study, the largest ever of this issue, was 6.7 %.

In the current study, younger age and female gender were more likely to be associated with PEUGIC. Similar findings have been reported in a recent UK series based on a national gastric cancer audit [17]. This could potentially be explained by younger subjects [2022] and women [21, 23] reportedly having a lower tolerance for EGD examination, which may in turn lead to a reduction in EGD diagnostic quality. Another possible explanation might be the lower expectation of UGIC in women and younger subjects by endoscopists, due to the lower incidence of UGIC in younger and female subjects. The increased risk of PEUGIC in women in the present study was only related to EC and not GC. Squamous cell EC accounts for 65.4 % of all EC in women but only 28.6 % in men, in whom esophageal adenocarcinoma is much more common [24]. Unlike the readily recognizable signs of early esophageal adenocarcinoma such as Barrett’s esophagus, the early signs of squamous cell EC may be less readily recognized in Western populations [8, 25]. This may explain the gender difference found in the current study.

Subjects with increasing medical comorbidity were more likely to have an episode of PEUGIC, which might relate to a lower tolerance of the procedure due to their associated medical conditions and therefore quality of EGD examination. Alternatively, subjects with multiple comorbidities may be more likely to undergo EGD than subjects without comorbidity for conditions such as anemia related to their comorbidities, when a relatively small, asymptomatic early UGIC might not be detected.

UGIC subjects who presented with alarm symptoms within 12 months of their EGD were much less likely to be associated with PEUGIC. Alarm symptoms suggest a more advanced case of UGIC, and thus, the UGIC would be more likely to be detected during EGD examination. In contrast, presenting with hematemesis or melena or GERD symptoms is not usually associated with UGIC, and therefore, this may potentially affect the endoscopist’s awareness of early UGIC during EGD. Surprisingly, the opposite finding to this has been reported in series in Scotland and Western Australia with subjects presenting with alarm symptoms being more likely to experience PEUGIC [7, 8]. The difference in the findings from these studies is likely to relate to identifying PEUGIC subjects within 6 months of UGIC diagnosis, rather than 12 months in the present study, and diagnosis being delayed in advanced UGIC cases due to food residue or blood obscuring the view, inadequate biopsy sampling, or follow-up arrangements. In the present study, PEUGIC subjects who presented with alarm symptoms were more likely to have endoscopic findings such as esophageal stricture and gastric ulcer reported at their PEUGIC EGD that are known to be associated with UGIC. Such endoscopic lesions were reported in up to 8.3 % of PEUGIC cases. Of these, only 50.0 % of subjects with esophageal stricture or ulcer and 64.6 % of subjects with gastric ulcer had a follow-up EGD within 90 days in the current study. A lack of adequate follow-up of these lesions is likely to be a contributing factor to PEUGIC cases.

PEUGIC subjects appeared more likely to undergo surgery following UGIC diagnosis; however, this was likely to be due to confounding factors (such as younger age) as there was no association after adjusting for other variables. Overall, there was no difference in both unadjusted and adjusted survival at 1 year between PEUGIC subjects and controls. The same findings were also reported in a Finnish and a recent UK cohort [10, 17, 18]. This should not be surprising given the very poor overall survival in UGIC patients, and obviously, the situation might potentially be very different if the PEUGIC had been diagnosed at an earlier opportunity.

Encouragingly, the PEUGIC rate in the UK has fallen over the study period from 7.9 to 2.7 % for EC and 9.0–6.5 % for GC. There are likely to be a number of factors behind this fall including improvements in endoscopic pathways, such as routinely following up esophageal strictures or ulcers and gastric ulcers (which has improved from 55.9 to 69.8 % when comparing periods before and after 2000 in the dataset), endoscopists taking more biopsies from suspicious lesions, improvements in the quality of endoscopic imaging, and endoscopists becoming more aware of early signs of UGIC. The reasons for PEUGIC being more commonly associated with GC than EC cannot be identified in the present study. The esophagus has a smaller surface area, simpler anatomy, and the mucosa is less likely to be contaminated than the stomach with food, debris, or bile impeding the endoscopic view. Endoscopists in the UK are also likely to be less aware of gastric premalignant changes such as gastric atrophy or intestinal metaplasia than the more widely recognized premalignant condition Barrett’s esophagus, and this may contribute to more early GC than early EC not being recognized at EGD.

The large sample size and its unselected nature are the obvious strengths of the present study, making it the largest study on PEUGIC to date. The total of 9487 UGIC subjects included was greater than the sum of all subjects included in previous studies of PEUGIC. The THIN database spans over two decades, allowing changes in PEUGIC incidence to be examined. The THIN primary care centers are spread across the UK, and subjects are regionally and demographically representative of the UK. In addition, as patients must be registered with a primary care practitioner in order to access secondary care services, this allowed unbiased subject selection, which is a potential source of bias in most previous studies due to their subject cohorts being recruited from a single healthcare provider. Furthermore, the data captured in THIN have previously been validated in a number of studies [26, 27].

Despite the above advantages, there are a number of limitations including specific issues related to the THIN dataset. The lack of ability to link the THIN dataset to the national cancer registry data is a significant disadvantage. However, primary care practitioners contributing data to THIN follow a standardized process and codes for cancer would not be entered without histological confirmation from a secondary care provider. In order to further validate the dataset in the current study, the surgical rates for EC (14.5 %) and GC (15.1 %) in 2010 from THIN were compared with the national esophagogastric cancer audit. The national audit reported a surgical rate of 20.0 % in EC subjects and 22.4 % in GC subjects, respectively, during the same period with a case ascertainment of 71.1 % [28]. Furthermore, the 1-year survival rate in the present study was similar to national survival rates reported in cancer registry data. In THIN, the survival rate for EC subjects diagnosed between 1997 and 1999 was 36.1 % and subjects diagnosed between 2000 and 2002 was 33.8 %, which is comparable to cancer registry rates of 33.3 and 38.0 %, respectively [29]. The possibility of administrative delays in primary care in recording the UGIC diagnosis date led us to exclude the period within 12 months of UGIC diagnoses for analysis of PEUGIC, potentially excluding some PEUGIC cases. However, although addressing the reasons for patients undergoing an EGD that did not diagnose UGIC within a few months of their diagnosis is an important issue, it is much less likely to improve the prognosis of UGIC than diagnosing the UGIC at an earlier stage or as a premalignant lesion years before the diagnosis date in PEUGIC cases. THIN only captures diagnostic outcomes from EGDs and data potentially relevant to PEUGIC, such as whether sedation was used, the grade and specialty of the endoscopist, H. pylori status, if biopsies were taken and the number of biopsies taken is not recorded, limiting conclusions on why PEUGIC cases occurred. Furthermore, the lack of complete data on UGIC histology and UGIC staging further limited analysis of potential causes of PEUGIC and the degree to which an endoscopist could potentially be responsible for a case of PEUGIC. For example, there may be virtually no changes at EGD 3 years before later diagnosis with an early-stage UGIC, whereas 13 months before presenting with an advanced UGIC it is very likely that the endoscopist missed an existing malignant lesion.

We would recommend that national bodies with responsibility for endoscopy should encourage research into EGD quality and set quality standards for EGD that are similarly stringent to the quality standards for colonoscopy that have improved outcomes for colonoscopy and colorectal cancer over the last decade. We would also recommend that individual endoscopy units undertake regular audit of PEUGIC rates and undertake root cause analysis of identified cases.

In summary, in the largest study to date, the risk of PEUGIC among UGIC subjects was 6.7 %. PEUGIC was associated with younger age, female gender, increasing comorbidity, increasing deprivation, and a lack of alarm symptoms at presentation. PEUGIC was more common among GC subjects. Endoscopic findings such as stricture and ulceration that are known to be associated with UGIC were recorded in 8.3 % of PEUGIC EGDs, representing potential missed opportunities for early UGIC diagnosis.