Introduction

Graft-versus-host disease (GHVD) is a common serious complication of hematopoietic stem cell transplantation (HCT) in 30–50 % of HLA-matched sibling transplants and up to 60–90 % of mismatched transplants [1]. The gastrointestinal tract is the second most common organ system involved in GVHD, with manifestations that include diarrhea, nausea, and vomiting. Endoscopic biopsies have played an important role in confirming the diagnosis of GVHD since the initial use of endoscopy in the 1970s. Rectal biopsies from rigid sigmoidoscopy, and later, flexible sigmoidoscopy (Flex-Sig), were the standard for making the diagnosis for GI GVHD [2]. However, in the late 1980s and early 1990s, a few studies suggested that upper endoscopy (EGD) biopsies were more sensitive than those obtained from lower endoscopy. In 1985, Snover et al. reviewed 24 patients and found that rectal biopsies missed five of six GVHD diagnoses found on gastric biopsies and one of four diagnoses found on duodenal biopsies [3]. Snover et al. followed up these results in 1991 when they reviewed 77 patients that received simultaneous upper and lower GI tract biopsies. Of the upper GI biopsies that were positive for GVHD, only 59 % of rectal biopsies were positive [4].

The results from these older studies have not been supported by more recent data that again suggest that rectal biopsies may in fact be more sensitive. In a prospective study in 2006, Thompson et al. [5] showed that in 24 patients, biopsies from the distal colon had a higher yield (82 %) than those taken from the stomach (71 %) and duodenum (65 %) for the histologic diagnosis of GVHD. Another, more recent study from MD Anderson evaluated 112 patients that all underwent biopsies from the stomach, duodenum, and rectosigmoid and showed that rectosigmoid biopsies were the most sensitive [6]. A smaller retrospective study from 2008 showed similar findings, and a study performed in a pediatric population also showed that rectal biopsies were more sensitive [7, 8].

Both Flex-Sig and EGD are extremely safe and routinely performed procedures with an approximately 0.02 % risk of complications in healthy patients [9]. There is a risk of bleeding from endoscopic biopsies, especially in this population of patients who are commonly thrombocytopenic, and some studies describe significant duodenal hematomas, an occurrence that has not been as routinely described with biopsies from other sites in the GI tract [10, 11]. EGD is also associated with a risk of aspiration that is not as commonly associated with Flex-Sig. Whereas Flex-Sig can be routinely performed without sedation, EGD typically requires moderate sedation which adds cost as well as a 0.2–0.5 % risk of cardiopulmonary complication and a sedation-associated mortality rate up to 0.05 % [12, 13].

It is also not known whether the nature of the patient’s symptoms reflects the site of their GI GVHD. Given the higher risk associated with EGD with sedation than for unsedated Flex-Sig, there has been increased interest in determining the most sensitive and safest site(s) to biopsy when evaluating for GVHD. Though our current clinical approach utilizes symptom-guided endoscopy with EGD for upper tract symptoms, prototypically nausea and anorexia, and Flex-Sig for lower tract symptoms, including diarrhea and bidirectional endoscopy with both procedures in patients with simultaneous upper and lower tract symptoms, there are no data to clearly support this approach.

Goals

Currently, the diagnosis of GVHD is a clinicopathologic undertaking that ultimately relies on the expert opinion of a hematologist–oncologist after assessing the clinical presentation, laboratory findings, endoscopic findings, and pathology. Using the clinical diagnosis of GVHD as the reference standard, our primary goal was to determine the sensitivity and specificity of biopsies from different sites within the GI tract for diagnosing GI GVHD in our HCT population.

Secondary goals included evaluating whether the nature of patients’ GI symptoms predicts the sensitivity of the site of biopsy and to review whether there were any significant complications from these procedures including perforation or bleeding requiring transfusion or admission.

Methods

Patients

After obtaining approval from our institution review board (IRB), all adult patients who underwent HCT at Duke University Medical Center between 1/1/05 and 1/1/11 were identified through an institutionally maintained database. The charts of those patients who were clinically suspected to have GI GVHD based on signs or symptoms and who underwent endoscopic evaluation and biopsy, regardless of the time from their HCT, were reviewed. Only patients who underwent allogeneic HCT were included in this analysis. For included patients who underwent more than one procedure at different time points for evaluation of potential GVHD, all procedures were included in this analysis.

Endoscopic Evaluation

All procedures were performed using standard Pentax endoscopes. This was a retrospective review, and therefore, no standard endoscopy protocol was followed. At our institution, general practice for these evaluations is as follows: Upper endoscopy is performed using a 29-French gastroscope following moderate sedation with midazolam and fentanyl. The procedure involves passage of the gastroscope into the duodenum with biopsies taken from one or multiple sites within the stomach and/or duodenum. Flex-Sig is also most typically performed with a 29-French gastroscope though some providers utilize a standard adult or pediatric colonoscope. If performed alone, Flex-Sig is usually done with minimal or no sedation. The only prep typically utilized is tap water enemas. The endoscope is typically advanced into the sigmoid colon, and biopsies are taken from the sigmoid colon and rectum. Full colonoscopy is performed with either a standard adult or pediatric colonoscopy following moderate sedation with midazolam and fentanyl. The decision to prep these patients is practitioner dependent with some patients receiving a full, polyethylene glycol-based, purgative prep and others receiving no prep if the exam is being done for voluminous diarrhea. The colonoscope is advanced to the cecum, and sometimes, the terminal ileum is accessed. Biopsies are taken from throughout the colon and, sometimes, from the ileum. Our general practice is to use a platelet threshold of 50 in order to take biopsies.

Pathologic Review and Scoring

An individual patient could have undergone one or more endoscopies, each of which may have sampled one or more sites. Sites of biopsy included stomach, duodenum, terminal ileum, right colon, left colon, and rectosigmoid. Upper GI tract sites were stomach and duodenum; middle sites were right colon and terminal ileum; lower sites were left colon and rectosigmoid.

All available biopsy, hematoxylin-and-eosin-stained slides were jointly re-reviewed by two expert pathologists (DC and MS) blind to the clinical information. Utilizing the 2006 NIH consensus guidelines for the diagnosis of GVHD, a pathologic diagnosis was rendered for each biopsy site, independent of one another [14].

Accordingly, four possible diagnostic categories were utilized: 1 (not GVHD), 2 (possible GVHD), 3 (consistent with GVHD), and 4 (definite/unequivocal GVHD). For the purpose of statistical analysis and in keeping with the most recent 2015 NIH consensus guideline, a category of 3 or 4 was considered positive for pathologic GVHD [15]. In addition to looking at specific biopsy sites, an overall patient pathologic diagnosis for all biopsies obtained during the same scope was determined by the highest diagnostic category. In other words, an endoscopic episode was considered histologically positive if any site biopsied was a consistent with or definitive GVHD and considered negative if all sites biopsied were negative for GVHD.

Clinical Review and Scoring

For the purposes of this retrospective study, all cases were formally reviewed by clinicians and pathologists who were blind to the original clinical diagnosis at the time of the chart review. The diagnosis of GVHD was based on a combination of the impression of the treating transplant physician, the histology, the initiation of empiric steroid treatment aimed at treating presumed GVHD, and the exclusion of other etiologies such as medications, CMV, and enteric pathogens including Clostridium difficile infection. The later etiologies are excluded by immunohistochemical tissue staining of biopsies, stool cultures, and PCR toxin assay, respectively.

Statistical Methods

The primary endpoint was histologic positivity for GVHD. Secondary endpoints were specificity and positive and negative predictive values.

Sensitivity, also called the true positive rate, was the probability that the site-specific biopsy was called biopsy positive when the clinical diagnosis was positive. Specificity, sometimes called the true negative rate, was the probability that the site-specific biopsy result was called biopsy negative when the clinical diagnosis was negative. Positive predictive value (PPV) was the probability that the site-specific biopsies that were called positive were clinically positive. Negative predictive value (NPV) was the probability that the site-specific biopsies that were called biopsy negative were clinically negative.

Because all sites were not evaluated during each endoscopy, it was not possible to statistically compare sites on measures of association. Therefore, we present assessments descriptively using incidence and probability with their respective 95 % confidence intervals (CI), the latter of which were computed using exact binomial methods. To compare upper versus lower GI tract sites, we included only cases that had at least one upper and at least one lower site biopsied.

To quantify the agreement between overall histologic diagnosis of GVHD and site-specific biopsy results, we used Cohen’s kappa. Serious complications of endoscopy collected were any perforation or bleeding requiring transfusion or hospital admission.

Analyses were performed using SAS 9.2 (Cary, NC).

Results

Between 1/1/05 and 1/1/11, 169 adult patients underwent 250 endoscopic procedures to evaluate for GVHD. The majority of patients (68 %) underwent one endoscopic evaluation for GVHD over the study period, while some were evaluated 2 (19 %), 3 (11 %), 4 (2 %), and 5 (1 %) times. The characteristics of these patients are given in Table 1. 56 % of patients were male, 48 % were at least 50 years of age, and 65 % underwent HCT for leukemia. The biopsies were taken a median of 84 days after HCT with more specific time interval information depicted in Table 2.

Table 1 Patients characteristics
Table 2 Timing of endoscopic evaluation for GVHD

Of the 250 endoscopic procedures, nearly half (49 %) sampled one site, while about a quarter sampled either two (28 %) or three (22 %) sites, and two endoscopies each sampled four sites, for a total of 435 biopsies, Table 3. The most frequent site of biopsy was the stomach (148), followed by duodenum (120) and rectosigmoid (120), Table 4.

Table 3 Type of endoscopic procedure performed
Table 4 Procedure and biopsy site characteristics

The number of tissue fragments (specimens) taken from each site ranged from a mean of 3 from the duodenum to a mean of 7 from the left colon, Table 5. Histologic positivity for GVHD was 73 % for upper tract sites (stomach and duodenum) compared with 65 and 64 % for middle (right colon and terminal ileum) and lower tract (left colon and rectosigmoid) sites, respectively. The overall sensitivity for clinical GVHD was 76 % (CI 68–82) for biopsies taken from the stomach and/or duodenum. When both sites were biopsied simultaneously, the sensitivity increased to 82 %. The sensitivity for clinical GVHD for biopsies from lower tract and middle tract sites was 72 % (CI 63–78) and 65 % (CI 41–81), respectively. Because each procedure did not feature biopsies from all possible sites, direct comparison is not feasible. We have included 95 % confidence intervals around the sensitivity proportions. Although the confidence intervals cannot be used directly as formal testing, they can be used for informational purposes. Because the confidence intervals are wide and overlap greatly, it is unlikely that the sensitivity measures of the three regions are substantively different. The positive and negative predictive values for clinical GVHD were 87 and 27 from biopsies obtained during EGD and 98 and 33 from those obtained during Flex-Sig, Table 6.

Table 5 Number of tissue fragments taken from each site
Table 6 Comparison of path diagnosis to the clinical diagnosis by region (%)

When compared to clinically diagnosed GVHD, the terminal ileum was the single most sensitive site, 86 %, but it was the least commonly biopsied. Of the commonly biopsied sites, the stomach, duodenum, and rectosigmoid all had similar sensitivities, 69–72 %. The site-specific details are depicted in Table 7.

Table 7 Comparison of path diagnosis to the clinical diagnosis by site (%)

Symptoms that motivated endoscopy are given in Table 7. The most frequently reported symptoms were watery diarrhea (77 %), nausea (74 %), and anorexia (56 %); all other symptoms were reported in fewer than half the cases. The nature of symptoms has traditionally guided the endoscopic approach within our institution, with EGD performed to evaluate upper tract symptoms such as nausea and vomiting and Flex-Sig performed to evaluate lower tract symptoms such as diarrhea. In the presence of nausea, the prototypical upper GI symptom, the percent of biopsies positive for GVHD from upper GI sites was 65 (95 % CI 56–73 %) in the stomach and 70 (95 % CI 60–79 %) in the duodenum, while positive in lower tract sites in 61 (95 % CI 38–80 %) from the left colon and 70 (95 % CI 59–78 %) in the rectosigmoid. In the presence of watery diarrhea, the prototypical lower tract symptom, the percent of biopsies positive for GVHD for the lower sites was 65 (95 % CI 46–81 %) in the left colon and 64 (95 % CI 54–72 %) in the rectosigmoid, while positive in upper sites in 64 (95 % CI 55–73 %) from the stomach and 69 (95 % CI 59–78 %) in the duodenum, Table 8.

Table 8 Biopsy positivity by presenting symptoms (site-specific biopsy) (%)

For patients with both upper and lower tract symptoms, these were typically given equal consideration, and bidirectional endoscopic evaluation was typically performed. Sixty-five endoscopies simultaneously sampled both upper and lower tract sites. In 39 (60 %) cases, both upper and lower site biopsies agreed with the overall histologic diagnosis, whether positive or negative. This left 26 (40 %) discordant cases. All 26 endoscopic events yielded an overall pathologic diagnosis of GVHD; however, in 10 (15 %) of cases the upper tract biopsies were negative and in 16 (25 %) cases the lower tract biopsies were negative, Table 9. The kappa coefficient, indicating the degree of agreement between the overall histologic diagnosis and upper site biopsies, was moderately strong at 0.614 (95 % CI 0.410–0.817); the corresponding kappa coefficient between overall and lower site biopsies was moderate at 0.461 (95 % CI 0.267–0.654). Although the agreement with overall diagnosis was stronger for upper compared with lower site biopsies, the difference was not of statistical significance.

Table 9 Histologic GVHD diagnosis in patients with simultaneous upper and lower tract biopsies

All specimens were evaluated for cytomegalovirus (CMV) infection by immunohistochemical tissue staining at the time of the initial biopsy review, and 11 patients were CMV positive. The CMV was detected in a single site in six patients and in multiple simultaneous sites in five. CMV was found most commonly in the stomach (eight patients) but was also detected in the duodenum (four patients), colon (three patients), and esophagus (one patient).

There were no perforations noted during any of the endoscopic procedures. No procedure-related adverse cardiopulmonary complications were seen. Despite the prevalence of thrombocytopenia in this population, no duodenal hematomas or other bleeding complications were seen during these procedures.

Discussion

The diagnosis of GVHD in patients who have undergone HCT remains a clinical challenge, and recent data show that there is poor correlation between clinical, endoscopic, and histologic scoring [16]. So despite the importance of histology, the diagnosis remains one ultimately based on the clinical, endoscopic, and pathologic correlation made by an experienced transplant physician. At our institution, the transplant physician typically waits for histologic confirmation before initiating therapy directed against GVHD, but rarely, if the clinical circumstances call for immediate action, empiric treatment will be initiated prior to histologic confirmation. In these cases, empiric therapy is typically discontinued if the histology does not support the diagnosis of GVHD.

This large review includes 169 adult patients whose biopsies were read by two expert pathologists, blinded to the clinical diagnosis, who applied the 2006 NIH consensus guidelines for the histologic diagnosis of GVHD. The overall sensitivity of biopsies taken during EGD, a procedure that typically requires the use of moderate sedation, was slightly higher, 76 %, than those taken during Flex-Sig, 72 %. It is difficult to assess the accuracy and yield of our current symptom-driven endoscopic approach with this retrospective study as patients who have biopsies from upper and lower sites would typically have had overlapping symptoms such as nausea and diarrhea. Despite this limitation, our data do not clearly support using symptoms to direct the site of biopsy as upper tract biopsies were diagnostic for GVHD in 64–69 % cases with diarrhea, which was similar to those taken from the colon in these patients, 58–65 %. Similarly, colorectal biopsies showed GVHD in 57–70 % of patients with nausea and vomiting, which was similar to biopsies from the upper tract, 65–70 %.

The 26 cases with discordant findings between upper and lower site biopsies were carefully reviewed in an attempt to better understand this interesting and important cohort. Though the potential for sampling error was initially suspected for one possible explanation for the discordance, this did not appear to be the case as these patients had a mean of 7 biopsies taken from upper sites, with both the stomach and duodenum biopsied in the majority of these patients, and 6 taken from lower sites. A potential confounder in these cases is the fact that the biopsies were re-evaluated for this study in an independent fashion, so our pathologists were not only blinded to clinical information, but also blinded to the concomitantly taken biopsies from different sites in the same patient, something that would never be the case in a standard clinical context where the entire case is evaluated as a whole, thereby allowing findings from one site to have some influence on other sites, especially when changes are equivocal. For example, four of the discrepant procedures featured biopsies that were CMV positive, one from the duodenum, one from the stomach, and two from the colon. For the purpose of this study, the other biopsies taken during the same procedure were evaluated independently and, if any GVHD-like features were identified, were classified as either consistent with or unequivocal GVHD. However, in clinical practice the presence of CMV infection would have influenced the interpretation of the other biopsy sites.

This study is limited by its retrospective nature. There was no formal endoscopy or biopsy protocol, and these procedures were performed by multiple physicians across our division each of whom approached these procedures differently. Because of this, there was variation in the number of sites biopsied per endoscopic event. Although 250 endoscopic procedures were undertaken, approximately half of these sampled only one anatomic site. To formally compare different sites on measures of association and discrimination, such as sensitivity and specificity, would require that each scope event sampled each site. As such, we consider these results exploratory and descriptive. We did, however, restrict certain comparisons to simultaneous biopsies of at least one upper and lower site. Just as there was no formal endoscopic or biopsy protocol, there was also no clear protocol for who was sent for endoscopic evaluation, when it was performed in relation to the onset of symptoms, and what type of intervention was undertaken. These decisions varied with the practice of the attending transplant physician.

The optimal endoscopic approach is still unclear. One can clearly argue that because it can be performed quickly and without sedation or overnight fasting and it allows for the appropriate diagnosis of GVHD with high sensitivity, Flex-Sig is the optimal initial test in this cohort, regardless of the symptom complex. Furthermore, work by Crowell et al. [17] demonstrated that in a pediatric HCT cohort being evaluated for GVHD with both EGD and full colonoscopy, GVHD was diagnosed on colonic biopsies taken from within reach of a Flex-Sig 100 % of the time. Because of these facts, Flex-Sig is the modality our institution favors to obtain lower tract biopsies. However, not performing simultaneous upper endoscopy may lead to a diagnostic delay in a small number of patients who have to undergo upper endoscopy if symptoms persist and their histology was not suggestive of GVHD. Despite a higher theoretical risk with performing upper endoscopy compared with Flex-Sig, complications were not seen in this study so one could also advocate performing both endoscopic procedures simultaneously in order to expedite care. To truly answer the question of the optimal diagnostic approach, a prospective study with patients randomized to EGD, Flex-Sig, or both arms regardless of symptoms would be required. Until then, this challenging clinical question should remain a decision made by a team of experts based on the unique clinical characteristics of an individual patient.