Introduction

Microorganisms living on and in different areas of the human body, such as the skin, respiratory tract, mouth, and gastrointestinal tract, are known as the human microbiota or human microbiome (Sender et al. 2016). Over 70% of the human microbiome is contained within the gastro-enteric tube, including thousands of microbial species that together weigh approximately 1.5 kg. These gut microbes play a pivotal role in the regulation of numerous physiological and biochemical mechanisms though the manufacture of various substances and metabolites (Yadav et al. 2018). The gut microbiota, in particular, are critical to immune function and metabolic processes, including lipid and glucose metabolism, production of vitamins and short chain fatty acids (SCFAs), and antimicrobial secretion (Adak and Khan 2019; Burger-van Paassen et al. 2009). Several studies have demonstrated that alterations in a number of microbial species within the gut are associated with increased risks for a number of diseases, including cardiovascular disease, type 2 diabetes, and certain cancers (Muñoz-Garach et al. 2016; Salgaço et al. 2019; Wang and Zhao 2018). Accordingly, interest in the characterization and assessment of the gut microbiota as a metric for better understanding human health has increased.

Traditional methods for microbial identification are culture based; however, due to their significant limitations, recent studies on the gut microbiota have used more advanced techniques, such as whole-genome shotgun sequencing and 16S ribosomal RNA (rRNA) gene analysis, to map the composition of gut microbial communities (Laudadio et al. 2018). In most cases, fecal specimens obtained from patients have been used to characterize and identify gut microbial enterotypes in a clinical setting, revealing at least 16 different bacterial phyla (Pollock et al. 2018). However, few studies have investigated whether data obtained from stool specimens are consistent with the real enteric environment. Furthermore, the human gut is functionally subdivided into a number of compartments, and previous studies have indicated that the microbiota within each of these parts might differ (Hale et al. 2017). This compartmentalization, however, can be missed when exclusively analyzing stool specimens. To address this, we aimed to identify and characterize microorganisms from four different locations within the large intestine and to compare these results with microbiota identified from stool specimens and rectal swabs. Additionally, we sought to determine whether rectal swabs and/or stool specimens are sufficient for identifying microbes within the colon.

Methods

Study population

From October 2019 to October 2020, a total of 100 participants who visited the Yongin Severance Check-up Center (Yongin, South Korea) for a general medical check-up volunteered for this study, and we included those who planned to undergo colonoscopy screening. Participants were excluded if they were taking antibiotics for treatment of an infectious disease. This study was approved by Yongin Severance Institutional Review Board (IRB No 9–2019-0014) and was performed in compliance with the Declaration of Helsinki. Written consent was obtained from all patients prior to participation.

Lifestyle status factors (e.g., smoking, drinking, exercise) were assessed via a self-administered questionnaire. Smoking status was recorded as non-smoker or current smoker. Alcohol consumption was defined as “yes” for those who currently drink alcohol. Physical activity was defined as “yes” for participants who regularly exercise two or three times a week. Past history of diabetes, hypertension, and dyslipidemia were also obtained from the self-reported questionnaire.

Bowel preparation and sample collection

A total of six human fecal samples were collected from stool, rectal swabs, and four segments of the large intestine (i.e., ascending colon, descending colon, sigmoid colon, and rectum) from each subject. Participants underwent standard bowel preparation that included consuming polyethylene glycol solution prior to colonoscopy (Fig. 1). Immediately before the colonoscopy, a rectal swab was collected by the doctor and stored in a collecting tube at − 80 °C for further processing. Swabs were inserted into the anal canal (± 3 cm) (Budding et al. 2014). During colonoscopy, distilled water was pushed through the biopsy channel of the scope, and samples from each segment were collected by aspiration. Mucosal jet wash from four segments of the large intestine (ascending colon, descending colon, sigmoid colon, and rectum) was collected from all study participants. Segment samples obtained from colonoscopy were collected in four 15-mL test tubes and immediately stored at − 80 °C for further processing. Stool samples were obtained at the participants’ homes by themselves before the patients began colonoscopy preparation. The participants were advised to collect stool 1 day before the colonoscopy. Stool samples were collected in a stool collection tube using the AccuStool Collection Kit (AccuGene, Incheon, Korea) and stored at − 20 °C in the participants’ freezer. During transport from the participants’ home to a medical facility, the samples were exposed to room temperature.

Fig. 1
figure 1

Schematic overview of the sampling method and bioinformatics analysis strategy in this study

DNA extraction and sequencing

DNA extraction from large intestinal aspirate samples was performed using the AccuBuccal DNA Preparation Kit (AccuGene), and DNA extraction from stool samples was performed using the AccuStool DNA Preparation Kit (AccuGene), both in accordance with the manufacturer’s instructions. The sequencing library was prepared using the Ion 16S™ Metagenomics Kit (Thermo Fisher Scientific, Waltham, MA, USA) to amplify the hypervariable regions of the 16S rRNA gene (Primer set V2, V4, V8 and Primer set V3, V6–7, V9), according to the manufacturer’s instructions (https://www.thermofisher.com). Input gDNA (3 ng) was amplified, and sequencing was performed using the Ion 530™ Chip Kit in an Ion Chef™ and Ion S5XL™ (Thermo Fisher Scientific) using the Chef Protocol—400 bp.

The Ion 16S Metagenomics Kit includes a variety of primers, and sequencing reads from seven variable regions are provided as one FASTQ file. We used USEARCH to extract regions that bind with individual primers (i.e., V2, V3, V4, V67, V8, V9) (Edgar 2010). A schematic overview of our sequencing and data analysis strategy is provided in Fig. 1.

Denoising and taxonomic binning

Barcode and adapter sequences were eliminated using Cutadapt (v.2.8) (Kechin et al. 2017), and denoising was performed using DADA2 (v.1.14) (Callahan et al. 2016) in QIIME2 (Bolyen et al. 2019). The FASTQ files, separated into seven regions, were filtered to remove 15 bp from the 5′ end and all sequences < 100 bp, and reads were quality-filtered to retain only those with a quality score > 25. The National Center for Biotechnology Information was used for taxonomy reference. We used the classify-consensus-vsearch algorithm in QIIME2 to conduct taxonomic binning with a 99% identity threshold. All sequence data have been deposited in BioProject (ID: PRJNA735579).

Defining taxon abundance

Determining the absolute or relative abundance of actual gut microbial taxa based on 16S rRNA gene sequencing can be challenging. Generally, the 16S amplicon sequences generated using this technique are clustered into operational taxonomic units (OTUs), representing the typical working definition of a bacterial species by the OTU selection algorithm, and the number of features observed represents the number of various taxa observed in the sample. Some researchers use these proportions to calculate the relative abundance of each taxon, with the number of features observed corresponding to the sum of the observed read counts. However, in the case of 16S rRNA gene amplicons, there may be differences in PCR efficiency, depending on the quality of the genomic DNA extracted from the sample. The Ion 16S™ Metagenomics Kit used in this study generates six amplicons and can observe seven hypervariable regions. We found that the majority of OTUs were observed in the V3/4 region, although OTUs not identified here were observed in other regions (specific strains were only identified in V9). It is possible that bacteria actually present in the sample may not be observed in this analysis, so it is necessary to assess a wide region of the 16S gene. This is thought to be particularly important if there are low-abundance bacteria present in a sample that could adversely affect health.

The use of more than two amplicons, however, can yield false abundance estimates. As shown in the example in Fig. 2, the same abundance should be observed in each hypervariable region of OTU 1/2/3, but this is rare in experiments. Critically, if this difference is not recognized, the estimated abundance of OTU 1/2/3 will not be accurate. Therefore, the normalization method must take into account the difference in PCR efficiency to avoid false conclusions. Here, abundance was calculated as the maximum value observed among hypervariable regions for each OTU.

Fig. 2
figure 2

Method for measuring bacterial abundance

Statistical analysis

Clinical data from the study population are presented as a mean ± standard deviation (SD) or number (%). Cumulative sum scaling (CSS) normalization was used in this study (Paulson et al. 2013). OTUs found in fewer than 10 samples were removed. We calculated beta diversity metrics to reflect the shared diversity between bacterial populations in terms of ecological distance within each sample population: unweighted and weighted UniFrac distances (Lozupone et al. 2011). Repeated measure analysis of variance (Rv.3.6.3) was used for analysis of variance differences between five phyla according to location. We calculated similarity using the intraclass correlation analysis (R Statistical Package, “ICC” Institute for Statistics and Mathematics, Vienna, Austria, ver 3.6.3, www.R-project.org). Alpha-diversity measured using the Shannon index was calculated with Wilcoxon rank-sum test. We also compared differences in distances among the locations using adonis multivariate analysis of variance (R Statistical Package, “vegan” Institute for Statistics and Mathematics, Vienna, Austria, ver 3.6.3, www.R-project.org).

Simple Pearson correlation analyses were performed between individual sample sites, with Bonferroni correction. The level of statistical significance for all analyses was set as P <\(1\mathrm{e-10}\) . Statistical analyses were performed using scipy.stats statistical software (Python v.3.6.12).

Results

Subjects and samples

General characteristics of study population are described in Table 1. A total of 55 men and 45 women participated in this study. The mean age ± SD of study population was 58.8 ± 10.9 years. In total, 18 (18.0%), 79 (79.0%), and 81 (81.0%) participants were current smokers, current drinkers, and regular exercisers, respectively. The proportions of individuals with diabetes, hypertension, and dyslipidemia were 31.0%, 61.0%, and 42.0%, respectively.

Table.1 General characteristics of the study population (n = 100)

Confirmation of bacterial diversity by sample

We first analyzed 16S rRNA sequencing data from the ascending colon, descending colon, sigmoid colon, rectum, rectal swabs, and stool in order to determine the bacteria present in each sample. The number of OTUs from each phylum present in the various samples is shown in Table 2. We detected an average of 640 OTUs (ranging from 622–670) in the ascending colon, descending colon, sigmoid colon, and rectum. A total of 649 OTUs were identified from rectal swabs, which is similar to the average number found in the colon aspirate samples. In contrast, only 572 OTUs were identified in the stool samples, which is 10.5% lower than the average number detected in the colon and rectum. These results suggest that conventional stool sampling could miss bacteria that are present in the large intestine. In addition, approximately 20–25% fewer Firmicutes, Proteobacteria, and Actinobacteria were detected in stool samples than in the large intestine. Conversely, approximately 25% more Bacteroidetes and 85% more Tenericutes were identified in stool samples than in the large intestine. These results suggest that stool samples do not accurately reflect the microbiome of the large intestine. In contrast, the number of OTUs from each phylum detected by rectal swabs was similar to that present in the large intestine.

Table.2 Number of phylum-level operational taxonomic units (OTUs) detected in each of the different samples

Comparison of bacterial abundance in the colon, stool, and rectal swabs

Next, we calculated the relative abundances of the phyla identified in the ascending colon, descending colon, sigmoid colon, rectum, rectal swab, and stool to determine if and how microbial compositions vary in these regions (Fig. 3A). We found that the microbiota in the large intestine, rectal swab, and stool mainly comprise five phyla: Firmicutes, Bacteroides, Proteobacteria, Actinobacteria, and Fusobacteria. The proportion of Bacteroidetes was more abundant in stool samples than in the other specimens (p = 1.10e − 136), whereas Firmicutes (p = 0.02) and Proteobacteria (p = 1.23e − 81) were less abundant in the stool samples than in the other specimens. Additionally, we compared similarities in abundance values for 21 phyla. Stool samples matched colonoscopy specimens from the ascending colon, descending colon, sigmoid colon, and rectum and rectal swabs by 8 to 13%. Rectal swabs matched colonoscopy specimens from the ascending colon, descending colon, and sigmoid colon by 72 to 75% (Supplementary Table S1).

Fig. 3
figure 3

Relative abundance and beta diversity of microbiota in samples from distinct regions of the colon, in rectal swabs, and in stool. A Relative microbial abundance at the phylum level for each sample. B Beta diversity by location, unweighted Unifrac principal coordinate analysis (PCoA). C Beta diversity by location, weighted Unifrac PCoA

The microbial composition of rectal swab samples was further found to be similar with those of the sigmoid colon and rectum. Notably, however, the microbial composition in stool samples was relatively different from the microbial composition of the large intestine. These results were similar at the family and genus level (Supplementary Fig. S1). The increased abundance of Bacteroidetes, in particular, suggests that overgrowth of these organisms may have occurred during the storage and transportation of feces from the patients’ homes to the hospital.

We further compared microbial ecosystems in the different samples using principal coordinate analysis (PCoA). Analysis of beta diversity in the ascending colon, descending colon, sigmoid colon, and rectum samples; rectal swab samples; and stool samples revealed two completely separate clusters (Fig. 3B, C). One cluster contained samples from the ascending colon, descending colon, sigmoid colon, rectum, and rectal swabs, which were mixed and therefore indicated a similar environment. Similar results obtained from weighted Unifrac PCoA (Fig. 3B) and unweighted Unifrac PCoA (Fig. 3C). These data further indicated that rectal swabs could sufficiently reflect the colon microbiome. The detailed results are presented in Supplementary Table S2. We also analyzed alpha-diversity with the Shannon index, which is commonly used to measure species abundance and diversity. The alpha-diversity of stool samples was significantly lower than that of other specimens (Fig. 4).

Fig. 4
figure 4

Alpha-diversity results according to the Shannon index. *P < 0.05, **P < 0.01, ***P < 0.001

Observation of compositional comparison between samples

Next, we assessed the compositional similarity of microbial clusters within samples from the large intestine (ascending colon, descending colon, sigmoid colon, rectum), rectal swabs, and stool by quantifying individual OTUs at the class level (Fig. 5). We then measured correlations between the different samples to determine which are most suitable for identifying the actual colon microbiome. Samples from the ascending colon, descending colon, sigmoid colon, and rectum were found to highly correlated (R-values ranging from 0.88–0.93). In particular, the ascending colon was found to be most highly correlated with the descending colon, while the descending colon was observed to be most highly correlated with the sigmoid colon. In general, R-values were found to be highest between adjacent regions. This result indicates that the strains are shifted downward from the highest ascending colon to the rectum, which is in the same direction as the intestinal motion (Supplementary Fig. S2).

Fig. 5
figure 5

Spearman correlation plot among the samples (ascending colon, descending colon, sigmoid colon, rectum, rectal swab, and stool) using operational taxonomic unit (OTU) counts. The X- and Y-axes represent the specific number of OTUs at the class level in each sample: ascending colon (A) stool, descending colon (D), sigmoid colon (S), rectum (R), rectal swab, and stool. *P < 0.05, **P < 0.01, ***P < 0.001

Overall, we detected high correlations between the composition of rectal swab samples and those from different regions of the large intestine regions (rectal swab and ascending colon, R = 0.65; rectal swab and descending colon, R = 0.66; rectal swab and sigmoid colon, R = 0.64; and rectal swab and rectum, R = 0.65). However, correlations between stool and other samples were relatively lower (stool and ascending colon, R = 0.24; stool and descending colon, R = 0.23; stool and sigmoid colon, R = 0.28; stool and rectum, R = 0.27; and stool and rectal swab, R = 0.25).

Discussion

The findings in this study reveal local microbiome variations within different regions of the large intestine and further show that sequencing of stool samples does not fully recapitulate the gut microbiome. Additionally, using data from a large population-based cohort, we successfully demonstrated that rectal swabs can be used as an alternative sampling method to study the gut microbiome.

Since the inception of the human microbiome project (Turnbaugh et al. 2007), many researchers have uncovered associations between the gut microbiome and human health and diseases (Fan and Pedersen 2021). Due to relative easy accessibility, stool samples have commonly been used for characterizing the gut microbiome in clinical settings (Human Microbiome Project Consortium 2012). However, stool samples are typically collected by individuals in their homes. Thus, they may not be immediately stored at − 20 °C, and samples may be exposed to higher temperatures during transport to the laboratory. Currently, there are still debates as to whether the stool microbiome is altered in response to different storage methods. In particular, several studies have demonstrated that prolonged storage of feces at room temperature impacts the microbial composition of stool (Bahl et al. 2012; Cardona et al. 2012). However, another study reported that the phylogenetic structure and diversity of communities in human stool samples are not significantly influenced by storage temperature or the duration of storage (Lauber et al. 2010).

Several studies have also demonstrated that stool reflects the contents of the gastrointestinal lumen, and this represents a microbial niche that is distinct from the mucosal-associated microbiota (Rangel et al. 2015; Ringel et al. 2015). In addition, recent studies have revealed variations in the composition of bacterial species in different regions of the human gut (McHardy et al. 2013; Vaga et al. 2020). For example, in a study of five healthy individuals, Vaga et al. showed that although feces reflect the average gut mucosal microbiome, mucosal biopsy uncovers variations within local microbial communities of the large intestine (Vaga et al. 2020). McHardy et al. further reported consistent compositional shifts in the gut microbiota from 42 healthy subjects and five Crohn’s disease patients (McHardy et al. 2013). Our results are consistent with those of McHardy et al., as well as with a study analyzing the gut microbial community in healthy subjects in Korea (Eun et al. 2016), which reported that the microbial composition of mucosal tissue differs from that in feces. Specifically, we found relative abundances of Firmicutes, Bacteroides, Proteobacteria, Actinobacteria, and Fusobacteria in the large intestine of approximately 31–35%, 17–27.5%, 29–35%, 6–8%, and 4.5–7%, respectively, and this microbial composition was quite different from that detected in stool. These results could be explained by biliary salt gradient, transit speed, secretion of IgA and antimicrobial material from Paneth cells, and oxygen tolerance (Zhou et al. 2020). Overall, the observed distribution of microorganisms in the large intestine suggests compartmentation within this organ.

Endobiopsy has been used to investigate the mucosal microbiota in different gastrointestinal sites (Vuik et al. 2019). However, this biopsy method is invasive, expensive, time-consuming, insufficient for obtaining substantial biomass, and not suitable for healthy subjects (Tang et al. 2020). Herein, to identify an alternative, we examined microbial compositions in various regions of the large intestine using colonic lavage samples aspirated with a suction tip from the ascending colon, descending colon, sigmoid colon, and rectum during colonoscopy. Watt and colleague previously suggested that colonic lavage samples could be representative of biopsy microbiota composition (Watt et al. 2016). However, at this time, colonic lavage sampling during colonoscopy is still invasive and time consuming.

To address this, several studies have suggested rectal swabs as an alternative sampling method for gut microbiota analysis. Bassis et al. (2017) reported that the microbial composition of stool and rectal swab samples from the same subject are highly similar, based on a study of eight in-patient samples. In this study, rectal swabs were inserted 1–2 cm past the anal verge. Budding et al. (2014) analyzed a total of 38 subjects who underwent colonoscopy and 10 inflammatory bowel disease patients and showed that rectal swabs provide a good method for producing highly reproducible microbiota profiles. In this case, rectal swabs were inserted into the anal canal (± 3 cm). Another recent study (Biehl et al. 2019) confirmed that rectal swabs are a practical and high-adherence method for microbiome sampling in a longitudinal cohort of hematological and oncological patients. Thus, these studies indicate that rectal swabs are a convenient and reliable method for investigating the gut microbiota in patients with complex medical problems in a hospital setting and are also self-collectable in out-patient settings. Our results here are consistent with previous studies.

However, unlike previous studies that reported similarities between microbial profiles obtained from rectal swabs and stool, we found that the microbial composition of rectal swabs was similar to those from large intestine samples obtained during colonoscopy and less similar to that of stool. This difference may be due to the fact that we harvested rectal swabs from patients that were prepped for colonoscopy, and thus, both the large intestine aspiration samples and rectal swabs were obtained after colonoscopy prep. In contrast, the observed difference in microbial compositions between aspirate samples seems to represent an actual difference within the large intestine compartmentation.

Our study has several limitations. First, as samples were aspirated after bowel preparation, we should consider the confounding effect of this preparation on our findings. Several studies have shown that bowel preparation may alter the mucosal-adherent microbiota (Harrell et al. 2012; O'Brien et al. 2013). Therefore, further studies are needed to assess and compare the microbial profiles obtained from unprepped rectal swabs and stool. Second, the colon is a dynamic organ with various physiologic functions, including the absorption of water and electrolytes and the transport of luminal contents as feces (Szmulowicz and Hull 2011). Colonic activity increases in order to foster the expulsion of stool, and the remaining liquid in the colon shifts naturally through the bowel with peristalsis and constant colon movement. Here, we also found that the microbial composition shifts along the colon lumen. This finding suggests that we may be able to infer the microbial composition of the small intestine based on microbiota detected in aspiration samples from the ascending colon. Because it is relatively difficult to obtain samples from the small intestine (Kastl et al. 2020), our results could provide an alternative sampling method for studying the small intestine in patients with diseases, such as small intestinal bacterial overgrowth. Third, although 16S RNA gene sequencing has been a mainstay of sequence-based bacterial analysis for decades, this method generates OTUs and representative OTU sequences that are compared with reference databases (Johnson et al. 2019; Poretsky et al. 2014). Therefore, the results are relative, rather than absolute, and the actual quantity of a particular bacterium is uncertain. Also, the results may differ depending on the choice of reference database. In future studies, high-throughput sequencing of the full 16S gene or culturomics is needed, as due to depth bias, it is impossible to detect small populations of bacteria using metagenomics methods. Fourth, previous studies have shown that feces and mucosal-associated microbiota have distinct microbial niches (Rangel et al. 2015; Tap et al. 2017). Therefore, we should interpret our results with caution considering that luminal gut microbiota might differ with the mucosal microbiota. Samples collected by aspiration and rectal swabs could contain both remaining luminal gut microbiota and mucosal microbiota. Finally, a rectal swab does not fully reflect the intestinal microflora, although it reflects the intestinal environment better than the stool samples. More studies are needed to find a method that can reflect the gut environment in a simple and non-invasive way.

Despite these limitations, our study has a number of important advantages, including the use of a relatively large sample size (n = 100) compared to previous studies. In addition, we included apparently healthy participants who visited our health check-up center for routine examination. In conclusion, this study provides evidence for the heterogeneity of gut microbiota in defined compartments within the large intestine. Our data further indicate that although fecal specimens provide a representation of the bacterial communities that interact at the gut mucosa, they cannot be used to accurately determine the composition of the gut microbiota. Although stool sampling has been accepted as a good sampling method of choice for gut microbiota analysis, rectal swabs can be used as a sampling method, when stool cannot be readily obtained.

Data sharing statement

Data from individual participants are protected for confidentiality reasons and can be made available only upon approval of the corresponding author.