Introduction

Helicobacter pylori (H. pylori) is a slow-growing, micro-aerophilic, Gram-negative, spiral-shaped and flagellated bacterium, which colonizes the mucus layer of the gastric epithelium [1]. Approximately one-half of the global human population was colonized by this organism, especially in developing countries [2]. Although Helicobacter pylori (H. pylori) infection can induce inflammatory responses of host cells, the majority of infected individuals remain asymptomatic throughout life. It is reported that H. pylori-infected individuals are having the risk of developing peptic ulcers in 15–20%, dyspepsia in 5–10%, gastric cancer in 1%, and primary gastric mucosa-associated lymphoid tissue lymphoma in 0.1% [3]. However, the mechanism of different disease outcomes caused by H. pylori infection remains unclear.

Currently, multiple different factors (i.e., pathogen, host, and environment) are all thought to account for the distinct clinical phenotypes. The differences in bacterial genotype and virulence intensity constitute the main points of pathogenic factors. H. pylori has been proved to be one of the most genetically diverse bacterial species known [4]. Geographic differences in gastric cancer incidence could be partly attributed to the differences between H. pylori strains [5]. Biological diversity of isolates was found in patients with different diseases [6]. Polymorphisms of H. pylori reflect the geographic origin and correlate with the status of virulence factors [7]. Furthermore, the genetic heterogeneity was also identified in genes encoding outer membrane proteins (OMPs) [8]. Phylogenetic analysis revealed that OMP genes (i.e., homB and homA) were heterogeneously distributed worldwide, with a marked difference between East Asian and Western strains [9]. Therefore, the differential prevalence of H. pylori genes and the subsequent alterations in OMPs may play a vital role in the different clinical outcome of chronic infection [10].

The H. pylori genome is predicted to encode > 60 different OMPs, but most have not been studied in detail [8, 11]. In specific, the blood group antigen-binding adhesin A (babA) gene is generally associated with the presence of H. pylori adhesion [12]. Attachment to the gastric mucosa is the first step in establishing bacterial colonization of H. pylori, which is crucial for colonization and persistent infection. Subsequently, the continuous colonization of H. pylori mediates the chronic inflammatory response of gastric epithelial cells. The protein encoded by outer inflammatory protein A (oipA) was identified as a factor that induces a pro-inflammatory response [13]. In addition, the sialic acid-binding adhesin A (sabA) gene is important for the induction of inflammatory reaction in the gastric mucosa [14]. The helicobacter outer membrane B (homB) codes for a putative OMP and was shown to induce activation of interleukin-8 secretion in vitro, which was found to be associated with peptic ulcers [15].

It is known that babA, oipA, sabA, and homB genes widely exist in different strains of H. pylori [16]. The genetic polymorphism and the functional status of these four genes affect the expression of corresponding OMPs, which may be associated with the regional prevalence of gastroduodenal diseases. However, the available results about the relationship between the four OMP genes (babA, oipA, sabA, and homB) and gastroduodenal diseases are controversial. Based on similar studies [17,18,19], the inconsistent association between these OMP genes and disease outcomes may be related to geographic diversity, sample sizes and/or detection methods (e.g., PCR, western blot, incomplete sequencing). Herein, we examined more than one hundred H. pylori strains isolated from Chinese patients with different gastroduodenal diseases. The full-length amplification of babA, oipA, sabA, and homB genes was acquired and sequenced. The findings of babA, oipA, sabA, and homB gene in clinical outcomes would greatly benefit the study of pathogenesis, biomarkers, therapeutic targets, or vaccine candidates.

Materials and methods

Strains and genomic DNA extraction

One hundred and seventy-seven preserved H. pylori strains (Table S1) in this study were previously isolated from Chinese patients with different gastroduodenal diseases (49 chronic gastritis, 19 gastric ulcer, 33 gastric cancer, 76 duodenal ulcer), 94 of which contained pathological information (41 chronic superficial gastritis, 24 intestinal hyperplasia, and 29 gastric cancer) [20]. The strains were successfully cultured on Campylobacter agar base (OXOID Co., UK) plates containing 5% defibering sheep blood and selective antibiotic medium. Then, the plates were placed in a microaerobic incubator with a condition of 5% O2 and 10% CO2 at 37 °C. The identification of H. pylori was confirmed by colony morphology and Gram-Staining. Genomic DNA of isolates was extracted by the QIAamp DNA Mini Kit (QIAGEN Co., France) according to the manual instruction.

Amplification and detection of babA, oipA, sabA, and homB genes

Amplification of babA, oipA, sabA, and homB genes was performed with the multi-primer PCR assay. All amplifications were carried in a reaction volume of 25 μl mix reagents that contained 2 μl (20 ng/μl) of H. pylori genome DNA, 5 μl Prime STAR buffer, 2 μl (2.5 mM each) dNTP Mixture, 1 μl (10 μM) forward primer and 1 μl (10 μM) reverse primer, 0.5 μl Prime STAR polymerase, and 13.5 μl sterile water. The conditions of thermal cycler program used for babA, oipA, sabA, and homB consisted of the following five steps: initial denaturation at 95 °C for 5 min followed by 35 cycles of 45 s at 95 °C (denaturation), and 45 s at 64 °C for babA, 30 s at 62 °C for oipA, 30 s at 58 °C for sabA, 40 s at 60 °C for homB (all annealing steps), followed by 30 s at 72 °C (extension step) and a final extension step was 5 min at 72 °C. The information of PCR primer for each amplicon is summarized in Table 1. All amplification products were examined by 1.5% agarose gel electrophoresis (TransGen Biotech Co., China) and detected by ChemiDoc MP System (Bio-Rad, USA).

Table 1 Primers used for the detection of H.pylori oipA, babA, sabA, and homB genes

The full-length sequence of babA, oipA, sabA, and homB genes

Samples without target genes of babA, oipA, sabA, and homB were discarded at the identification stage of the PCR product. The positive PCR products of isolates were purified and then the DNA samples were submitted and performed as a service by the Beijing Genomics Institute (BGI Tech., China) for the full-length sequence of babA, oipA, sabA, and homB genes.

Analysis of genetic polymorphism and the functional status

Sequences of babA, oipA, sabA, and homB genes were automatically and manually aligned by ClustalW in MEGA v7.0 software (MEGA Inc., USA). DnaSP v6.0 software (Julio Rozas & Universitat de Barcelona, Spain) was used to conduct analyses of genetic polymorphism. The nonsynonymous (Ka) to synonymous (Ks) substitution rate ratio (ω = Ka/Ks) can be used as an estimator for selective pressure on DNA sequence evolution (negative/purifying selection Ka/Ks < 1, neutral selection Ka/Ks = 1 and positive/adaptive selection Ka/Ks > 1) [21]. The ratios of Ka/Ks and Pi(a)/Pi(s) were evaluated by the modified Nei Gojobori method and the Junkes–Cantor correction in MEGA to evaluate the evolutionary selection. In addition, parameters involving haplotypes (H), haplotype diversity (Hd), nucleotide diversity (Pi), nucleotide differences (K), and Tajima’D test were obtained to estimate the functional status of the four OMP genes. The presence of an early stop codon (TAA/TGA/TAG) in the 5′ region of sequences (signal peptide function fragments) means that the functional process of the gene to protein translation is turned off. Also, expression of oipA and sabA genes is regulated by the slipped-strand repair mechanism based on the number of CT dinucleotide repeats in the 5′ region of the gene (switch on = functional, switch off = non-functional) [22].

Phylogenetic analysis of babA, oipA, sabA, and homB genes

To compare the differences between the strains, a total of 27 reference strains with complete genetic sequence (known sequences listed in the GenBank database) and geographic information were obtained by reviewing the literature [23, 24]. The detailed information of these reference strains were as follows: hspEAsia: 51, 52, 83, 35A, F16, F32, F57; hspAmerind: cuz20, Pecan4, Puno135, sat464, shi169, shi417, shi470, v225d; hpEurope: 26695, B8, HPAG1, G27, Lithuania75, SJM180; hspWAfrica: J99, Pecan18; hspSouthIndia: India7, SNT49; hpAfrical: SouthAfrica7, SouthAfrica20. We performed the phylogenetic analysis of babA, oipA, sabA, and homB genes in isolates from two aspects: different gastroduodenal diseases (177/177 isolates) and different pathological parameters (94/177 isolates). The neighbor-joining method was used and bootstrap analysis was performed with 1000 replications to reconstruct the phylogenetic tree in MEGA. To enhance the image contrast, the phylogenetic tree was ulteriorly edited by online tools (Evolview, http://www.evolgenius.info/evolview).

Statistical analysis

The one-way analysis of variance, Chi-square test or Fisher’s exact test was employed for the comparison of the corresponding variables across groups in IBM SPSS 21.0 statistics software (IBM Corp., USA). A test result was two-sided where a P value of < 0.05 was considered statistically significant.

Results

Prevalence of babA, oipA, sabA, and homB genes

The prevalence of babA, oipA, sabA, and homB genes in 177 isolates from Chinese patients with different gastroduodenal diseases (49 chronic gastritis, 19 gastric ulcer, 33 gastric cancer, and 76 duodenal ulcer) was 91.5%, 100%, 94.0%, and 95.5%, respectively. Besides, the prevalence of babA, oipA, sabA, and homB genes in 94 isolates with different pathological information (41 superficial gastritis, 24 intestinal hyperplasia, and 29 gastric adenocarcinoma) was 92.5%, 100%, 89.4%, and 97.9%, respectively. However, neither gastroduodenal disease nor pathological parameters were related to the prevalence of these four OMP genes (babA, oipA, sabA, and homB) (all P > 0.05).

Genetic polymorphism and evolutionary selection

There were 250 (27.4%) variable sites found in the 177 oipA sequence with 912 bases, which contained 141 (15.5%) parsimony informative sites and 109 (11.95%) singleton variable sites. Also, the variable sites were found in babA, sabA, and homB was 55.3%, 43.94%, and 43.16%, respectively. The Pi of the oipA gene among Chinese isolates was 0.023, which was far lower than that in the reference strains (0.271). The average K of the oipA gene was also lower in the Chinese isolates than that in the reference strains (20.82 vs. 246.13). The details of these parameters in the remaining three genes (babA, sabA, and homB) are shown in the Table 2. In addition, the Ka/Ks ratios were calculated to identify the status of babA, oipA, sabA, and homB genes. In general, all four OMP genes were in the status of positive selection (Ka/Ks > 1). As shown in Table 3, neither genetic polymorphism nor evolutionary selection of the four OMP genes was related to the pathological parameters (all P > 0.05).

Table 2 Polymorphic analysis of oipA, babA, sabA, and homB genes
Table 3 Polymorphism of oipA, babA, sabA, and homB genes and its correlation with pathological parameters

The functional status of babA, oipA, sabA, and homB genes

In all sequences, oipA showed an on functional status and 12 repeated patterns of CT dinucleotide (Table 4). In addition, the repeated pattern of CT dinucleotide in the coding region was found and classified. In 166 sabA sequences, 128 (77.1%) of which had a switch on status while 38 (22.9%) had an off status. Furthermore, the repeated pattern of CT dinucleotide varied in the sequences which were different from oipA (Table 5). In 147 babA sequences, 132 (89.8%) of which had an on functional status while 15 (10.2%) had an off status. The function state on and off were 161 (95.3%) and 8 (4.7%) respectively in 169 homB sequences. No repeated pattern of CT dinucleotide or other regular pattern was found in both babA and homB sequences with functional status off in which the sequences mainly showed irregular single nucleotide polymorphism.

Table 4 The switch status of the signal sequence-coding region for the oipA gene
Table 5 The switch status of the signal sequence-coding region for the sabA gene

Phylogenetic analysis of four OMP genes

Phylogenetic analysis based on the full-length sequence of oipA, babA, sabA, and homB genes was performed respectively to distinguish the correlation between isolates and gastroduodenal diseases. Among which, phylogenetic tree showed that strains with the oipA gene was clustered together containing six clades (Fig. 1): three main clades including most Chinese strains and all HpAsia clades (as Asian clade); one strain from patient with duodenal ulcer solely clustered in an independent clade; two strains were overlapped with western strains (as Western clade); hspAmerind strains grouped into the last clade (as hspAmerind clade). The phylogenetic tree of the babA gene (Fig. 2) showed a similar feature as oipA. However, both the phylogenetic tree of the sabA and the homB displayed a different pattern compared with the oipA. The phylogenetic tree of the sabA gene was mainly grouped into three clades (Fig. 3): the first clade was a sole strain isolated from a patient with gastric cancer; the second clade included half of the Chinese strains; the third clade included a few Chinese strains and all reference strains. Similarly, the phylogenetic tree of the homB gene (Fig. 4) appeared in the same characters as the sabA gene. However, the sequences with different disease information were not clustered together as expected in the phylogenetic analysis.

Fig. 1
figure 1

Phylogenetic analysis of oipA gene sequence. HpChina: isolates from Chinese patients with different clinical information and pathological parameters; the others: reference strains from the GenBank database; CG: chronic gastritis; GU: gastric ulcer; DU: duodenal ulcer; CSG: chronic superficial gastritis; IM: intestinal metaplasia; GC: gastric cancer

Fig. 2
figure 2

Phylogenetic analysis of babA gene sequence. HpChina: isolates from Chinese patients with different clinical information and pathological parameters; the others: reference strains from the GenBank database; CG: chronic gastritis; GU: gastric ulcer; DU: duodenal ulcer; CSG: chronic superficial gastritis; IM: intestinal metaplasia; GC: gastric cancer

Fig. 3
figure 3

Phylogenetic analysis of sabA gene sequence. HpChina: isolates from Chinese patients with different clinical information and pathological parameters; the others: reference strains from the GenBank database; CG: chronic gastritis; GU: gastric ulcer; DU: duodenal ulcer; CSG: chronic superficial gastritis; IM: intestinal metaplasia; GC: gastric cancer

Fig. 4
figure 4

Phylogenetic analysis of homB gene sequence. HpChina: isolates from Chinese patients with different clinical information and pathological parameters; the others: reference strains from the GenBank database; CG: chronic gastritis; GU: gastric ulcer; DU: duodenal ulcer; CSG: chronic superficial gastritis; IM: intestinal metaplasia; GC: gastric cancer

Discussions

It was reported that some OMPs play a vital role in the pathogenesis of H. pylori [8, 25]. Moreover, OMPs could cooperate with other virulent factors to lead diverse disease outcomes including gastric cancer. However, the strains in these studies were isolated from different geographical regions and the genes of the four outer membrane proteins were investigated in different ways. The reported results from the PCR analysis were the most common, but the correlation between OMP genes (i.e., babA, oipA, sabA, and homB) and different gastroduodenal diseases was controversial [26, 27]. The possible reason was that these genes had not been sequenced at full length and the relationship between single nucleotide polymorphisms and different diseases had not been well analyzed.

The full-length sequence of babA, oipA, sabA, and homB genes was detected respectively in our study with the multi-primer PCR assay, and the result showed that the prevalence of H. pylori babA, oipA, sabA, and homB genes in isolates from Chinese patients was 91.5%, 100%, 94.0%, and 95.5%, respectively. The results were largely consistent with literature reported from some Asian countries (> 90%, South Korea, Japan) [28, 29], but higher than those from Western countries (< 80%, America, Germany) [17, 19, 30]. The prevalence of gastric cancer varies with geographic area and a high incidence of gastric cancer in East Asians countries may be explained, at least in part, by the differences in genotypes of H. pylori strains [5]. Therefore, the differences in strains could be further explained by the different prevalence of H. pylori OMP genes.

The high prevalence of babA, oipA, and sabA genes had an active influence over the correlation between H. pylori and gastroduodenal diseases [18, 31]. Yamaoka et al. [17] found that oipA was associated with gastritis, peptic ulcer, and gastric cancer, while sabA was associated with gastric cancer and intestinal metaplasia in American. Gerhard et al. [32] found that babA was associated with peptic ulcer and gastric cancer in German. The homB gene might be susceptible to gastric cancer development in American and Colombian and contribute to the determination of clinical outcomes in patients from East Asia and Western countries [33]. However, not all relationships between the four OMP genes and gastroduodenal diseases were consistent with the above results [19, 34,35,36]. Similar to the latter, no correlation between the prevalence of babA, oipA, sabA, and homB genes and gastroduodenal diseases with pathological parameters was observed in the present study (P > 0.05).

It was necessary to analyze genetic diversity and evolutionary selection to further identify the relationship between OMP genes and gastroduodenal diseases. Compared with the oipA gene (Pi < 0.1), the other three OMP genes (babA, sabA, and homB) had higher genetic diversity (Pi > 0.1). The genetic diversity of oipA and babA genes was lower when compared with the reference sequence, while that of sabA and homB genes was higher. H. pylori strains with genetic polymorphism have an extraordinary capability to survive under intense selective pressure to colonize in hostile gastric conditions [37, 38]. H. pylori genes were highly divergent between East Asian and non-Asian strains, and 86% among which exhibit a positive selection [39]. Our results showed that all four OMP genes were in the status of positive selection (Ka/Ks> 1). A study found that the Asian strain was more evolutionarily capable than the European strain, and suggested that the number of genes in a positive selection state was proportional to the stress of adaptation [40]. These results indicated that the pathogenic factors of H. pylori were in a positive state of evolutionary selection, thus gaining the ability to adapt and survive in gastric mucosa.

We also considered all mutation events and predicted functional states at the genetic level, including one or more non-synonymous mutations. Most switch status of the four OMP gens in this study was on > 90. The switch status of the gene may affect bacterial characteristics such as virulence [22]. Protein OipA was identified as a factor that induces a pro-inflammatory response. The functional status of the oipA gene was relevant to gastritis and gastric cancer. The sequence of oipA and sabA was regulated by the number of CT dinucleotide repeats in the N-terminal signal peptide coding region [41, 42]. All strains in this study had the functional status for the oipA gene, which was consistent with the results from Japanese strains. A total of 13 different CT patterns were found in 177 sequences for the oipA gene, of which no more than 6 CT repeats were found and the most prevalent CT pattern was 3 + 1. Ando et al. [7] studied 109 strains from nine countries, and found that 67% (31/46) strains from East Asia have no more than 6 CT repeats, while 70% (23/33) strains from Western countries showed a 6 to 12 CT repeats. The results reveal the difference and uniqueness in CT numbers between East and Western, implying a geographical distribution characteristic of oipA gene.

Unlike oipA gene, various CT patterns of sabA gene were detected in 166 sequenced genes. The results suggested that the sabA gene was of great high flexibility to switch on/off status. It is reported that the sabA gene is of great flexibility and diversity. A study from the Taiwanese showed that 80% of strains were sabA gene-positive but only 31% expressed SabA protein [43]. SabA was first found to mediate binding to the sialyl-Lewis antigen whereas this interaction was weaker than BabA-Leb adherence. Both the changeable on/off switching and weakness binding to sLex may benefit H. pylori to escape from the host’s tough immune response, leading to chronic inflammatory response and subsequent gastric mucosal lesions by successfully colonizing in the stomach for a long time.

The phylogenetic tree of the four OMP gene sequences containing different disease information was constructed respectively in this study, but no clustering of similar disease was observed. The results indirectly showed that none of the four OMP genes could effectively distinguish the differences between gastroduodenal diseases. H. pylori has gradually evolved into different geographical clusters with the development of human evolution and society [39]. Together with reference strains from other regions of the world, both the phylogenetic tree of oipA and babA genes showed a characteristic of the regional cluster. However, the clustering results of sabA and homB genes were relatively dispersed. In addition, almost all Chinese strains and East Asian strains were clustered together and could be classified as the main branch. The four OMP genes showed multiple subbranches in the clustering of Chinese and East Asian strains, which seemed to have more evolutionary diversity than the reference strains from the West. The evolutionary diversity of OMP genes may be related to the regional environment so that H. pylori can better adapt to the living conditions.

Our study also had some limitations that could not be ignored. First, we were unable to determine the sequential order of isolated strains and gastroduodenal diseases. Second, there was no homologous pathological information about duodenal ulcer in our sample. Finally, we could not get the strains without diseases due to pathogenic characteristics of H. pylori, but with chronic superficial gastritis instead in direct comparison.

Conclusion

BabA, oipA, sabA, and homB of H. pylori strain isolated from Chinese patients were highly polymorphic and in the status of positive selection (Ka/Ks> 1), but not a significant association with gastroduodenal diseases and pathologic changes in the present study. Nevertheless, more potential OMPs that could modulate interactions between H. pylori and gastroduodenal diseases were deserved further investigation.