Introduction

The genus Vibrio consists of 103 species [1]. Of these, only ten species have been implicated to cause gastrointestinal and extra-intestinal diseases in human beings. Vibrio species are generally inhabited in marine niches. In humans, Vibrio species has been isolated from stool, vomitus, blood, or wound infections and also from environmental niches such as seawater, sediments, plankton, shellfish (oysters, clams and crabs) [2, 3]. Vibrio species which have great medical implications include: V. alginolyticus, V. carchariae, V. cholerae, V. cincinnatiensis, V. fluvialis, V. furnissii, V. metschnikovii, V. mimicus, V. parahaemolyticus, and V. vulnificus [4]. V. parahaemolyticus spreads into humans through contaminated sea food leading to acute gastroenteritis with diarrhea [2]. V. cholerae and V. vulnificus are responsible for other serious life-threatening infections in humans [2, 5].

Identification of Vibrios

Vibrio cultures are identified by colonial appearance, Gram stain, serology, and biochemical tests: Oxidase test, Voges–Proskauer test, sensitivity to pteridine O129, serology (agglutination with specific antisera), etc. [6, 7]. For species level identification, Matrix Assisted Laser Desorption/Ionisation—Time of Flight Mass Spectrometry is being employed [8]. This approach is effective in distinguishing very closely related species: Photobacterium damselae and Grimontia hollisae isolates from Vibrio species [8]. The highly conserved gene such as rrs (16S rRNA) is most widely used for detecting bacteria. Although, quite effective and precise, it does have some limitations. Species specific genes allow distinction between pathogenic and non-pathogenic strains. Amplification and sequencing of dnaJ gene has been instrumental in identifying Vibrio species—V. alginolyticus, V. cholerae, V. mimicus, V. parahaemolyticus, and V. vulnificus whereas toxR amplified using real-time quantitative PCR was found to be useful for detecting V. vulnificus in patients with skin and soft tissue infections [9, 10]. For distinguishing Vibrio from Aeromonas species in patients showing cholera-like symptoms, a duplex- PCR directed at genes—rrs and gcat (encoding cholesterol acyltransferase) has been used [11]. notI and sfiI genes have also proved helpful in distinguishing different species of Vibrio [12, 13]. Multiplex PCR sequencing of rpoB along with hsp60, and sodB and flaE genes was employed to distinguish four species of Vibrio: V. cholerae, V. mimicus, V. parahaemolyticus, and V. vulnificus [14]. pPCR assay to simultaneously detect virulent and non-virulent strains of V. vulnificus and V. parahaemolyticus was based on viuB, tdh, trh, vvhA and tlh genes as biomarkers [15].

Loop-mediated isothermal amplification (LAMP) protocol designed for amplification of ompW gene encoding outer membrane protein was targeted to detect V. cholerae, where ascytolysin/hemolysin gene (vvhA), could rapidly identify V. vulnificus with tenfold higher sensitivity than conventional PCR method [16, 17]. Innovative combination of LAMP method and Lateral Flow Dipstick to target vhhP2 and rpoX genes allowed rapid and sensitive detection of V. alginolyticus and V. harveyi [18, 19]. LAMP based detection was targeted on rpoS and vcgC genes of V. vulnificus on α subunit gene of RNA polymerase of Vibrio corallilyticus and thermolabile hemolysin gene (tlh) of V. parahaemolyticus [2023].

A few other methods which are employed for typing clinical isolates are: (i) Multi-locus sequence typing, (ii) Multiple-locus variable number tandem repeat analysis, and (iii) Whole genome sequencing [2426]. In spite of their high accuracy, discriminatory power, and reproducibility, these are limited to reference laboratories only and are not easy to implement for routine assays. The methods are costly, time-consuming and require special equipments [27]. Whole Genome Sequencing is relatively more promising as a rapid, accurate, and comprehensive technique with much wider implications and utility [28]. Rapid and accurate identification of pathogenic bacteria has always been a challenge. Molecular tools have proven helpful in meeting this challenge. A range of novel genomic tools, developed recently have enabled elucidation of the latent features of the highly conserved gene—rrs [2934]. However, this gene could not prove effective in identifying organisms, which possess multiple copies of rrs e.g., in Clostridium and Yersinia [35, 36]. Identification of Vibrio has been quite a tough task. Different researchers have used a variety of genes including rrs, as biomarkers for distinguishing Vibrio species. However, rrs alone has not proved very effective in identifying Vibrio species. The need is to identify a consensus gene, with unique features to be used as biomarker for rapid diagnosis. Here, we segregated the genes which were common to all species within a genus and digested them in silico with various type II restriction endonucleases (RE). Species within each genus could be segregated by different sets of gene-RE combinations.

Materials and Methods

Sequence Data and Comparative Genome Analysis

Completely sequenced genomes of the eight species of Vibrio were retrieved (http://www.ncbi.nlm.nih.gov/): V. anguillarum, V. cholerae, V. fischeri, V. nigripulchritudo, V. parahaemolyticus, V. tasmaniensis, V. tubiashii, and V. vulnificus (Table S1). Characteristics of Vibrio genomes have been presented in Table S1. Genes which were common to all the Vibrio genomes were elucidated by Pair-wise comparisons (Table S2). Among the 8 genomes, 108 protein encoding common genes could be segregated. Of these 108, we selected 24 to represent the whole range of gene sizes, in the range of 113 nucleotides (nts) to 3494 nts (Tables S2 and S3). The highly conserved non-protein coding gene, rrs was taken as reference, because it is used widely for identifying bacteria.

Restriction Endonuclease Analysis of Common Genes

All the selected genes were subjected to digestion with ten Type II REs: (i) four base cutters AluI (AG’CT), BfaI (C’TA_G), BfuCI (_GATC’), CviAII (C_AT’G), HpyCH4V (TG’CA), RsaI (GT’AC), TaqI (T_CG’A), Tru9I (T_TA’A), and (ii) 6 base cutters HaeI (WGG’CCW), Hin1I (GR_CG’YC) [36]. RE digestion patterns of all the 24 genes sequences along with rrs (Table S3) were analysed through Cleaver (http://cleaver.sourceforge.net/). Data matrices of REs generating 5–15 fragments were considered for consensus RE patterns [35, 36]. Vibrio species were then identified on the basis of unique gene-RE combinations.

Results

The completely sequenced genomes of Vibrio spp.: V. anguillarum, V. cholerae, V. fischeri, V. nigripulchritudo, V. parahaemolyticus, V. tasmaniensis, V. tubiashii, and V. vulnificus (Table S1) were found to vary from 4.03 to 6.32 Mb. Each genome is composed of 3656 to 5807 genes with an overall GC content in the range of 43.87–47.49 mol% (Table S1).

In Silico rrs Gene Analysis of Vibrio Species

The frequency of occurrence of the rrs gene per genome of Vibrio strains varied from 7 to 11. Within each genome, the rrs copies showed high similarity. Multiple sequence alignments of 69 copies of rrs from eight Vibrio genomes allowed us to conclude that these can be represented by ten groups containing 1–11 copies i.e., 67 copies are highly similar among themselves. RE digestion of rrs sequences showed that only a few in each species can be designated as unique: V. anguillarum (3/7 copies), V. cholerae (1/8 copies), V. fischeri (3/11 copies), V. nigripulchritudo (4/8 copies), V. parahaemolyticus (1/10 copies), V. tasmaniensis (2/7 copies), V. tubiashii (2/10 copies), and V. vulnificus (1/8 copies) (Table S4). It may be stated that rrs is not a good candidate gene for distinguishing Vibrio species unless all its copies are sequenced. It implies that we may need to resort to other gene sequences for deriving meaningful conclusions.

In Silico RE Digestion Patterns of Common Genes

In view of the fact that unique RE digestion patterns in rrs could not be deduced from any of the Vibrio genomes, genes which were common among them were analyzed. Genome wide comparison leads to the identification of 108 common genes in these 8 Vibrio genomes. Out of these 108 genes, we selected 24 genes, which varied in size from 113 to 3494 nts, in such a manner that genes of all sizes were represented (Tables S2 and S3).

In silico RE digestion patterns of 24 common genes with 10 different REs revealed some very interesting features in them. Of these 24 genes, 9 could be used for distinguishing most of the genomes—dapF, fadA, hisD, ilvH, lpxC, recF, recR, rph and ruvB (Tables S5–S13). The information on RE digestion patterns of the rest 15 genes has been presented as supplementary material (Tables S14–S28). However, due to the generation of a large number (ranging from 10 to 40) of small sized fragments, it became difficult to deduce meaningful conclusions. Hence, these were not considered significant enough for further evaluation.

A comparative analysis of all the nine genes and their RE digestion patterns revealed that fadA, hisD, and recF are the potential candidate genes, which can be used as biomarkers. These three genes had unique RE digestion patterns with REs: AluI, BfuCI, CviAII, HpyCH4V, RsaI, TaqI and Tru9I. HaeI, Hin1I and BfaI did not prove very effective, as they scarcely cleave these nine genes.

  1. (i)

    hisD, recF and fadA genes

In silico digestion of hisD gene with REs—AluI, HpyCH4V, RsaI, and TaqI resulted in generation of unique digestion patterns with all the eight Vibrio genomes, where as BfuCI and Tru9I were successful in providing information which allowed identification of seven species of Vibrio. On the other hand, digestion of recF gene with REs—AluI, CviAII and Tru9I resulted in unique digestion patterns with all the eight Vibrio genomes, where as BfuCI, HpyCH4V and TaqI were helpful in distinguishing seven species of Vibrio. It is interesting to note that these two genes showed contrasting behavior with different REs. The RE digestion patterns of fadA gene with REs—BfuCI, CviAII, HpyCH4V, and Tru9I were unique and thus could be used as distinct biomarkers, where as with REs—AluI, RsaI and TaqI were effective in distinguishing 5–6 species of Vibrio. The three Vibrio genomes, which showed resistance to digestion with certain REs: V. cholerae (AE003852) to AluI, Tru9I and TaqI, V. tasmaniensis (FM954972) to TaqI and Tru9I, and V. vulnificus (BA000037) to BfuCI and RsaI (Tables 1, 2, 3).

Table 1 Unique in silico restriction endonuclease digestion pattern (5′–3′) of hisD gene of Vibrio genomes
Table 2 Unique in silico restriction endonuclease digestion pattern (5′–3′) of recF gene of Vibrio genomes
Table 3 Unique in silico restriction endonuclease digestion pattern (5′–3′) of fadA gene of Vibrio genomes
  1. (ii)

    dapF, ilvH, lpxC, recR, rph and ruvB genes

The digestion of these genes allowed segregation of 5–8 genomes of Vibrio. The genomes which could not be digested with most of the REs were: V. anguillarum (CP002284) and V. fischeri (CP000020) (Tables 4, 5, 6, 7, 8, 9).

Table 4 Unique in silico restriction endonuclease digestion pattern (5′–3′) of ruvB gene of Vibrio genomes
Table 5 Unique in silico restriction endonuclease digestion pattern (5′–3′) of lpxC gene of Vibrio genomes
Table 6 Unique in silico restriction endonuclease digestion pattern (5′–3′) of dapF gene of Vibrio genomes
Table 7 Unique in silico restriction endonuclease digestion pattern (5′–3′) of rph gene of Vibrio genomes
Table 8 Unique in silico restriction endonuclease digestion pattern (5′–3′) of ilvH gene of Vibrio genomes
Table 9 Unique in silico restriction endonuclease digestion pattern (5′–3′) of recR gene of Vibrio genomes
  1. (iii)

    The rest of the genes (15)

The rest 15 genes were effective in distinguishing certain genes with low frequency. The information on their RE digestion pattern can be used to supplement that generated with other genes. Thus, though occasional, these genes have some potential as biomarkers (Tables S14–28).

Discussion

Bacterial identification based on rrs gene has turned out to be quite effective. However, organisms having multiple copies of this gene show high Intra-genomic heterogeneity, which may lead to over estimation of the existing variability [37, 38]. A high level of similarity among the different copies of rrs present in different Vibrio strains further complicates the issue of closely related organisms. In the case of Vibrio species, a host of genes have been employed for their identification from time to time: dnaJ, flaE, hsp60, notI, ompW, recA, rpoA, rpoB, rpoX, rpoS, sfiI, sodB, tdh, tlh, toxR, toxR, trh, vcgC, vhhP2, viuB, and vvhA. It indicates that no consensus gene is available so far. It further highlights that rrs has not been very fruitful for accurate identification. Significantly low sequence similarity of the dnaJ gene (77.9 %) compared to 97.2 % of the rrs gene, implied its high discriminatory power for Vibrio species [9]. Our study has also shown that rrs alone cannot be used for identifying Vibrio up to the species level. In fact, tlh gene studied through LAMP, could identify 143 V. parahaemolyticus strains but was not able to identify 33 other Vibrio spp. and a large number of non-Vibrio strains [39]. LAMP assay targeting toxR gene was able to correctly detect 36 V. parahaemolyticus strains [40]. Multiplex PCR sequencing of rpoB along with hsp60, sodB and flaE genes was employed to distinguish four species of Vibrio: V. cholerae, V. mimicus, V. parahaemolyticus, and V. vulnificus. Here, rrs gene was used as positive internal control [14], which implies that this gene alone was not sufficient for identifying Vibrio species.

Some of the genes reported in literature are among the common genes detected in our study; these include flaE, recA, rpoA, rpoB, sodB. In silico RE digestion of these genes (Tables S29–S33) revealed that they can also be used to distinguish all eight Vibrio species except rpoB. RE digestion of rpoB gene leads to an unmanageable number of fragments (25–30), which are thus difficult to analyse. The later cannot be recommended as candidate gene also on account of the fact that its amplification is not easy due to its large size (4029 nts; Table S30). Our study allows us to conclude that genes—fadA, hisD, and recF varying in size between 1080 and 1296 nts (in combination with certain REs) are the most suitable candidates for identification of all 8 Vibrio species. Here, we need to amplify the specific gene by polymerase chain reaction and subject the amplicon to defined RE. The second category of genes—dapF, ilvH, lpxC, recR, rph and ruvB (495–1004 nts) provide reasonably good information, which can also be exploited for identification of Vibrio. As we can expect these genes to be present in other related genera, such as hisD gene in Escherichia coli, an analysis of RE digestions obtained with ten REs used here (data not shown), revealed clear cut differences between the two genera. This protocol can thus be exploited for rapid diagnosis of Vibrio species.