Introduction

The study of human vaginal microbiota for the health of pregnant women and neonates is still in its infancy. Both traditional cultivation and culture-independent methods have shown similar patterns indicating that the indigenous vaginal microbiota in healthy pregnant women is typically dominated by Lactobacillus spp. [1, 2]. Using cultivation and Gram-staining, a prospective cohort study (at the first, second, and third pregnancy trimesters) of pregnant women in Ghent classified vaginal microbiome into four categories. The first category (I) is mainly composed by Lactobacillus, in which Ia and Iab were dominated by L. crispatus, while Ib was predominantly L. iners and L. gasseri [3]. A combination of Gram-staining and terminal restriction fragment length polymorphism (tRFLP) methods suggested that the presence of L. crispatus during early gestation ensured a stable microflora, whereas L. gasseri and L. iners were likely to vary over time and strongly predispose the vagina to bacterial overgrowth during pregnancy. Richard et al. [4] demonstrated that vaginal microbiota composed solely of Lactobacillus spp. at the time of embryo transfer yielded the best prospect for a successful outcome during an IVF-ET procedure. The Lactobacillus species play key protective roles by lowering the environmental pH through lactic acid production [5], thus stimulating the local innate immune system and decreasing symptoms and complications during pregnancies [6]. However, a disturbed vaginal ecosystem is thought to be associated with adverse pregnancy outcomes, such as preterm labor, the preterm rupture of membranes, and an increased risk of maternal and fetal morbidity [7, 8].

The recent development of novel methods to determine 16S rRNA gene tags using next generation sequencing (NGS) techniques provides a more detailed and integrative view of human microbiomes. Studies have suggested that a dominance of Lactobacillus might not be the only state of a normal vaginal microbiome, and the vaginal microbial community in single women may change dramatically over time [9]. Whether these changes of vaginal microbiota occur during pregnancy and the correlation between the vaginal microbiota types with prenatal health are largely unknown. Using pyrosequencing, Aagaard et al. [10] demonstrated that the overall diversity and richness of the vaginal microbiome was reduced during pregnancy. Recently, the vaginal microbiota in normal pregnant women was reported to be more stable than in non-pregnant women [11]. These controversial reports suggest that vaginal microbiome during pregnancy is understudied.

Sampling the vaginal microbiome involves physical intervention, which may pose potential threat to the final delivery outcome. Especially due to the one child policy in China, the sampling process, particularly the sampling site, is of concern for pregnant women, as sampling at different vaginal anatomic sites can have different potential physical injury to the vagina. The complex structure of the female genital tract ecosystem can be divided into several different microenvironments, such as the lower part of the endocervix, the ectocervix and the vagina [12]. Culture based studies have reported that the majority of women harbor distinctive bacterial populations in the cervix and vaginal canal [13]. These studies also demonstrated that the vaginal flora was a dynamic ecosystem that was subject to change and that the cervix represented a unique ecologic niche [14]. These observations are consistent with the results reported by Ling et al. that the total numbers of bacteria were significantly lower in the ectocervix than in the vagina [15]. Using 16S rRNA gene sequencing, Kim et al. demonstrated the heterogeneity in microbial populations across the cervix, fornix, and outer vaginal canal in non-pregnant women [16]. In addition, a cross-sectional study also demonstrated that taxa varied across the vaginal subsites (introitus, posterior fornix, and mid-vagina) [10]. The pyrosequencing data also suggested there was some variance in the microbiome across vaginal subsites [10]. However, Forney et al. demonstrated that self-collected vaginal swabs from the mid-vagina reflect the same microbial diversity as physician-collected vaginal specimens [17]. All in all, whether vaginal microbial populations differ across different anatomic sites remains controversial and more evidence is needed for a better understanding.

In the present study, we used the barcoded Illumina paired-end sequencing (BIPES) technique [18] to characterize the vaginal microbial communities at three different subsites, the cervix, posterior fornix and vaginal canal. We sampled women who were not pregnant, women in the three different trimesters and women who were postpartum to evaluate whether there was any sampling site variations during different pregnancy conditions. These results provide a direct comparison of the vaginal microbiome diversity across pregnancy stages.

Materials and Methods

Ethical Statement

The study was approved by the Ethical Committee of Southern Medical University, and all participants provided written informed consent.

Sample Collection

Women were recruited during a routine obstetrical visit at Southern Medical University in China, Guangzhou. All of the subjects were Chinese with ages ranging from 19.4 to 39.2 years old. Individuals who were asymptomatic and showed no clinical signs of vaginal disease upon examination by an obstetrician (Y.W.), including evidence of vaginal discharge, amine or fishy odor, and a vaginal pH of >4.5, were included in the study. Individuals who had taken antibiotics or antifungal drugs in the past 30 days or who, in the 48 h prior to sample collection, had sexual intercourse, used douches, or vaginal medications were excluded from the study.

A sterile speculum examination was performed by a single obstetrician (Y.W.) to collect vaginal fluid. For each individual, 9 sterile plastic swabs (JiangSuKangJian Medical Treatment Articles Co., Ltd.) with triplicates were obtained from the cervix, posterior fornix and vaginal canal. Three swabs were obtained from each site using the swab method. A total of 306 vaginal swabs were collected from 34 subjects between June and July 2012 in the obstetrical department at Southern Medical University. Thirty four subjects were divided into 5 groups, including non-pregnancy (5 subjects), T1 (6 subjects), T2 (6 subjects), T3 (12 subjects), and postpartum (5 subjects; Table 1). Swabs were frozen within 4 h after collection and stored at 80 °C until usage.

Table 1 Participants and samples included in the study

Total Bacterial Genomic DNA Extraction

Bacterial DNA was extracted from the vaginal swabs using the DNA MAGNETICS and EXTRACT kit (Shenzhen BioEAsy Biotechnologies. Co., Ltd., China) according to manufacturer’s instructions [19]. The bacterial cells retrieved on the swabs were submerged in 250 μl of TNCa buffer and vigorously agitated to dislodge the cells. A total of 20 μl of proteinase K solution (20 mg/ml) were added, vortexed to mix, and then incubated at 56 °C for approximately 15 min. The lysis-binding buffer provided in the kit (200 μl) was added and 200 μl of absolute ethyl alcohol and 40 μl of magnetic beads were then added and agitated for 20 s. The samples were left to stand at room temperature for 10 min and were agitated every 2 min. The mixtures were left on a magnetic shelf for 20 s to settle, and the supernatants were discarded. Then, 500 μl of W1 wash buffer was added; the mixture was agitated for 15 s, and then placed on a magnetic shelf for 20 s to settle. After discarding the supernatant, 700 μl of W2 wash buffer was added and the mixture was agitated for 15 s and then placed on a magnetic shelf for 20 s to settle. Discarded the supernatant and kept the sample tube at 56 °C for 7 min to make magnetic beads dry. One hundred microliters of elution buffer was added, and the solution was agitated for 15 s. The sample tube was incubated in a 65 °C water bath for 7 min, agitated for 15 s, placed on a magnetic shelf for 20 s to extract the DNA and stored at −20 °C before PCR analysis.

PCR Amplification

We used the barcoded V4F 5′ GTGCCAGCMGCCGCGGTAA 3′ and V6R 5′ ACAGCCATGCANCACCT 3′ primers to amplify bacterial 16S rRNA V4-V6 fragments. The PCR cycle conditions were as follows: an initial denaturation step at 94 °C for 2 min, 24 cycles of 94 °C for 30 s, 52 °C for 30 s, 72 °C for 30 s, and a final extension step at 72 °C for 5 min. Each 25 μl reaction consisted of 2.5 μl of Takara 10× Ex Taq Buffer (Mg2+free), 2 μl of dNTP mix (2.5 mM each), 1.5 μl of Mg2+(25 mM each), 0.25 μl of Takara Ex Taq DNA polymerase (2.5 units), 1 μl of template DNA, 0.5 μl of 10 μM barcoded primer V4F, 0.5 μl of 10 μM primer V6R, and 17.75 μl of ddH2O. Equimolar amplicon suspensions were combined and subjected to paired-end 101 bp sequencing on an Illumina MiSeq sequencer at Novo gene.

Data Analysis

We filtered the sequences for those containing ambiguous bases or mismatches in the primer regions. Because PE 101 bp sequencing is not able to span the V4 to V6 regions of the 16S rRNA gene, we used 30Ns to concatenate the two single-ended sequences for the following analyses. UCHIME was used to remove chimeras using the de novo mode (parameters were set as: –minchunk 20 –xn 7 –noskip gaps 2) [20]. UCLUST was used to cluster the sequences using the default parameters, with the identity parameter set to 0.97. The RDP classifier was used to classify these sequences into specific taxa using the default database [21]. The Shannon index was applied to evaluate the alpha-diversity and UniFrac distance was used to analyze the β-diversity (multiple alignments were performed using Pynast, Green genes core set was used as the template file, two single-ended sequences of each gapped sequence were aligned separately and the alignments were merged thereafter) [22]. All of the analyses from clustering to alpha and beta diversity were performed with QIIME (1.5.0) [23]. Statistical analysis for the relative abundance of the genera and the diversity indices and estimators were performed using SPSS 17.0 version [24]. Differentially abundant features were determined using Linear discriminate analysis effect size (LEfSe) [25]. LEfSe is an algorithm for high-dimensional biomarker discovery and explanation that identifies genomic features characterizing the differences between two or more biological conditions. LEfSe determines the features most likely to explain differences between classes by coupling standard tests for statistical significance with additional tests encoding biological consistency and effect size [25]. The threshold on the logarithmic LDA score for discriminative features was 4.0.

We used de novo [20] clustering and taxonomic assignment of 16S rRNA gene sequences that grouped sequences into OTUs. We further looked into the species level classification of Lactobacillus as different Lactobacillus species can predispose the vagina to bacterial overgrowth and other vaginal imbalances during pregnancy [3, 4]. It have been reported that the prevalent Lactobacillus spp. in vagina of White and Asian women are consist of four distinct species, in particular L. crispatus, L. iners, L. gasseri, and L. jensenii [1, 9, 11]. We downloaded the 16S rRNA sequences of these four species from NCBI, sliced out paired-end 80 bp reads from V4 to V6 which corresponds to the region used in our dataset, added 30 “Ns” to fuse forward and reverse reads and then used multiple sequence alignment (Clustal X) to compare the sequences. We found that 80 bp reads from V4 to V6 of 16S rRNA gene can successfully distinguish the four Lactobacillus spp. (Picture are shown in the supplementary, S6-phygenetic tree). The species level classification of the four most abundant Lactobacillus OTUs in our dataset was done by Blast (http://blast.ncbi.nlm.nih.gov/Blast.cgi) with 16S rRNA database [1, 3]. As expected, the representative sequences of our Lactobacillus OTUs can match to only one of the species from L. iners, L. crispatus, L. gasseri, and L. jensenii without dual.

A community state type (CST) in vaginal is a cluster of community states (the species composition and abundance of a vaginal community) that are similar in terms of the kinds and relative abundances of observed phylotypes [9]. The clustering of community states was done with hierarchical clustering based on the Euclidean distances between all pairs of community states and complete linkage. Four CSTs (CST I, II, III, and IV-A) in the dataset have been identified, which was consistent with CSTs proposed by Gajer et al. [9]. The bacterial communities of CST I, CST II and CST III were dominated by L. crispatus, L. gasseri, and L. iners, respectively. Communities of the CST IV-A were generally characterized by modest proportions of L. crispatus and L. iners, or other Lactobacillus spp., along with low proportions of various species of anaerobic bacteria such as Atopobium, Gardnerella, Hallella, Prevotella, and Streptococcus. The corresponding four clusters are depicted on Fig. 1 and are labeled I, II, III, and IV-A, respectively.

Fig. 1
figure 1

Heatmap of the percentage abundance of microbial taxa found in the vaginal microbial communities of 34 subjects. Complete linkage hierarchical clustering of Euclidean distance identified four community state types (CST I, II, III, and IV-A). The upper color bar shows the trimesters of each samples (NP non-pregnant; T1; T2; T3; PP postpartum), while the lower color bar shows community state types (CSTs)

Datasets were deposited into the European Bioinformatics Institute (http://www.ebi.ac.uk/) with accession numbers from ERS371314 to ERS371619.

Results

General Pattern of the Sampled Vaginal Community

A total number of 34 individuals were recruited in the present study (Table 1). All of the study subjects were Chinese with ages ranging from 19.4 to 39.2 years old. From each individual, we took 9 swabs with triplicates for each subsite. All sampling was performed by a single obstetrician (Y.W.). After sequencing with MiSeq, we performed quality control procedures for the raw reads using QIIME [23]. A total number of 720,601 high-quality 16S rRNA gene sequences were obtained for the 306 samples, with an average of 2,354 sequences per sample. Within them, 35 samples were filtered because of having less than 1,000 reads and 271 samples with more than 1,000 reads per sample remained (Table 2).

Table 2 Distribution of samples in each community state type (CST)

Overall, Lactobacillus spp. were the dominant bacteria, with Atopobium, Fusobacterium, Gardnerella, Hallella, Prevotella, and Streptococcus present in much lower proportions (Fig. 1). Within the genus of Lactobacillus, we observed four major species, namely L. crispatus, L. gasseri, L. iners, and L. jessenii. According to the dominant bacteria, the vaginal communities could be classified into four community state types (CST), using the nomenclature established by Gajer and colleagues [9]. As shown in Table 2, the CST I is dominated by L. crispatus (26.9 %), CST II is dominated by L. gasseri (6.3 %), and CST III is dominated by L. iners (55.0 %). The CST IV-A (11.8 %), however, was characterized by a relatively low abundance of Lactobacillus along with proportions of various anaerobic bacteria species, such as Atopobium, Gardnerella, Hallella, Prevotella, and Streptococcus, which have previously been shown to be associated with bacterial vaginosis (BV). In general, we observed that these communities were grouped according to their CSTs, but not to their sampling subsites or pregnancy stages.

Comparison of Communities Across Sampling Subsites

All samples from the three subsites, namely cervix (C), posterior fornix (P) and vaginal canal (V), had exactly consistent CST in each subject (Fig. 2a, b, c, d). According to the above analyses, 18 subjects were grouped into CST III, 10 in CST I, 2 in CST II, and 4 in CST IV-A. As is evident, regardless of the subject’s CST or pregnancy stage, samples from all three subsites were consistent within an individual.

Fig. 2
figure 2

The homogeneous vaginal microbial composition among the three sampling sites. The microbial community structures of 34 subjects were clustered into CST I (a), CST III (b), CST II (c), and CST IV-A (d). The name format for every subject was composed of the information of trimester combined with the number. For each subject, the first bar represents the average of three repeated samples collected from the cervix, the second for the posterior fornix, and the third for the vaginal canal. The different composition of subsites within subjects identified by LEfSe was marked with the asterisk on the bar. e No differences existed among the subsites of the cervix (C), posterior fornix (P), and vaginal canal (V) in the overall alpha-diversity exhibited with Shannon diversity (P = 0.525) and PD whole tree (P = 0.108)

When assessing the overall alpha-diversity using the Shannon diversity index and PD (phylogenetic distance) whole tree value, no significant differences existed among the subsites of C, P and V. (Kruskal–Wallis one-way ANOVA, P = 0.525 and P = 0.108, respectively; Fig. 2e). We further analyzed the subsite alpha-diversity in each subject and the results demonstrated that only one subject (T1.1) exhibited a difference in the Shannon diversity index across different anatomic sites (cervix = 2.00, posterior fornix = 5.33, and vaginal canal = 7.67, Kruskal–Wallis one-way ANOVA, P < 0.05).

However, because of the complexity of microbial communities, we did observe some specific taxa variation among the different sampling sites. We used LEfSe [25], a statistical tool used to identify genomic features, to characterize the differences in the community structures at the three sampling sites for each individual. The majority of subjects (82.4 %) did not have any significantly different taxa. In subject T2.2 (The No. 2 subject in trimester T2), the vaginal canalhad fewer L. iners, but more L. crispatus. The cervix of T3.5, NP.2 and T3.8 had differences in L. iners, L. gasseri, and L. crispatus compared with the posterior fornix and vaginal canal. The posterior fornix exhibited a higher abundance of L. gasseri, Gardnerella in T1.2 and Hallella, Prevotella,, and Fusobacterium in PP.3.

Comparison of Pregnancy Stages

Both the clustering (Fig. 1) and PCoA (Fig. 3a) results consistently showed that the postpartum vaginal microbiome is notably different compared to all other stages. The postpartum samples had the highest variation within the groups (Fig. 3b, Kruskal–Wallis one-way ANOVA, P < 0.05) and largest distance compared with non-pregnancy (Fig. 3c, Kruskal–Wallis one-way ANOVA, P < 0.05). In addition, the Shannon-diversity in the postpartum group was significantly higher than the other groups (Fig. 4, Kruskal–Wallis one-way ANOVA, P < 0.05). Specifically, three of the five postpartum subjects were grouped into CST IV-A, while the remaining two were within the CST III. During the three trimesters, the samples in T2 and T3 grouped together tightly, whereas the T1 samples spanned the entire space (Fig. 3a) and showed the highest variation compared with the other samples (Fig. 3c, Kruskal–Wallis one-way ANOVA, P < 0.05).

Fig. 3
figure 3

The beta-diversity of the microbial communities during the five trimesters. a The PCoA profile of the five trimesters displayed with Weighted UniFrac distances and Abund jaccard distance. Each dot represents one sample from each trimester. Red dots indicated samples in NP (non-pregnant), green in T1, orange in T2, blue in T3, and purple in PP (postpartum). b The variation of Weighted UniFrac distances and Abund jaccard distances within each trimester. c The Weighted UniFrac distances and Abund jaccard distances compared with the NP (non-pregnant) group

Fig. 4
figure 4

The Shannon-diversity index across trimesters. The Shannon-diversity index in PP (postpartum) was higher than NP (non-pregnant; Kruskal–Wallis one-way ANOVA, P < 0.05)

To further understand the differences in the microbial communities during the various pregnancy stages, we performed LEfSe analyses with a logarithmic (LDA) value of 4.0. A total of 25 taxa were found to be significantly different among the NP, T1, T3, and PP groups (Fig. 5). The abundance of Streptococcaceae was higher in T1. The PP group showed the most unique microbiota characterized by a lower abundance of Lactobacillus spp. along with the presence of diverse taxa, including Bacteroides, Fusobacterium, Gardnerella, Hallella, Incertae, and Prevotella.

Fig. 5
figure 5

Different species in the five trimesters identified by LEfSe with LDA values of 4.0. The differences are represented by the color of the most abundant class (red indicating taxa in NP (non-pregnant); green in T1; blue in T3; purple in PP (postpartum))

To further test the differences observed during pregnancy stages, we tested an additional 24 subjects, with replicate samples for each of them. The results were consistent with the above results, with the most significant differences observed in the postpartum samples, while the T2 and T3 samples were similar to non-pregnancy samples (Supplementary Fig. S1).

Discussion

Using sequence-based methods, the present study characterizes the vaginal microbial communities during non-pregnancy, pregnancy and postpartum at vaginal subsites in Chinese women. Our results demonstrate that Lactobacillus spp. (CST I, CST II and CST III) are the predominate bacteria in the vagina. This result is consistent with previous studies conducted in North America [26] and Europe [27], and confirms the existence of a core vaginal microbiota regardless of geographic separation and ethnic variation. We suggest that the vaginal microbiota varied substantially among subjects, even within CSTs, but was homogenous throughout the vaginal subsites (cervix, posterior fornix and vaginal canal) within individuals. Our subjects included non-pregnant, pregnant and postpartum women, and this may be the possible cause for the composition variation observed between individuals. In addition, although17.6 % (6/34) of subjects had minor taxa differences among the three subsites, the core community structure can be confirmed within each individual. Therefore, we propose that samples collected from any of the three subsites may be sufficient to reflect the complexity of the vaginal microbiota. These results provide a reference for specimen protocols to evaluate the entire vaginal community in pregnant women, thus enabling large-scale cohort studies of the vaginal ecosystem, particularly in pregnant women.

We observed vaginal microbiota heterogeneity during different trimesters, which is consistent with previous reports detailing the differences between pregnant and non-pregnant women [10, 11]. However, our study details more specific variations. The microbial composition in T2 and T3 samples are consistent with NP samples, which are characterized by a relatively high abundance of the species L. crispatus (CST I) and L. iners (CST III; Table 2). It was also determined that the T1 samples had a higher microbial diversity. Moreover, the relative abundance of Streptococcus was found to be distinctly high in subject T1.6, which facilitated T1 to be the most diverse microbiome during gestation. As it is the beginning of the pregnancy, T1 has been reported to be characterized by fluctuant concentrations of estradiol and progesterone [28]. Sex steroid hormones play a major role in driving the composition and stability of the vaginal microbiota [9]. Thus, fluctuant concentrations of estradiol and progesterone in each individual may contribute to the diversity of the CSTs in T1.

Postpartum (PP) is the unique stage of women who underwent pregnancy and recovered to the non-pregnant state [29]. Our results demonstrated that the microbiota of two postpartum subjects (PP.4 and PP.5) were dominated by L. iners, while the other three subjects (PP.1, PP.2, and PP.3) were grouped into CST IV-A. It has been reported that L. crispatus appears to ensure normal vaginal microflora and effectively inhibit the growth of pathogenic microorganisms, while L. iners is predisposes the vagina to bacterial overgrowth [1]. Ferris et al. have reported that L. iners was the predominant species in all patients after BV treatment [30]. These results are consistent with the proposal of L. iners as a risk factor for BV recurrence [31]. In addition, Ferris et al. suggested that L. iners may become a dominant part of the vaginal microflora when the microflora is in a transitional stage between abnormal and normal [32]. Factors in postpartum such as douching, use of feminine hygiene products [33], hormonal changes [34], sexual behavior, and gynecologic hygiene [9] are associated with fluctuations in vaginal microbial community composition. Therefore, these factors that disturb the vaginal community may be responsible for the discrete CSTs observed in postpartum. However, owing to the lack of metadata for the women studied, we cannot equate these variables with abnormal vaginal flora. Gondo et al. [35] reported that a decrease in members of the genus Lactobacillus during postpartum was not associated with clinical symptoms. Therefore, we postulate that the CSTs observed during postpartum may be normal during the recovery from pregnancy, but additional investigations are warranted to verify this hypothesis.

In the present study, we primarily focused on the effects of the sampling sites. Therefore, our current data are not sufficient to correlate the microbiome with prenatalhealth. Moreover, we only procured snapshot samples from each individual, whereas the variations in the vaginal microbiota during pregnancy should be better evaluated using cohort studies. Recently, a cohort study in pregnant women suggested that the microbiome was stable during gestation [11]. Our present study supports the sampling rationale for further large-scale studies of vaginal microbiome during pregnancy.