Keywords

1 Introduction

Microorganisms are ubiquitous in the environments including humans in and out of their surfaces. These unseen microorganisms share a similar count of the total number of cells in a human body (Sender et al. 2016). Therefore, this plethora of microorganisms can be similar to a multicelled organ (Plutchik 2001). The total count of microorganisms present in a habitat is defined as “Microbiome,” which can be further specified on the basis of microbial types such as bacterial microbiome (bacteriome), viral microbiome (virome), and fungal microbiome (mycobiome). With the existence of one microorganism-one disease status, how a single microorganism exhibits the potential to affect the overall health of the human host? It also includes the nearby host cells as well as microbial cells. Most oral diseases are the consequence of polymicrobial associations that not only include various bacterial species but also microorganisms of other domains such as yeast and fungi (Yarieva 2022). Therefore, it is highly important to understand the overall microbiome dynamics. The count, as well as the diversity of the microorganism, differs in different niches of the human body. With a total count of 10–100 trillion bacterial cells, the gut is considered as the most abundant habitat for bacteria in the human body. It was followed by the oral cavity, where the count of the bacterial species is more than 700 (Verma et al. 2018). Besides, lung (Man et al. 2017), skin (Skowron et al. 2021), armpits (Akani et al. 2021), vagina (France et al. 2022), liver (Gola et al. 2021), kidneys (Mertowska et al. 2021), and other visceral organs are significantly high. The role of bacteria as commensals (Khan et al. 2019), symbionts (Henriquez et al. 2021), and another beneficial effects (Mohajeri et al. 2018) are well established in the literature to sustain the human well-being. However, in past few years, commendable research has been done to understand the microbiome dynamic in the context of human health. Due to several microhabitats in the human body and their interactions with inhabitant bacteria as well as external environments, the microbiota can be considered a dynamic ecological community (Khalighi et al. 2022). These microhabitats exhibit specific community composition that exclusively relies on the localized physiological environment of the organs (Foxman et al. 2008). Despite the several microbial habitats, the human body serves as an excellent ecosystem by maintaining a dynamic equilibrium. Moreover, it is now the prerequisite to this microbial structure unaltered to maintain the body healthy. This dynamic equilibrium can be altered due to exposure to environmental interferences (Dethlefsen et al. 2006). An altered microbiome (dysbiosis) led to the onset of several diseases (Fig. 11.1). Gut and oral habitats have been extensively studied for such dysbiotic states that resulted in diseased conditions. It has also been reported that most of the abundant bacteria in the gut have been associated with health status. Though it was oppositely reported in the case of the vaginal microbiome, where higher diversity is closer to the diseased state. It indicates that microbial profile can be correlated with the physiopathological states of the human body. It would be interesting to explore the knowledge of microbiome-based information for a better understanding of the diseased or healthy status of a host. However, the cultivation of microorganisms to their pure form is the biggest bottleneck. With the best of the traditional approaches of cultivation techniques, it is not possible to capture the entire microbial community on Petri-dishes for their extensive analysis from any habitat including humans. This way, microbiologists lose most of the portion of more than 99% and represent an incomplete analysis. Metagenomics has emerged to cope with these conditions that offer direct cloning of a community DNA for their extensive analysis. Thus, despite of the uncultivability of the microorganisms, advanced tools of microbiology has enabled to produce massive information of the inhabitant microbes of a habitat. This chapter describes various tools and techniques to explore such hidden microorganisms for their role in diagnosis of infectious diseases and developing rapid and cost-effective biomarkers.

Fig. 11.1
An illustration lists the major habitats of dominant phyla and dominant genera for gut, skin, and oral health and the reported consequences of microbial dysbiosis.

The major habitats (gut, oral, and, skin) of bacteria in human hosts exhibit the dominant bacterial communities at the phyla and genera levels and the reported consequences of microbial dysbiosis on the development of diseases

2 Traditional Microbiology

With the great contribution of Antoine van Leeuwenhoek by introducing microscopes to the world, remarkable progress has been done in the field of microbiology. Thereafter, several microbiological techniques evolved with the passage of time to explore the limited information on microorganisms of various domains (Muthukumar et al. 2008). Traditional microbiology includes serology, antigen detection, microscopy, and isolations (Laupland and Valiquette 2013). These methods provide a rapid way of analyzing morphological parameters of bacteria, fungi, and several protozoan-based parasites. Traditional tests can be qualitative or quantitative; where the qualitative analysis is performed by using colonial morphology, Gram staining, endospore staining, and biochemical activities of the bacteria. The quantitative analysis includes the enumeration of the culture via different methods like pour plate, spread plate, surface drop, agar droplets, and microdilution. The prerequisite of traditional or conventional microbiological testing is an isolate/culture that is inoculated from the bacterial sample for studying various parameters. Even different media were used to inflate the bacterial identification essentially through biochemical testing (Gracias and McKillip 2004). One of the most significant applications of microbiology along with biotechnology is in the successful production of vaccines that prevent human as well as animals from many lethal diseases (Opal 2010). Today’s pathological testing relies on pure culture-based investigations and their reports.

3 Problems Associated with Traditional Microbiology

The foremost problem of traditional microbiology is losing the major fraction of microorganisms in a habitat. The limitations rose in various facets that include (1) incomplete information on the nutritional parameters of the habitats required to cultivate the microbes. (2) Differences in the physiological parameters (pH, temperature, salt, metals, ionic state, etc.) of the habitats. (3) Loss of microbe-microbe and microbe–host interactions at the level of commensalism, symbiosis, proto-corporation, and other mutual interactions (Pickup et al. 2003). The cultivation conditions in the laboratory are highly limited, where various microorganisms are cultivated in highly rich nutrient media (Keer and Birch 2003) in stringent parameters. With the best of efforts, environmental conditions cannot be simulated in the laboratory (Zhang et al. 2018). Besides, the traditional cultivation approaches are time-consuming and tedious to achieve the isolate in its pure state. Microbial diagnosis and therapies demand a high level of accuracy whereas traditional practices of microbiology are mostly manually conducted leading to lower accuracy of the test being conducted which is a major drawback to accurate diagnosis (Van Belkum et al. 2020). Contamination at any stage led to the failure of the entire diagnosis and enhances the possibility of relocation errors (Nathan et al. 2018).

Studying the obligate and even facultative anaerobic microorganisms is another big challenge where prior information on the source samples is a must to lose precious human samples for a diagnosis. Along with the mentioned restraints, another major limitation related to cell-based products includes the need for manual and visual examination of cultures to detect growth (Peris-Vicente et al. 2015). Hence, highly keen, and active observation is the demand for classical-based methods of microbiological testing which itself a laborious task for humans. Moreover, cultivation-based methods are unable to provide the community structure of a habitat (Rhoads et al. 2012). As most of the time, only dominating bacteria/microbes of that environment only appear during the defined cultivation conditions (Dawodu and Akanbi 2021). Another limitation is regarding the viability of the bacteria being cultured, i.e., viable bacteria are enumerated along with the nonviable colonies that provide the total count, and therefore, the proportion of viable bacteria is not known.

Traditional culturing methods are therefore often viewed as slow and outdated, although they still deliver an internationally accepted evidence-based analysis/diagnosis. In contrast, molecular tools have the potential for rapid analysis, and their operational utility and associated limitations and uncertainties should be assessed considering their use for regulatory monitoring (Oliver et al. 2014; Rhoads et al. 2012).

4 Molecular Techniques for Analyzing Bacteria for Human Diagnosis

Molecular techniques have come up with an immense contribution toward classification and identification of the bacteria in all the major fields like food, medicine, health diagnosis, etc. Since 1983, many molecular methods have been developed for the detection and genotyping of bacteria (Hallin et al. 2012). Several molecular typing methods are available such as ribotyping, restriction fragment length polymorphism (RFLP), Random amplified polymorphic DNA (RAPD), amplified fragment length polymorphism (AFLP), and amplified ribosomal DNA restriction analysis (ARDRA) find applications in identifying the bacterial types. However, these are time-consuming and tedious techniques that further demand highly trained manpower. Development of rapid molecular techniques has enhanced and specified the expeditious identification of the bacteria being considered.

4.1 PCR and Derived Approaches

The technique is well known for diagnosing infectious pathogens of all types that can amplify the millions of identical gene copies from a very less amount of a clinical sample. It is widely used in the diagnosis of viruses, bacteria, parasites, and fungi (Kurkela and Brown 2009). Several modifications of the technique empower it better for its utilization in the diagnosis of human diseases. For example, the quantitative-based PCR method depends upon the fluorescent probes reader which detects and quantifies the PCR product in real-time. This technique is more fruitful due to the deficiency of post-PCR processing. Where the amplified PCR product can be detected automatically by the fluorescence monitoring of real-time PCR. Nowadays, Light Cycler and and TaqMan are the commercially available advance real-time PCR versions. Light Cycler TM and smart cycler used SYBR-green dye are used to perform fluorescence monitoring. Whereas, TaqMan uses fluorescent probes, which specifically bind with the amplification- binding site of the target sequences (Dwivedi et al. 2017). Nummi and coworkers employed qPCR for the detection of Mycoplasma pneumoniae and Chlamydia pneumoniae infection as both cause pneumonia with similar expression (Nummi et al. 2015). Thus, qPCR enables to detection of the most common mutation associated with the 23S rRNA gene of macrolide resistance in Mycoplasma pneumoniae. The technique is widely used in respiratory pathogens and clinical specimens. The fungal disease has also been successfully screened by amplifying the signature sequences of the pathogenic strains such as Aspergillus fumigatus and Aspergillus flavus. Real-time PCR was widely used in the diagnosis of Propionibacterium spp., Chlamydia spp., Legionella pneumophila, and Listeria monocytogenes (Dwivedi et al. 2017; Mackay 2004). The application of PCR can be seen in detecting COVID-19 infection worldwide for detecting the coronavirus and its various mutants such as N501Y,69-70del, K417N, and E48K SARS-CoV2 (Vega-Magana et al. 2021). Multiplex PCR is another version of PCR that is widely used in the detection of multiple DNA sequences simultaneously in a single PCR reaction. Here, multiple sets of primer pairs along with multiple DNA templates are used in a single reaction. Where the primer pair binds with specific target DNA and is optimized at the different amplicon sizes. This technique generates information on multiple genes in a single run (Kurkela and Brown 2009). Dwivedi et al. (2017) use multiplex PCR for detecting infectious pathogenic bacterial strains of Brucella abortus and Brucella melitend in a very short time of 2–3 h with the genus-specific primer-probe sets. Loens et al. also used this technique for the detection of a respiratory specimens of M. pneumoni, C. pneumonia, and Legionella species with the use of molecular beacons (Loens et al. 2008).

Reverse transcription PCR (RT-PCR) is widely used in the detection of viruses. This technique synthesizes complementary DNA (cDNA) by using an mRNA template with the help of a reverse transcriptase enzyme. He et al. (2015) used this technique for the detection of porcine circovirus from domestic pigs in China. Reverse transcriptase quantitative PCR provides a major contribution to the investigation of viruses like flaviviruses, hepadnaviruses, herpesviruses, orthomyxoviruses, parvoviruses, papovaviruses, paramyxoviruses, pestiviruses, picornaviruses, poxviruses, retroviruses, rhabdoviruses, and TT virus (Bookout et al. 2006). Mackay (2004) is also used to detect the viral load to check the infectious interaction between the virus and the host.

4.2 Fluorescence In Situ Hybridization

Fluorescence in situ hybridization (FISH) is a rapid detection technique for the identification of pathogenic bacterial DNA sequences and the diagnosis of genetic diseases, gene mapping, and novel oncogenes which play role in various types of cancers. Besides, FISH is being utilized as a cytochemical technique for genetic detection and identifying the loci of nucleic acid-based probes (Cui et al. 2016). The technique has an advantage due to its accuracy, safety, and specificity toward the nucleic acids (Dwivedi et al. 2017). It is a low-cost molecular hybridization assay that was used to detect the DNA probe in Chlamydia trachomatis or Neisseria gonorrhea. This technique was also used in the detection of respiratory infections, gastrointestinal diseases, mycobacterial infections, and fastidious bacteria like spirochetes (Jensen et al. 2001). Prudent and Raoult reported the FISH technique for the detection of Q fever due to Coxiella burnetii infection which may cause serious complications in humans and animals. Similarly, Maiwald et al. (2003) used the FISH technique for the detection of two strains of Tropheryma whipplei from the cerebrospinal fluid of two patients of Whipple’s disease. The FISH assay provides confirmation of toxin/antitoxin elements work as a pathogenic factor in bacterial cells (Audoly et al. 2011). Zhang et al. (2012) used peptide nucleic acid (PNA) probes for the rapid detection of Listeria spp., L. monocytogenes, and L. ivanovii within 1 h. Since early detection can decrease the infection potential of L. monocytogenes which causes meningoencephalitis. (Goulet et al. 2012). Few studies reported the PNA-FISH technique for uncovering tuberculosis-causing Mycobacteria (MT) and nontuberculosis mycobacteria (NMT) strains (Soini and Musser 2001). It was successfully used for the rapid detection of several Mycobacterial spp. such as M. leprae, M. avium, and M. kansasii within 3 h (Lefmann et al. 2006). The technique has been used for visualizing the microbes and their distribution in oral biofilm (Malic et al. 2009).

4.3 Microarray

Microarray is a DNA hybridization biochip-based technology that analyzes thousands of genes simultaneously and detects the specific gene of DNA and RNA. The technique is being widely used to detect single- nucleotide polymorphism and mutation in genomic DNA (You et al. 2008). Microarray technique has been employed for the identification of bacteria at species and subspecies level to check their pathogenicities such as E. coli, Vibrio cholerae, Salmonella enterica, Campylobacter jejuni, Shigella spp., Yersinia enterocolitica, and Listeria monocytogenes (You et al. 2008). Few studies also reported the microarray for the identification of yeast and molds by targeting the ITS region of the fungal 18S rRNA gene (Huang and Zheng 2006). Wang and colleagues use 16S-based microarray technique for the detection of intestinal bacteria from the fecal sample (Wang et al. 2002). From this technique, 20 predominant intestinal bacteria were directly detected using a microarray-based technique. Lee and coworkers detected 44 pathogenic bacterial strains using Pathochip DNA microarray from the clinical sample of blood, sputum, cerebral spinal fluid, pus, and urine. Bacterial-specific genes such as housekeeping genes, virulence factors, and antibiotic-resistant genes were also used in place of 16S rRNA sequence for the detection of S. aureus, E. coli, and Pseudomonas aeruginosa-like pathogens (Lee et al. 2003). DNA microarray was also implemented in the detection of pathogenic species in food samples (Cleven et al. 2006). Palka-Santini et al. (2009) employed DNA microarray for DNA-DNA hybridization with specific oligonucleotides for the identification of contaminating microorganisms. Besides, the bacterial species of some viruses and fungi are also detected by the microarray technique. Quantitative microarray was attempted for the early detection of the hepatitis virus that causes chronic liver disease and hepatocellular carcinoma-like diseases (Sakai et al. 2012).

4.4 Nucleic Acid Sequence-Based Amplification

In this technique, T7 RNA polymerase, RNaseH, and primer-mediated T7 promoters are used to convert the RNA template into double-stranded DNA. This technology is used in the detection of Mycoplasma pneumoniae, Chlamydophila pneumoniae, and Legionella spp. From the clinical diagnosis of respiratory specimens (Loens et al. 2008). Lau and Coworkers used two detection methods of nucleic acid sequence-based amplification (NASBA) technique: NASBA-electrochemiluminescence (NASBA-ECL) and enzyme-linked oligonucleotide for the rapid identification of foot and mouth disease virus (FMDV) (Lau et al. 2008). Prateek et al. (2010) used this technique for the detection of cytomegalovirus (CMV) infection. Similarly, Shan et al. (2003) also used the NASBA-ECL technique for the detection of avian influenza, a subtype H5 from the allantoic fluid harvested from inoculated chick embryos (Shan et al. 2003). Guoshuai and colleagues detect classical swine fever virus (CSFV) without interfering other viral RNA by using G4-THT-NASBA, since G4-THT-NASBA is a highly sensitive, easy-to-use, and rapid technique for RNA detection (Guoshuai et al. 2022; Jia et al. 2022).

5 Advanced Tools and Techniques of Microbiology

It took around 400 years to develop the techniques to explore the hidden and uncultivable microorganisms in the human body. Remarkable development has been done in the line of human microbiome-based research. However, prior to NGS, it was quite challenging due to the inability to capture the entire microbial community of a habitat. Several reasons have been discussed in Sect. 11.3 under the heading of “Problems associated with traditional microbiology.” Research never waits for new techniques; it pursues the available facilities. Though, the human gut microbiome was studied by culturing the inhabitant gut microbiomes in large numbers on various media under varying physiological conditions. We had gathered huge information on the gut bacterial diversity prior to the advent of NGS (Guarner and Malagelada 2003). For example, significant research has been done on Heliobacter pylori; a cancer-causing bacterium of the gut. In 1984, Simon and Gorbach predicted that the gut microflora is more populated with anaerobes than the aerobic bacteria where the gut harbors more than 500 bacteria species (Simon and Gorbach 1984). Firmicutes, Proteobacteria, Bacteroides, Clostridium, Fusobacterium, Eubacterium, Ruminococcus, Peptococcus, Peptostreptococcus, and Bifidobacterium were already known dominant genera of the human gut microbiome (Harmsen et al. 2000). The admirable research on pro- and prebiotics in the context of gut health must be noted. Several Lactic acid bacteria (LAB) were established for their role in the betterment of gut health. Similarly, the salutary effect of various Clostridium spp. was well known for its role in attenuating inflammation and allergic diseases (Samarkos et al. 2018). Today’s microbiome-based research on Clostridium difficile is highly influenced by prior research on its isolated cultures only (Britton and Young 2012). Several diseases have been diagnosed suing various NGS tools (Table 11.1). Though NGS-based investigations come out with enormous information, even the traditional cultivation approaches are irreplaceable and play an integral role to achieve a concrete solution (Vishwakarma and Verma 2021).

Table 11.1 Various NGS platforms used identification of human diseases

5.1 Metagenomics

Metagenomics is defined as the study of collective genomics analysis of microorganisms by direct extracting and cloning DNA from an assemblage of microorganisms present in a particular environment. The term metagenomics was coined by Jo Handelsman in 1988 while doing her research on the discovery of natural products through biosynthetic gene clusters (BGC). Jo Handelsman recognized that entire sample DNA can be used for exploring Novel BGC loci (Handelsman et al. 1998). Unlike traditional microbiology, metagenomics has enabled the study of unculturable microorganisms in their native habitats by directly extracting DNA from the respective samples. Metagenomics explores the microbial genes and genomes either by functional-based or sequence-based approaches (Culligan et al. 2014). Metagenomic-derived studies are gaining interest in various fields to study taxonomic and functional annotation of the microbiomes of agricultural, environmental, human, and clinical samples (Zhou et al. 2020; Chiu and Miller 2019). Functional metagenomics is the activity-based screening of the metagenomic libraries for the desired bioactive molecules. Several successful achievements have been reported to retrieve the novel genes of various industrially relevant enzymes (Cui et al. 2019), antibiotics (De Coster et al. 2019), bacteriocin (Pal and Srivastava 2014), and other antimicrobial compounds (de Abreu et al. 2021). Of the sequences-based approaches, amplicon sequence metagenomics utilizes marker genes such as bacterial/archaeal- specific 16S rRNA, eukaryotic 18S rRNA, or ITS (internal transcribed spacers) regions for taxonomic and functional profiling of microbes. Whereas, shotgun-based metagenomics relies on whole metagenome or whole genome (unculturable) sequences (Pérez-Cobas et al. 2020). Thus, provides better resolution and reliable information about the taxonomic and functional characteristics of the inhabitant microorganisms of a particular environment or habitat. In most of the studies, amplicon sequencing is being used due to cost-effectiveness with high accuracy (Callahan et al. 2019). Various microbiome tools are available to study the microbiome and its functional activity in various environments (Galloway-Pena and Hanson 2020). For example, microbiome-based studies are the outcome of the sequence-based metagenomic approaches. Commendable research has been done in the last decade to understand the microbiome dynamics of humans (Sehli et al. 2021; Baker et al. 2021).

5.2 Microbiome-Based Tools

The term “Microbiome” can be defined as the total genome of all the microorganisms (commensals, parasitic, symbiotic, pathogenic, or nonpathogenic) present in a particular habitat or environment (Berg et al. 2021). Microbiome explores the entire microbial communities and has been successfully employed to study the microorganisms of various fields such as environment, soil, water, and air, including humans (Cullen et al. 2020). Several global projects on metagenomics and microbiome analysis have been successfully accomplished such as Earth Microbiome Projects (EMP) (Gilbert et al. 2010, 2014), Human Microbiome Project (HMP) (Turnbaugh et al. 2007), and European MetaHIT (Qin et al. 2010). Kho and Lal (2018) assume the gut microbiome is the controller between wellness and disease. The direct role of human microbiota in the immune system has been established that shapes the host immune system. The effect of environmental factors has also been studied to correlate the human microbiome dynamics. It includes geographical variations, antibiotic doses, temperature effects, vegetation, lifestyle, food habits, and mental status (Rodríguez et al. 2015; Biedermann et al. 2013; Tyakht et al. 2013). Maurice et al. (2013) reported the alteration in the physiology and gene expression of the human gut microbiome due to antibiotic doses. Disruption of short- and long-term microbial balance has been reported due to antibiotic treatment (Jernberg et al. 2007). In an interesting report, microbiota depletion was strongly correlated with serotonin and bile acid metabolism that consequently resulted in delayed GI motility (Ge et al. 2017). Besides, antibiotic-treated mice were also more susceptible to S. Typhimurium and C. difficile-like antibiotic-associated pathogens (Ng et al. 2013). Gut bacteria (LABs) also play a crucial role in the de novo synthesis of essential vitamins (LeBlanc et al. 2013). The LABs were exclusively known for the synthesis of vitamin B12 (LeBlanc et al. 2013). Besides, folate is constitutively produced from Bifidobacteria spp. (Pompei et al. 2007). Several other vitamins such as vitamin K, riboflavin, biotin, nicotinic acid, pantothenic acid, pyridoxine, and thiamine are chiefly produced by the gut microbiomes (Hill 1997). The role of gut bacteria in metabolizing the bile acid has also been reported that is hard to reabsorb. Alteration in such transformation in the respective bacteria causes human illnesses such as obesity and type 2 diabetes (Palau-Rodriguez et al. 2015). Therefore, microbiome-based investigations provide the potential to diagnose human diseases. To study the microbiome, we need certain tools and techniques. Marker Gene Analyses (Amplicon sequencing), shotgun metagenomics, metatranscriptomics, metaproteomics, and metabolomics are a few of them that can be used as well as clubbed with the microbiome-based analysis.

5.3 Marker Gene Analysis

Marker genes are defined as a specific region of DNA used to identify microbes in metagenomics samples. Commonly used methods are 16S ribosomal RNA gene sequencing for bacteria identification and internal transcribed spacer (ITS) region sequencing for fungal identification. Both markers possess hypervariable regions which participates in assigning genera and species. Bacterial or fungal-specific markers are usually amplified with their respective specific primers followed by NGS-based sequencing. The raw sequences undergo demultiplexing and quality filtering using various tools such as DADA2 (Callahan et al. 2016), Deblur (Amir et al. 2017), UNOISE3 (Edgar 2016), and FastQC (Andrews 2010) to achieve the quality reads. Processing the sequences is carried out either by picking OTUs (operational taxonomic units) with a similarity threshold of 97% or 99% or ASVs (Amplicon Sequence Variants). ASVs are more sensitive in detecting even single-nucleotide variations and thus provide more precise and detailed information on microbial diversity. These ASVs/OTUs are used for taxonomic assignment, diversity analysis, and functional profiling of the microbes present in the study samples (Hamady and Knight 2009). Several microbiome analysis interface packages or pipelines are available such as QIIME1 (Caporaso et al. 2010), QIIME2 (Bolyen et al. 2019), Mothur (Schloss et al. 2009), DADA2 (Callahan et al. 2016), and SILVAings (https://ngs.arb-silva.de/silvangs/). Taxonomy assignments are majorly done by using several classifiers and databases such as the RDP classifier (Wang et al. 2007), a naive Bayesian classifier such as Greengenes (McDonald et al. 2012) and SILVA (Yilmaz et al. 2014) and for fungi, UNITE (Kõjalg et al. 2005) database is used. The command-based tool Quantitative Insights into Microbial Ecology (QIIME) is a more widely used tool in recent times due to several advantages. It is noted that QIIME1.9.1 has been obsolete now and updated with its advanced version QIIME2 2020.1. The newer version of QIIME 2 has a diverse multiple-user interface and wraps many different tools required for the downstream analysis of sequences.

5.4 Shotgun Metagenomics

In contrast to amplicon sequencing which uses a short stretch of 16S rRNA or 18S rRNA, shotgun sequencing deals with the sequencing of entire metagenomic DNA in a sample. Thus, it comes out with massive information on bacteria, fungi, archaea, and other microorganisms. It parallelly depicts the information on its taxonomy as well as functionally active genes (Mukhopadhya et al. 2019; Moreno Gallego et al. 2019). The advantage of shotgun metagenomics is that it provides concrete information on functional profiling as compared to the marker-based analysis (16S and 18S). As it predicts the functionality of the microbial community using the available information in the available database. The PICRSUt tool is the highly recommended tool for predicting the function of 16S rRNA genes during marker gene analysis (Langille et al. 2013; Douglas et al. 2020). The shotgun-based analysis is still evolving and will be more reliable in the near future as the database will be evolved. In shotgun metagenomics, short reads are aligned for making contigs such as Layout or Consensus assembly (Ayling et al. 2019). These contigs and scaffolds are further used for retrieving the information based on the available databases. However, analysis of shotgun-based metagenome sequences requires more tedious steps over the QIIME platforms. It also requires complex bioinformatic tools and methods for analyzing reads, some bioinformatic pipelines and software used are Megahit (Li et al. 2015), StrainPhlAn (Truong et al. 2017), MetaPhlAn (Beghini et al. 2021), HUMAnN (Beghini et al. 2021), HOME-BIO (Ferravante et al. 2021), MetaVelvet (Namiki et al. 2012), IDBAUD (Peng et al. 2012), and metaSPAdes (Nurk et al. 2017). Profiling of marker or representative genes is mostly studied by read-based taxonomy assignment and gene annotation which is based on similar DNA constituents and patterns such as K-mers, gene homology, and GC content (Claesson et al. 2017). Kraken is frequently used for taxonomy assignment which is based on K-mer length (Wood and Salzberg 2014).

5.5 Metaproteomics

Metaproteomics is the study of microbial communities by analyzing total proteins, thus exploring the microbial community of a specific habitat at the molecular level. The term metaproteomics was first used by Rodriguez-Valera in 2004 to describe and identify proteins and related genes which are abundantly expressed in the environmental sample (Rodríguez-Valera 2004). The study relies on functional information rather than only gene-level information. The main advantage of this technology is that here the expressed protein is studied that finds a role in determining the overall physiology of the bacteria. It has now become an integral stream of proteomics which has enabled the identification of large-scale proteins in microbial populations/communities (Kleiner 2019). Several studies have used metaproteomics to explore the taxonomic and functional role of the complex microbiomes of specific habitats (Abram et al. 2011; Kan et al. 2005; Ram et al. 2005; Wilmes and Bond 2006). Metaproteomics correlates diseases or environmental parameters with the function and taxon of a specific environment (Erickson et al. 2012; Heyer et al. 2016; Kolmeder et al. 2016). Metaproteomics data analysis primarily identifies peptides through mass spectrometry and searches it against protein sequence databases. Further, these peptides are assigned to protein groups or proteins that might be unique or shared between several taxa followed by assignment of these groups to functional groups using several databases (Blank et al. 2018). Some algorithms/tools and databases used are eggNOG-mapper (Huerta-Cepas et al. 2017), MEGAN (Huson and Weber 2013), MetaGOmics (Riffle et al. 2017), MetaProteomeanalyzer (MPA) (Muth et al. 2018), ProPHAnE (https://www.prophane.de), and Unipept (Gurdeep Singh et al. 2019). Earlier, a few drawbacks/issues were associated with metaproteomics that was related to the bioinformatics evaluation of data (Muth et al. 2013). Firstly, it requires high computational efforts, large processors and efficient tools and algorithms, bigger hard drives, and memory. Secondly, the identification of redundant proteins also affects the accuracy of taxonomy and functional assignment (Herbst et al. 2016). Thirdly, for unknown taxonomic composition, it is difficult to identify their functional role in a particular environment. More advancement has been achieved in the field of metaproteomic in the context of microbiome-derived information.

5.6 Metabolomics

This is another interesting tool to explore the dynamics of microbiomes at the level of their total metabolites. Metabolites direct serve as a direct signature of the metabolic reaction and ongoing biochemical activity, thus metabolites can be employed as health indicators (Cavill et al. 2009; Tang et al. 2019). Recent developments uncovered the significant influence on the microbiome due to the metabolome of near and distant body sites. The commendable research on liver diseases and their association with gut microbiomes encourages to development of the technique by using the extended profile of metabolites as therapeutic targets or biomarkers (Del Chierico et al. 2017; Canfora et al. 2019). Metabolomics is the study of small molecules or metabolites (<1500 Da) such as smaller biochemical compounds, including simple amino acids and related amines, as well as lipids, sugars, nucleotides, and other intermediary metabolites within an organism, a cell, tissue or fluids. The interaction of these metabolites in a biological system is known as metabolome. Thus, unlike proteomics or genomics, it measures molecules having different properties like polarity, solubility, chirality, and other physicochemical properties (Kuehnbaum and Britz-McKibbin 2013). Metabolomics relies on various analytical techniques such as liquid or gas chromatography coupled to highresolution mass spectrometry (HRMS), Nuclear magnetic resonance (NMR) spectroscopy, mass spectrometry (MS), and, liquid chromatography–mass spectrometry (LC-MS). These techniques are frequently used in metabolomic-based research to analyze thousands of metabolites with high accuracy. Metabolomics studies have enabled researchers to know how food/diet and disease are related, the correlation of gut microbiome with cardiac diseases (Newgard et al. 2009; Koeth et al. 2013). Metabolomics based on mass spectrometry is most sensitive for analyzing various compounds; however, it has some problems with the standardization and quantitation information. NMR-based metabolomics exhibit less sensitivity as compared to mass spectrometry-based metabolomics; however, it provides an absolute concentration of detected compounds and is useful for elucidating molecular structure more accurately. Metabolomics can be categorized broadly into targeted and untargeted types, targeted metabolomics analyze a predefined set of compound or metabolites, however, untargeted or global metabolomics allows the estimation of extracted metabolites from respective samples and can be used for novel biological perturbations which work more effectively with a high-resolution mass spectrometer for better structural characterization of the compounds or metabolites (Johnson et al. 2016). Localization of specific metabolites within cells or tissue can be achieved with the use of imaging metabolomics consisting of imaging mass spectrometry techniques, such as MALDI (matrix-assisted laser desorption ionization), NIMS (Nanostructure-imaging Mass Spectrometry), DESI (Desorption Electrospray Ionization Mass Spectrometry), and SIMS (Secondary Ion Mass Spectrometry) (Palmer et al. 2016; Fletcher et al. 2013). Metabolomics not only provides biological information but also exhibits potential application on novel therapeutic molecules (Clish 2015). Advanced sequencing tools and extensive research on microbiomes have enabled the correlation of the metabolites between healthy and diseased states of the host.

5.7 Metatranscriptomics

Bashiardes et al. (2016) reviewed well the use of metatranscriptomics in microbiome research. In recent years, the development of advanced sequencing tools such as RNA-Seq showed a jump in transcriptome-based analysis providing deep insights into this line of research. Transcriptomics can be understood as the study of complex microbial community’s gene expression within their natural environments/habitats. The technique was first introduced in 2000, and now RNAs sequencing or meta transcriptomics has been significantly increased which enabled researchers to characterize microbial community and their interaction (Bashiardes et al. 2016; Bikel et al. 2015) enabling the detection of genes expression to understand the microbe–host relationship (Moniruzzaman et al. 2017). The major goal of a metatranscriptomics study is to explore the functional activity of the microbiome of a particular habitat. Functional annotation can be achieved either through reads or assembled contigs. Tools such as MetaCLADE (Ugarte et al. 2018), HMM-GRASPx (Zhong et al. 2016), and UProC (Meinicke 2015) are read based on functional profiler and take ORF as input which is used in metatranscriptomic-based investigations. Alternatively assembled contigs can also be used for functional annotation by using Prodigal (Hyatt et al. 2010) and FragGeneScan (Rho et al. 2010) like programs which are further followed by functional assignments primarily based on similarity searches approach by using tools such as DIAMOND (Buchfink et al. 2015) or functional databases like KEGG (Kyoto Encyclopedia of Genes and Genomes) (Kanehisa and Goto 2000), COG (Clusters of Orthologous Groups) (Tatusov et al. 2000), NCBI RefSeq (Oleary et al. 2016), and UniProt (Uniprot 2019), etc. Other tools are available to analyze the transcriptome data such as Prokka (Seemann 2014), EDGE Bioinformatics (Li et al. 2015), and MG-RAST (Wilke et al. 2016), which combine a few similarity searches against various databases pipelines and platforms. Further annotation and enzymatic functions are mapped using metabolic pathways tools such as MinPath (Ye and Doak 2009) or iPath (Yamada et al. 2011), MAPPS (Riaz et al. 2020) (https://mapps.lums.edu.pk), Metacyc (A multiorganism database of metabolic pathways and enzymes) (Caspi et al. 2010). Targeted approaches typically do not provide direct evidence of the functional potential of the microbial population of a specific habitat, tools such as PICRUSt (Phylogenetic Investigation of Communities by Reconstruction of Unobserved States) enable researchers to infer functional profiling of microbiome directly from marker gene (such as 16S rDNA) taxonomy profiling (Aguiar-Pulido et al. 2016; Langille et al. 2013). Gosalbes and coworkers employed the 16S rRNA transcripts to determine the extensive bacterial profiling in the GI tract (Gosalbes et al. 2011). In another investigation, the transcriptome profiling of the gut microbiome revealed that the specific bacterial strain Eggerthella lenta exhibits cytochrome-encoding operon which is upregulated by digoxin that consequently inactivates the cardiac drug. Similarly, metatranscriptomics-microbiomics has been employed to assess the microbial community (Maurice et al. 2013), microbiome-immune interactions (Cullender et al. 2013), microbiome-antisense RNAs (Bao et al. 2015).

6 Development in Next-Generation Sequencing Platforms

It took approximately 400 years to understand that the entire microbial community of habitat cannot be analyzed due to several limitations of the traditional techniques of microbiology (Behjati and Tarpey 2013). The advancements in sequencing technology especially the emergence of NGS have immensely revolutionized the era of genomics (Buermans and Dunnen 2014). Several bacteria and archaea have been successfully sequenced and assembled using NGS-based technology (Goh et al. 2017; Panda et al. 2019). The technique extensively provides the comparative microbiome community structure of the healthy and diseased states (Malla et al. 2019). Significant development has been seen in the evolution of NGS- based platform. Earlier, Roche 454 GS FLX was exclusively used for NGS-based sequencing; however, due to low coverage and high cost, the technique has now been obsoleting today. At present, four NGS techniques are prevalent, which include Illumina, Ion Torrent technology, Pacific Biosciences technology, and Oxford nanopore sequencing. Illumina platforms dominate over the other sequencing platform for exploring human microbiomes using whole metagenome-based sequencing along with 16S rRNA amplicon-based sequencing (Chan et al. 2015; Saxena et al. 2017). The NGS has also made possible metagenome-based sequencing where a better resolution can be achieved to understand the functional and genetic diversity of unculturable communities (Lopez-Lopez et al. 2013; Sharon and Banfield 2013). Chan et al. (2015) added new information in shotgun-based metagenomics in terms of metabolic activity and dynamics of the inhabitant bacteria of the Sungai Klah (SK) hot spring that makes this hot spring unique. Expectedly, the bacterial diversity profile was almost similar in both amplicon as well as shotgun sequencing. NGS was the only solution to uncover the hidden microorganisms present in the human body with the successful completion of the human Microbiome project (Turnbaugh et al. 2007). Ion Torrent technology, an efficient sequencing platform has also been used for studying several environments and for understanding bacteria genomics (Bhalla et al. 2013; Mangrola et al. 2015; Pap et al. 2015). LeBlanc et al. (2013) studied the fecal microbiota composition using Ion torrent technology and suggested for a better DNA extraction strategy to avoid the biasness in bacterial community composition. Besides, a high error rate in Ion Torrent technology has been reported as compared to the Illumina Miseq platform which limits its employability. The Pacific biosciences single-molecule real-time (SMRT) sequencing technology is another major player in the sequencing world that is highly used for detecting specific DNA methylations (Ardui et al. 2018; Straub et al. 2018). Short-read massive parallel sequencing has emerged as a standard diagnostic tool in several medical diagnostic applications (Loomis et al. 2013; LeBlanc et al. 2013). However, the technique comes along with several limitations such as GC bias, challenges to map repetitive elements, and differentiating paralogous sequences. Therefore, it was gradually replaced with long-read single molecules that later clubbed with PacBio’s single PacBio’s single-molecule real-time (SMRT) sequencing technology (Ardui et al. 2018). Loomis et al. (2013) were the first to report the FMR1 CGG repeat using SMRT sequencing technology to diagnose repeat biasing in Fragile X Syndrome. Oxford Nanopore Technologies (ONT) MinION long-read sequencer has emerged as an advanced sequencing tool (Bowden et al. 2019) that has been used for sequencing several microorganisms (Kato et al. 2020). Matsuo et al. (2021) utilized the technique to achieve the full-length 16S rRNA amplicon to confer the species-level classification. Though the technique is under evolution to minimize the high error rate, however, it can be adopted well for the diagnosis of infectious diseases in an efficient and time-saving mode. Several investigations have been carried out to diagnose the diseased state of the human microbiome. For example, pulmonary sepsis (Guillen-Guio et al. 2020). Dysbiotic stages of pre- and post-antibiotic-treated human microbiome (Leggett et al. 2020). Besides, the nanopore has been successfully utilized for assessing species engraftment after fecal microbiota transplantation (Benítez-Páez et al. 2020). At present, Illumina (HiSeq and MiSeq) technology has made incredible developments in data output and accuracy in a cost-effective manner (Dohm et al. 2008; Reuter et al. 2015). Therefore, the technology has dominated the NGS-based sequencing market. The technique has enormously been used to understand human microbiome alterations (Evans et al. 2014; Lambeth et al. 2015; Yasir et al. 2015). Illumina HiSeq technology was employed to identify the significantly different bacteria in the oral cavity of oral squamous cell carcinoma (Srivastava et al. 2022). Several studies have been done to understand the taxonomical and functional profiling of smokeless tobacco using Illumina-based sequencing technology (Sajid et al. 2021; Vishwakarma et al. 2023).

7 Conclusion and Prospects

With the remarkable development in omics technology, researchers are now able to explore microbial dynamics. Metagenomics has grabbed the attention and changed the perception of researchers. It is gaining huge interest in the field of biotechnology and substantially impacting and increasing industrial products (Lorenz and Eck 2005). Other than this, various novel bioactive molecules such as terragines, violacein, indirubin, cytarabine, and cephalosporins have been retrieved using metagenomics that finds applications in human wellness (Coughlan et al. 2015). Shotgun metagenomics gained interest in the field of human health (oral microbiome, gut microbiome) and the discovery of new drugs for the treatment of diseases and new genes and proteins from noncultivable microorganisms. Metabolomics is another noteworthy technique that has gained huge interest in the field of biomedical and pharmaceutical science and broadens microbiome-metabolome research. The technique has identified several metabolite biomarkers and caused diseases such as cancer, diabetes, and Alzheimer’s disease which have no known therapeutic targets and strategies previously (Wishart 2016). Thus, it has bought more precision in medicine worldwide with the development of monitoring of each drug response, phenotyping of tumors, and targets for cancer therapies. MS-based metabolomics studies have recently been elevated and have been significantly used for the study of drug effects, toxins, and several diseases such as cancer of the kidney, bladder, breast, gastric, and other metabolic diseases and nutritional effects (Zhang et al. 2013a, b, 2014; Feng et al. 2018). In the future, metabolomics will help in elucidating various metabolic pathways and research related to metabolites with the help of combined chromatography. Metatranscriptomics provides the functional profiling of the microbial communities of an environment, thus giving information on the whole expressed genes in that community. The combined study of metagenomics and metatranscriptomics allows researchers to reveal the up-and-down expression of a particular function and role of the microbes present in an environment (Mason et al. 2012; Maurice et al. 2013; Duran-Pinedo et al. 2014). Therefore, at present, we are well equipped with various tools and techniques that assist in exploring the good as well as bad microorganisms at a broader scale. The taxonomic shift and their correlation with the hidden microorganisms can be explored for developing biomarkers in near future.