Keywords

1 Introduction

The evolution of sequencing technology from Sanger sequencing to whole genome sequencing (WGS) and the broad use in diverse microbiology fields has started a new era of “precision microbiology”, allowing a so far unprecedented detailed characterization of microorganisms and microbial communities. WGS is an already established process and according to a report from the European Centre for Disease Prevention and Control (ECDC), 50% of the public health laboratories in the European Union member states used this technology already in 2016 for daily routine diagnostics, surveillance and epidemiological investigations, outbreak investigation, and tracking and identification of the infection sources [1]. WGS-based typing, either single nucleotide variants (SNVs) analysis or gene-by-gene allelic profiling, is currently the most powerful diagnostic tool for detailed characterization of microorganisms and has replaced previous typing tools like pulsed-field gel electrophoresis, Sanger sequencing methods, and serotyping. The general benefits of WGS-based strain characterization approaches are its robustness, high discriminatory power, basis for a geographic differentiation of microorganisms and acquisition of evolutionary information for outbreak isolates [2]. The high data quality, the reproducibility and accuracy of WGS technologies has been demonstrated in public health and food safety [1,2,3,4]. In addition, the value of WGS has been highlighted for the detection, epidemiology and screening of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) variants, which was a key component in public health efforts during the COVID-19 pandemic. Another benefit of WGS technologies is their backward compatibility to results obtained with traditional methods via extraction of information from a genome to determine the serotype, classical multi-locus sequence type (MLST), virulence, toxigenicity, antibiotic resistance, mobile genetic elements (MGEs) and plasmids of an isolate from a single experiment [5, 6] (Fig. 1).

Fig. 1.
figure 1

WGS workflow for characterization of bacterial species and prediction of their properties.

The high-resolution achievable by WGS makes it possible to monitor even small genetic variations occurring during the course of an outbreak, which allows the monitoring of transmission over time, identification of person-to-person transmissions and infection sources.

WGS is also superior to any other method in terms of generation and management of global databases and nomenclature like for example Enterobase [https://enterobase.warwick.ac.uk, 7], GenomeTracker [https://www.fda.gov/food/whole-genome-sequencing-wgs-program/genometrakr], Global Microbial Identifier (GMI) [https://www.globalmicrobialidentifier.org], PubMLST [8], cgMLST [https://cgmlst.org] or the EFSA One health WGS system [9] with standardized nomenclatures for bacteria. GISAID [https://gisaid.org] for all influenza- and corona viruses including SARS-CoV-2 and Pango for SARS-CoV-2 nomenclature [10]. Nextstrain [11] for both, viruses and bacteria. The setup of open accessible databases allows data sharing between public health and food laboratories worldwide and facilitates international source tracking, multinational outbreak investigation [1, 9, 12] and analysis of microorganisms from divers environments under a one health approach [13]. The resulting information is the basis for correct decisions required in outbreak situations to stop further transmission and to terminate outbreaks [12, 14].

Amplicon based and shotgun metagenomics are promising WGS approaches that allow, compared to current microbiological methods, an unbiased and culture-independent analysis of diverse sample matrices, providing a nearly complete information about microbial communities (Fig. 2) [15, 16].

Metagenomics allows, in contrast to culture-dependent methods, also the identification of slow growing, difficult to cultivate and non-cultivable microorganisms [16]. In addition, metagenomics allows the extraction of further information for microbial risk assessment like the occurrence of antimicrobial-, virulence- and toxin genes and, when linked to transcriptomic or proteomic data, the identification of functional capabilities and biochemical activities of microbial populations (Fig. 2) [16].

Fig. 2.
figure 2

Whole genome and metagenomics sequencing. Left side: culture-based WGS. Right side: culture-free shotgun metagenomics.

In conclusion, WGS approaches clearly improved the detection and characterization of microorganisms, the investigation of outbreaks, detection of infection sources, and elucidation of transmission chains, enabling authorities, hospitals and food companies the timely and selective implementation of appropriate control and preventive measures. In the future, the qualitative and quantitative improvement of strain and genome databases in combination with further technological advancements and the development of new bioinformatics and machine learning algorithms will allow the accurate analysis and prediction of the characteristics of microorganisms, such as, virulence, resistance and pathogenic properties or, health-related benefits, probiotic and technological properties and also patent protection (Fig. 1).

2 Whole Genome Sequencing in Food Microbiology

With the increasing industrialization of food production and the internationalization of food trade, ensuring food safety is nowadays a global challenge [17]. Food contamination and malnutrition have a worldwide impact on public health and economics. Many factors such as the adaptation of microorganisms to new environments/hosts, human lifestyle and demographic changes, economic development, climate change, environmental pollution and excessive water- and soil consumption may cause the emergence of new and the re-emergence of old microbial threats [18]. Microbial food contamination can occur in any stage of the farm-to-fork chain during food storing, processing and handling but also in origin, at the farm (i.e. feed contamination) in the case of livestock, or in the environment. Moreover, in our globalized world and as impressively demonstrated by the SARS-CoV-2 pandemic, pathogens can easily spread worldwide. Therefore, local, national and international surveillance systems are required for efficient disease monitoring [18]. The accomplished progress in sequencing technologies from Sanger sequencing to whole genome sequencing (WGS) in combination with surveillance programs impressively improved the timely detection and implementation of appropriate control measures to terminate outbreaks and to prevent further transmission and morbidity [1].

WGS yields concordant results independently of the analysis pipeline i.e. single nucleotide polymorphism (SNP) or gene-by-gene based approaches for phylogenetic clustering [16, 19]. Thus, WGS allows the identification of the responsible infection source with a high level of confidence in a foodborne outbreak scenario [26, 27] and has improved in such situations correct decision-making [12, 20]. Moreover, the set-up of open access databases hosting whole genome sequences and associated metadata, has made international source tracking and multi-country outbreak investigation easier [21].

WGS allows an efficient tracking of microorganisms and their distribution from farm-to-table. Subpopulations of bacterial pathogens can be transmitted from diverse sources outside processing facilities like animals, incoming raw materials, soil, dust and water into food processing facilities [22]. Once inside the manufacturing facilities, microorganisms can persist on diverse surfaces, equipment, and cold storage areas over long periods of time [23] and can spread to food, and subsequently to consumers via aerosols, food processing workflows and contaminated materials in contact with food [23, 24]. WGS therefore enables the characterization of bacterial subpopulations at each step of the food chain—from the environment, to suppliers, food-processing facilities, final products and consumers [22, 24].

The high level of discrimination of WGS-based analysis improves and allows a more targeted risk assessment and risk management of microorganisms added as probiotics and protective cultures in food products [25]. WGS is also useful for the systematic selection of suitable food microorganisms i.e. so called non-starter and starter cultures used for manufacturing of several food products like cheese, meat, vegetables and fish [26]. In these food products, they contribute to the texture, specific aroma and taste and enhance the biological safety through the production of a variety of antimicrobial substances like bacteriocins, organic acids and hydrogen peroxide, which inhibit the growth of diverse foodborne pathogens and spoilage microorganisms [27, 28]. WGS is also useful for the identification and characterization of the so-called endemic food microorganisms, i.e. microorganisms obtained from artisanal food products of a country produced without industrial starter cultures. These artisanal-food-typical strains may have specific properties, which are an essential factor for the typicity of traditional food products and might be a valuable resource for future food production [27]. Moreover, artisanal-food-specific strains might be useful markers to prove the origin of traditional food products [28].

However, although the use of microorganisms for the production and conservation of food has a long tradition and the safe application in food has been demonstrated, its use is controversial due to the emergence of toxigenic, virulent, and in recent years multidrug resistant strains. Therefore, strains added to food or used for manufacturing of food products must obtain the Qualified Presumption of Safety (QPS) [29] or the Generally Recognized As Safe (GRAS) status following recommended schemes and guidelines. WGS methods clearly improve and facilitate the safety assessment of food microorganisms, which is based on a case-by-case investigation [30] by analyzing the toxigenic, resistance and virulence potential of the respective strains [29, 30].

The application and further development of WGS-based technologies in food microbiology and food safety will provide a superior tool for risk managers to assess and evaluate food risks and the properties of strains.

3 WGS in Medical and Public Health Microbiology

WGS is already an established laboratory process in public health microbiology for daily routine diagnostics, surveillance, outbreak investigation, and identification of the infection sources [1, 12, 31]. However, despite the demonstrated usefulness and benefit of WGS for pathogen detection and variant screening in severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), this technology is still not a standard technology in many microbiology laboratories in hospitals and health care units [32]. With sequencing costs decreasing to around 0.010$/ megabase DNA [https://www.genome.gov/about-genomics/fact-sheets/DNA-Sequencing-Costs-Data], other barriers hamper the general use and acceptance of WGS in clinical microbiology laboratories. The main factors are a long turnaround time, the lack of knowledge among laboratory staff and lack of bioinformaticians, lack of data to create accurate models for genotype-phenotype prediction, and lack of workflows for genome data analysis [33]. Therefore, in clinical settings, WGS and metagenomics are mainly used for isolate detection and identification, isolate characterization, microbiome characterization, detection of antimicrobial, virulence and toxigenic targets and the detection and characterization of plasmids that play a role in the transmission of resistance and/ or virulence (Fig. 2) [34].

However, WGS is more useful for clinicians if meaningful interpretations can be made directly from genome data i.e. prediction of antimicrobial resistances or the clinical impact of a virulence gene [32]. For several bacterial species including ESKAPE (Enterococcus faecium, Staphylococcus aureus, Klebsiella pneumoniae, Acinetobacter baumannii, Pseudomonas aeruginosa, and Enterobacter spp.) pathogens, workflows with or without using machine-learning algorithms were developed and validated for the prediction of antimicrobial resistance from genomes [35,36,37]. Moreover, with respect to inter-laboratory reproducibility, WGS based prediction of antimicrobial resistance outperforms already phenotypic antimicrobial susceptibility testing for some pathogens [38]. For tuberculosis, WGS-based diagnostics is already an established process showing 93% accuracy for the detection and characterization of multidrug-resistant Mycobacterium tuberculosis and 7% lower costs than traditional culture-based methods [39]. However, for the majority of pathogens, the accuracy of resistant phenotype prediction from genomes is still hampered by the lack of curated high quality databases for drug resistance-conferring determinants [39].

In addition, as mentioned above, WGS is backward compatible to historical datasets obtained with classical methods.

In hospital settings, the high-resolution achievable by WGS improved the detection and investigation of outbreaks considerably [40]. WGS allows the monitoring of even small genetic variations enabling the detection of person-to-person transmissions [14], transmission over time, detection of diverse clusters and sporadic unrelated cases, and the identification of infection and contamination sources [4, 41].

However, to establish a routine use of WGS in clinical microbiology, certain requirements such as a stable funding, the use of optimized and harmonized sequencing protocols, the prospective sequencing and integration of WGS data into the hospital management systems, or bridging the gap between public health, hospital infection prevention and the clinical laboratories needs to be fulfilled. In public health, to utilize the full potential of WGS harmonized protocols, and open access national- and international databases are required for sharing, comparison, and depositing WGS- and minimal epidemiological metadata. These harmonized datasets will considerably improve risk assessment, risk prediction, disease prevention and public health actions.

4 Conclusion

With the advent and implementation of WGS in microbiology, a new era of “precision microbiology” began. In food safety, public health and clinical microbiology, WGS provides superior discriminatory power for strain characterization, robustness and stability. This is crucial in cluster detection and source tracking for gaining knowledge on microbial evolution, and for risk assessment and safety evaluation of food microorganisms. WGS technologies provide not only benefits for clinical, public health and food agencies but also for the food industry throughout the farm-to-fork principle. Future improvements in WGS, i.e. technology, metadata, new bioinformatics and machine learning algorithms, will enable the detection of current and emerging infectious diseases in real-time and the international exchange of data and information in a standardized manner. We can also expect a paradigm shift from classical/ molecular microbiology towards predictive microbiology.