Introduction

Urban Wastewater Treatment Plants have been utilized over the last decades to treat the wastewater from several human activities (Households, Industry, Hospitals, etc.). This treatment consists on several steps where organic load (including pollutants) are partially or totally removed and degraded. Conventional UWWTPs consist of a combination of physico-chemical (Primary treatment), biological processes (Secondary treatment) and an Advanced Oxidative Process (Tertiary treatment), the objective of which is to remove recalcitrant or non-biodegradable pollutants to obtain a high water quality before it is discharged or for reusing purposes (Oller et al. 2011; Ferro et al. 2016).

The presence of microorganisms in these systems of water purification is fundamental to reduce organic matter and pollutants from the water. Indeed, the secondary treatment is based on the metabolic activity of a complex microbial community (sludge) where the relation between species or strains directly affects and determines the overall effectiveness of a biological treatment of wastewater (Cydzik-Kwiatkowska and Zielińska 2016). However, nowadays, researchers still investigate these microbial communities as they are highly complex and many aspects are unknown which may be critical for a better functioning of UWWTP operation at large scale (Cydzik-Kwiatkowska and Zielińska 2016). In this approach, microbial genetic-based techniques are emerging as one of the most important tools for increasing knowledge on the biological activity of this type of communities, referred to as bioremediation.

On the other hand, there is also another important and significant application of the same molecular-biology techniques to investigate emerging microbial pathogens of recent concern appearing in secondary UWW with an undetermined hazardous impact on the environment, including human health. Many efforts are being made to investigate the presence of these pathogens in the influent and effluent of a UWWTP to ensure “safety” especially when reuse is in practice. This is the case of the antibiotic-resistant bacteria (ARB) and related genes (antibiotic-resistant genes (ARGs)) (Rizzo et al. 2013; Ferro et al. 2016).

This contribution reviews the current scenario of conventional and advanced techniques used for water quality management and monitoring of microorganisms naturally occurring in UWW in two areas of research and application: bioremediation and UWW reuse. Also, it is pointing out the main implications of the development of advanced techniques to investigate unknown micro-pollutants due to limitations of standard techniques. Furthermore, an exposition of the different massive sequencing platforms and the methodology used in each one is carried out. Finally, a bibliometric analysis is carried out, studying parameters such as number, country, institutions, and source publications, giving an overview of the way in which scientific research in the area of bioremediation and wastewater reuse will take place in future.

Overall legislation and water quality monitoring in urban wastewater treatment plants

International and national regulations for UWW reuse establish the legal framework to ensure the correct monitoring, use, and management of water resources. The different legislation describes, among others aspects, the analysis of physicochemical and biological parameters according to standardized analytical method. Although most of them rely on the same basis, the maximum levels of contaminant discharge permitted for each rule (or country) may vary and also depend on the different final end-uses, including agriculture, industry, recreational uses, etc.

The most relevant guidelines in UWW management and reuse are established by the US Environmental Protection Agency (USEPA 2012) and the World Health Organization (Gorchev and Ozolins 2011). At a European level, the DIRECTIVE 2000/60/EC establishes the framework for community action in the field of water policy. However, there are many national regulations in Europe based mainly on the WHO guidelines including microbial load limits depending on the type of microorganism and the type of final use. Spain has with one of the most restrictive European Regulations on UWW management and reuse, the Spanish RD 1620/2007 (RD 1620/2007). Other countries like Jordan, South Africa, and Australia (Seder and Abdel-Jabbar 2011) (Seder and Abdel-Jabbar 2011; National Water Quality Management, S 2006; DWA 2011) have developed their own standards in this matter.

Water quality monitoring to accomplish water legislations criteria is an important key to provide safe water levels and improve subsequent water management. This monitoring includes the analysis of several chemical parameters, highlighted as follows: pH value, electrical conductivity, color, odor, total suspended solids (TSS), turbidity, biochemical oxygen demand (BOD5), ammonium and Kjeldahl nitrogen, anion and cation concentration, chemical oxygen demand (COD), water hardness, and/or free chlorine.

The main averaged physicochemical characteristics typically found in UWW effluents can be summarized as follows: pH 7–7.5, conductivity ca. 1530 μS cm−1, turbidity ranged from 7.78 to 10 NTU, total organic carbon (TOC) around 15 mg L−1, TSS ranged from 250 to 850 mg L−1 (Giannakis et al. 2016). In the case of UWW, a variable number of chemical and microbiological pollutants mainly derived from daily human activities are present (Asano 1998). In line with this, a great number of toxic inorganic substances have been detected, including elements like arsenic, copper, lead, mercury, zinc, etc. (Pescod 1992). Other important chemical pollutants in UWW are those so-called ‘contaminants of emerging concern (CECs)’ or priority substances that co-exist in this water matrix at very low concentration (from ng L−1 to μg L−1), although they represent a challenge for UWW treatment and potential reuse. In this, CEC included substances like pharmaceuticals, antibiotics, personal care products, hormones, industrial chemicals, heavy metals (arsenic, copper, lead, mercury, zinc, etc.), and many other chemical emerging substances which cannot be completely removed by conventional treatments (Miralles-Cuevas et al. 2016).

On the other hand, UWW contains a wide variety of microorganisms, including several microbial groups like bacteria, viruses, and protozoa. Table 1 summarizes the most common naturally occurring waterborne pathogens detected in UWW. Regarding water monitoring, it can be highlighted that in contrast to the wide presence of microorganisms in water, there are only a few microbial indicators used for assessment of water quality. Escherichia coli (total coliforms/fecal coliforms) is the most widely used microbial parameter in water legislations including drinking water and UWW for reuse activities. Other pathogens, such as Legionella sp., Salmonella sp., viruses, and protozoa (Cryptosporidium sp., Giardia sp., etc), are rarely included or required as criteria.

Table 1 Summary of representative waterborne pathogens detected in secondary effluents of UWW

Nevertheless, anthropogenic activity has derived in the development of microorganisms considered as ‘emerging concerns’, like antibiotic-resistant bacteria (ARB) (Rizzo et al. 2013; Ferro et al. 2016). They have recently appeared or been detected in UWW effluents thanks to the development of advanced detection techniques, and their presence represents new issues and concerns for reusing treated UWW (Gorchev and Ozolins 2011).

For these reasons, guidelines are under constant evaluation and modification to cover any new aspect related with the reuse of UWW effluent including UWWTP management, new final end-uses, new potential risks, and the addition of new parameters for water quality monitoring. In this line, a new legislative proposal on minimum requirements for reused water for irrigation and groundwater recharge has been announced by the European Commission for 2017. It will include aspects like risk management plans, treatment standards and process controls, and water quality benchmark.

Standardized and conventional detection methods of microorganisms in water

Standardized methodologies permit the measurement of any water parameter according to proper analytical methods with the aim of generation and comparison of water quality data or parameter between different laboratories. The most extended manual of practices on water quality analysis for water and wastewater is the well-known “Standard Methods for the Examination of Water and Wastewater.” This manual provides a number of standardized EPA-approved laboratory tests of physicochemical and microbiological water analysis (APHA—2005). Regarding microorganisms, the most commonly accepted procedure for detection and enumeration of water pathogens are widely explained in the aforementioned manual and briefly summarized in Table 2.

Table 2 Main standardized methods used for detection of microorganisms in water (source: APHA—2005)

Non-standardized and advanced analytical methods

The interest in knowing all the microorganisms present in a given environment is not new. The ease of living in environments of extreme conditions, their ability to adapt to overcoming conditions, and the near ubiquitous presence of microorganisms have aroused the curiosity of the scientific community in recent decades. But, their analysis is not always simple. The main problem that scientists can find is that, as several studies have shown, only 1% of the microorganisms we can find in a given environment are cultivable in the laboratory (Rappé and Giovannoni 2003). In addition, and even though culture-dependent methods are extensively used for pathogen detection in water, these methodologies show several limitations such as low sensitivity and excessive time needed to obtain reliable results (Yazmín Ramírez-Castillo et al. 2015).

Consequently, other analytical methods (non-standardized) are widely used to investigate many microbial-related aspects, permitting scientist to increase their knowledge and get valuable information on pathogens in water such as the investigation of human pathogens that exist in a viable but non-cultivable (VBNC), detection of genetic modifications, mutations, symbiosis between strains or species, etc.

To investigate the relationships between microorganisms responsible for pollutant removal from wastewater, a new set of molecular techniques was developed during the 1990s. Such techniques eliminate the need to culture organisms for detection and remedy shortcomings of traditional techniques by allowing rapid, sensitive, and specific identification of target microorganisms responsible for the elimination of specific contaminants as well as required pathogens.

The most common advanced techniques in many research areas are summarized as follows:

  1. 1.

    Flow-cytometry. This is a technology used to analyze several parameters of cells, including the cell’s relative granularity, size, and fluorescence intensity as well as its internal complexity surface and intracellular molecules. It permits the characterization of different cell types in a heterogeneous cell population by fluorescently labeled cells. It offers a powerful and effective technology for assessing bacteria in water samples because it is accurate, rapid, detects both cultivable and uncultivable microorganisms, and it is relative easy to perform (Van der Mark et al. 2013). Currently, this methodology appears as one of the advanced analytical methods that may compete with standard plate count to improve water quality monitoring (Van Nevel et al. 2017).

  2. 2.

    Fluorescence in situ Hybridization (FISH). The method is a powerful technique for researches that permit the in situ detection, identification, and a quantitative description of a microbial community such as activated sludge and wastewaters (Gilbride et al. 2006). It permits scientists to investigate possible mechanisms of survival, infection at cellular level, and detection of emerging pathogens from water, sewage, and sludge. FISH is based on the specific hybridization of rRNA oligonucleotide probes labeled covalently at one end with fluorescent dye with genetic material. This technique is used in combination with confocal microscopy, fluorescence microscopy, or flow cytometry in order to obtain qualitative and quantitative results (Allegra et al. 2008).

  3. 3.

    Biosensors. Traditional chemical and physical tests for contaminants in urban wastewater should be combined with bioassays to evaluate their biological availability and bio-toxicity and, consequently, to determine their potential effects on human health and the aquatic biota. In addition, the effect of bioaccumulation of contaminants over time could be assessed.

Recently, the development of biosensors has opened up great perspectives to the onsite, simplified, and cost-effective monitoring of water quality (Brayner et al. 2011; Lagarde and Jaffrezic-Renault 2011). In a biosensor, a biological recognition element is combined with a physical transducer to convert the biological response to a signal that depends on the analyte concentration (Jianrong et al. 2004). However, few drawbacks such as low selectivity, low detection limits, risk of contamination with other microorganisms, and mass transfer limitation, should be tackled.

The majority of available biosensors is enzymatic and operates via electrochemical processes. They offer high selectivity towards the target analyte but they also are time consuming. Other drawbacks should be mentioned: costly enzyme purification and immobilization protocols are needed and present short life time and poor stability. Microbial biosensors are more sensitive to a large variety of analytes, thanks to the consortium of enzymes that they contain in their cells (Park et al. 2013). Electrochemical approaches, i.e., amperometry, potentiometry, and conductometry, are usually implemented for microbial sensors. In addition, optical microbial biosensors are also highly employed (Chouler and Di Lorenzo 2015).

Microbial fuel cell technology is a very promising technology though still under basic research (Lee et al. 2015); such devices directly convert the chemical energy as organic matter into electricity via metabolic processes of microorganisms. No external transducers are required to convert the biological response into a signal, as the presence of a pollutant in the feeding stream is immediately detected by a distinct current change from the system. Apart from pure cultures, mixed cultures of naturally available microorganisms have been also tested (Aracic et al. 2015).

  1. 4.

    Techniques based on molecular biology. The most widespread technique used on water research is the very well-known quantitative polymerase chain reaction (qPCR). This method has proven to be an effective tool to detect, identify, and quantify microorganisms in water with high sensitivity and saving time. It is based on the detection and amplification of specific DNA fragments of a microbial strain. To do so, different fluorescent probes can be used; they hybridize within the DNA target sequence generating a detected signal by specialized systems, which permits quantification based on DNA-specific sequences, analysis of mutations, etc. Due to the increased knowledge and new development of methodologies based on DNA quantification, this approach is widely described in the next sections (Sanz and Köchling 2007; Cydzik-Kwiatkowska and Zielińska 2016).

Metagenomics

Accordingly to what has been discussed above, it is widely accepted in the scientific community that basic determination of the microorganisms and their activity under different conditions are keys for successful operation of an urban wastewater treatment plant. Near real-time capability for monitoring the activity of the microbial population (biomass) in activated sludge should be mandatory as it determines metabolic pathways that may occur in the technological system and affecting the final quality of treated wastewater for reuse.

Therefore, the necessity of developing a technique that facilitates the study of the microbial diversity of a sample by analyzing all of the prokaryotic DNA, without having to isolate and previously cultivate in the laboratory each microorganism, has recently arisen. And, that is what metagenomics does, a scientific discipline whose origin goes back to the Pace essays (Pace et al. 1985) in which he attempted to read the introduced microbial DNA into cloning vectors. But, those early studies had little success and it was not until 1998 that the term metagenomics was first used (Handelsman et al. 1998) to refer to the totality of the genomes found in a certain environment. Thus, the goal of a metagenomic analysis is the study of microorganisms, through their DNA, in the context of their community.

Metagenomics studies can be tackled from two different approaches but always through massive sequencing or next-generation sequencing (NGS) (Mardis 2008). Massive sequencing is carried out through a targeted metagenomics approach (Yang et al. 2014), while NGS is done through a shotgun metagenomics approach (Liebl et al. 2014). The fundamental differences between them are methodological and objective. In the targeted metagenomics, a gene or a few genes are sequenced and used primarily to carry out phylogenetic-type studies, while in the shotgun metagenomics, all present DNA is sequenced and used in functional gene analysis assays (Morgan et al. 2013).

In targeted metagenomics studies, to construct the libraries that will later be sequenced, polymerase chain reactions (PCRs) are performed by amplifying, fundamentally, the hypervariable regions of the DNA that encode for the ribosomal RNA (rRNA) (Amaral-Zettler et al. 2009; Caporaso et al. 2011). This type of assay is usually used in ecology for the development of taxonomic studies that shed light on the biological diversity of an environment. On the other hand, in shotgun metagenomics studies, what is done to construct the genomic library is to fragment the DNA through the use of restriction enzymes (Venter et al. 2004) or any other physical method, and to sequence it in its entirety. In this type of sequencing, the depth (the number of times each nucleotide is read) is lower than in targeted metagenomics since much more genetic material is sequenced from each microorganism. But in return, it permits a global vision, which facilitates establishing relationships between different elements of the genome, allowing drawing functional conclusions about the genes.

In any case, regardless of the metagenomic approach that is carried out, once the library is built, the rest of the process is similar and consists of DNA sequencing. So, the birth of high throughput sequencing technologies was a fundamental fact in the development of the metagenomics and for that reason, it is easy to understand that since the first work of this type was published (Rondon et al. 2000) until now, the number of articles has continued to grow. In addition, it is also important to note that in parallel to the development of different massive sequencing platforms, a multitude of bioinformatic tools have been developed that make it possible to analyze the large amount of data obtained (Oliver et al. 2015; Garrido-Cardenas and Manzano-Agugliaro 2017). Finally, the greater capacity of reading the DNA and the better understanding of the data have led to a reduction in the price of these analyses. If at the beginning of the twenty-first century the price of sequencing a complete genome was around 100 million dollars, that same genome can be sequenced today for no more than 1000 dollars. The parallelization of the sequence readings associated with the NGS methodology has managed to make the sequencing of large amounts of DNA very accessible. For this reason, metagenomic analyses are increasingly common in studies not only related to food or agriculture (Li 2011) but also with human health and well-being (Wang et al. 2015).

Massive sequencing platforms

The different massive sequencing platforms and their main features are represented in Table 3 (Garrido-Cardenas et al. 2017). The first four platforms (454 Roche, SOLID, Illumina and Ion Torrent) are platforms with short-read sequencing technology, also known as second-generation technology, which appeared on the market from the years 2000 (the first machine of the Roche 454 platform) to 2010 (the first equipment of the Ion Torrent platform). The other two platforms, Pacific Bioscence and Oxford Nanopore, are of more recent creation and they are known as platforms with single-molecule real-time long read or third-generation sequencing technology.

Table 3 Different massive sequencing platforms with their most important characteristics

As it was already pointed out, the Roche 454 platform was the first to appear on the market and be used for massive sequencing analysis. Its technology is also known as pyrosequencing (Hyman 1988), and it is based on the measurement of the light emitted as a consequence of a secondary reaction produced in the DNA replication (Chowdhury et al. 2012). When DNA is duplicated to be read, with each nucleotide being incorporated into the new synthesized strand, release of a pyrophosphate molecule occurs. This pyrophosphate is essential for the transformation into oxyluciferin of the luciferin reagent, releasing a light that can be measured with a CCD camera, coupled charging device. The 454 platform equipment has been widely used in metagenomics analyses (Tun et al. 2012; Gonzalez-Silva et al. 2017) and other massive sequencing analyses, although at present, they are being abandoned mainly due to their high price, but also because of the high error rate in homopolymer reading (Luo et al. 2012).

The SOLiD massive sequencing platform is based on the detection of fluorescence signals based on sequential ligation of fluorescent probes. What it does, in practice, is to carry out successive ligation cycles of the 16 different probes obtained by combining the 4 different nucleotides in the first two positions of the probe by 2-in-2. These probes are marked by four different fluorophores, and only the set of the signals generates a unique sequence. That is, each measure of fluorescence, independently, results in a multiple interpretation of the signal. But, the combination of all the signals is unique and unequivocal (Valouev et al. 2008). The great advantage of this platform compared to others is its high throughput and the low price of sequencing reactions, but it has the cons of generating readings of very short length and the price of the equipment is very high.

In the Illumina platform, chemically modified nucleotides which give rise to the reversible termination of the DNA polymerization reaction are used. These modified nucleotides are further labeled with a fluorophore, and their fluorescence is measured by a phenomenon known as TIRF, total internal reflection fluorescence (Bentley et al. 2008). This platform is the most used in massive sequencing projects, including those of metagenomics (Lazarevic et al. 2009; Caporaso et al. 2012). Both its great advantages and its disadvantages are similar to those argued for the SOLiD platform, although the Illumina platform has a greater range of equipment to adapt to the characteristics of the project to be carried out.

The last of the platforms with short-read sequencing technology to appear was Ion Torrent, and it introduced the novelty of using semiconductor materials, abandoning the optical detection systems. Its theoretical fundament is based on the measurement of the pH micro-changes produced with the release of H+ protons in DNA synthesis (Merriman et al. 2012). Although there is not a large number of Ion Torrent equipment on the market, this platform does have different chips that the researcher can choose depending on the needs of their study. Each chip has a number of different wells, and it is inside each well where DNA polymerization takes place. That is, in short, the chip is the machine. The two major advantages of this platform are the low cost of both the equipment and the sequencing reactions, and the simplicity of the equipment. Its great drawback is that it is not yet at the level of Illumina or SOLiD in terms of the accuracy of its readings.

Recently, there have appeared two new platforms of sequencing of different theoretical basis, but with a great common novelty: the sequencing of unique molecules in real time (Lee et al. 2016). These are Pacific Bioscience and Oxford Nanopore platforms. The Pacific Bioscence platform uses immobilized enzymes to polymerize while reading DNA molecules (Rhoads and Au 2015), while the Oxford Nanopore platform uses nanosensor channels to differentiate the two sides of a compartment and to measure the potential change produced when DNA traverses the pore (Loman and Watson 2015). Both platforms share advantages and disadvantages. Among the advantages that can be highlighted are the low cost of sequencing and the great length of reading, while the fundamental disadvantage is their very low accuracy.

Application of metagenomics to UWW

Bioremediation

These advanced detection techniques have become also key tools for microbial population monitoring in bioremediation. Wastewater treatment by using naturally occurring organisms to break down hazardous substances into less toxic or non-toxic substances is known as bioremediation. Aerobic/anaerobic and anoxic biological systems are widely employed in UWWTPs as the cost-efficient treatment technique. However, historically, they have been implemented as a “black box” engineering solution where amendments are added and the pollutants are degraded (Chakraborty et al. 2012). Consequently, highly accurate techniques are necessary to identify specific microbial populations present in activated sludge systems responsible for certain enzymatic and degradation activities with the aim of improving advanced bioremediation processes focused on microorganisms’ possible adaptation to newly detected pathogens and contaminants of emerging concern.

In this sense, the use of genetic engineering to create organisms specifically designed for bioremediation has great potential, as does the addition of matched microbe strains to the activated sludge medium to enhance the resident microbe population’s ability to degrade contaminants (Lovley 2003).

UWW reuse

Some examples on how genetic techniques have been applied to assess the capability of UWWTP for pathogen removal could be given. The dynamics of complex microbial communities present in water samples through the different stages of wastewater treatment (including coagulation-flocculation, sedimentation, sand filtration, and disinfection) as well as in biofilms (in the secondary treatment step) were quantitatively unveiled by qPCR by Lin et al. (2014) and Lu et al. (2015). They demonstrated that Arcobacter butzleri, Aeromonas hydrophila, and Klebsiella pneumoniae present in wastewater were efficiently eliminated during biological treatment by applying molecular monitoring methods.

One of the current concerns related with UWWTP is that these stations act as “hotspots” of antibiotic-resistant bacteria (ARB) and antibiotic-resistant genes (ARG), facilitating their spread in the environment. In line with this, some works have been performed using metagenomics approach demonstrating the great potential of this technology to investigate ARB and ARG.

Huang et al. (2014) investigated tetracycline-resistant bacteria (TRB) and ARGs in activated sludge of sewage treatment plants treated specifically with tetracycline using 454 pyrosequencing and Illumina high-throughput sequencing. Pyrosequencing of 16S rRNA gene identify several bacteria genera (Sulfuritalea, Armatimonas, Prosthecobacter, Hyphomicrobium, Azonexus, Longilinea, Paracoccus, Novosphingobium, and Rhodobacter) as potential TRB. Results of metagenomic analysis indicated an increase in the abundance and diversity of the Tet genes, reducing the occurrence and diversity of non-tetracycline ARG, especially sulfonamide resistance gene Sul2 (Huang et al. 2014).

In other works, the effect on ARG in a river disposal by WWTP effluents was assessed by functional metagenomics. It constructed libraries in E. coli revealing a significant increase downstream of the WWTP in the number of resistant clones to amikacin, gentamicin, neomycin, ampicillin, and ciprofloxacin (Amos et al. 2014). Moreover, Zhang et al. (2015) investigated the capacity of ARG removal through thermophilic and mesophilic anaerobic digestion of sludge at bench-scale reactors. They applied metagenomics analysis, detecting 35 ARGs in the sludge with a reduction of > 90% of 8 and 13 ARG after thermophilic and mesophilic anaerobic digestion, respectively. Another study was conducted to investigate antibiotic resistance profiles of Pseudomonas aeruginosa in a hospital wastewater treatment plant (HWTP) (Santoro et al. 2015). In this work, metagenomics analysis was associated with P. aeruginosa isolated as a bio-indicator to assess the antimicrobial susceptibility, the viability, and the diversity of ARB-Pseudomonas species.

More recently, shotgun metagenomic sequencing was applied to evaluate the different steps of three UWWTPs in Swedish. It was found that the OXA-48 gene was enriched in surplus and digested sludge, related with the resistance to carbapenems, one of our most critically important classes of antibiotics, concluding that comprehensive analyses of resistant/non-resistant strains within relevant species are warranted with metagenomics (Bengtsson-Palme et al. 2016).

Bibliometric analysis

In the development of the bibliometric analysis, a search was carried out in Scopus, the Elsevier database, with the following parameters: TITLE-ABS-KEY (wastewater) OR TITLE-ABS-KEY (sewage) AND TITLE-ABS-KEY (metagenomic) OR TITLE-ABS-KEY (“dna sequencing”) OR TITLE-ABS-KEY (pyrosequencing). The time range used was from 1994 to 2016. It should be noted that a search query changing any of these parameters or time range can give different results.

The main aspects studied were number of publications per year, distribution by country and affiliation, and source. The records were processed using spreadsheets, and graphs were generated to facilitate the visualization of the results.

Evolution of scientific output

The search returned 732 documents. Figure 1a represents the evolution of scientific output in the period 1994–2016. As can be seen, scientific production has grown progressively over this period, the most notable increase since 2012, reaching a maximum of 170 published documents on metagenomics in wastewater in 2016.

Fig. 1
figure 1

a Trends in publications on metagenomics on wastewater from 1994 to 2016. b Trends in publications on metagenomics on wastewater from 1994 to 2016 with the y-axis in logarithmic scale

In Fig. 1b, these same values are plotted but on a logarithmic scale, and the obtained function is adjusted to a linear trend line with a correlation coefficient of R 2 = 0.9627. This confirms the impression obtained from the previous graph of the trend of exponential growth from the beginning of the twenty-first century to the present day.

Publication distribution by countries and institutions

Figure 2a shows the scientific production of the studied subject, distributed by countries. Only eight countries with at least 30 publications have been included in the representation. As can be seen, there are two countries that stand out above the others. These are China and the USA, with 264 (36.07%) and 160 (21.86%) publications, respectively. It is noteworthy that, of the top five countries in the ranking, three are Asian, which demonstrates the great interest that is being given to environmental issues in that region of the planet. To understand the indisputable leadership of China in this ranking, several aspects should be taken into account. In the first place, it is necessary to consider the great effort that the Chinese Government is making to improve the economic investment carried out by R&D (Qiu et al. 2014). In 1998, the expenditure devoted to this concept was 0.65% of GDP, while in 2013, that expenditure was 2.08% of GDP. And, this is so given the decided commitment made in the last 30 years for trying not to rely technologically on other countries. On the other hand, in China, there is an absolute commitment both public and private for the challenges posed by water management. There are currently two major national innovation centers dedicated to water: China Institute of Water Resources and Hydropower Research (IWHR) and the Nanjing Hydraulic Research Institute (NHRI). In addition, there is a watershed innovation platform, of a scientific nature, and several national laboratories and research centers of technology and water engineering.

Fig. 2
figure 2

a Chart representing the distribution of country publications. b Chart representing the distribution of institution publications

In the same vein, it is not surprising that in studying the number of publications by institutions (Fig. 2b), we find that most of these institutions are also Asian. The figure represents the 17 institutions with at least 10 publications in the period studied, and it can be seen that the first five are of Chinese nationality. In fact, among the top ten, in addition to the Chinese institutions, there is only one Spanish university (the University of Granada), an Australian university (University of Queensland), and a Japanese university (University of Tokyo).

Distribution of output in journals

Finally, the distribution of publications according to the journal in which they have been published is shown in Fig. 3. In the figure, only those journals with at least 10 publications have been represented. Most of these journals (13) are published in Europe, while three journals are published in America, and only one journal is published in Asia. Of the European journals, eight are English, two are Swiss, two are Dutch, and one is German. Of the American journals, all are from the USA and the Asian journals are Chinese. In this ranking, the first three positions are prominently occupied by three European journals: Bioresource Technology (82 publications, 11.20%), Applied Microbiology and Biotechnology (53 publications, 7.24%), and Water Research (53 publications, 7.24%). All of these journals have a common focus on environmental biotechnology, microbiological media analysis, and any other technology associated with biological factors.

Fig. 3
figure 3

Distribution of publications by source

Conclusion

Wastewater treatment is an activity of increasing interest. This is carried out in specific plants that pursue the partial or total elimination of the organic load through different physical-chemical, biological, and oxidative treatments. Prior to the elimination of all these organic and inorganic contaminants, the first step is to monitor them. After the different treatments, the values of the measurements made to these contaminants have to conform to the parameters established by the legislation.

With the parameterization of the microbiological values, the situation is complicated. First, it is because 99% of the microorganisms present in the wastewater are not cultivable in the laboratory and therefore have not been traditionally studied, and, secondly, because we still do not have qualitative or quantitative quality thresholds for reusable water. For these reasons, a large number of standardized and non-standardized and advanced methods have been used for several decades for the detection of microorganisms in water, although these have been shown to be insufficient.

So, it is essential having microorganism analyzing methodologies that can be carried out without the need to isolate each of the species present in a medium. Methodologies that allow us to compare the microbiological profile of an environment with the microbiological profile in another one. This type of metagenomics analysis, in which all DNA extracted from a medium, is processed in its entirety trying to look for the microbial diversity present in said medium and has been successfully performed in the last decade in humans. Initially, the first studies were limited to highlighting the role of microorganisms in maintaining human health (Ordovas and Mooser 2006) and how relevant the knowledge of the intestinal microbiota in the path of the future of personalized healthcare would be. Similar studies have been carried out in the last 10 years. During this time, a large number of articles and reviews have been published that point in this direction. And others go even further, pointing out the importance of the human microbiome in pathologies related to the immune system (Kau et al. 2012), obesity (Ley 2010), or even cancer (Zeller et al. 2014). In other areas such as ecology (Kimes et al. 2013), methodologies related to metagenomics are also being applied.

On the other hand, since the beginning of the twenty-first century, high-performance massive sequencing techniques have been developed that are reducing DNA sequencing costs and enabling an ever-increasing reading depth. There are currently six major mass sequencing platforms, four of them with short-read sequencing technology and the other two, with single-molecule real-time long read technology. With all of them, metagenomics projects have been developed whose objective is the study of all microorganisms present in a given environment, in the context of their own community.

Wastewater quality monitoring has not been left out of this methodology. Analyses that have tried to establish metagenomics libraries of genes related to the microbial degradation of aromatic compounds (Suenaga et al. 2007) have been carried out, and metagenomics analyses have even been performed in municipal wastewater treatment plants to study the main methanogenic pathways in anaerobic sludge digestion (Yang et al. 2014). These analyses have been carried out using different massive sequencing platforms: Ion Torrent (Cao et al. 2016), Illumina (Ma et al. 2015) and Roche 454 (Ranasinghe et al. 2012). In all cases, the result has been satisfactory and future prospects are encouraging.

The problem remains that even the most detected microorganisms are not characterized, so a high percentage of them cannot be specified on a species level, sometimes, not even a genus level or even a family. That is why, it is essential to continue working on this line and try to characterize, if not all microorganisms in an environment, at least the profile of those that make the product quality of an analysis acceptable.

In the bibliometric study that we have carried out in this work, we have seen that the interest for the analysis of wastewater by means of metagenomic techniques is remarkable, with an exponential growth in the last 15 years, with China being the country with a greater scientific literary production in this matter. This is consistent with the policy pursued by the Chinese Government and the network of water and scientific infrastructures around the treatment of wastewater. However, in the keywords analysis of the published studies, it can be seen how the use of massive sequencing technologies is not yet widespread. Most metagenomics analyses continue to be made using standard Sanger sequencing, regardless of whether the throughput and cost of the NGS makes it profitable. Nevertheless, it is true that during the last years, massive sequencing has had more presence in the metagenomics analysis, indicating that it may be the trend in the future.

In bioremediation, the use of metagenomics allows the in-depth study of the effect of the different interventions, facilitating the optimization of the processes (Eyers et al. 2004). That is, metagenomics can help us to understand the appropriateness or otherwise of remediation strategies. However, the metagenomic approach, from the current perspective, is only useful to study the changes in microbial diversity in response to an action. The values obtained are not yet of great importance in absolute terms. That is why, it is crucial to combine this type of new approaches with classical approaches. The sum of all the measures and the combination of this enormous amount of data have to help us to optimize the mechanisms and processes related to the purification of waste water, with the added value of increasing the final quality of the wastewater before to be reused for any activity. The great potential for analysis of antibiotic-resistant bacteria and related genes in UWW has been also demonstrated by several authors, opening a door for future investigation that obviously will permit to develop enhanced treatment that favors their reduction and dissemination in the environment.