Keywords

1 Introduction

Chemistry is a fascinating field of study with endless applicability in every aspect of our life. The beauty in this open and diverse universe is the possibility of changing lives for the best. As part of the mission, chemists came to cooperate with the biological and medical sciences and elucidate complex biological systems through chemistry, mostly by using analytical instrumentation. Human curiosity and the necessity of science to explain relevant questions of biological interest continue to push the evolution and improvement of existing analytical tools. In the meantime, analytical chemists still need to be alert about the progress of biological sciences (Horvai et al. 2011). The nineteenth and twentieth centuries were marked by the chemist’s efforts on making precise, reliable chemical quantitation a reality. Now, we are living in a biological revolution. These fields, together, will clear obstacles and will bring mutual benefits to researchers working in the field of Bioanalytical Chemistry (Horvai et al. 2011; Wake 2008).

2 Bioanalytical Chemistry: Accessing the Chemistry of Life

One of the greatest mysteries of life is to understand its origin. The distinguishment of animate and inanimate matter is surrounded by questions on how and why organic molecules and inorganic compounds might interact through sophisticated molecular recognition, as in a perfect symphony. It is known that chemical reactions are vital to maintaining the structure and the workability of living organisms, guaranteeing their survival and continuous evolution. As an aftereffect, a vast number of different chemical compounds are repeatedly created as products of those reactions, telling us the history behind each living organism (Pross and Pascal 2013; Datta et al. 2020).

The chemical compounds that originated from a biological organism are called biomolecules. Most biomolecules have organic composition, and some typical examples are nucleic acids, amino acids, peptides, proteins, carbohydrates, and lipids. While the study of life at a molecular level is concerned with biochemistry, bioanalytical chemistry explores the identification, characterization, quantification, and time-dependent monitoring of biomolecules originated from sensitive sample matrix through many specific analytical methods that will be further explored in this book (Fig. 1) (Labuda et al. 2018; Kogikoski et al. 2018; Roat-Malone 2007).

Fig. 1
figure 1

Schematic representation of the landscape of bioanalytical chemistry—the integration between analytical chemistry and biology. Created with BioRender (https://biorender.com/)

The chemical study beyond biological functions provides the necessary knowledge to perform diagnostics of diseases, to discover novel biomarkers and biomaterials, to improve drug’s design by a specific mechanism of action, as well as to develop biosensors for specific target control. All of these purposes carry an enormous responsibility, requiring a high level of accuracy and reliability, those ensured by the analytical procedure and by the analyst (Hersel et al. 2003).

2.1 Biomolecules Composition and Properties

Biomolecules are made of some critical elements, such as carbon, nitrogen, oxygen, hydrogen, calcium, phosphorus, and sulfur. Moreover, the presence of sodium, potassium, magnesium, chlorine, and transition metals like iron and copper regulates critical biochemical pathways on biological systems. The chemical interactions provided by those elements dictate the structure of the matter existent in the organism, allowing different physical properties. From an analytical point of view, it is crucial to understand and use chemical and physical properties in favor of methods, to help elucidate the differences between biological molecules and samples.

The discrimination between compounds originated from plants with similar composition yet with different biological properties (Kubinyi 2002; Atanasov et al. 2015), for example, sometimes is only possible due to the separation capability and selectivity (Atanasov et al. 2015; Srikoti et al. 2020; Gault and McClenaghan 2009), the assessment of the exact mass (mass spectrometry) and the determination of their shape and structure by spectroscopic methods (Gault and McClenaghan 2009).

Among biomolecules, nucleic acids and proteins are groups with intricate compositions and functions that play essential roles in life maintenance. In time, as science learns about the specific functions and pathways that these biomolecules conduct, the importance of bioanalysis of these molecules grows continuously. There are two types of nucleic acids, deoxyribonucleic acid (DNA) and ribonucleic acid (RNA). If there is a “code of life,” it certainly is the DNA. DNA contains all the genetic information transmitted to the new generation of cells or organisms, while RNA is responsible for the synthesis of proteins. Both are composed of nucleobases (purine and pyrimidine bases), carbohydrates, and phosphoric acid. Covalent binding is responsible for the interactions between the constituents, forming nucleobases and nucleotides chains. The hydrogen bonds are responsible for base pairing, which gives the DNA its 3D structure with a twisted double helix shape (Selzer et al. 2018; Manz et al. 2004).

Proteins, in its turn, consist of amino acids connected by peptide bonds. In living organisms, about 20 different amino acids are arranged in distinct sequences of connections, forming a chain. Then, intermolecular interactions enable the amino acid chain to fold and create a 3D structure and shape, giving the protein a specific activity and function. In general, proteins are responsible for providing the organisms with the structure, tissue support (fibrous characteristic), and specific functions to the immune system and metabolism (antibodies and enzymes) (Manz et al. 2004). The research field responsible for elucidating the proteins and the methods used is called proteomics (Amiri-Dashatan et al. 2018). Other biomolecules also have an essential role in identifying diseases by the –omics sciences (genomics, transcriptomics, proteomics, and metabolomics), each one using a different technology (Bedia 2018).

An important aspect to consider when working with biological samples is the nature of the matrix. Samples originated from animal tissue need different treatment than plant material, for example. Each type of analysis usually has a sample preparation protocol to avoid degradations and analytes loss. At the same time, the targeted compounds need to be extracted from the sample matrix, preventing interference from contaminants or constituents during instrumental analysis. The extraction process is challenging and has to be chosen carefully, taking into account the properties and characteristics of the analyte, cost-benefit, and compatibility with the analytical instrumentation (Bedia 2018; Poole et al. 2016; Clark et al. 2016).

2.2 Importance of Bioanalytical Studies

Modern biomedical research is interested in deeply understanding the protein universe since proteins are directly related to the expression of different types of genes and might differ in various conditions and diseases. Nevertheless, to discover a protein and predict its clinical aspects is enough? In science, the answer is usually no. Recently, a new branch has emerged, named Chemical Proteomics. This field of science focuses on integrating synthetic chemistry, cellular biology, and mass spectrometry. Concisely, chemical proteomics uses the affinity chromatography approach to identify protein interactions with other molecules, being the protein uncharacterized or not. This approach has the advantage of using a target fishing method, allowing the identification of not only new proteins but also what kinds of molecules that interact with proteins. Chemical proteomic became an excellent way to test drug response (Amiri-Dashatan et al. 2018; Chen et al. 2020; Eberl et al. 2019). Besides, knowing the proteins involved at receptor-binding domains and understanding its characteristics is crucial when infections need to be overcome, and the race for medicine or a vaccine needs to be fast (Qiao and de la Cruz 2020).

The biomarker discovery, to be clinically useful, needs to present a robust and reproducible method, also helped by a computational process to facilitate the comprehension from scientists to the clinician. That is the goal of translational medicine, to narrow the basic and applied science (Bravo-Merodio et al. 2019; Wang 2012).

2.3 Separation Methods for Biomolecules

The study of biomolecules involves a complex manipulation of these analytes. This complexity is due to the researcher’s interest in having each biomolecule separated in order to evaluate its physical and chemical properties without interference from other molecules. The complexity in separating biomolecules for its elucidation comes from the fact that many substances have very similar properties and are unquestionably diluted on its substrates. To overcome these conditions, separation and purification methods were developed and enhanced over the past years (Glad and Larsson 1991). The most important of these methods are covered in this book, including electrophoresis, chromatography, and centrifugation (e.g., membrane centrifugation and ultracentrifugation) (Fig. 2).

Fig. 2
figure 2

The most commonly used separation techniques for biomolecules. Image adapted from BioRender (https://biorender.com/)

Electrophoresis is a technique for separating and identifying high molecular weight biomolecules, such as proteins and nucleic acids. This separation occurs through an electric field that moves charged macromolecules depending on their molecular weights, charge magnitude, and tertiary and quaternary structures. The basic scheme for electrophoretic separation consists of an anode, a cathode, and a medium filled with a buffer solution and supported by inert material. When voltage is applied, a current is generated by the movement of positively charged molecules to the cathode and negatively charged to the anode (Mikkelsen and Cortón 2016). In electrophoresis, the fundamental parameter, characteristic of each molecule (under given conditions), is the electrophoretic mobility, μ, given by the equation:

$$ \mu =v.E=q.f $$
(1)

v velocity, E electric field, q net charge, f frictional coefficient.

Over the years, much has been developed in electrophoresis (Campa et al. 2006; Lechner et al. 2020), and mostly, three types are primarily used: the moving-boundary, zone, and steady-state electrophoresis. Choices are based on the best compatibility between the analyte and its substrates and the specificity of the experiment.

Capillary electrophoresis (CE) is the most recent advance in this area, consisting of the same technique mentioned above, with the difference that it is carried out inside of a capillary column or tube. This simple fact makes the resolution of this technique much higher than any other separation technique due to the small volumes supported and the control of the electroosmotic flow rate. Other advantages of this technique, over the conventional electrophoresis, are minor heat dissipation per volume, higher electric fields, faster separations, and smaller diffusional band broadening. This technique also allows the possibility of analyzing small samples such as single cells with high efficiency. Some limitations include low sensitivity due to low selectivity, low limit of detection (LOD) compared to other similar techniques, and high dependency with solution pH and temperature. Also, the requirement of concentrated samples to obtain a clean response is another fragility of this approach (Mikkelsen and Cortón 2016; Fekete and Schmitt-Kopplin 2007).

Chromatography is widely used for separating biomolecules. This technique is based on the flow of a mobile phase (containing the sample) through a stationary phase. There are multiple types of chromatography, but as this chapter concerns biomolecules, this topic focuses on liquid chromatography (LC). In this case, the mobile phase is an aqueous solution, and the stationary phase can be either a particulate solid, a semi-solid porous gel, or a porous monolith. The separation occurs with the mobile phase dragging the analyte through the stationary phase. By the binding strength between the sample contents and the stationary phase, specific retention times are established, and each molecule is released based on this time between the insertion and elution (Mikkelsen and Cortón 2016).

When considering biomolecules separation, it is relevant to mention four main mechanisms: partition, size-exclusion (gel filtration), affinity, and ion-exchange. Partition occurs with the analyte interacting with two liquid phases (mobile and stationary), according to its solubility. The first to elute is the most soluble analytes into the mobile phase. Size exclusion separates molecules by their size and shape, with smaller and compact molecules being retained to the stationary phase and the larger and open-structures ones, eluting first. The affinity mechanism involves particular biomolecules, such as enzymes and antibodies, that have specific interactions with their substrates and antigens, respectively. Ion-exchange separation is specific for charged molecules, in which the stationary phase has an opposite charge compared to the analyte, attracting it (Fig. 3) (Mikkelsen and Cortón 2016).

Fig. 3
figure 3

Relevant chromatography separation mechanisms for biomolecules: Partition, Ion-exchange, Affinity, and Size exclusion. Images adapted from Biorender (https://biorender.com/) and Servier Medical Art licensed under CC 3.0, https://smart.servier.com

The most advanced technology involving LC is related to high and ultra-high-performance liquid chromatography (HPLC and UHPLC). The system allows the passage of intense high-pressure flows, increasing the efficiency and the resolution of the analysis. This pressure arises from the tiny size of the stationary phase particles, among other improvements in the whole apparatus, including the coupling with robust detectors such as Mass Spectrometer (MS) (Matuszewski et al. 2003; Yoon et al. 2019).

Among the strengths of this technique, versatility stands out, considering the wide range of molecules that can be separated and analyzed. Also, high resolution, selectivity, specificity, and efficiency are obtained in the most modern chromatographs. Its limitations are the high costs involved and the excessive hazardous waste generated by this technique. Besides, low boiling and high masses (>106 Da) analytes are better investigated using other techniques (Lottspeich and Engels 2018).

Centrifugation is a consolidated technique that is still broadly used. It consists of the use of centrifugal forces to separate particles from their liquid medium. This method is based on the principle of gravitational forces that lead particles with higher densities in suspension to deposit downward. This deposition depends on the size, shape, and mass of the particle and viscosity, and density of the supporting fluid. This technique is widely used in biochemical samples such as cells, organelles, DNA, and proteins. Upon using the right exposure time and rotation speed, analytical centrifugation can separate those biomolecules without any harm (Mikkelsen and Cortón 2016; Lottspeich and Engels 2018).

In a centrifuge, the main component is the rotor, which allows high-speed separations. Three types of rotors are used in this technique: swinging-bucket, fixed-angle, and vertical rotors. The difference between them is the direction of the applied forces and, consequently, the angle of deposition. Each rotor is used with different techniques directed to the desired application. Among the techniques, it is possible to mention the differential, zonal, isopycnic, density, and fractional centrifugations. These mechanisms differ according to the exploration of sedimentation, viscosity, and density differences inside of a mixture (Mikkelsen and Cortón 2016; Lottspeich and Engels 2018).

Advances in centrifugation include temperature control, high-speed and ultra-centrifuges (Laue and Stafford 1999). Usually, they are used for specific biological samples that require the control of some factors to avoid damages. Care must be taken with this technique. According to the sample, temperature control is essential to avoid denaturation. Besides, this technique presumably changes the structure and shapes of the particles, including another limitation for some experiments. Undoubtedly, centrifugation is a straightforward and applicable technique, showing its importance for the vast field of biochemical analysis (Mikkelsen and Cortón 2016; Lottspeich and Engels 2018).

2.4 Microfluidics: Bringing Solutions to Bioanalytical Chemistry

Microfluidics, also known as micro total analysis systems or lab-on-a-chip, is a growing field offering application in many areas such as sensors, chemical synthesis, biological field, and engineering, among others. Microfluidic devices were initially implemented to study the behavior of fluid through microchannels. The main goal was to miniaturize known processes, enabling the use of lower sample and material volumes, resulting in cheaper and greener processes (Li 2010).

The field of microfluidics was born in 1975, with the fabrication of a silicon-based gas chromatograph (Terry et al. 1979), and it was established with the miniaturization of pumps, valves, and sensors in later works (Hodge et al. 2001; Reyes et al. 2002; Auroux et al. 2002). During the 1990s, not only chromatography and its components were on the track of microfluidics, but also other analytical tools. They evolved toward miniaturization, with the highlight being electrophoresis (Manz et al. 1992; Duarte et al. 2012). Furthermore, and until nowadays, the field keeps having remarkable attention, with the recent development of top technology devices focused on pharmaceutical studies, the so-called organs-on-chip, and diagnosis, with microdevices able to perform the screening of several diseases (Fig. 4) (Reyes et al. 2002; Ingber 2018; Zhang et al. 2018).

Fig. 4
figure 4

Microfluidic devices are applied to target compound analysis, sample preparation, and organs-on-chips. Images adapted from Biorender (https://biorender.com/) and Servier Medical Art licensed under CC 3.0, https://smart.servier.com

Bioanalytical chemistry has always been related to the microfluidics field since there are few conventional methods capable of intensively treating and analyzing biomolecules. Samples such as plasma, saliva, tumor cells, among others, are typically limited to a few microliters, frequently making unfeasible its evaluation through some techniques, e.g., LC-MS and GC-MS. Therefore, microfluidic devices are bringing enormous advances to the sphere of separation (e.g., chromatography and electrophoresis), cell-based assays, nucleic acid, and PCR analysis, immunological and biological reactions, biochemical investigation, protein crystallization, and clinical investigation (Gomez 2010; Imamura et al. 2020; de Oliveira et al. 2020; Bounab et al. 2020; Roman and Kennedy 2007).

The advantages of using microfluidics as a bioanalytical tool include the reduction of materials, reagents, waste, and costs. Improvements in throughput, efficiency, and sensibility are also very convenient in this technique, besides the portability, dynamicity, and disposability it offers. Among its limitations, one can mention the compatibility between materials, samples, and analytical methods, the integration of components in complex systems, and the high investments on microfabrication facilities (Roman and Kennedy 2007).

3 Omics Sciences: Guided Bioanalysis

About 20 years ago, the world awoke to a new era. No, we are not talking about the twenty-first century or the internet popularization, but of the human genomics era, guided by the sequencing of the entire human genome achievement of the Human Genome Project (HGP). From now on, scientists believed to have found the perfect tool to answer all the disease-guided questions, an enormous achievement for humanity, the solution for all diseases, and elucidation of thoughtful journeys. Well, reading these beliefs in the 2020s may sound naive, but back then, the scientific belief was guided by the thought that, once DNA holds all the information needed for life, it should also hold all the information related to diseases, behavior, and therapeutic responses (Lander et al. 2001; Emmert-Streib et al. 2017; Gibbs 2020).

Indeed, without DNA, life would not be possible the way we know it. Since Watson-Crick’s central dogma statement, DNA became the chest that keeps the necessary information needed to maintain cellular machinery working. Several genes are responsible for different diseases and conditions, for example, the oncogene p53, known to be an essential gene in several types of cancer (Ozaki and Nakagawara 2011). The p53 gene is responsible for encoding a protein that is directly responsible for checking for DNA damage during the cell cycle. Once the p53 gene is mutated, the p53 protein may lose the ability to check and stop the damaged DNA from being copied, thus, turning possible the proliferation of this flawed cell. This way, after somatic mutations in the DNA, the damaged cell turns into a cancer cell, i.e., a cell with the ability to avoid all the cell cycle checkpoints, multiplying fast and irregularly and becoming a tumor (Pollard et al. 2004; Whibley et al. 2009). Other genes related to different diseases were discovered in the past decades, such as BRCA1 and BRCA2 for breast cancer and the IDDM1 locus for type 1 diabetes (Gomes et al. 2017; Narod and Foulkes 2004).

Life, however, is not simple. Time and immense scientific efforts showed that life is relatively more complex than DNA nucleotides can tell us. Following the idea of the central dogma of biology, proteins also show a crucial role in cellular response, as in viral infections. Here, proteins such as cytokines are responsible for signaling to the immune system cells of the virus presence, and antibodies, also proteins, responsible for marking the virus and bacteria for further elimination by immune system cells or mechanisms (Pollard et al. 2004). Proteins are also responsible for regulating many critical cellular processes. Examples are apoptosis, which is guided by several proteins, including BH3 domain proteins and the cell-division cycle, a process in part controlled by cyclin-dependent kinase (CDK) proteins that regulate each step of the cell cycle (Chipuk et al. 2006; Shamas-Din et al. 2013; Galderisi et al. 2003; O’Connor and Adams 2010).

Although proteins are also not the answer to everything, metabolites enter the field to win, once they are the final products from cellular machinery. Metabolites are responsible for several essential reactions in cellular operation, for example, the ATP and NADH molecules that are responsible for the energy that runs the entire human organism, or even the amino acids, the so-called building blocks of the proteins. Metabolites operate in several different pathways in the cellular workflow, so much that the most recent KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway list, Fig. 5, could be mistaken with a draw of electronic chips given the complexity of the networking and connections, an achievement that would make even the most giant existing subway let behind (Kanehisa 2019). Metabolites are also directly affected by the environment; for example, the diet or the use of pharmaceuticals and drugs directly affects the metabolome—the complete set of metabolites in the organism. It is also highly enjoyable to look in the metabolome for biomarkers of conditions and diseases. As a good example, di-tyrosine and 2-pyrroloylglycine appear to be biomarkers for the evolution of pre-diabetes to diabetes, as cortisol is a biomarker for stress (Zeng et al. 2008; Yan-Do and Macdonald 2017; Hellhammer et al. 2009).

Fig. 5
figure 5

Diagram of the metabolic pathways in human physiology. In an analogy of a metro system, each node, or station, represents a metabolite, while each color represents a different train line or metabolic pathway. Lines represent the path traveled by the trains, showing that to a train go from the beginning to the end of the line, it has to pass through all the line stations. Many of the pathways are linked to each other, been the reason that metabolic changes in a specific route can disrupt the entire system. Images adapted from KEGG Pathway Maps (Kanehisa 2019)

None of the three groups of molecules, however, can, alone, offer a conclusive and final evaluation of all diseases, or be used as a unique tool for explaining human organisms. Nowadays, science realized that no unique perspective of an organism stands alone. The three mentioned groups of molecules work together in a too complex network, where the miss of a single piece of the puzzle may disrupt the entire system or, as humanity calls it: death. This complexity led the scientists to investigate the proteins and metabolites with the same approach as that given to genes. As genes led to genomics, the field of transcriptomics studies RNA, proteomics is the evaluation of the proteome, and metabolomics the investigation of a complete metabolome (de Anda-Jáuregui and Hernández-Lemus 2020; Rahman and Rahman 2018; Joyce and Palsson 2006).

Nowadays, the three sets of biomolecules (nucleotides, proteins, and metabolites) undergo an omics approach for the continued advancement of our knowledge about life. This approach means that using bioanalytical tools, comparisons between different conditions could (1) explain the emerging of a condition or disease, (2) lead to a biomarker, or (3) be employed as a prognostic tool or treatment target (de Anda-Jáuregui and Hernández-Lemus 2020; Karczewski and Snyder 2018). Complexity in the biological matrix can describe the complexity of each of the omics fields as the difficulty of accessing the biomolecules, their range of concentration and localization, and the number of different analytes. Figure 6 represents a visual template of how the omics fields are correlated. Nucleotides (DNA and RNA), besides the complexity of the sequencing, are biomolecules with small molecular diversity and range of concentration, which translate into a smaller analytical challenge. DNA and RNA encode the next group of biomolecules: the proteins, which have a broader range of concentration and more considerable diversity in the same organism than the nucleotides, making proteomics studies more sophisticated from the analytical point of view. Metabolites, the next link in the chain, on the other hand, are present in any organism in a broader range of concentration than the others. They also include a great variety of compounds, including different physicochemical properties, which turns the metabolomics field highly challenging from the bioanalytical analysis perspective. In the next subtopics, the reader will find the main points involving the omics approaches for polynucleotides, proteins, and metabolites, including the most common bioanalytical techniques employed, and the challenges for each one (de Anda-Jáuregui and Hernández-Lemus 2020; Joyce and Palsson 2006; Karczewski and Snyder 2018).

Fig. 6
figure 6

Omics sciences applied to molecular biology: from nucleotides to proteins, from proteins to metabolites. Image created with Biorender (https://biorender.com/)

3.1 Nucleotides: Genomics and Transcriptomics

Genomics can be considered the precursor of all the omics fields and is defined as the comparison between genomes by sequencing them and checking the similarities and discrepancies. In the past, genomics was one of the most expensive fields of science, with complete genome sequencing costing billions of dollars. Fortunately, sequencing technology made huge progress in the past decades, in such a way that nowadays it is possible to perform DNA analysis at affordable prices, indeed has given start to a new business field for individual DNA analysis (de Anda-Jáuregui and Hernández-Lemus 2020; Raja et al. 2017; Vernon et al. 2019).

As mentioned, “Genomics 101” rests in DNA sequencing for later comparison and analysis of the sequences. During the Human Genome Project – HGP, the sequencing used to be performed by a technology called Sanger sequencing, in reference to the Frederick Sanger sequencing process, Fig. 7, that applied DNA polymerase, deoxynucleotides, and fluorescent marked dideoxynucleotides to synthesize different sizes of DNA chains, therefore separated by capillary gel electrophoresis and read by a fluorescence detector (Carrilho 2000). As electrophoresis separates the polynucleotide chains by size, the different DNA sizes continuously generate the pattern of the DNA sequence. Somehow, we can say that bioanalytical tools made possible the entire field of genomics since, without electrophoresis analysis, no sequencing was possible (Carrilho 2000; Schuster 2008; Dewey et al. 2013; Mamanova et al. 2010; França et al. 2002). More than that, we can say that the analytical chemists saved the Human Genome Project by designing new instrumentation (multiple capillary array electrophoresis), new enzymes (Taq DNA polymerase), new fluorophores (higher quantum yields and convenient spectroscopic properties), and improved separation media (high molecular weight poly(acrylamide)) (Zubritsky 2002).

Fig. 7
figure 7

Sanger sequencing. The fragmented DNA is combined with a primer, DNA polymerase, deoxynucleotides, and labeled dideoxynucleotides (the Sanger’s chain terminator nucleotide) to allow the synthesis of a new DNA strand, originating different sizes of DNA chains. Synthesized DNA chains are later separated by the size of the chain employing capillary electrophoresis using polymer solutions, and read by a four-color fluorescence detector. The electropherogram, combined with the fluorescence measurement, allows the DNA sequence determination. Image created with Biorender (https://biorender.com/)

Nowadays, Sanger sequencing is still widely employed, but the next generation sequencing (NGS) technology took over by storm as it is a faster and cheaper approach than the Sanger sequencing by capillary array electrophoresis. In NGS, Fig. 8, the DNA is extracted and purified, fragmented, and reacted with an adapter, a specific sequence of nucleotides. The DNA adapter is later added to a chip, also called a flow cell, containing several molecules of a nucleotide sequence (oligonucleotide) fixed on the surface. This fixed oligonucleotide is complementary to the adapter linked to the DNA, which leads to the pairing of both nucleotides, a process called library hybridization. After this process, DNA polymerase is added to the flow cell, together with deoxynucleotides, leading to a DNA synthesis using the fixed oligonucleotide as a primer. This step is a polymerase chain reaction process (PCR) and is repeated, leading to DNA amplification (Schuster 2008; Dewey et al. 2013; Mamanova et al. 2010; França et al. 2002; Medvedev et al. 2009).

Fig. 8
figure 8

One of the next-generation sequencing technologies. The process of NGS starts with library preparation (1) by the DNA fragmentation, followed by the addition of adapters to the DNA sequences. The DNA fragments are inserted into the flow cell, and the DNA is amplified (2). After amplification, fluorescently labeled nucleotides are added in turns, and the sequence of all fragments is acquired (3). As the final step, specific software turns several fragments into a single DNA sequence (4). Image created with Biorender (https://biorender.com/)

Accordingly, several copies of the DNA fragments are fixed to the surface, and different sequences are located in different clusters on the flow cell. Once the DNA fragment is fixed to the surface, all the previously added deoxynucleotides are washed away, and fluorescence marked terminator deoxynucleotides are added to the flow cell. DNA polymerase then performs the DNA synthesis with the marked deoxynucleotides. After one cycle of synthesis, the equipment read the fluorescence in each of the clusters. Since each marked deoxynucleotide has a different color, the equipment can determine which nucleotide was added in each cluster. After each measurement, chemicals are added to remove the terminator and fluorescence groups from the marked deoxynucleotide, in order to allow DNA polymerase to continue the DNA synthesis. The chemicals are then washed away, and the process of adding a newly marked deoxynucleotide is repeated over and over. The equipment measures after each nucleotide is added, turning possible to determine the sequence of all the DNA fragments. Bioinformatics is latter used to gather the fragments into a single DNA sequence (de Anda-Jáuregui and Hernández-Lemus 2020; Vernon et al. 2019; Carrilho 2000; Schuster 2008; Dewey et al. 2013; Mamanova et al. 2010; Medvedev et al. 2009).

In NGS, electrophoresis separation was abandoned in exchange for a different approach for sequencing, but bioanalytical tools are still turning all the processes possible. The entire NGS process relies on microfluidics devices that afford lower reagents consumption and quicker analysis (França et al. 2002). Therefore, genomics remains holding hands with bioanalytical advances, and further advances in this field will highly benefit genomics technologies (Zubritsky 2002). Despite the sort of technology applied in DNA sequencing, genomics is currently a widely defunded science and permitted several discoveries of how organisms works and the discovery of different genetical markers for disease development, as the use of the BRCA1 gene as a biomarker of breast cancer occurrence probability, and TMSB15A gene as a biomarker for triple-negative breast cancer treatment progress (Davis et al. 2014; Martínez-Jiménez et al. 2020; Carrasco-Ramiro et al. 2017).

Transcriptomics, in parallel, is the evaluation and comparison between the transcriptomes, i.e., the RNA transcripts of the DNA. This field remains highly attractive once RNA transcripts give information about the differentially expressed genes due to a specific condition or treatment. Differently from DNA, which has a vast part of non-coding sequences, RNA transcripts of a given cell are more closely related to the condition and encode information that is going to be used in that specific scenario. Therefore, transcriptomics can be said to be closer to the phenotype (de Anda-Jáuregui and Hernández-Lemus 2020; Sandberg 2014).

Although quite a few techniques permit RNA sequencing or RNA analysis directly, such as the use of microarray technology, transcriptomics is commonly achieved by extraction of RNA, followed by fragmentation of RNA chain and the reverse transcription of the RNA to cDNA (complementary DNA). After the cDNA is obtained, the sequencing is performed by the same techniques explained previously for DNA sequencing. Recently, bioanalytical technologies have been employed in transcriptomics, allowing even the analysis of single-cell transcriptomics. In this approach, organism cells are separated by microfluidic devices and followed by cDNA obtention and sequencing, tuning the science even closer to personalized medicine (de Anda-Jáuregui and Hernández-Lemus 2020; Sandberg 2014; Li et al. 2016).

Nevertheless, the sequencing of nucleotides, both DNA or RNA, still has an enormous impact in different fields as medicine and agriculture. Genomics and transcriptomics were widely developed in crop species in order to understand which loci and genes were responsible for better growth, or better adaptability to a given soil, or even responsible for better response to pathogens. As a result, we have several transgenic species of crop plants, which resulted in better harvests, impacting the food production market (Varshney et al. 2015).

It is interesting to observe that the recent growth of publications, clinical diagnosis, therapies, and agricultural products born from genomics and transcriptomics studies can be closely related to the development of better and advanced bioanalytical tools, that turned possible the advance of these fields and allowed the scientists to transform the world somehow.

3.2 Proteins–Proteomics

The term proteomics, published in 1995, is defined as the study of the total protein content able to be encoded by a given genome (Humphery-Smith 2015). The term is still in use, and this field of bioanalytical chemistry seems to be a promising one, with many efforts being put on (Yates et al. 2009; Boggio et al. 2011). The importance of the proteomic study is related to the vital role of proteins in cell functions. Proteins act in a building block scheme for cells, being responsible for numerous functions such as enzymatic reactions, signaling, transcription and translation processes, and structural components, being decisive for cell regulation (Vaz and Tanavde 2019; Aizat and Hassan 2018).

Proteomics can be divided into four different areas: sequence, structural, functional, and interaction and expression proteomics. As the name says, each one of them deals with different properties and aspects of proteins, apart from using different analytical tools for identification and evaluation. Summarizing, sequence proteomics determines amino acid sequences in proteins using chemicals to cleave and tag them. This technique is classically done using Edman sequencing (Edman and Begg 1967). Nowadays, Mass Spectrometry (MS), and Nuclear Magnetic Resonance (NMR), electrophoresis, and HPLC substituted Edman’s technique, reducing analysis time. Structural proteomics studies the structure of proteins and the implications in their functions. Many analytical methods are used to determine protein structure, with X-ray diffraction and protein crystallization, and 2D NMR being the most common (Sali et al. 2003; Woolfson 2018; Paganelli et al. 2016). The third area of proteomics studies the functions of proteins and the interactions between proteins and other biomolecules mostly through in vitro analysis and microarrays (LaBaer and Ramachandran 2005; Sutandy et al. 2013). Functional and interaction area is considered the most targeted proteomics of all. Finally, proteomics is capable of elucidating the expression of a protein in a very complex system, with plenty of biomolecules. This elucidation is essential to understand the role of a specific protein in cells, tissues, or entire organisms (e.g., the human body) (Kim et al. 2014), and it is usually analyzed and quantified by MS.

Applications in proteomics are vast and include multiple areas such as diagnosis, drug screening, epidemiology, pathogenicity, oncology, food chemistry, agriculture, and others (Chandrasekhar et al. 2014; Aslam et al. 2017). As every growing field, proteomics brings many challenges. However, as technology develops, new insights emerge, leading to advances in this area.

3.3 Metabolites–Metabolomics

Biological sciences are always trying to understand better the organism’s conditions, from a healthy functional organism to a disabled one, like an infection, a genetic disease, or cancer. Genomics, transcriptomics, and proteomics have a lot to tell about these conditions. However, there is a set of compounds that are even closer to the phenotype, which means the observable characteristics from the condition, for example, a cough, brain dysfunction, or tumor growth. The group of compounds that are the final product of cellular metabolism and, therefore, the closest step in the relationship of cellular functioning and phenotype is the metabolites (Ahmad 2008).

An array of metabolites is called metabolome and is composed of numerous molecules that have different properties, which can be: inorganic compounds, hydrophilic carbohydrates, organic acids, hydrophobic lipids, complex natural products, volatile molecules, among other (Fiehn 2002). The comparison between the metabolome of a healthy organism and the metabolome of a sick one, known as metabolomics, can lead to the discovery of biomarkers for the detection of several types of diseases in initial stages, allowing the best medical intervention to the patient (Lindahl et al. 2017). In the field of personalized medicine, biomarkers can also be used as prognostic markers, guiding the physicians to the best therapeutic approach (Villas-Bôas and Gombert 2006; Nordström et al. 2006; Wishart 2007).

Metabolomics as a tool is extremely useful for different fields, as in plant science. Here metabolomics can lead to the discovery of different secondary metabolites expelled by the plant in the presence of a pathogen. Alternatively, metabolomics can show the differences in the metabolites of a species after a pesticide application. These discovered compounds can lead to new pharmaceuticals or even the use of less impacting pesticides and are of great interest to natural products science (Funari et al. 2013; Razzaq et al. 2019; Jorge et al. 2016).

Metabolomics studies can be performed in several types of samples, e.g., in vitro (cell culture), in vivo (urine, blood, saliva, and feces), and from different plant parts (Funari et al. 2013; de Zawadzki et al. 2017). Normally, these studies are carried out in three main steps: sample preparation, data acquisition, and data analysis, Fig. 9. Sample preparation, especially the extraction step, is a crucial procedure in non-target metabolomic approaches, that is, when the goal is not a specific set of metabolites, but instead a comprehensive fingerprint of the metabolome (Duportet et al. 2012). Data acquisition can be performed by different analytical techniques, such as liquid chromatography coupled to mass spectrometry (LC-MS), gas chromatography coupled to mass spectrometry (GC-MS), capillary electrophoresis coupled to mass spectrometry (CE-MS), direct infusion to the mass spectrometer (DIMS), ionization and matrix-assisted laser desorption (MALDI) and nuclear magnetic resonance (NMR) (Mesquita et al. 2012; Zawadzki et al. 2018). Each technique has its main strengths and disadvantages, as it will be better discussed in the next chapters of this book. However, it is essential to say that metabolomics studies would not be possible without the recent advances in analytical chemistry (Duportet et al. 2012; Mesquita et al. 2012; Kim and Verpoorte 2010; Ferreira et al. 2016).

Fig. 9
figure 9

Metabolomics workflow. After sample collection, a few steps of sample preparation are processed in order to extract the metabolites and remove interferents. In sequence, the sample is analyzed by analytical tools, such as LC-MS, GC-MS CE-MS, or NMR. At last, the acquired data is analyzed by statistical protocols in order to avoid bias and establish the correct interpretation. Image created with Biorender (https://biorender.com/)

Data analysis comprehends statistical and chemometric evaluation of the data to differentiate the groups studied. This evaluation involves, for example, Principal Component Analysis (PCA), Partial Least Square Discriminating Analysis (PLS-DA), Orthogonal Partial Least Square Discriminating Analysis (OPLS-DA), amongst others. Once a metabolite is statistically proved to be able to differentiate the conditions, it is considered to be a potential biomarker. The following step is to identify the compounds using metabolomics databases, such as HMDB and Metlin for human metabolome and Nubbe and Dictionary of Natural Products for plant metabolomics (Lindahl et al. 2017; Triba et al. 2015).

Although been one of the newest between omics sciences, metabolomics showed to be extremally useful in clinics already, e.g., in diagnosis and prognosis of different types of cancer, such as breast cancer, where glycerophosphocholine and glucose levels could be used to differentiate tumors from healthy tissue; and prostate cancer, where phosphocholine, lactate, and alanine can be used for diagnosis. Metabolomics discoveries can contribute even harder to physicians by applying the biomarkers in new advanced bioanalytical tools. One of them is the MasSpec Pen, a handheld sample probe that is capable of extract metabolites from tissue directly from the surgery table and sending the metabolites to a mass spectrometer kept inside the surgery room. The MasSpec Pen technology allows the use of biomarkers to differentiate the tumor tissue from healthy tissue, allowing the surgeons to extract less healthy tissue as possible (Spratlin et al. 2009; Zhang et al. 2017).

3.4 Bioinformatics and Chemometrics Tools

The vast number of data generated by the advances in analytical instrumentation brought new challenges to perform data analysis and interpretation. Bioinformatics had to be reviewed to differentiate distinct groups of samples, molecules, or methods among themselves. Also, all the design studies had to be properly modeled using statistical tools. Nowadays, chemometrics is useful to help researchers to understand and categorize the obtained data. Moreover, unknown molecules can also be identified through the help of several databases (Bravo-Merodio et al. 2019; Rotroff 2020; Parry-Smith 2019).

Specialized software was developed in the past years to help researchers to perform data analysis, such as Galaxy® for genomics and transcriptomics data, msInspect® for proteomics research, and MetaboAnalyst® for metabolomics data analysis. Nevertheless, software like Statistica® and Minitab® are commonly used for the design of experiments in order to enable the development of better chromatographic separations and comprehensive sample preparation steps. The software, as mentioned earlier, were developed to be user-friendly, not requiring previous programming skills from the user. Although, if the user has programming training, software like R® and MatLab® can be used to refine their data analysis, through the change of the mathematical algorithms, improving the model fitting to their own data (Enot et al. 2011).

Bioinformatics and data analysis protocols vary broadly from lab to lab, and up to today, there is no consensus of the most suitable algorithms for all the research data. Is it suitable to perform a PLS-DA or a neural network with this set of data? Were the results genuinely unbiased? Therein, it is highly important to the analyst take some time to truly understand the statistical models that he is wondering about applying, in order to keep the data interpretation consistent with the acquired data. For this quest, in the next chapters of this book, the reader will find essential information on chemometrics and data analysis.

In summary, all the omics fields have been contributing significantly with the advance of new therapies, diagnostics tools, better crop plantations, food quality control, and even consider contributing to the concept of personalized medicine. It is essential to state that none of the fantastic advances from omics fields would be possible without equal advances in bioanalytical technologies. Improvements in separation techniques, including better chromatographic columns and more durable pumps, evolution in microfluidic devices, and enhancement in detectability and separation on MS and ion-mobility MS and on NMR and chromatographic NMR, were of fundamental importance for the advances in omics sciences; what can exemplify the importance and relevance of the other chapters of this book.