Introduction

The knowledge derived from genomic and computational technologies increases in geometric progression. The understanding of this avalanche of data is closely linked to the formidable development in the bioinformatics area. By enabling the overall assessment of this extraordinary amount of data, bioinformatics has considerably accelerated scientific discoveries. This growth has as a consequence a large supply of products, services, and information, so that keep up to date, locate, and use the latest innovations; it has become a full-time activity.

Although we initially tried to create a complete profile of all the available bioinformatics resources, quickly, it was evident dynamism and constant updating of this field surpassed this goal. We create a division into four categories: sequence analysis software, software prediction of protein structures, resource servers “online,” and finally left a list of places of interest on the Internet that can shorten the search time. We opted for the selection of these categories because we believe that analyze in a comprehensive way the molecular biology central dogma.

Bioinformatics, as a scientific area, gathering techniques and tools from the subjects: molecular biology, source of information to be analyzed; informatics or computer science, provides the hardware for analysis and networks to share the results; mathematics, the origin of the algorithms used in the data analysis. The interrelationship of the three areas creates the basis for bioinformatics applications in molecular biology, as can be seen in the following diagram (Li et al. 2013) (Fig. 1).

Fig. 1
figure 1

Relationship of biological “-omics” with bioinformatics (Li et al. 2013)

We know that there have been many reviews of bioinformatics tools published, but in this review, we want to present in a simple way the fundamental and most useful tools for the life science researcher work, integrating several tools that can cover all omics. Tools were used since the design of the experiments, the obtaining of biological data, the deposit of these data, and their mineralization in order to deduce phenotypes and their interrelationships and applications both in molecular cloning in pharmacy and medicine. So this mini-review will be useful both for biotechnology researchers and for agronomists, zootechnics, ecologists, pharmacists, and medicine agents (Martins et al. 2014, 2019).

Deduce the order of DNA sequences is essential for basic biological research, with several important applications in biotechnology. The large capacity sequencing obtained with modern DNA sequencing technologies has been responsible for the immense, extraordinary, and complete sequencing of DNA sequences, or genomes, including the human genome. The first sequencing method “Sanger sequencing technique” based on the selective incorporation of chain-terminating dideoxynucleotides (ddNTP’s) by DNA polymerase with capillary electrophoresis, automatic, was developed by Applied Biosystems (Namely AB370). These automated tools, with significant capacity sequencing, have been the main tool in the sequencing of various genomes and the human genome. These first genome projects were, in turn, a stimulus to the development of new and powerful platforms for sequencing called next-generation sequencing (NGS) (Heather and Chain 2016).

Next-generation sequencing systems

Next-generation sequencing (NGS) is a high-throughput methodology that allows massive base-pair sequencing in DNA or RNA samples. Making a large number of applications possible, including full sequencing of numerous genomes, the study of gene expression profiles, the study of epigenetic changes, the study of mutations, and molecular analysis, to make the future of personalized medicine possible (Goldman and Domschke 2014).

NGS systems include multiple platforms, the so-called second generation of sequencers, with different approaches and sequencing capabilities such as Life Sciences’ SOLiD/Ion Torrent PGM, Illome’s Genome Analyzer/HiSeq 2000/MiSeq, and Roche GS FLX Titanium/GS Junior. In the third generation of sequencers, the most popular platform is the single-molecule real-time (SMRT) sequencing is a parallelized single-molecule DNA sequencing method. Each of the four DNA bases is attached to one of four different fluorescent dyes. When a nucleotide is incorporated by the DNA polymerase, a detector detects the fluorescent signal of the nucleotide incorporation, and the base call is made according to the corresponding fluorescence of the dye. Other sequencing platforms already from the fourth generation of sequencers, based on nanoporous, are developing with more data generation capabilities in less time and lower costs. Whichever platform you use, millions of data points are generated in hours, so getting data is no longer a problem, leading to a paradigm shift, where data processing, storage, and analysis become the task most relevant task. It is at these points that bioinformatics, with its ability to analyze large amounts of data with diversified objectives, assumes its essential role, also considering that each of the mentioned platforms incorporates a series of bioinformatics tools for processing the output data (Goldman and Domschke 2014; Kulski 2016).

Primary analysis of DNA sequences

The primary analysis of DNA sequences is essential in the daily life of the biotechnology laboratory, in the detection of mutations and the establishment of phylogenies, elaboration of restriction maps to make cloning, and cassettes for silencing genes in order to see their role in the cell metabolism.

In the genome analysis software, several program packages can be found, which accompany the entire process from receiving the sequencer graphics to publishing the data in online databases. These features, along with free access to academics, file compatibility, and their date of conception are the main factors in the choices made.

We point out that many of the services provided by these programs are also provided by some programs available online, the disadvantage that each query requires a network connection, but with the advantage that these online resources are updated regularly.

Staden package (http://staden.sourceforge.net/) (Bonfield and Whitwham 2010; Rodger et al. 2003a, b)

The very complete program package for nucleotide sequence analysis, free for students and researchers, allows requests via mail or directly from the network. Staden is very powerful and lends itself to automated processing of data; it is not very intuitive as it requires some learning but it is certainly an excellent work tool.

The Staden package was developed at the Medical Research Council (MRC) Laboratory of Molecular Biology, Cambridge, England, by Rodger Staden’s group. The package was converted to open source in 2004, and several new versions have been released since.

The authors describe the current version of the sequence analysis package developed at the MRC Laboratory of Molecular Biology, which has come to be known as the “Staden package”: “the package covers most of the standard sequence analysis tasks such as restriction site searching, translation, pattern searching, comparison, gene finding, and secondary structure prediction, and provides powerful tools for DNA sequence determination.”

This package contains the following programs:

  • Gap4 and Gap5: This program is the main tool of this package; it performs compilation, sequence merging, compilation rectification, reads sequence pairs, and allows editing them (Fig. 2);

  • Pregap4: Allows the reception and analysis of the information from the sequencers constituting the information input port for this program package;

  • Trev: Fast and effective, allows the visualization of sequences in ABI, ALF, or SRF formats;

  • Trace diff: Automatically localizes mutation points by comparing the sequence understudy with reference sequences. It supports any number of sequences and allows the visualization of results by gap4;

  • Sip4: Compares sequence pairs in various ways, often displaying results graphically. It allows a comparison between nucleotides between proteins and between proteins and nucleotides.

  • Nip4: Analyze nucleotide sequences to find genes, restriction sites; allows translation, etc.

Fig. 2
figure 2

Gap Interface (Staden Package Handbook) (Bonfield and Whitwham 2010; Rodger et al. 2003a, b)

pDRAW32 (https://www.acaclone.com/)

A program to be used on the Windows platform, with a nice and intuitive interface, available for free on the Internet at the website (https://www.acaclone.com).

With this program, it is passive to perform various operations, such as annotations for the DNA understudy, cloning DNA, editing sequences, selecting restriction enzymes, exporting graphics and text, calculating the optimal PCR temperature, calculating homologies between two DNA fragments, and containing scientific aid files. Possibly one of the best programs for cloning strategies, extremely intuitive, so easy-to-use, and produces very beautiful, simple, and complete images.

GenBeans (http://www.genbeans.org/)

GenBeans is an integrated stand-alone platform for bioinformatics based on NetBeans (developed by Apache Software Foundation) open-source software. It focuses on molecular biology and provides a fully integrated toolbox in a rich, easy-to-use graphical interface for analyzing and visualizing sequences. Another interesting program based on NetBeans is geneinfinity (http://www.geneinfinity.org/) that we describe in Table 1.

Table 1 Tools for biological sequences characterization

DNASTAR™ (https://www.dnastar.com/)

Another computer package whose utilization has been getting big is the DNASTAR™; this package has programs with which we can edit and compare sequences, deduce physicochemical characteristics, and do genetic constructions, restriction maps, etc.

Serial Cloner (http://serialbasics.free.fr/Serial_Cloner.html)

This program was developed at Institut Curie by Franck Perez. Serial Cloner is designed to provide molecular biology software for Macintosh and Windows users. It reads and writes DNA Strider compatible files and imports and exports files in universal FASTA format. It consists of graphical display tools and simple interfaces that help you analyze and build in a very intuitive way.

“The user interface is relatively simple to operate, in that it is within the reach of any user with advanced biology knowledge, who should be especially impressed with the huge amount of options available. With Serial Cloner you can, among other things, join DNA fragments obtained through PCR, manipulate the shRNA, or simply assemble fragments of different chains.”

Sequencher (https://www.genecodes.com/)

“Gene Codes Corporation is a privately-owned international firm, which specializes in bioinformatics software for genetic sequence analysis. Its flagship software product, Sequencher, is a sequencing software used throughout the world. Its targeted use is by researchers at academic and government labs as well as for biotechnology and pharmaceutical companies for DNA sequence assembly.”

Sequencher is a simple but useful program that allows us to:

  • Analyze nucleic acid sequences in editing modes;

  • Alignment with possible visualization of the various chromatograms;

  • Perform manipulations and restriction maps.

The latest release of Sequencher highlights Gene Codes’ goal of providing researchers with powerful, easy-to-use DNA analysis software tools. Sequencher 5.3 adds RNA-Seq analysis to its long list of DNA sequence analysis features, as well as improvements to Sequencher Connections, its newest architecture for DNA sequence analysis.

FastPCR (https://primerdigital.com/fastpcr.html)

FastPCR is an integrated tool for PCR primers or probe design, in silico PCR, oligonucleotide assembly and analyzes alignment and repeat searching developed by PrimerDigital (Kalendar et al. 2017a, b, c). PrimerDigital is a biotechnology company specialized in high-quality primer, probe design service, and software development that delivers state-of-the-art PCR software. From the wide experience we have in using FastPCR, we agree with the description of this software made by the company that we summarize: “The FastPCR software is an integrated tools environment that provides comprehensive and professional facilities for designing any kind of PCR primers for standard, long-distance, inverse, real-time PCR (TaqMan, LUX-primer, Molecular Beacon, Scorpion), multiplex PCR, Xtreme Chain Reaction (XCR), group-specific (universal primers for genetically related DNA sequences) or unique (specific primers for each from genetically related DNA sequences), overlap extension PCR (OE-PCR) multi-fragments assembling cloning and Loop-mediated Isothermal Amplification (LAMP); single primer PCR (design of PCR primers from close located inverted repeat), automatically detecting SSR loci and direct PCR primer design, amino acid sequence degenerate PCR, Polymerase Chain Assembly (PCA), design multiplexed of overlapping and non-overlapping DNA amplicons that tile across a region(s) of interest for targeted next-generation sequencing (Molecular Tagging) and much more.”

The design of the primer has to be done very rigorously to guarantee the future of the PCR project, errors in the design can be noticed only long after much effort, and money has already been spent on the project. That is why it is recommended to use a software that allows us in silico to previously establish the conditions of reaction and design of the primers; FastPCR is a free (free) software, friendly, and extremely versatile to avoid spending time and money.

Biological sequences characterization

Annotation is the process of characterizing genes and their biological products in a DNA sequence. This process had to be automated because the number of genes is too large to be written down by hand. The annotation was made possible by the fact that the genes have recognizable start and stop regions. Sequence analysis refers to the study of different characteristics of molecules such as nucleic acids or proteins, which guarantee their specific functions. In the first instance, the sequences of the molecules are deposited in public biological databases (Mehmood et al. 2014).

Then, several tools can be used to predict their characteristics related to their function, structure, evolutionary history, or identification of counterparts with high precision. These analyses are quite popular due to the many applications in science biological factors, simplicity, and quantity of information about the gene/protein under study. Table 1 presents a list of tools for the characterization of biological sequences (Mehmood et al. 2014).

An orthodox procedure to deduce genetic information consists of obtaining fragments of genomic DNA by mechanical fragmentation or with restriction endonucleases or of cDNA obtained with the enzyme reverse transcriptase from messenger RNA; these fragments can be cloned into cloning vectors for sequencing with the help of programs like pDRAW32 (https://www.acaclone.com/). The cloned sequences obtained are separated from the vector sequences (which served as a reference for the design of the sequencing primers) with a simple but useful VecScreen program (https://www.ncbi.nlm.nih.gov/tools/vecscreen/). The partial sequences obtained by sequencing multiple clones can be assembled in programs such as Sequencher (https://www.genecodes.com/) to obtain larger contigs. The information contained in these sequences can begin to be worked on in order to search for open-reading frames (ORFs) through programs such as the ORF finder described in Table 1 (https://www.ncbi.nlm.nih.gov/orffinder/). The homology of the proteins deduced from the ORFs, with protein sequences deposited in the databases, will then be searched using programs such as BLAST, FASTA, or CLUSTAL. Then, the physical-chemical characteristics, the 3D structure, and the fate of proteins in the cell, as well as their function, will be established with the programs and methodologies that are described below.

Prediction of subcellular protein location

The prediction of the subcellular location of proteins predicts the fate of a protein in the cell, using computational methods with the protein sequence.

There are several publicly available software, using different methods to predict the location of proteins (amino acid composition, signal peptide composition, physical-chemical composition, among others), which is a very important part of the bioinformatics prediction of protein function and genome annotation (Nielsen et al. 2019).

Software used for protein location predictions can be accessed via URL addresses as follows:

As written on the website “SignalP 3.0 server predicts the presence and location of signal peptide cleavage sites in amino acid sequences from different organisms: Gram-positive prokaryotes, Gram-negative prokaryotes, and eukaryotes. The method incorporates a prediction of cleavage sites and a signal peptide/non-signal peptide prediction based on a combination of several artificial neural networks and hidden Markov models” (Bendtsen et al. 2004a, b).

“Cello2go is a publicly available, web based system for screening various properties of a targeted protein and its subcellular localization” (Yu et al. 2014).

“The method, LocTree2, predicts the location of all proteins in all areas of life. Similar to the previous method, LocTree, incorporates a system of Support Vector Machines organized hierarchically to mimic the mechanism of protein trafficking in cells” (Goldberg et al. 2012).

Euk-mPLoc 2.0, Predicting subcellular localization of eukaryotic proteins including those with multiple sites (Chou and Shen 2010).

ESLpred is a tool for predicting the subcellular localization of proteins using support vector machines. The predictions are based on dipeptide and amino acid composition and physicochemical properties (Horler et al. 2009).

The SecretomeP 2.0 server produces ab initio predictions of non-classical, i.e., not signal peptide-triggered protein secretion. The method queries a large number of other feature prediction servers to obtain information on various post-translational and localizational aspects of the protein, which are integrated into the final secretion prediction (Bendtsen et al. 2004a, b).

Proteins characterization

After decoding the open-reading frame of a gene, a series of bioinformatics tools can be used to characterize the deduced sequence of the protein. A search on the Expasy Proteomics Server website (http://expasy.org/tools) and a nucleotide sequence allows us to identify and characterize proteins; identify motifs, patterns, and profiles; infer their stability, cell location, or function; make predictions of secondary and tertiary structures; look for similar sequences deposited in databases and compare them; and establish phylogenetic relationships.

The detection of the physical-chemical characteristics of proteins can be carried out in PROSITE (http://prosite.expasy.org/scanprosite/), in the neural network system of the Pôle BioInformatique Lionnais/Network Protein Sequence Analysis or in the DiANNA 1.1 application (http://clavius.bc.edu/~clotelab/DiANNA/), for the prediction of post-translational modifications on the Center of Biological Sequence Analysis website (http://www.cbs.dtu.dk/services).

ProDom is a comprehensive database of protein domain families generated from the global comparison of all available protein sequences. Recent improvements include the use of three-dimensional (3D) information from the SCOP database, a completely redesigned web interface (http://www.toulouse.inra.fr/prodom.html).

Phylogenetic analysis

Phylogenetic analyses are procedures used to rebuild relations evolutionary between a group with molecules and related organisms for prediction of certain characteristics of a molecule in which its functions are not known (Mehmood et al. 2014). The underlying principle of phylogeny is to group living organisms according to the degree of similarity. Phylogenetic comparison analysis is used usually to control the lack of statistical independence between species. Phylogenetic tools are usually used to test various hypotheses evolutionary, and they are indispensable to functional genomics (Mehmood et al. 2014; Khan et al. 2014).

MEGA-Molecular Evolutionary Genetics Analysis (MEGA) is computer software for conducting statistical analysis of molecular evolution and for constructing phylogenetic trees, very recommended for sequence alignment and phylogeny inference (https://www.megasoftware.net/). We share the opinion of Kumar et al. in 2016 about the Molecular Evolutionary Genetics Analysis (MEGA) that MEGA includes a large repertoire of programs for assembling sequence alignments, inferring evolutionary trees, estimating genetic distances and diversities, inferring ancestral sequences, computing time trees, and testing selection. Over the last 25 years, MEGA’s use in evolutionary analysis has been cited in over one hundred thousand studies in diverse biological fields (Kumar et al. 2018).

  • MOLPHY: Molecular phylogenetic analysis tool (https://sbgrid.org/software/titles/molphy).

  • PAML: Package of programs for phylogenetic analyses of DNA or protein sequences using maximum likelihood (https://bio.tools/paml).

  • PHYLIP: PHYLogeny Inference Package (PHYLIP). One of the most useful and used free computational phylogenetic package of programs for inferring evolutionary trees (phylogenies). The author is Joseph Felsenstein, Professor at the University of Washington, Seattle. It consists of 35 programs that include methods for, distance matrix and maximum likelihood, including calculating statistical support for clades (bootstrapping) and consensus trees based on the following types of data: molecular sequences, gene frequencies, restriction sites and fragments, matrices from distance (http://evolution.genetics.washington.edu/phylip.html).

  • Jalview: Program for multiple sequence alignment editing (https://www.jalview.org/).

Biological sequence databases

Biological sequence databases are a vast collection of biological information data, such as sequences of nucleotides, proteins, and macromolecular structures. The information stored in these databases has no only important for future applications but also serves as a tool for primary sequence analysis. The submission and storage of this information to be freely available to the scientific community led to the development of several bases worldwide. The bases of data contain varied information; therefore, they are classified as primary and secondary, through information stored. Primary databases are composed of derived information directly from basic scientific research on sequencing. SWISSPROT, UniProt, GenBank, and PDB are examples of primary databases. Secondary databases contain information derived from the interpretation of information stored in the database’s primary. SCOP, CATH, PROSITE, and eMOTIF are examples of secondary databases (Koonin and Galperin 2003) (Table 2).

Table 2 Biological databases

Proteins structure prediction tools

Proteins are composed of polypeptides, which in turn are polymers composed of amino acids that fold together creating a three-dimensional structure (3D). Protein folding in its form correctly is a prerequisite for any protein that can perform its biological function; therefore, in order to understand the functions of a specific protein, information is needed about their three-dimensional structures, see Table 3.

Table 3 Tools for analyzing protein structures and functionality

Molecular interactions

Proteins rarely perform their functions in isolation and therefore interact with other molecules to run a particular process. Understand how biomolecules interact with other molecules could be used in purification techniques as well as drug development. It is also essential to understand the interactions between molecules in order to elucidate the biological functions of a molecule. For example, interactions between proteins have a key role in cellular activities such as signaling, transport, metabolism, and various biochemical processes (Table 4).

Table 4 Tools for studying molecular interactions

Molecular dynamics simulations

Biological activity is the result of molecular interactions. This behavior of molecules can be studied with the use of bioinformatics tools, usually referred to as simulation tools for molecular dynamics. They aim to provide detailed information on the dynamics of processes that occur in biological systems (refer to Table 5).

Table 5 Molecular dynamics simulation tools

Medicines concession

Before bioinformatics tools, scientists resorted to chemistry, pharmacology, and clinical sciences to discover new compounds. Traditional processes are time-consuming and costly. Bioinformatics came to facilitate this complex process and has a vital role in the discovery of new drugs and its design due to the quick analysis of molecules in a computer when compared with experimentation (Lekamwasam and Liyanage 2013; Chordia and Kumar 2018) (refer to Table 6).

Table 6 Databases for target drugs

Integrative bioinformatics modules

As already mentioned, the amount of biological data grows exponentially and these data are spread over infinity of public and private repositories and are stored in different formats. This makes it difficult to search for these data and carry out the analyses necessary to deduce new knowledge from the set of deposited data. Integrative bioinformatics attempts to solve this problem by providing unified access to life science data.

The several directions which may lead to breaking the bottleneck of Integrative Bioinformatics are described by Chen et al. (2019) and include:

  • “Integration of multiple biological data towards systems biology. Different omics data is reflecting different aspects of the biological problem. Often, to solve a problem, there are many different methods developed by many groups. These methods may perform differently, some good, some bad. Combing with big data, and other approaches, artificial intelligence (AI) has been successfully applied in bioinformatics, especially in the field of biomedical image analysis;

  • Computing infrastructure development. Integrative Bioinformatics in the big data era requires a more advanced IT environment. To facility the related computing and visualization demands, both hardware (e.g. GPU) and software (e.g. Tensor flow) are developing. Supercomputers are used. Cloud services are provided by more and more institutes and big companies.”

Many ready-made professional commercial bio-informational programs are presented by development companies using modern sequencing technologies. Bioinformation groups usually prefer to use ready-made modules and write scripts to bind data between them.

Therefore, in parallel, two ways for data analysis are ready-made commercial products and scripts for linking different ready-made mini-programs. Both ways are necessary. Therefore, the most important thing is the support and updating of ready-made programs and modules. In the following list, we present the most important modules for integrated bioinformatics:

  • Uniprot UGene: Ugene is free bioinformatics software for multiple sequence alignment, genome sequencing data analysis, and amino acid sequence visualization. Unipro UGENE is a multiplatform open-source software with the main goal of assisting molecular biologists without much expertise in bioinformatics to manage, analyze, and visualize their data. It provides visualization modules for biological objects such as annotated genome sequences, next-generation sequencing (NGS) assembly data, multiple sequence alignments, phylogenetic trees, and 3D structures. Availability and implementation: UGENE binaries are freely available for MS Windows, Linux, and Mac OS X. (Okonechnikov et al. 2012);

  • Vista: “Vista is a comprehensive suite of programs and databases for comparative analysis of genomic sequences. There are two ways of using VISTA - you can submit your own sequences and alignments for analysis (VISTA servers) or examine pre-computed whole-genome alignments of different species” (Frazer et al. 2002). http://genome.lbl.gov/vista/index.shtml;

  • Qlucore: Qlucore Omics Explorer (QOE) is next-generation bioinformatics software for research in the life sciences. Qlucore Omics Explorer is built for fast and easy analysis of many different types of data and a wide range of application areas are supported:

With Qlucore Omics Explorer, you can examine and analyze data from gene expression experiments, DNA methylation data, proteomics data, and next-generation sequencing (NGS) data (https://www.qlucore.com/);

  • CIBI: The CRCM’s Integrative BioInformatics (Cibi) is a technological platform of the Centre de Recherche en Cancérologie de Marseille. The Cibi platform offers a wide range of expertise in bioinformatics (large-scale data integration, development of specific analysis) and develops state-of-the-art bioinformatics pipelines as NGS (next-generation sequencing) data analysis and integration (Chip-Seq, RNA-Seq, SC-RNA-Seq, variant analysis for research and cancer diagnostics) (https://cibi.marseille.inserm.fr/);

  • iMAP: iMAP is an integrated bioinformatics and visualization pipeline for microbiome data analysis. According to Buza et al., the iMAP tool wraps functionalities for metadata profiling, quality control of reads, sequence processing and classification, and diversity analysis of operational taxonomic units. This pipeline is also capable of generating web-based progress reports for enhancing an approach referred to as review-as-you-go (RAYG). The iMAP pipeline integrates several functionalities for better identification of microbial communities present in a given sample. The pipeline performs in-depth quality control that guarantees high-quality results and accurate conclusions. The vibrant visuals produced by the pipeline facilitate a better understanding of the complex and multidimensional microbiome data (Buza et al. 2019;

  • MIGenAS: Migenas is a versatile and extensible integrated bioinformatics toolkit for the analysis of biological sequences over the Internet. The toolkit is part of the Max-Planck Integrated Gene Analysis System (MIGenAS) of the Max-Planck Society available at www.migenas.org (Rampp et al. 2006);

  • Methy-Pipe: Methy-Pipe is an integrated bioinformatics pipeline for whole-genome bisulfite sequencing data analysis. According to Jiang et al., Methy-Pipe uses Burrow-Wheeler transform (BWT) algorithm to directly align bisulfite sequencing reads to a reference genome and implements a novel sliding window-based approach with statistical methods for the identification of differentially methylated regions (DMRs). Methy-Pipe is a useful pipeline that can process whole-genome bisulfite sequencing data in an efficient, accurate, and user-friendly manner. Software and test dataset are available at http://sunlab.lihs.cuhk.edu.hk/methy-pipe/ (Jiang et al. 2014);

  • IGV: The Integrative Genomics Viewer (IGV) is a high-performance, easy-to-use, interactive tool for the visual exploration of genomic data. It supports flexible integration of all the common types of genomic data and metadata, investigator-generated or publicly available, loaded from local or cloud sources (https://igv.org/);

  • Bioconductor: Bioconductor provides tools for the analysis and comprehension of high-throughput genomic data. Bioconductor uses the R statistical programming language and is open source and open development. It has two releases each year, and an active user community. Bioconductor is also available as an AMI (Amazon Machine Image) and Docker images (https://www.bioconductor.org/);

  • Geneious: Geneious is a very useful and popular DNA, RNA and protein sequence alignment, assembly and analysis software platform, integrating bioinformatic and molecular biology tools into a simple interface. These tools are created by the company Biomatters, headquartered in New Zealand with offices in the USA, and users in 125 countries worldwide, yours solutions enhance productivity in more than 4000 universities, research institutes, and businesses. Biomatters create powerful, integrated, and visually appealing bioinformatics solutions, with a strong emphasis on ease of use and overall user experience (https://www.geneious.com/).

Trends and future of bioinformatics

Bioinformatics has developed significantly from the development and establishment of molecular cloning methodologies and the automation of DNA sequencing methods. With the development and application of the new generation sequencing platforms, large-scale sequencing of genomes and transcriptomes began, which contributes to the development of bioinformatics methodologies and tools at a level that went beyond academic centers and which includes medical biotechnology, gene therapy, agriculture biotechnology, animal biotechnology, environmental biotechnology, and forensic biotechnology. Currently, bioinformatics has a great application in genomics, proteomics, metabolomics, transcriptomics, and molecular phylogenomics. The development of biomarkers for the creation of safer and more personalized drugs is leading to more to greater development and use of bioinformatics. The sequencing of personal genomes and metagenomics projects will increase significantly in the coming years with the consequent intervention of bioinformatics. We think that the future of bioinformatics will involve specialization in different areas that go down more in scientific depth at the level of nanopores and even the atom itself.

Conclusion

Bioinformatics is a discipline relatively new that in recent years progressed very quickly. It is discipline that makes it possible to test hypotheses virtually what allows you to have better knowledge before proceeding with expensive studies. Despite the development of the most many tools for analysis genomics, proteomics, inference of structures, drug design, and simulations of molecular dynamics, none can be considered the “perfect” tool. Bioinformatics tools provide results that are more accurate what allows reliable interpretations. Perspectives in the field of bioinformatics include contributions to understanding the human genome, leading to the discovery of new drugs and specific therapies. It is essential that bioinformatics and other disciplines move side by side to understand biological systems and the consequent development of human well-being. During the first years of biotechnology, the most important was to obtain biological data. With the development of the methods and techniques of the new generation of sequencing (NGS), the paradigm shifted to the ability to analyze such a large amount of data resulting from the sequencing of genes and genomes. However, with the development of bioinformatics tools in recent years, the most important is to know what we want in research, choose the appropriate tool, work hard with it, and know-how to correctly interpret the results provided.