Introduction

Science, mysteries, phenomena, discoveries, innovations, and genetic engineering are required to invigorate novel methods in synthesizing microbial products. Surfactants of microbial origin are plausible substitution to chemically synthesized ones due to mindfulness on the need to protect the environment [1]. Most commercial synthetic surfactants produced are toxic, carcinogenic, and non-environmentally friendly [2]. In recent years, research efforts in scientific and industrial communities have centered on environmentally friendly bio-products, namely biosurfactants (BioSs). BioSs are surface-active compounds (heterogeneous groups of secondary metabolites) which could reduce interfacial tension (IFT) and surface tension (ST), thus making it have potential benefits over chemical surfactants for their utilization [3, 4]. BioSs are advantageous over synthetic surfactants due to their effectiveness, specificity, high biodegradability, low toxicity, biocompatibility, and pollution control applications. These attributes allow their biotechnological applications in cosmetic and pharmaceutical products, food additives, agriculture, and the petroleum industry [1, 5, 6]. The other promising properties include emulsification, excellent detergency, foaming, enhanced oil recovery, and metal sequestration. They also play essential roles in microbial adhesion to the interfaces, changing surface-active phenomena, spreading, wetting, flocculation/dispersion, flotation, and solubility of hydrophobic organic compounds [5, 7].

For example, there are factors, such as low product yields, high costs, severe foaming, and complex procedures originating from set-ups during cultivation that limit industrial and biotechnological applications of BioSs [8]. The connections between these biomolecules' production conditions, their structures, and capabilities are central components of their improvement in industry, environment, and biotechnology [9]. Thus, future research's central part is the improved synthesis of BioSs with high-value properties by genetic engineering or modifications. Accordingly, low product yields that limit biotechnological and environmental applications permitted the increasing consideration in the genetic engineering relative to improving the synthesis of BioS production as of late [8]. The BioS synthesis by certain microorganisms may be ascribed to the presence of specific genes or enzymes activated in the presence of hydrocarbons and other carbon substrates [10]. It is pertinent to note that bacteria are capable of synthesizing BioSs while degrading hydrocarbon compounds. Nevertheless, relatively little is known about the genetics, molecular features, and functional characterization of the degradative and BioS synthetic systems [11,12,13].

It is worthy to note that the industrial and biotechnological applications of genetically enhanced and hyper-producing recombinant strains have not been appropriately established, even with hyper-BioS producers being reported [14]. The development of hyper-producing microbial strains is carried out through the utilization of new-age biotechnology to screen potent proteins and enzymes present in microorganisms, which may help improve their outcome as bio-products [10, 15]. This is realized through the use of different advances inclusive of introduction or the manipulation of genetically engineered microorganisms, single/naked genes, and operons, in silico computation for the discovery of novel metabolic pathways, bio-prospecting using high throughput screening, genome mining, and metagenomic screening [15, 16]. Thus, the significant focus therein in BioS research is the utilization of conventional genetic improvement approaches. Genetic engineering involves the modifications of microbial genetic materials to obtain new or improved product capabilities of biotechnological and environmental importance. The advancement in BioS surface activity and cost-effectiveness of BioS synthesis is also the general objective for developing hyper-producing microorganisms [17]. There are different hyper-producing microbial strains that have developed for improved BioS activity such as B. subtilis (pHT43-comXphrC), B. subtilis fmbR, B. subtilis fmbR-1, B. subtilis THY-7, Bacillus subtilis THY-7/Pg3-srfA, B. subtilis TS593, B. subtilis TS662, B. subtilis, B. subtilis 168S1, 168S2 [18,19,20,21]. In recent years, many studies on enhancing the production of BioS have been of interest with a focus on the use of inexpensive substrates, optimization of medial components and growth conditions, fermentation strategies, statistical model optimization of process and environmental parameters, as well as optimization of downstream processes [13, 22, 23]. Reductions in BioS production total costs will hinge on developing cheaper strategies, including the use of low-cost substrates and the use of genetically engineered microbial strains for enhanced production yields. The possible use and improved application of these hyper-producers, in addition to novel cost-effective bioprocesses, throw the challenges and offer prospects for BioS competitiveness in the world market. This review paper provides a general synopsis on the biosynthesis and genetically engineering strategies for improving BioS synthesis with emphasis on the usage of the novel, recombinant, and hyper-producer microbial strains in bioremediation applications.

Biosurfactant Classifications

BioSs are produced by an extensive diversity of microorganisms and possess structures of different chemical and surface properties [6]. Microorganisms can produce different types of BioSs, which include glycolipids (mannosylerythritol, rhamnolipids, sophorolipids, xylolipid, cellobiose lipids trehalose lipids), lipopeptides (subtilisin, viscosin, serrawettin, surfactin, polymyxin, iturin), polysaccharide–protein complexes, flavolipid, phospholipids, fatty acids, polymeric surfactants (liposan, alasan, emulsan), and lipids [6, 24]. The most frequently produced low molecular weight surface-active compounds are glycolipids and lipopeptides. The other group that has often been used substitutively with BioSs to represent surface-active biomolecules is bioemulsifiers [25]. The different types of BioSs and the structural compositions are discussed and provided below.

Glycolipids

Numerous glycolipids, encompassing simple fatty acids esterified to a carbohydrate moiety have been defined from different microorganisms [26]. Their structural composition differs from simple sugars with fatty acyl substituents to complex carbohydrates that can successively be connected to aromatic compounds, nucleosides, or terpenoids, in addition to having different connection points to “un”, “mono”, “poly”, unsaturated fatty acids through glycosidic or ester linkages. Glycolipid BioS structures include rhamnolipids, sophorolipids, mannosylerythritol, trehalose lipids, and xylolipids.

Rhamnolipids

Rhamnolipids are glycolipid BioS, synthesized by different microorganisms and not limited to Pseudomonas aeruginosa, Renibacterium salmoninarum, Serratia rubidaea, Burkholderia thailandensis, Acinetobacter sp. YC-X 2 during secondary metabolism [27,28,29,30,31]. An oily glycolipid BioS synthesized formerly by Pseudomonas pyocyanea was also first discovered in 1946. Edwards and Hayashi [32] explained the chemical structure of the rhamnolipid as glycosides comprising of one (mono-rhamnolipids) (Fig. 1a) or two (di-rhamnolipids) (Fig. 1b) rhamnose sugars connected by an O-glycosidic bond to lipid moieties. The hydrophobic component of the rhamnolipid group consists of one or two bonds. Still, in uncommon cases, three β-hydroxy unsaturated fatty chains might be single, double, or triple bonded and possess different lengths of C8 to C16. The hydrophilic component of rhamnolipid comprises single- or twofold rhamnose sugars connected by an α-1, 2-glycosidic bond [33].

Fig. 1
figure 1figure 1figure 1

Main glycolipid biosurfactants produced by microorganisms, namely (a) mono-rhamnolipid, (b) di-rhamnolipid, (c) lactonic sophorolipid, (d) acid sophorolipid, (e) mannosylerythritol lipid, (f) cellobiose lipid, (g) trehalolipid, and (h) xylolpid [26, 43, 67, 125]

Sophorolipid

Sophorolipid is a glycolipid complex that is synthesized by a couple of non-pathogenic yeast species. Sophorolipid comprises of hydrophilic sugar head called sophorose and a hydrophobic unsaturated fat of 16 or 18-carbon chain length. Sophorose encloses glucose of disaccharide group connected by an irregular β-1, 2 bonds acetylated on the 6′- as well as 6″-positions [34]. The carboxylic end of sophorolipid could be either lactonized (Fig. 1c) or an acidic form of sophorolipid (Fig. 1d).

Mannosylerythritol and cellobiose lipids

Mannosylerythritol lipids are functional glycolipids also synthesized abundantly by yeast strains. They comprise fatty acids joined to 4-O-β-D-manno-pyranosyl erythritol or 1-O-β-D-manno-pyranosyl erythritol as the hydrophilic head group (Fig. 1e) [35]. Cellobiose lipid is another glycolipid BioSs with the significant product recognized as 16-O-(2″,3″,4″,6′-tetra-O-acetyl-β-cellobiosyl)-2-hydroxyhexadecanoic acid (Fig. 1f). Yeasts and mycelia organisms are shown to produce a few extracellular glycolipids, including cellobiose, and mannosylerythritol lipids [36].

Trehalolipids

Trehalolipids are made from unsaturated fatty acid group length (hydrophobic components) in a blend with carbohydrate group (hydrophilic component) (Fig. 1g). The hydrophobic parts of trehalolipids are vastly different, comprising hydroxylated branched fatty acids of varying chain lengths and aliphatic acids. The amounts of the hydrophobic chain in every molecule of trehalose lipids are usually mono-, di-, and tetra-esters, separately connected to long-chain unsaturated fats by an ester bond [26].

Xylolipids

Xylolipid is another class of BioS discovered recently with molecular composition of methyl-2-O-methyl–d-xylopyranoside, a hydrophilic component connected to hydrophobic parts of the octadecanoic acid (Fig. 1h) [37].

Lipopeptides

Lipopeptides are biomolecules comprising a lipid connected to a peptide group (small chains of amino acid monomers joined by peptide bonds). Lipopeptides are synthesized by various bacterial genera such as Bacillus, Streptomyces, Pseudomonas, and fungi such as Aspergillus [38]. Lipopeptides have received substantial consideration for their antimicrobial and surfactant properties. Bacillus subtilis produced acyclic lipopeptide surfactin, which is one of the most recognized BioSs [38]. The major lipopeptide groups are further discussed below.

Surfactin

Surfactin group is the most prominent Lipopeptide (Fig. 2a), which comprises a peptide loop of seven different amino acids (L-valine, two L-leucine, L-aspartic acid, glutamic acid, and two D-leucines) and a hydrophobic fatty acid chain, of 13 to 15 carbon length. Surfactin has shown potent antibacterial, antitumoral, antibiofilm, and antiviral activities as well as bioremediation process and environmental applications in recent studies [39, 40].

Fig. 2
figure 2

The chemical structures of (a) surfactin, (b) iturin, and (c) fengycin biosurfactant. The cyclic lipopeptide contains fatty acid chain linked with amino acids. The compound subordinate in each group originates from various amino acid constituents [41]

Iturin

Another lipopeptide group with a hydrophobic fatty acid joined by an amide bond to a peptide moiety (a constituent amino acid residual constituent) is iturin [41]. They possess a typical arrangement and show variability at four different positions (Fig. 2b) [42]. The various groups associated with iturin include bacillopeptin, mycosubtilin, iturin A, C, D, E, bacillomycin D, F, and L, respectively. Iturin, as a lipopeptide group, has also been useful in antimicrobial, pharmaceutical, and biotechnological applications [42].

Fengycin

Fengycin is another set of lipopeptide groups with a lipidic fraction and ten amino acids connected to an N-terminal end. Iturin and surfactin contrast from this group due to different amino acids, such as allo-threonine and ornithine [41]. Like iturin group, fengycin possesses robust antifungal action; it inhibits the development of a wide variety of plant pathogens and application in improved diesel biodegradation. The diverse array of the peptide component (variations with trademark Alanine-Valine di morphy positioned at six in the peptide ring) also authorizes the characterization of a new fengycin B into fengycin family (Fig. 2c).

Fatty Acids and Phospholipid

Fatty acid and phospholipid are unsaturated BioS of C12 to C14 lengths and complex unsaturated fatty acids comprising hydroxyl groups and alkyl branches. Different bacteria produce vast numbers of fatty acids and phospholipid surfactants as the fatty acids are suitable as BioSs due to their surface activity [38].

Flavolipid

A class of BioS with stable interfacial activity and emulsifying capacity is represented as flavolipids. This group’s polar end (Fig. 3) possesses two cadaverine molecules and citric acid, which is somewhat dissimilar to the polar groups in other reported BioSs. Flavolipid BioS is of interest due to their potentiality in environmental, biotechnology, industrial applications [43].

Fig. 3
figure 3

Structures of flavolipid biosurfactant isolated from Flavobacterium sp. strain MTN

Polymeric Biosurfactants

Polymeric BioSs are generally high atomic weight biopolymers, with characteristics, for example, rigidity, increased thickness, and shear resistance. Emulsan and liposan, synthesized by Acinetobacter calcoaceticus and Candida lipolytica, respectively, are the best-studied polymeric BioS [38]. Different cases of particulate BioS are extracellular vesicles of microbial cells, which aid hydrocarbon emulsification [44]. Emulsan holds a backbone comprising a 2-amino-2-deoxy-hexuronic acid, amino sugars, glucose, fatty acids, and galactosamine (2-amino-2-deoxy-galactose) connected to the main chain through amide and ester bonds (Fig. 4) [45].

Fig. 4
figure 4

The structural composition of emulsan, a major microbial surface-active compound with high molecular weight [44]

Functional Characterization of Different Biosurfactant Biosynthetic Genes in Microorganisms

The molecular characterization and biosynthetic regulation of Bacillus subtilis lipopeptide BioS [46] and Pseudomonas aeruginosa glycolipid (rhamnolipid) BioS were the first to be reported [47]. Additionally, the molecular characteristics of other BioSs that have also been described include iturin and lichenysin from Bacillus species [48, 49], emulsan from Acinetobacter species [50], arthrofactin from Pseudomonas species [51], and mannosylerythritol lipids from Candida [48]. The biosynthetic characterization of other less-known BioSs such as viscosin, amphisin, serrawettin, hydrophobin, lokisin, and tensin is mostly non-existent [48].

Surfactin Synthetase Genes

The molecular characterization and biosynthetic regulation of surfactin ensue through a non-ribosomal peptide synthetase mechanism. The step includes a multienzyme peptide synthase complex, which comprises four enzymatic subunits SrfA, SrfB, SrfC, and SrfD. SrfA carries out the activation and addition of amino acids, namely Glu, Leu, and D-Leu. At the same time, srfB encodes synthetases that catalyze Val, Asp, and D-Leu's proliferation and activation. Subsequently, the thioesterase type 1 motif necessary for peptide termination and Leu is activated by SrfC (Fig. 5). Finally, the SrfD located terminally and encodes for thioesterase type II required for the lactonization process. The SrfA operon consists of a sfp gene encoding the phosphopantetheinyl transferase required for surfactin synthetase activation [52, 53] These enzymes are called surfactin synthetases needed for surfactin biosynthesis and are coded by srf operon [46].

Fig. 5
figure 5

The diagrammatic representation of different operons of NRPSs responsible for the biosynthesis of lichenysin and surfactin of the lipopeptide groups [49, 126]

Lichenysin Synthetase Operon

The non-ribosomal peptide synthetases (NRPSs) group, also known as multimodular peptide synthetases, is responsible for synthesizing lichenysin [54]. There is description of seven amino acid activation- thiolation, two epimerization, and one thioesterase domain in a lichenysin synthetase operon which is similar to surfactin and other peptide synthethases [48, 49]. There are recognition, activation, and incorporation of respective amino acids, namely L-Gln, L-Leu, D-Leu, L-Val, L-Asp, D-Leu, and L-Ile/L-Val by different functional modules such as LchAA, LchAB, and LchAC (Fig. 5). During lichenysin synthesis, the starting unit is LchAA, and modules in LchAB and LchAC initiate peptide chain elongation. Additionally, the terminal end of LchAC, specifically the putative thioesterase, is responsible for the cyclization and release of the peptide product. In contrast, the stimulation and initiation of the BioS synthesis are carried out by LchA-TE [55].

Non-Ribosomal Peptide Synthetase

The non-ribosomal peptides are assembled by NRPS enzymes comprising modules that are responsible for the sequential selection, activation, and condensation of precursor amino acids, fatty acids, alpha-keto acids, and alpha-hydroxy acids, as well as polyketide-derived units [56, 57]. These peptides' structural diversity is typically cyclic or branched compounds comprising small heterocyclic rings, proteinogenic amino acids, and other uncommon variations in the peptide backbone [58]. The functional characterization of the BioS biosynthesis gene clusters involved in directing the non-ribosomal synthesis of bioactive compounds in Bacillus amyloliquefaciens, Bacillus subtilis, and Bacillus tequilensis has been discussed in previous reports [59,60,61].

Iturin Synthetase Genes

The four open reading frames, ituD, ituA, ituB, and ituC constitute a significant component of iturin synthetase operon. The specific deficiency in iturin A production was confirmed due to the disruption of putative malonyl coenzyme A transacylase encoded by the ituD gene [62]. However, ituC and ituB genes encode the peptide synthetase that has one epimerization domain, thioesterase domain that helps in peptide cyclization, and two adenylation domains [48, 62]. On the other hand, the peptide synthetase encoded by the ituB gene possesses four amino acid adenylation domains. Finally, the ituA gene contains prominent features of three functional areas homologous to aminotransferase, β-ketoacyl synthetase, and amino acid adenylation [48].

Arthrofactin Synthetase Gene Cluster

The overall modular architecture of arthrofactin synthetase gene cluster follows the collinearity rule as revealed by cloning and recombinant technology. There are three genes of arthrofactin operon, namely, arfA, arfB, and arfC which encodes ArfA, ArfB, and ArfC with two, four, and five functional modules representing cyclic lipo-undecapeptide BioS. There is no epimerization domain in each of these modules but features condensation, adenylation, and thiolation domains that are characteristic for such multienzymes [51].

Emulsan Synthetase Genes

There are five different emulsan synthetase genes (wza, wzb, wzc, wzx, wzy) required for the biosynthesis of emulsan by Acinetobacter lwoffii RAG-1 [50]. The importance and establishment of wzc (protein tyrosine kinase) and wzb (protein tyrosine phosphatase), respectively, were later confirmed in the emulsan synthetase cluster. There was an emulsan-defective phenotype due to the deletion in either of the two genes [63].

Rhamnosyl-Synthetase Genes

The biosynthesis of rhamnolipid is involved by three main enzymatic reactions with β-oxidation confirmed to play a significant role in rhamnolipid production (Fig. 6) [47]. The substrate needed for both mono- and di-rhamnolipids is utilized and activated by rhamnose moiety, dependent on the RmlBCAD pathway. This process is further encoded by the catalytic activity of the enzyme AlgC and RmlBCAD operon. In the rhamnose sugar precursor synthesis, the typical D-glucose molecule is converted to D-glucose-1-phosphate catalyzed by phosphomannomutase enzyme AlgC, participating in the biosynthesis of glucose and rhamnose needed for the formation of core-LPS [64]. The process then follows the synthesis of dTDP-D-glucose by enzyme, RmIA. The RmIB further converts the dTDP-D-glucose to dTDP-4-oxo-6-deoxyl-D-glucose by RmIB and subsequent conversion to dTDP-6-deoxyl-L-deoxyl-4-hexulose by enzyme RmIC. The RmID enzymes convert dTDP-6-deoxy-L-lyso-4-hexulose to dTDP-L-rhamnose. The dTDP-D-glucose and dTDP-6-deoxyl-L-deoxyl-4-hexulose are rhamnosyl-transferases RhIB and RhIC substrates, needed for the mono- and di-rhamnolipid biosynthesis. Hypothetically, RhIG enzyme functions by relaying fatty acid synthesis intermediates into the rhamnolipid pathway [47, 53].

Fig. 6
figure 6

Biosynthesis pathway of mono-rhamnolipid and di-rhamnolipid biosurfactant

Alasan Synthetase Genes

The complex anionic polysaccharide containing three proteins, namely AlnA, AlnB, and AlnC, as well as covalently bound alanine (apoalasan), are components of Acinetobacter radioresistens KA53 alasan. The recombinant protein E. coli OmpA has an amino acid sequence homologous to that of the recombinant protein AlnA [65]. Likewise, the family of antioxidant enzymes known as peroxiredoxins has strong homology to AlnB amino acid sequence. Additional information about the mode of action of Alasan BioS is anticipated from the unknown genetic detail of AlnC [66].

Physiology, Pathways, and Kinetics of Biosurfactant Production

Biosurfactant Physiology and Metabolic Pathways

BioSs are synthesized through intracellular or extracellular adhesion to microbial cells when cultured in liquid medium containing immiscible substrate as a source of carbon and energy. Microbial cell function associated with BioS is not understood fully, as speculations have been made about their application in the emulsification of hydrophobic organic pollutants with low solubility [67]. The role played by BioS is enabled via reducing the surface tension between the interphase, thus increasing the availability of substrate for metabolism and uptake [25, 68,69,70]. The different pathways for the biosynthesis of BioSs are discussed below, ranging from glycolipids (rhamnolipids, sophorolipids, phospholipids, mannosylerythritol, trehalose lipids), lipopeptide (surfactin) including polymeric BioSs (emulsan). BioSs are amphiphilic compound, comprising both hydrophilic polar and hydrophobic non-polar joined ends. Microorganisms exploit the hydrophilic polar moieties for cell metabolism, whereas the utilization of hydrocarbon portion is entirely dependent on the hydrophobic moieties [67, 71]. The study on respective metabolic pathways gives an understanding of how BioSs are synthesized from different substrates. The synthesis of precursors for BioS production involves different metabolic pathways which are dependent on carbon substrates provided initially and utilized in the production culture medium. In the synthesis of glycolipids, the flow of the primary carbon source (carbohydrates) is regulated by the lipogenic pathways, while glycolytic pathway, on the other hand, enabled the formation of the hydrophilic moiety (Fig. 7a) [67]. A significant precursor of carbohydrates (glucose 6-phosphate) present in the hydrophilic component of glycolipid BioS is made from the degradation of carbohydrate substrates such as glucose or glycerol. Subsequently, acetyl-CoA is produced from pyruvate, which in turn gives malonyl-CoA in addition with oxaloacetate. Thus, this process is followed by conversion into an important precursor for the synthesis of lipids, namely fatty acids [71]. In a situation where petroleum hydrocarbons are utilized as the substrate source, the mode of action is principally engaged to both the gluconeogenesis and lipolytic pathways, thereby allowing its usage to produce sugars or fatty acids (Fig. 7b) [67].

Fig. 7
figure 7

The intermediate metabolism identified with the synthesis of glycolipid biosurfactant precursors with (a) different carbohydrate substrates with enzymes (i) phosphofructokinase; (ii) pyruvate kinase; and (iii) isocitrate dehydrogenase responsible for the flow of carbon (b) hydrocarbon substrates with enzymes (iv) isocitrate lyase; (v) malate synthase; (vi) phosphoenolpyruvate carboxykinase; and (vii) fructose-1-phosphatase responsible for the flow of carbon

So far, the biosynthesis of sophorolipids BioS includes the successive transfer of activated glucose molecules (UDP-glucose) to a hydroxyl acid in reactions catalyzed by two separate glycosyltransferases. The acetyltransferase further acetylates the glucose molecule, and the fatty acid constituents can be produced by modifying hydrocarbons or de novo from acetate in the growth medium [72]. In the case of mannosylerythritol biosynthesis, the genes required were formerly acknowledged on smut fungi named Ustilago maydis, which yields mannosylerythritol inclusive of cellobiose lipids [73]. Mannosylerythritol BioS is synthesized through different enzymatic reactions. The enzyme mannosyltransferase required in the synthesis of mannosylerythritol is encoded by emt1, while mat1 translates an acetyltransferase catalyzing the mannosylerythritol acetylation at both the C-4′ and C-6′ hydroxyl groups of mannoses. In addition, an acyltransferase is required for the acylation of mannosylerythritol, which is encoded by mac1 [73, 74].

On the other hand, trehalose (trehalolipids) biosynthesis encompasses glucose transfer from UDP-glucose to glucose-6-phosphate to synthesize trehalose-6-P-UDP. This is synthesized through the catalytic capability of trehalose-6-phosphate synthase. Subsequently, a free disaccharide catalyzed by trehalose-6-phosphate phosphatase is generated by de-phosphorylation [75]. Besides, the synthesis of phospholipid occurs in the cytosol corresponding to the membrane that is coupled with proteins that act in allocation (flippase and floppase) and synthesis (acyl transferases, phosphatase, and choline phosphotransferase). Ultimately, the phospholipids containing vesicles destined for the cytoplasmic cellular sprout out on its exterior. The exoplasmic cellular membrane also generates the release of phospholipids BioS on its inner leaflet [76].

Alternatively, the biosynthesis of surfactin, which is one of the prominent lipopeptide BioSs, ensues through a non-ribosomal peptide synthetase mechanism [46]. This process involves joining amino acids into the surfactin's peptide component catalyzed by surfactin synthase through a thiotemplate mechanism. This includes amino acid activation by ATP and assemblage of amino acids into a peptide chain. Using an acyltransferase enzyme, the lipopeptide BioS is then formed by linking the hydroxyl fatty acid to a peptide group [77]. There is also a second pathway responsible for regulating the biosynthesis of surfactin (expression of SrfA). For example, B. subtilis encodes for eight of these [PhrA, PhrC (CSF), PhrE, PhrF, PhrG, PhrH, PhrI, and PhrK] and 11 aspartyl-phosphate phosphatase proteins (RapA to RapK) as shown in Fig. 8 [78]. The activity of the co-transcribed Rap proteins is inhibited by Phr peptides. The RapC which is responsible for the de-phosphorylation of ComA is proportional to the concentration of PhrC. However, there is usually repression of surfactin synthesis when there is a high intracellular concentration of PhrC. In this regard, the BioS production is subsequently dependent on the SpoOK (permease) required for peptide transfer across the membrane [79]. The process involved in srfA gene expression is dependent on RapC low concentrations to improve the availability of phosphorylated ComA, thereby triggering transcription by binding the promoter region. Thus, srfA gene expression is further regulated by repressor proteins such as AbrB and GTP as well as other transcriptional regulators such as DegU and sensor CodY [78].

Fig. 8
figure 8

The genetic regulation for the biosynthesis of surfactin by Bacillus species. The negative positive regulation is indicated closed circles and close-head arrows indicate positive regulation [Adopted from Roongsawang et al.]

[78]

Kinetics of Biosurfactant Production

The BioS production kinetics has substantial variance among diverse systems. The different kinetic parameters to be considered are assembled below:

  1. (a)

    Growth-dependent;

  2. (b)

    Growth-limiting;

  3. (c)

    Synthesis by immobilized or resting cells; and

  4. (d)

    Synthesis with precursor supplements [80].

In production related to growth, there exists a parallel correlation between cellular growth, substrate usage, and increased BioS production. A heightened increase in BioS concentration due to the restraint of one or more medium constituents characterizes the synthesis under growth-limiting conditions. The synthesis by immobilized or resting cells is a type where the cells use carbon substrates continuously for BioS synthesis, with relatively no cell multiplication. The last kinetic parameter, as listed above, involves the addition of BioS precursors to the production medium. As revealed by researchers, precursor addition often results in qualitative and quantitative variations in BioS product yield [67].

Substrate Concentrations and Formulations used for Biosurfactant Production

In the last decade, many studies have been undertaken for media optimization, especially for most prominent BioS producers—Bacillus, Pseudomonas, Candida, and Acinetobacter species. Different parameters such as type and amount of carbon and nitrogen sources, the ratio of metal supplements, pH, temperature, aeration, dissolved oxygen, and cell density have been studied expansively and found to be fundamental for BioS production [81, 82]. Table 1 gives an overview of a few of the prominent BioS producers and concentrations of carbon sources, which showed the highest reported BioS yield when grown in shake flasks and large‐scale fermenter vessels. The output and productivity of BioS are significant in the range of culture conditions, medium compositions, primary substrate concentration, and current operating scales at relatively small-scale synthesis. However, techniques such as response surface methodology and statistical methods like Box–Behnken, Taguchi, and Plackett–Burman Design have been frequently used to handle multiple data and optimize BioS production processes [83,84,85].

Table 1 Overview of a few of the prominent BioS producers and concentrations of carbon sources, which showed the highest reported biosurfactant yield

Additionally, artificial intelligence‐based technique is another method that has been utilized for media optimization and yield improvement of BioS [86, 87]. In an enormous scope, the BioS synthesis costs are commonly revealed as being fundamentally subject to bioreactor volume (impacted by profitability and fairly by titer because of startup/closure time and batch scheduling), costs of raw materials (affected by the decision of crude materials and yield), and costs of separation and purification [88, 89]. Another key expense at industrial scale is the startup, cleansing, and costs of sterilization caused when running sequenced batch/fed-batch production campaigns, which are frequently neglected; however, these have been distinguished by mechanical accomplices as a vital objective for the decrease in costs. The frequency of production campaigns is inversely proportional to the product titer for a given quantity of BioS. As such, having the option to run a fermentation process for a more extended period while keeping up profitability and arriving at higher titers will decrease the entire production costs [90]. There is a need for cost-effectiveness through the utilization of renewable raw materials for most biotechnological product processes. This is why many products dependent on BioS and bioemulsifier are very costly and still in low amounts. Consequently, despite expanded interest in BioS, their bulk use has not been acknowledged because of huge costs, particularly when contrasted with surfactants of chemical origin [90, 91].

Genetic Engineering Strategies for Enhancing Biosurfactant Production

It has been hard to accomplish significant breakthroughs regarding BioS production with traditional techniques such as breeding, fermentation, optimization, and statistical optimization [23]. Thus, current genetic engineering approaches can be used to satisfy the need for new, competitive, and environmentally friendly BioS products as the improvement of microbial strains offers a great prospect in reducing the cost of production [92, 93]. The enhanced strains compared with parent strains also use the same quantity of raw materials thereby synthesizing a higher amount of the desired products [93]. The genetics of the producer organism is an essential factor that affects the yield of all biotechnological products since the ability to produce a metabolite is conferred by organism genes [94]. Genetic engineering is a strategy that includes the modification of genetically engineered microorganisms, naked genes, or plasmid-containing BioS genes using biotechnological techniques. This method helps in the manipulation of single organism's genes or operons, including heritable and non-heritable DNA, constructing biosynthetic pathways, and sequence modification of existing BioS synthetic genes. Therefore, it is of great significance to discuss the genetic methods utilized for improved synthesis of BioSs.

Recombinant DNA Technology

Recombinant microorganisms can be obtained by introducing an exogenous nucleic acid encoding BioS in a host microorganism. Here, the host microorganism can be chosen and utilized subjectively with the ability to produce the diverse BioS compounds. For example, the incorporation of a vector suitable for nucleic acid can be introduced into the host cell in any manner. A vector is a nucleic acid molecule incorporated and transported to the cell (carrier) with its capability in replication and expression for a host cell. Scientists have been successful in the usage of recombinant DNA technology to increase BioS production yield. For instance, metabolic engineering strategy led to an increased rhamnolipid production by recombinant E. coli expressing rhlAB compared to the parent strain (Pseudomonas aeruginosa) and other E. coli strains [95].

Similarly, the rate of fatty acid synthesis in recombinant E. coli increases 1.3-fold when RhlA is expressed, confirming that RhlA is required and sufficient in the formation of the rhamnolipid acyl moiety [96]. Anburajan et al. [97] also reported the heterologous expression of the surfactin synthetase gene from Bacillus licheniformis NIOT-06 into E. coli M15. Phosphopantetheinyl transferases are fundamental in activating polyketide, fatty acid, and non-ribosomal peptide synthetase enzymes. Therefore, it was proposed that Sfp phosphopantetheinyl transferase represents an important component of peptide synthesis systems and indispensable in the biosynthesis of lipopeptide BioS [13]. Porob et al. [60] reported that an sfp gene cloned from Bacillus tequilensis encoded 224 amino acids and the role of surfactin genes in BioS synthesis from 6 different Bacillus species. Additionally, the genes sfp, sfp0, and srfA were cloned into recombinant microbial strains BioSa, BioSb, and BioS, leading to improved BioS activity. The outcomes also discovered conserved family characteristics between BioS and esterase genes [98]. The production of lipopeptides with improved properties was facilitated through the engineering (cloning and sequencing) of the arthrofactin synthetase gene cluster, thus exploiting its usage industrially [51].

Mutagenesis

Mutagenesis is one of the methods by which the gene sequence of the microbial strain can be transformed in order to induce its activity [93]. This method is also used extensively as a mechanism for the development of hyper-producing microbial strains tailored for improved BioS synthesis. Different mutation techniques such as site-directed mutagenesis, treatment with DNA ligase, substitutions, deletions, and insertions have been proposed by researchers. Additionally, it is conceivable to get a mutation variant by other techniques, such as ultraviolet irradiation, physical and/or chemical mutagenesis methyl-N’-nitro-N-nitrosoguanidine, or by selection based on resistance to ionic detergents such as CTAB [93, 99]. One of the strategies to cope with economic constrains associated with BioS production is the development of a hyper-producing mutants. From the literature search, there has been limitation of the hyper-producing mutants for improved BioS synthesis to genera of Acinetobacter, Pseudomonas, and Bacillus, which are the producers of emulsan, rhamnolipid, and, respectively [94]. The overproduction of BioS by mutagenesis technique has been shown by previous reports. For instance, there was higher production of surfactants from a B. subtilis E8 mutant after being induced with ion beam [100].

Similarly, the use of gamma irradiation enabled a Pseudomonas aeruginosa MR01 mutant with a more than 1.5-fold BioS production [101]. The introduction of gamma-ray on P. aeruginosa S8 enabled the formation of mutant strain with 2–3 times BioS production than the wild-type cells [102]. In a report, the RhlA and RhlB mutants showed that swarming requires the expression of the rhlA gene but does not necessitate rhamnolipid production. It was also demonstrated that ammonium used instead of nitrate as a nitrogen source along with excess available iron decreases RhlA expression and swarming motility [103]. The genes responsible for the biosynthesis and control of emulsan heteropolysaccharides BioS were also targeted in a different report. The mutants deleted for several of the genes were defective in emulsification, indicating that polysaccharide is essential for extracellular emulsifying activity [50]. The mutated proteins flawed during the catalytic activity performed better in enhancing apoemulsan-mediated emulsifying activity [104]. Acinetobacter venetianus RAG1 also forms emulsan as reported in another study. The removal of the protein fraction yield apoemulsan shows a lower emulsifying activity on hydrophobic pollutants such as n-hexadecane [48]. In another report, the arfB gene disruption mutant did not produce arthrofactin but exhibited low swarming activity while enhancing biofilm formation and extracellular fiber production. These results suggest that the arthrofactin synthetase gene may have multiple functions [51].

Overexpression of Extracellular Peptides

The overexpression of small signaling molecules that activate and regulate the biosynthetic cluster of an antibiotic revealed the possibility of enhancing biosurfactant production [19]. As reported, the overexpression of extracellular peptides, ComX, and PhrC is a crucial resource for improving surfactant production as cells grow at low-level cell density. The production of surfactant in the strain, B. subtilis (pHT43-comXphrC) after 48 h cultivation was 6.4-fold higher than in the wild strain. Both extracellular signaling factors and different response pathways in B. subtilis were responsible for improving surfactin production through the development of genetic competence and quorum response trigger at low cell density [19]. Table 2 shows different microbial strains utilized for the improved synthesis of BioS, including the different genetic engineering techniques and BioS yields.

Table 2 Biosurfactant yield from different recombinant microbial strains

Substitution, Replacement, and Modifications of Amino Acids

The substitution or modification of amino acid refers to the replacement of an amino acid residue with a side chain with another amino residue with similar properties. This method is dependent on the various side chains which could be acidic side chains (e.g., aspartic acid, glutamic acid), basic side chains (e.g., lysine, arginine, histidine), non-polar side chains (e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan), beta-branched side chains (e.g., threonine, valine, isoleucine), uncharged polar side chains (e.g., glycine, asparagine, glutamine, serine, threonine, tyrosine, cysteine), and aromatic side chains (e.g., tyrosine, phenylalanine, tryptophan, such as the histidine). The intensity of the promoter affects the transcription level of a target gene directly, which in turn impacts the way the intensity is expressed. Also, one of the most direct and effective ways to regulate the expression of key genes is through promoter substitution [18]. Two earlier studies investigated surfactin yields after promoter exchange in the presence of the srfA operon. The studies were, however, conducted with different surfactin producer strains and substitute promoter sequences, which provided conflicting results [20, 105]. Coutte et al. [105] described the replacement of the srfA native promoter by the constitutive promoter PrepU in B. subtilis 168 after the integration of a functional sfp gene. Replacement of surfactin operon promoter by a constitutive one prevented lipopeptide synthetase expression thereby revealing only a small enhancement of surfactin production that suggests that the precursor supply is the problem for the surfactin overproduction in B. subtilis 168 derivative strains and not the synthetase expression [105].

In contrast, Sun et al. [20] reported tenfold enhanced surfactin yields after replacement of PsrfA with Pspac, an IPTG-inducible hybrid promoter originating from B. subtilis bacteriophage SP01 and E. coli lac operon. Another study reported a similar output as the substitution of the native srfA promoter for the constitutive Pveg could significantly increase surfactin production in a strain with only the compound's low native production [8]. Similarly, when the promoter of the iturin operon was replaced by the repU promoter of the plasmid pUB110 replication protein, a threefold increase in the production of iturin A was observed [62].

Gene/Gene Cluster Knock-Out

This is a genetic technique that requires that one of an organism's genes is made inactive (“knocked out” of the organism) for enhanced BioS synthesis. The best approach in this technique is homologous recombination and through which a single gene gets deleted without effecting all other genes in an organism. Therefore, it could be a double knock-out, triple knock-out, or quadruple knock-outs depending on the number of genes knocked out in a specific microorganism [106]. The biosynthesis of different lipopeptides and polyketides would compete with biosurfactant production for energy, NADPH, and direct precursors unavoidably. Therefore, eliminating these large gene clusters could enhance the synthesis of BioSs. Table 2 shows that the deletion of competitive extracellular matrix formation pathways and other NRPS/PKS pathways improved surfactin production significantly by 3.3-fold compared to 168S1, which may be possibly due to the elimination of precursor competition [21].

Bioremediation Applications of Genetically Engineered Microorganisms with Biosurfactant Properties

Biosurfactant-facilitated bioremediation (BFB) is an approach that involves the inoculation of BioS producers or BioS monomers into the contaminated environments. BFB has attracted increasing attention recently, as more researches have focused on this innovative strategy. This methodology is adopted to overcome the constraints of bioavailability encountered during the transformation of contaminants in complex and harsh environments [107,108,109,110,111,112,113,114]. Microorganisms are proficient in producing different kinds of BioSs, which range from the low molecular weight to the high molecular weight BioS. These BioS-producing microorganisms belong to several genera such as Pseudomonas, Penicillium, Clostridium, Acinetobacter, Bacillus, Aeromonas, Brevibacterium, Lactobacillus, Arthrobacter, Citrobacter, Candida, Corynebacterium, Yarrowia, Ustilago, Aspergillus, Torulopsis, Ochrobactrum, Pseudozyma, Saccharomyces, Gordonia, Enterobacter, Rhodococcus, Halomonas, Serratia, Leuconostoc, and Thiobacillus [44, 115]. BioSs are amphiphilic surface-active agents that are known to have both polar and non-polar groups, and they decrease surface tension at the interface between two liquids incapable of forming a homogeneous substance, similar to water and oil [116]. BioSs synthesized by microorganisms are environmentally compatible, biodegradable, non-toxic, effective at extreme environmental conditions, and have higher foaming capacity, making them more suitable over their synthetic counterparts [23]. In different environments, bioremediation can be less efficient due to the low bioavailability of petroleum hydrocarbons. Thus, introducing BioS-producing microorganisms that emulsify hydrocarbons would make remediation more efficient and reliable [23, 117]. The amphiphilic nature of BioSs enables the solubilization of water-insoluble hydrophobic pollutants through emulsification and surface-area reduction [117, 118]. BioS bioaugmentation is an essential strategy to enhance the degradation rate in contaminated environments. However, for a safe and successful introduction of BioS, the toxicity and effectiveness of BioSs must be assessed before inoculation of the BioS-producing organism. Furthermore, BioS compounds and degradative enzymes termed biomaterials can be directly introduced into the contaminated environment. This strategy may minimize the regulatory burden that is imposed with the direct inoculation of foreign organisms [119].

There is increased demand for effective BioSs because of its biotechnological, industrial, and environmental applications [120]. There was an increase in the extracellular anionic polysaccharide, biodispersan produced by A. calcoaceticus A2, a mutant strain defective in protein secretion. The reduction in secreted proteins presented on the extracellular fluid reduced problems in the purification and application of biodispersan. Moreover, recombinant strains often give rise to better product characteristics [120]. The engineered E. coli M15 strain has potential for biotechnological application since it produces BioS at high rates and can avoid the complex downstream process associated with the conventional bioprocess [97]. Besides, enhanced expression of the rhlAB operon in wild-type strain PG201 resulted in increased rhamnolipid production. The expression of the rhlAB operon on a plasmid led to rhamnolipid BioS and/or polyhydroxyalkanoate hyper-production that can be used for the synthesis of biodegradable plastics [121]. In a different research, recombinant E. coli pSKA cloned with olive as sole carbon source containing the BioS gene srfA showed higher esterase and BioS activity in comparison to Bacillus sp. SK320 [122]. In another study, data revealed that the BioS genes were successfully cloned, expressed, and overexpressed in BioSa, BioSb, and BioSc, showing a twofold increase in the activity than the parent strain. It was further reported in the study that cloning of the BioS genes from Bacillus subtilis SK320 into E. coli resulted in the expression of the BioS activity and conferred enhanced esterase production in the recombinant cells BioSa, BioSb, and BioSc as compared to Bacillus subtilis SK320 [98]. Subsequently, it was discovered that emulsion formed in the presence of olive oil by the recombinant cells of Bacillus subtilis SK320 emphasizing that the cloning of the BioS gene conferred on the E. coli cells enhances its ability to utilize olive oil as a sole carbon source [98]. The gene encoding AlnB was also cloned, sequenced, and overexpressed in E. coli. Recombinant AlnB had no emulsifying activity, but it stabilizes oil-in-water emulsion generated by AlnA [66]. To further justify the finding, the apoemulsan recombinant-esterase mixture was investigated for emulsification of a wide range of pure and crude oil products from various sources, comparing the activities with that of fully proteinated emulsan. The results revealed that the esterase–apoemulsan complex was more effective as it emulsifies various hydrophobic substrates that are typically not likely to be emulsified by crude emulsan itself [10].

In this context, rhIB is a gene that has been reported to be a BioS-producing gene. In a study, BioS-producing gene was amplified from Pseudomonas aeruginosa, and the gene regulated by regulatory gene reduced the surface tension, thereby emulsifying the oils (petrol, diesel, and kerosene) [123]. The sfp nucleotide sequence was expressed in E. coli, and its putative product was purified for use in antibody production and the analysis of the amino acid sequence. Overproduction of sfp in Bacillus subtilis did not cause the production of an increased amount of surfactin, but it resulted in the repression of a lacZ transcriptional fusion of srfA operon, which encodes enzymes that catalyze surfactin synthesis [59]. There was formation of stable oil–water emulsions with hydrophobic substrates such as hexadecane from the mixture of apoemulsan that included the catalytically active soluble form of the recombinant esterase isolated from cell extracts or the solubilized inert form of the enzyme recovered from the inclusion bodies [104]. The first time investigation of Rhodococcus strain accounted for genetic enhancement in the synthesis of BioS and the role played in environmental remediation [124]. In another instance, the possibility of B. subtilis mutant BioS for biotechnological uses was shown by its stability to environmental factors such as pH and temperature and its applicability in more than 90% of the oil recovery process motor oil adsorbed to a sand sample [93].

Further researches are essential to improve the environmental scale applications with consideration on numerous ecological complexes and factors that limit BioS synthesis and utilization. To encourage field uses of these BioS innovations, substantial tests are foreseen to consolidate heterogeneities in topographical/hydrological features and bioremediation of contaminated sites. With the new improvement in this field and the spotlight on interdisciplinary research joined with advancements of metabolic and genetic engineering, the prospects of BioSs will be financially practical. The exploration in this field is progressing quickly, and it envelops areas as diverse as textiles, pharmaceutics, cosmetics, petroleum, wastewater treatment, agriculture, natural science, and molecular biology.

Conclusion and Future Perspectives

The inclusion of genetic engineering can improve not only BioS surface activity but also its production yield. Genetically engineered hyper-producing organisms can bring breakthroughs in the production process and give high yields if the microbial surfactant production's genetics is known in detail. Therefore, it is desirable and recommended that future research on BioSs be focused on the use of genetic engineering to develop hyper-producing microbial strains. New and emerging concepts in genetic engineering should be employed to produce organisms that can enhance production with better product characteristics. Furthermore, the ability of mutant and engineered hyper-producing microbial strains to grow on a wide range of economical and renewable substrates could produce high yield BioSs and a cost-effective bioprocess. This approach will benefit from the recent advancements in strain engineering, structural elucidation, and characterization focused on producing novel BioS compounds with unique properties. Consequently, improving the quality and intensity of research in this field will help increase yield production and produce novel types of BioSs that can enhance their utilization in hydrocarbon bioremediation, antimicrobials, microbial enhance oil recovery, environmental and industrial processes.