Introduction

The journal’s editors asked me to write a review on what can be paraphrased as “drugs from natural sources.” Since my background is predominately in anti-infective and anti-tumor agents, both in the pharmaceutical industry and at the National Cancer Institute’s Natural Products Branch within the NCI’s Developmental Therapeutics Program, I have used examples from those experiences.

What provides the overarching link(s) to most of the pharmacological areas that I comment on are examples of the “influence of microbes as sources of some of the actual natural product compounds.” Hence, there will be discussion of a significant number of current discoveries of potential and actual agents that began in the marine world with invertebrates, followed by examples from both free-living microbes and invertebrate-linked microbes. That discussion will include a very important section demonstrating that insects from a specific terrestrial phylum have “uncultivated microbes producing a toxin” whose genomic information was later shown to be related to other organisms found in the most unexpected places.

Embedded in the discussions will be many examples that demonstrate how synthetic chemists have frequently modified a natural product that had biological activity of import and have then produced significant numbers of molecules, where at times one can identify the natural product progenitor’s structure, though in other cases this may not be possible.

To finish, there is a relatively small section using recent papers demonstrating how the revolution in genomic information and associated software and hardware has impacted the ways in which current drug-searching from natural products has evolved.

Pederin: a Link Between Brazilian Beetles and Marine Sponge Metabolites

The insect toxin pederin(e) (Fig. 1; 1) used as a deterrent by the beetle Paederus fuscipes was first identified by Pavan and Bo in 1952 [1]. This was then followed in 1965 by a report by Cardani et al. [2]. However, the definitive structure was then published in 1968 by Matumoto et al. [3]. In 1999, the German entomologist Kellner published an interesting paper asking what the basis of pederin polymorphism in the rove beetle Paederus riparus was. He suggested with data that an endosymbiont was the actual producer of the toxin [4]. Expanding on this hypothesis in 2001, he reported suppression of pederin biosynthesis by using antibiotics to remove endosymbionts in another close species, Paederus sabaeus [5]. Thus, there was significant though indirect evidence implying a bacterial component to the production of pederin. Since this occurred in two different species of the beetle, it possibly was common to all. Then, in two papers published the following year (2002), Kellner reported on the molecular identification of an endosymbiotic bacterium associated with pederin production in Paederus sabaeus [6] together with a demonstration of interspecific transmission of the endosymbiont(s) [7].

Fig. 1
figure 1

Structures 1–11

From these genetic studies, the producing agent was identified as a Gram-negative bacterium, a Pseudomonad that infected about 80% of the tested female beetles. This insect genus is found on all continents except Antarctica, with the dermatitis caused by the toxin well described in the literature, in particular in a 2013 publication by Cressey et al. that demonstrates the problems when dealing with this toxin [8].

This is where it might well have languished as an interesting molecule to synthesize as a demonstration of novel chemistry, as shown by the following two publications in the synthetic chemistry literature [9, 10].

Mycalamides A and B and Onnamide

So why was this discovery of the probable bacterial source of pederin now an important one? For this, one needs to go back to a discovery in the middle 1980s from New Zealand waters, which was reported in 1988 [11] and 1990 [12]. In these two papers, Perry, who was a graduate student in the Blunt and Munro group at the University of Canterbury in Christchurch (South Island, NZ), reported the isolation of two extremely potent antitumor agents that they named mycalamide A (Fig. 1; 2) and B (Fig. 1; 3), from a Mycale sponge collected in cold water (~ 2 °C) at approximately 30 m depth off South Island, New Zealand. The only difference between the two compounds was the presence of an extra methyl group in B (Fig. 1; 3), which led to a tenfold difference in bioactivity against the murine leukemia cell line P388. Since they were working under a subcontract funded by the NCI, they brought the compounds to the small company SeaPharm (the NCI contractor) in FL before the formal publications listed above. At SeaPharm, the compounds were shown to demonstrate significant activity against human cell lines. These effects including murine in vivo assays, together with data on onnamide (Fig. 1; 4) which was isolated at SeaPharm from an Okinawan sponge of the genus Theonella but collected from warmer water (~ 30 °C) [13], were published in 1989 just as SeaPharm closed down [14].

Thus, on the one hand, a molecule that was isolated from insects all over the globe, (though initially from Brazil) had two similar molecules from both a chemical structure aspect and biological activities that were isolated from a Mycale sponge collected in cold water (2 °C) at ~ 30 m depth with another similar molecule isolated from effectively a surface Theonella sponge collected off Okinawa at ambient temperatures around 30 °C.

It was later shown (see above under the “Pederin: a Link Between Brazilian Beetles and Marine Sponge Metabolites” section) that the probable producer in the case of the pederin molecule was a Gram-negative microbe, almost certainly a pseudomonad by inspection of the microbial genes, following a thorough analysis and identification of a relevant polyketide synthase-peptide synthase cluster by Piel, who was then at Jena at the Max Planck Institute for Chemical Ecology. Piel cloned the putative pederin-producing genes from the as yet uncultured bacterial symbiont of Paederus beetles and published his results in 2002 [15].

The next step in the process was to use this knowledge to attempt to isolate similar gene clusters in the Okinawan sponge that produced onnamide using the clusters from Paederus beetles, which are examples of what microbiologists have named “symbiosis islands” as the initial probe(s) [16]. Working in close conjunction with marine chemists from Japan, Piel and his collaborators were able to isolate the onnamide/theopederin (Fig. 1; 5) polyketide synthase genes from the sponge [17, 18]. These gene clusters, as with the beetle, apparently came from an as yet unculturable microbe, techniques/sources that will become even more important as the story continues up to the present day.

Ircinistatin A/Psymberin

The next phase in the complex story of the source(s) of these agents in marine invertebrates was the report by two groups of the complex chemical variation known as ircinistatin A (Fig. 1; 6) or psymberin (Fig. 1; 6). These were both published in 2004, but the submission dates give precedence to ircinistatin A (Fig. 1; 6). These turned out to be some of the more complex structures based on pederin. A total synthesis that established stereochemistry, etc., was published in 2008 by the Smith Lab at the University of Pennsylvania [19]. Then, in 2013, a much fuller review on this agent (under the name psymberin) was published by Bielitza and Pietruska [20]. This latter paper gives a much fuller story as to the many synthetic schemes used to produce this agent and its derivatives.

Recap

As of approximately 2011, genomic techniques, though initially in their infancy at the turn of the twenty-first century, had now become a methodology that could be utilized in cases where the microbe cannot be cultivated by the techniques available.

The Piel Group and Products from “As Yet Uncultivated Microbes” Isolated from Marine Sources

Since 2012, scientists working with the Piel group and/or utilizing his techniques have now led to a complete rethinking of the concept of “uncultivatable microbes and their products.” The paper that caused this major upheaval in thinking about the situation was one published in early 2014 by Piel and his associates (see below), though three earlier papers led to the major change. The paper that began the new story was an article in the 2011 Annual Reviews of Microbiology which outlined approaches to “capture” bioactive agents from such microbes [21]. Then, in 2012, Hentschel, with Piel and other collaborators, wrote a review on genomic insights into sponge microbiomes [22]. This was followed in 2013 by a paper from Wilson and Piel [23]. which covered the exploitation of uncultivated microbes to locate novel biosynthetic systems.

Then in early 2014 in a paper in Nature, [24] the group proved the source(s) of the majority of bioactive agents (including onnamides) that had been isolated by many investigators from an apparently “very bioactive Theonella sponge” in Palau lagoon. This organism had been investigated by many marine natural product chemists, in particular those associated with the Faulkner group at Scripps Oceanographic Institution in California from the early 1990s. The Faulkner group had postulated that some of the bioactive agents they isolated from this sponge involved microbes in their biosynthesis, but the necessary techniques were not available at the end of the twentieth century.

The 2014 Nature paper effectively demonstrated that well over half of the chemically quite different thirty-plus bioactive molecules reported were from one as yet uncultivated bacterium. These were mainly antitumor agents, since most of the earlier work was funded by the US National Cancer Institute, in particular studies from US investigators. Once the information in this paper became disseminated, there was a significant series of alterations in already published works now asking the question “who produced what?,” an initial example the same year being the 2014 paper on the biogenesis of the complex cyclic peptide calyculin [25]. Then, in 2018, data on the Japanese (Okinawa-sourced) Theonella sponge was published demonstrating similar uncultivated microbes as the producers [26]. Following on from these papers, in 2020, the New Zealand successors to the Blunt and Munro laboratories reported that the Mycale sponge that produced the mycalamides and other agents including the microtubule inhibitor peloruside (Fig. 1; 7) are products of the sponge’s microbiota and not just one uncultivated microbe from a consortium of such microbes [27].

As shown in the preceding discussion, the Pederin-like agents have excellent antitumor activities and are almost all produced by as yet uncultivated microbes (all considered to be prokaryotic in nature), together with an interesting “stripped genome” system in the production of diaphorin (Fig. 1; 8) which was reported in 2013 by Nakabachi et al. [28]. This is similar to what was reported as the actual source(s) in the workup of the marine-sourced drug ecteinascidin 743 (Fig. 1; 9) by the Sherman laboratory in 2015 [29].

Similar Products from “Cultivatable” Microbes and Pederin Production

Currently, there are two agents in the “pederin-like structure series” that have been found by cultivation of free-living prokaryotes. These are cusperin (Fig. 1; 10), which was first reported from a free-living cyanobacterium by Kust et al. in 2018, [30] and then the extremely interesting report in 2017 from the PharmaMar group of a pederin-like molecule now named labrenzin (Fig. 1; 11), isolated from the free-living heterotrophic proteobacterium Labrenzia sp. PHM005 [31]. In 2019, Kacar et al. reported further on this agent identifying a Trans-AT PKS gene cluster in the biosynthetic cluster [32]. By using this “free-living” microbe as the host and inserting the necessary gene clusters, Kacar et al., including PharmaMar scientists, were able to produce pederin microbiologically [33]. Though that report covered the intrinsic problems in scaling up such a process, it is fundamentally easier than obtaining pederin by synthetic means or “wild collections from many thousands of beetles.” Most non-process chemists usually assume that scale-up from a lab level synthetic scheme that might produce < 10 mg is relatively easy, that this supposition is totally incorrect will be demonstrated in a later section dealing with halichondrin derivatives (cf. the “From Synthetic Chemistry to Drugs and Candidates under cGMP Conditions” section).

It is quite probable from what was commented on in the papers related to labrenzin that PharmaMar might well be considering pederin in particular due to its well-reported spectrum of antitumor activity, and/or one of the similar agents that potentially can be microbiologically produced by suitable manipulation of the labrenzin genome as prime candidates as “warheads” in antibody drug conjugates (ADCs). Prior examples of this usage of very potent natural products are discussed in a later section on dolastatins (cf. the “Dolastatins, Potent Marine-Sourced Peptides Leading to Drugs Against Cancer” section).

A take-home lesson from this section is that “one size does not fit all”, but the real lesson from a natural product aspect is that the sponge (and possibly significant numbers of other marine invertebrate genera) is/are simply a “host” for the microbes that “do the work in producing the bioactive agents.” As a result, drug development from such sources is certainly going to involve some mixture of genomic precursor(s) in cultivatable microbes, coupled to chemical modification of a precursor. This is exactly the case in the production of artemisinin by producing arteminisic acid as a precursor and then producing different molecules by chemical processes. This will also be covered in a later section (cf. the “Artemisinin; Production via Biochemical Engineering” section).

Antitumor Agents

Dolastatins: Potent Marine-Sourced Peptides Leading to Drugs Against Cancer

This class of compounds first reported by the Pettit group at Arizona State University in the 1980s following isolation in miniscule quantities from the Indian Ocean sea-hare Dolabella auricularia is one of the initial examples of how synthetic chemistry and early NMR techniques/HPLC were absolutely necessary for these compounds to be assigned their absolute structures.

The initial work of isolation covered many years and literally tons of the nominal producer collected mainly in the Indian Ocean. The initial isolates exhibited an ED50 in in vitro assays of 46 pg/mL levels against the murine leukemia P388 cell line. Using the same tumor in in vivo studies in mice, dolastatin 10 demonstrated protective activity at a dose close to 20 µg/kg. Its flat structure was elucidated by NMR and MS studies in 1987 [34].

The base structure had five amino acids in a linear array. The N-terminus (P1) was N,N-dimethylvaline (dolavaline; Dov), followed by valine (at P2), and then three new amino acids, dolaleuine (Dil, P3), dolaproine (Dap, P4), and dolaphenine (Doe) at the C-terminus (P5). However, due to the lack of any stereochemical information, the only valid method in the late 1980s was total synthesis, identifying each center as the synthesis proceeded. Following this strategy, the absolute configuration was published in 1989 [35] with a companion US patent issued in 1990 [36]. Subsequent work with collaborators demonstrated that dolastatin 10 (Fig. 2; 12) and analogues functioned as antitubulin agents binding at the vinca alkaloid site, thus formally functioning in the same manner as the natural products vinblastine, maytansine, and phomopsin. The story up through 1995 was covered by Pettit in a 1997 monograph in Progress in the Chemistry of Organic Natural Products, which gave thorough details of the initial discovery and identification of the dolastatins [37].

Fig. 2
figure 2

Structures 12–23

Actual Producer of the Dolastatins

In 1998, a report from the Moore laboratory at the University of Hawaii, working in conjunction with other marine natural product chemists, demonstrated that dolastatins, plus very close relatives, were not produced by the host invertebrate that Pettit had used but were in fact products of a free-living cyanophyte, and thus they were prokaryotic products. The first compound reported in 1998 was symplostatin 1 (Fig. 2; 13), which was isolated from the cyanophyte Symploca hynoides by Harrigan et al.[38] Two years later, this was followed by the report of the isolation of dolastatin 10 (Fig. 2; 12) from another cyanophyte, Symploca sp. VP642 collected in Guam [39] following observations by marine scientists of Dolabella species feeding on that cyanophyte. Subsequently, the Luesch laboratory and collaborators found both of these compounds (symplostatin 1 and dolastatin 10) when multiple Symploca strains were fermented in laboratory settings [40]. Later, the dolastatin 10 producing cyanophyte was taxonomically reclassified and assigned to a new genus/species, Caldora penicillata [41].

Clinical Trials of Dolastatin and Analogues

Dolastatin 10 (Fig. 2; 12) entered human clinical trials progressing to phase II but was dropped in 2010 due to toxicity. This compound was subsequently the structural basis of three very potent agents that progressed to phase II clinical trials. These agents were cemadotin (Fig. 2; 14), synthadotin (Fig. 2; 15) (also known as ILX-651 and LU-223651), and auristatin PE (Fig. 2; 16) (also known as soblidotin, TZT-1027, and YHI-201); though they failed as single agents, among these structures, auristatin PE (Fig. 2; 16) can be seen to be the progenitor of what a few years later become known as MMAE (Fig. 2; 17) and MMAF (Fig. 2; 18) when slightly modified as warheads for ADCs, directed initially against leukemias (cf. following section).

Dolastatin Derivatives as Warheads for Antibody–Drug-Conjugate MMAE/F

In 2002, Seattle Genetics scientists were granted a base US patent [42] covering a series of new pentapeptides based upon the earlier Pettit molecules. The patent included two molecules based on the auristatin E (Fig. 2; 19) nucleus, monomethylauristatin E (MMAE) (Fig. 2; 17), trade name vedotin and a close relative, monomethylauristatin F (MMAF), (Fig. 2; 18), trade name mafodotin. Details were subsequently published in 2003 by Doronina et al. [43] covering the initial development of potent ADCs using these “warheads.”

The first ADC based upon MMAE, brentuximab vedotin, was approved by the FDA in 2011. This ADC (Fig. 2; 20) linked MMAE via a cleavable maleimide-based linker to the monoclonal antibody (brentuximab; cAC10). That antibody was directed against CD30, a cell membrane epitope that is present in Hodgkin’s lymphoma and anaplastic large cell leukemia. Following its initial approval in 2011, it was subsequently approved for lymphomas by relevant agencies in other countries.

Following on from this molecule, a significant number of pharma companies signed licensing agreements of one type or another with Seattle Genetics, now known from the middle of 2020 as SeaGen. As of late August 2022, 14 ADCs have received approval worldwide [44], of which four utilize MMAE (Adcetris®, Polivy®, Padcev®, and Aidix®) and one utilizes MMAF (Blenrep®). As a result, one can see the highly significant biological effects that these derivatives of the linear dolastatins have had on antitumor chemotherapy in the last 10–15 years. The impact of these natural products continues today with a significant number of ADCs that are in the early phases of clinical development and using variations on the original MMAE/F.

Further Chemical Modifications of the Vedotins

From the initial findings by the Pettit laboratory in the 1970s coupled to the realization that the dolastatins were bacterial in origin and the significant abilities of many medicinal chemists worldwide, commencing with the initial synthetic work by Singh in the Pettit laboratory [45], have come agents that when used as “warheads” revolutionized the treatment of initially lymphomas and now other cancers.

These arose as medicinal chemists in large and small pharmaceutical companies began modifications of the base molecules, aiming to discover patentable molecules with similar activities. The following structures, in no particular order, are those reported in the ADC literature as of early 2021 linked to a variety of proprietary monoclonal antibodies. These ADCs are designed to treat different tumor types and are at various stages of preclinical and clinical trials in countries worldwide: duostatin 5 (Fig. 2; 21), amberstatin-269 (Fig. 2; 22), auristatin 0101 (Fig. 2; 23), N-demethyl-N-[4-(6-maleimidohexano-hydrazido)-4-oxobutyl]auristatin W amide (Fig. 3; 24), BAY 1168650 (auristatin W derivative) (Fig. 3; 25), AGL-0182–30) (Fig. 3; 26), the basic structure for the mAb-warhead under the name ZW49 (Fig. 3; 27), and SHR152852 (Fig. 3; 28), the MMAE variant that is the warhead of SHR-A1403 [46, 47].

Fig. 3
figure 3

Structures 24–28

A review by Cheng-Sanchez et al. [48] was recently published in the open access journal Marine Drugs and contains excellent structural diagrams of further ADCs at all levels of development that have utilized molecules based on the original dolastatin structure, including the derivatives listed above plus numbers of others at earlier stages of development. This review paper, together with the 2022 review by Singh [45], are recommended studies for scientists not involved in this type of research. These demonstrate how from a relatively simple structure of 5 amino acids produced by a cyanophyte multiple drug entities are either under development or have been approved as antitumor agents.

Non-Dolastatin-Based Molecules as ADC Warheads

Although the dolastatins and derivatives were the centerpiece of the preceding section, many other natural product-derived antitumor agents are also being utilized as warheads, as shown by the number of approved agents listed in the 2022 paper by Fu et al. [44]. ADCs with warheads derived from other natural sources are now in use alongside the dolastatin derivatives. Their structures are based upon maytansine and pyrrolobenzodiazepines (which are both microbial in origin), and camptothecin derivatives which might also have a microbial component in their biosynthesis. All are well known from the 1980s or earlier; thus, old agents have new leases on life.

Anthracyclines

These agents were some of the earliest natural-product-derived compounds that became human-use antitumor agents. They are still being investigated not only as potential leads to novel agents based upon the initial findings many years ago but in current investigations into the potential of the base molecules for further bioengineering, etc., as shown by the 2022 review by Hulst et al. in Natural Products Reports [49].

The initial compounds are now known to be the products of aromatic type II polyketides, assembled by sequential condensation of acyl-CoA units. This type of genetic construct covers molecules that now fall into natural product structural classes that include the anthracyclines, angucyclines, aureolic acids, tetracenomycins, tetracyclines, and other (unlisted) base structures, but only the anthracyclines will be covered.

The first reported anthracyclines were the rhodomycins (considered as potential antibiotics in 1950). Subsequent molecules were reported to show antibiotic activities, but their major and current usage was as antitumor agents. It may well be that the two best known anthracyclines in the scientific and lay press are the very close structural relatives daunoubicin (Fig. 4; 29), [50] and doxorubicin (Fig. 4; 30). In that era, genetic modification of the daunorubicin-producing bacterium to yield the more potent derivative doxorubicin (Fig. 4; 30) [51] required chemical and physical mutagenesis techniques, not as would be performed today by specific genome modification. Inspection of the chemical structures of those two agents and their four close chemical relatives, rhodomycin B (Fig. 4; 31), nogalamycin (Fig. 4; 32), aclacinomycin (Fig. 4; 33), and steffimicin (Fig. 4; 34), demonstrated their common tetracyclic moiety, with the majority of their chemical diversity being due to carbohydrate tailoring enzymes.

Fig. 4
figure 4

Structures 29–39

Of the numerous anthracyclines reported over the years, six semi-synthetic derivatives of daunorubicin (Fig. 4; 29) are in current clinical use: doxorubicin (Fig. 4; 30), epirubicin (Fig. 4; 35), idarubicin (Fig. 4; 36), pirarubicin (Fig. 4; 37), and valrubicin (Fig. 4; 38). Finally, the totally synthetic amrubicin (Fig. 4; 39), containing a minimal version of daunosamine, is used in Japan. A common problem of all of these molecules is their innate cardiotoxicity; thus, there is a clinical limitation as to the number of treatment courses, often based upon the age of the patient [52]. As mentioned earlier in this discussion [49], there are ongoing investigations involving novel agents related to these earlier structures; but to date, none of the more recent compounds referenced in that review [49] have become approved drugs, though work continues on using genomic searching and modification by “mixing and matching” gene clusters.

Anthracyclines as Warheads (ADC and Lipid Linked)

Anthracyclines were among the initial molecules considered as available “warheads” with the aim of overcoming the cardiotoxicity mentioned above by linking to lipid carriers. For treating breast cancer, a pegylated-lipid version was approved in various countries [53], with a non-pegylated version approved for her2-negative breast cancer. In a 2021 paper, Schettini et al. [54] demonstrated that a non-pegylated liposomal doxorubicin could also be a valid treatment under various clinical scenarios in breast cancer treatment. It must be pointed out, however, that to date no anthracycline linked to a monoclonal antibody (ADC) has been approved as a drug.

From Synthetic Chemistry to Candidates and Drugs

Vancomycin and Chemically Modified Natural Product Glycopeptide Antibiotics, Non-cGMP

The glycopeptidic antibiotic vancomycin (Fig. 5; 40) was frequently referred to in its early usage as the “antibiotic(s) of last resort.” This term was also used with its later close chemical relatives. This soubriquet was due initially to “its/their” use when simpler antibiotics, usually orally active, were ineffective against a virulent Gram-positive infection due to mutations in the bacterium. Adding to the problem, the vancomycin class antibiotics have to be given intravenously, requiring close monitoring due to toxicity.

Fig. 5
figure 5

Structures 40–42

Vancomycin was first introduced in the middle to late 1950s by the pharmaceutical company Lilly, with its fully defined structure published in 1982, when asparagine was confirmed as the final amino acid within the peptide backbone [55].

Resistance to vancomycin was reported early on in its use, but the cause of the resistance was not known at the time, since it was reported well before information was available as to vancomycin binding sites, later known to be due to changes in bacterial cell walls in the resistant bacteria. The information became known (middle to late 1970s) that vancomycin together with its later naturally occurring and/or semi-synthetic “chemical cousins” all bound to the “L-Lys-D-Ala-D-Ala-CO2H” terminal sequences in the cross-links in the Gram-positive cell walls. As a result of such binding, bacterial cell wall growth was inhibited, stopping the growth of the bacterium.

Microbes resistant to these glycopeptides exhibited a series of different responses now known to be linked to changes in their terminal cell wall amino acid moieties. The initial resistant phenotype (VanR) occurred due to a simple change of the terminal D-alanine (D-Ala-CO2H) to D-Lactate in the cases of the clinical phenotypes vanA, vanB, and vanD. The terminal residue in the other three clinical phenotypes, vanC, vanE, and vanG changed to “D-Ser-CO2H.” The vancomycin resistance level due to the D-Lac modification was approximately 1000-fold, and for the D-Ser modification approximately 140-fold compared to nonresistant microbial strains of the same bacterium.

Then came reports that the Gram-positive microbe S. aureus with methicillin resistance (MRSA) had strains that were also becoming resistant to vancomycin. The widespread hypothesis that vancomycin resistance was due to the use of glycopeptide antibiotics in animal feeds, was shown to be inaccurate by two reports, both from the Wright group in Canada, first in 2011 by D’Costa et al. [56] and then a year later [57]. These reports demonstrated that microbes isolated from deep core samples taken from Yukon ice fields > 10,000 years old demonstrated similar resistance phenotypes to the current MRSA S. aureus.

It was not until 2009 that the next glycopeptide antibiotic telavancin, a close structural relative of vancomycin, was approved by the US FDA. Then in 2014 the FDA approved two more “glycopeptide antibiotics,” dalbavancin, derived from part of the known A40926 complex, and oritavancin, derived from chloroeremomycin. In all these cases, synthetic organic chemists made the modifications that led to the approved products starting from the base natural product(s).

Though not approved (to date) in the USA as it is not a single agent but a mixture of closely related compounds, teicoplanin has been used in Europe for a number of years. Interestingly, some VanR phenotypes are not resistant to this mixture though in general most strains are resistant to all. Structures of these agents are not given in this review.

Synthetic Modifications of the Vancomycin Chemical Skeleton

The synthetic chemistry group at the forefront of initially semi-synthetic and then total synthetic work with the vancomycin chemical skeleton is the one led by Boger at the Scripps Research Institute in La Jolla, CA. Encompassing the last 10 to 15 years, this group published significant synthetic chemistry papers demonstrating at first how by making “simple changes” viz. a change at one position within the peptide backbone by using ingenious thought processes and sophisticated synthetic chemistry; they synthesized a series of related vancomycin molecules with very significant antibiotic activities against resistant MRSA and E. faecalis (both VanA and VanB phenotypes).

The Boger lab then extended their synthetic chemistry to involve other parts of the base molecule by linking relatively small “pharmacologically active molecular parts” from other glycopeptides in clinical use. The subsequent compound (Fig. 5; 41), redrawn from the 2017 paper by Okano et al. [58], demonstrated the substitutions used. The compilation of structural changes and their corresponding MIC tables presented in that paper demonstrated how the use of careful structural modifications and microbiological analyses at each stage, was capable of converting total resistance to E. faecalis and E. faecium (MICs of vancomycin ≥ 250 µg.mL−1) to molecules with MICs from 5 to 0.005 µg.mL−1 against these VanA/E-resistant microbes.

In 2020, the same group reported their results obtained by substituting different guanidino modifications on the C-terminus of the internal vancomycin peptide chain. These modifications further improved the antimicrobial activity. An unexpected synergistic mechanism of action independent of D-Ala-D-Ala was also reported [59]. Here, using structure (Fig. 5; 42) as the base, they added the 4-chlorobiphenyl)-methyl (CBP) modification, as from earlier work this modification might give significant increases in activity. Going back to structure (Fig. 5; 42), when X = O and R = a variety of guanidino substituents, a series of modified vancomycins were synthesized exhibiting sub-microgram activities (using their reported MIC levels) against relevant vancomycin-resistant clinical specimens.

I used the following direct quote from that paper [59]: “a prototypical member of the series, G3-CBP-vancomycin (15) exhibits no hemolytic activity, displays no mammalian cell growth inhibition, possesses improved and especially attractive in vivo pharmacokinetic (PK) properties, and displays excellent in vivo efficacy and potency against an especially challenging multidrug-resistant (MRSA) and VanA vancomycin-resistant (VRSA) Staphylococcus aureus bacterial strain.” The structure (15) mentioned in the direct quote above from Wu et al. [59] is shown as CBP-G3 in (Fig. 5; 42).

In addition to the papers above, another recent communication from the Boger lab [60] provides an excellent precis of the modified vancomycin derivatives mentioned above, together with other modified glycopeptides from the Boger lab. They comment that these agents are now known in that laboratory as “maxamycins.” This paper, plus one directed to the synthetic chemistry processes used [61], are worth reading to gain insight into what may occur when current chemical processes are applied to a microbial product that first saw “light of day” in the middle to late 1950s, with the desire to overcome microbial resistance in the original compound. Another report from 2020, but this time published in a synthetic chemistry journal, is well worth reading since there is significant discussion about other microbiological changes that occurred when trimethyl ammonium cations are available as part of the final structure (not shown) [61].

From Synthetic Chemistry to Drugs and Candidates Under cGMP Conditions

Introduction

As shown above, it has become obvious that experienced synthetic chemists can utilize sophisticated chemical processes to both modify and/or produce molecules de novo, whose structural genesis was a bioactive natural product. What is also extremely important in this synthetic process is the absolute requirement to demonstrate the ability to produce the desired agent from total or semi-synthesis under current good manufacturing processes (cGMP) and on a scale large enough to provide the agent for preclinical and clinical development and then produce the approved drug.

It is one thing to complete an academic synthesis of a novel natural product or variant on the base structure in a synthetic chemistry laboratory with up to 60-plus steps, giving an overall yield in the range of 1 to 10 mg. However, using such a process to produce the quantities needed for even preclinical and clinical trials requires an entirely different “type of chemist,” one who is experienced in process development and scale-up, first under GLP (good laboratory practice) for use in late preclinical trials particularly the 2-year toxicity testing required and then under cGMP for clinical trials, and if successful, produce kilograms of the drug compound still under cGMP.

Halichondrins: Discovery to Approved Synthetic Drugs and Candidates

Chemists have synthesized derivatives of base molecules, but prior to this work, no drug entity has been totally synthetic in nature with a molecular weight above 700 D, a complex cyclic structure and then produced in bulk as the drug.

The story of the halichondrin B derivative eribulin MW 730 D (Fig. 6; 43) is the prime example of how a totally synthetic compound based upon a very potent marine-derived antitumor compound became an approved drug initially in the USA and then in other countries. The reports by Uemura et al. in 1985 [62] and Hirata and Uemura in 1986 [63] of the isolation of the halichondrins from the Japanese sponge Halichondria okadai “scooped” the Pettit group working in the central Pacific and the Blunt and Munro group working in New Zealand waters, both using different sponge sources. The very small amounts of halichondrin B obtained probably also contained okadaic acid, since the sponge sources all appeared from later studies to also co-produce this agent. The Pettit group provided a small amount of halichondrin B to NCI, who demonstrated it inhibited tubulin using data from direct binding assays, plus the use of the then novel NCI60 cell line patterning data [64]. Thirteen years later, Jordan and Wilson demonstrated the complex interaction of halichondrin B and other derivatives with tubulin [65].

Fig. 6
figure 6

Structures 43–45

In the mid-1980s, NCI had funded a pure research grant (an R01) to the Kishi group at Harvard for the synthesis of halichondrin B (Fig. 6; 44) MW 1111 D. In early 1992, the Kishi group published a 60-step synthesis of halichondrin B with a very low overall yield [66]. Since it had been decided by NCI early in 1992 to further develop this class of compounds as potential antitumor agents before the publication cited above, NCI therefore needed to source these compounds, in particular halichondrin B.

Cutting a long story short, NCI funded the recovery of 1 metric ton of the sponge Lyssodendoryx sp. from 200 m depth off the east coast of South Island, New Zealand and financed the extraction and purification of 330 mg of pure halichondrin B from the 1 metric ton of sponge. This sponge was the source of the material that the Blunt and Munro group had used in their earlier studies on the halichondrins, and since it was in NZ waters, permission could be given for the effort by the NZ government.

What was not known in the middle 1990s was that Kishi had started working with Eisai’s American Institute to scale up his earlier synthesis and to check at each iteration the biological activity demonstrated, with adjustments made to the next synthetic step depending upon the biological results seen from the earlier molecule. Over the next few years, Eisai synthesized over 200 derivatives, learned that the major “active principle” was maintenance of the right-hand ring, and licensed the Kishi patents.

Following work presented (as a poster) at a US AACR scientific meeting by the NCI scientists using the NZ-sourced material, Eisai and NCI established a scientific linkage whereby NCI tested the two most active Eisai compounds related to halichondrin B in both in vitro and in vivo assays against the pure NZ-halichondrin B, and none contained okadaic acid. The two Eisai compounds proved to be superior in both sets of experiments, and following an agreement with Eisai, their most active compound was scaled up under cGMP conditions, with 9 g provided to NCI for preclinical and phase I clinical workup. Following extensive clinical trials, the modified halichondrin derivative known as Eribulin (Fig. 6; 43) was approved by the US FDA in 2010 as Halaven®.

At that time, this compound was the most complex totally synthetic molecule yet approved anywhere as a drug. The process chemists at Eisai now have the synthesis trimmed to 17 crystallizations and one HPLC, producing kilogram quantities of the cGMP drug substance. Eribulin is also being investigated as an ADC warhead under the code name of Morab-202, with initial preclinical details being published by Cheng et al. in 2018 [67] together with a positive report on the first phase I clinical trial being published in 2021 by Shimizu et al. [68]. In 2022, a thorough analysis of both eribulin and other marine-sourced warheads was published in the journal Marine Drugs [48].

In 2019, Eisai published their total synthesis of halichondrin B (Fig. 6; 44) and a derivative named as E7130 (Fig. 6; 45) with a MW of 1066 D, producing 10 g of cGMP material [69] which is currently in phase I clinical trials in Japan at 9 sites for patients with solid tumors. Thus, synthetic organic chemists now have the proven capability to produce under cGMP conditions quantities of high molecular weight structures based upon Mother Nature’s chemistry.

Artemisinin and Derivatives

Artemisinin: Initial Discovery

The discovery of artemisinin as a potential treatment for malaria came from consulting commentaries written more than 2500 years ago in traditional Chinese medicine (TCM) records. The initial data series on the discovery and development of this agent came from You-You Tu’s Nobel lecture in 2016 [70] coupled to her later paper in 2019 [71]. The early records covering fevers treated by TCM almost certainly included descriptions of malaria as noted by Diouf et al. in 2014 [72]. These TCM reports dated back some 2000-plus years, with significant outbreaks in southern China and the Indochina peninsula. Contemporary TCM records referred to the use of extracts from plants of the genus Artemisia (Chinese Qinghao) as medicinal agents.

These Artemisia extracts were mentioned initially as a specific remedy for what are now equivalent to descriptions of malarial symptoms in Ge Hong’s Zhouhou Beiji Fang (or Handbook of Prescriptions for Emergency) dating back to the Eastern Jin Dynasty (317–420 CE). Later works such as the Bencao Gangmu (Compendium of Materia Medica) by Li Shizhen (Ming Dynasty, 1368–1644 CE) recommended the usage of “Qinghao and other techniques” for relief of fevers that were possible malarial symptoms.

Until Laveran’s 1880 discovery of malarial parasites and Ross’ discovery in 1897 that mosquitoes were the vectors for avian malaria, the comparable mosquito involvement.

in human malaria by Battista et al. in 1900 [73] indicated the true cause(s) of malaria.

The most significant “point” from a chemical perspective in Tu’s reports was her realization that use of “normal” TCM methodology gave variable results. On investigating older literature (back to roughly 500 years BCE) looking for reports of use of TCM for “fevers", came the realization that lukewarm water extracts were the recommended method for these extractions. Such modifications consistently yielded artemisinin (Fig. 7; 46), though later she avoided such heat-induced loss by the use of ether as the extractant. She also demonstrated that of the six Artemisia species grown in China (listed in the early TCM records), only Artemisia annua contained artemisinin.

Fig. 7
figure 7

Structures 46–50

The definitive structure of artemisinin (Fig. 7; 46) demonstrates the source of her variability problem, since the essential internal peroxide bridge is heat-labile and decomposes above ~ 45 °C. Further information as to her role and earlier work by Chinese investigators who, over the years, had worked with Professor Tu was published in 2017 by Liu [74] in between the two citations [70, 71] above. This paper is an excellent historical resource for Chinese investigators; however, a significant proportion of the references listed are not easily available to Western scientists as they are in Chinese journals and written in Mandarin with no current translated versions available.

Artemisinin: Production Via Biochemical Engineering

Once artemisinin demonstrated its clinical superiority over the then current anti-malarial treatments, most of which had been “loosely based” upon the chemical structure of quinine, the requirement for large amounts of the initial agent became a major problem that had to be overcome. Though the plant takes around 6 months growing in fields, the extraction procedure/harvesting destroys the plant; therefore, one requires large successive plantation growth and/or methods to produce either the final compound or a suitable intermediate precursor. The large plantation system is what was done under the auspices of the US Army in the 1980s in the southern USA, but it still required significant investment as yields were low and costs were high particularly when compared to the then conventional antimalarials.

In 2004, the Gates Foundation funded initial work using the current biotechnological methods by the Keasling group at Berkeley. This university-based operation then developed into the commercial spin-off, Amyris. Amyris devised a process using Saccharomyces cerevisiae, which had been bioengineered to contain and subsequently express the necessary early synthetic genes from A. annua in order to produce large quantities of the essential intermediate artemisinic acid (Fig. 7; 47). Subsequently, a semi-synthetic chemical process converted artemisinic acid into artemisinin [75, 76]. This process was licensed to Sanofi for large-scale production with the aim of reducing the cost to $1 US per dose for use mainly by the World Health Organization. Currently, there are multiple companies now involved in optimizing production of artemisinic acid and/or investigating conversion to artemisinin, with a 2018 paper from Amyris reporting on further expansion of methodologies [77]. As a marker of the interest by companies interested in artemisinin and variations, the 2020 paper by Liu et al. [78] lists the substantial number of patents awarded since 1968 that are related to these compounds, amply confirming the value of Professor Tu’s initial discovery.

Properties of Chemically Modified Artemisinin Derivatives

Stereochemistry of Artemisinin

The modification of artemisinin in order to overcome pharmacological flaws (particularly solubility issues) by groups in the West occurred almost from the beginning. There was no doubt that organizations in China were performing similar studies, but those details were not available at the time in non-Mandarin literature. Since artemisinin has structural stereochemistry, there was the question as to whether artemisinin’s antiparasitic activity was stereospecific.

The isolated molecule is the ( +) antipode; therefore, the ( −) antipode needed to be synthesized and tested with its natural partner to determine any stereospecificity in its known activities. Although there were theoretical synthetic routes to the ( −) isomer, in 2018, Krieger et al. [79] described the synthesis of the ( −) isomer (Fig. 7; 48). That report noted that stereochemistry had no effect on antimalarial activity. This was unexpected, as most interactions with protein targets are stereospecific, with the presence of the wrong antipode frequently acting as an inhibitor. They also commented on reports that ( +) artemisinin binds to a significant number of different plasmodial proteins. These comments cast doubt upon the hypothesis that artemisinin bound only to a specific plasmodial protein, plus they demonstrated that both antipodes, when tested against the P. falciparium parasite, gave identical results within assay error.

However, when both antipodes were screened for cytotoxic activity against CCRF-CEM human leukemia cells and the corresponding adriamycin-resistant CCRF-CEM cells, contrary to the reports above on interactions with the whole parasite, in both cell lines, the ( +) antipode was close to twice as active as the unnatural ( −) antipode.

What is significant from a chemical and pharmacological perspective is that the synthetic pathways leading to both antipodes can be easily modified, producing previously unknown/unreported compounds. Such “unnatural compounds” may well lead to molecules that are active against resistant parasites and perhaps certain tumor cells.

Antitumor Activities of First-Generation ART Derivatives

Prior to 2004, there was some evidence that the metabolite dihydroartemisinin (Fig. 7; 49) might be a lead to a treatment for Lupus erythematosus-related nephritis. Its mechanism of action involved inhibition of the production of anti-double-stranded-DNA antibodies, which led to concomitant inhibition of secretion of TNF-α and affected the NF-κB pathway (Li et al. [80]). Though the Li report was published in 2006, the work was performed earlier as described by Efferth in a 2017 review [81].

In that 2017 review, Efferth provided a series of tables giving the published data (up through 2016) on the pharmacology of artemisinin (Fig. 7; 46), dihydroartemisinin (Fig. 7; 49), and artesunate/artesunic acid (Fig. 7; 50). The latter compound (also a known metabolite of artemisinin) was synthesized early in the investigations of artemisinin, designed to overcome some of the metabolic problems shown with the pure natural product.

As Efferth mentioned [81], these compounds covered a wide range of potential interactions with mammalian cancer cells including but not limited to induction of DNA lesions, oxidative stress, arrest of the cell cycle, antiangiogenesis, induction of various modes of programmed cell death, and interactions with essential signal transduction processes.

Checking the NIH Clinical Trials database (www.clintrials.gov) at the end of September 2022, there were 149 listed studies when using “artemisinins” as the search parameter, but only 3 had any link to neoplasm treatment. Of these, two were in the USA, a phase I (NCT03100045) and a phase II (NCT04098744) and with the third at the phase 1 level in Germany (NCT00764036). It is possible that there are other trials ongoing in China not currently listed in the NIH database, but in earlier years, some Chinese trials had been listed in that database.

Further Bioactivities of Amino-Artemisinin Derivatives

Metabolism in humans of the first-generation artemisinin-based compounds produced dihydroartemisinin (Fig. 7; 49) as the initial metabolite. If, however, a molecule was minus the oxygen group at position C10, this metabolite would be avoided. In 2006, Haynes et al. [82] reported that an artemisinin derivative they synthesized with a cyclic nitrogen-containing substituent at the C10 position (artemisone, Fig. 8; 51) was active in vivo as an antimalarial.

Fig. 8
figure 8

Structures 51-55

Modifying artemisone (Fig. 8; 51) by addition of a piperazine substituent (Fig. 8; 52) and then using it as the base structure, they produced a series of ferrocene-linked molecules (Fig. 8; 53–55) differing by the type of linkages to the ferrocene ring. Xiao et al., in a review in 2020 [83], published and commented not only on the artemisinin-related compounds but also on other antimalarial agents that have been linked to this aromatic Fe-containing moiety.

When tested against three Pf strains (two resistant and one sensitive), these ferrocene-containing molecules demonstrated excellent selectivity against their asexual blood stages. There was also some selectivity demonstrated when assaying them against human normal and tumor cells. These compounds require further refinement but aptly demonstrate that such molecules may well have medicinal potential [84, 85], with more examples being discussed below in the next section.

Artemisinin-Based Compounds with Antiviral and Other Antiparasitic Activities

Results from early chemical modifications demonstrated that artemisinin-derived molecules exhibited a multiplicity of biological activities in addition to those mentioned above. These included activity against a variety of viruses, non-malarial parasitic diseases such as toxoplasmosis [86], with an extension to potential anti-leukemia activities. Though many variations on the base molecule have been reported, only a select few are commented on below but without formal structures.

In a review in 2014, Ho et al. [87] produced a chart of reports from 1980 to early 2013 of these “other activities of artemisinin-derivatives.” In the initial pages of that review, another chart showed that approximately 30–40% of published reports from as early as 1982 extended artemisinin(s) into pharmacological areas other than antimalarials.

From the aspect of antiviral activity, there is reasonable in vitro evidence on the activity of artemisinin and some of its derivatives against DNA viruses of the Herpesviridae and Hepadnaviridae, including human herpes virus 6, herpes simplex viruses 1 and 2, Epstein–Barr virus, and Hepatitis B virus. However, weaker activities were reported against polyomaviruses and papilloma viruses, and little to no inhibitory activity in vitro was listed for RNA viruses such as HIV 1 and 2, hepatitis C, and influenza. More information as to further potential antiviral activities was also given in 2018 in a review by Efferth commenting on the antiviral effects of varied “first generation” artemisinin derivatives [88].

Artemisinin Dimers and Trimers

Relatively recent reports have demonstrated how “nominally simple” chemical modifications (production of dimers and trimers of artemisinin) could yield compounds with antimalarial and other biological activities. In 2018, a paper by Frolich et al. [89] provided evidence of the potential of such molecules if linked to rigid beads and used to “catch” the biological target(s). Once “caught,” identification of the “catch” by mass spectroscopic techniques followed. Though this was not a new technique, as shown by comments in the 2018 paper by Laraia et al. [90], it is one of the ways of directly identifying potential pharmacological targets of a chemical construct and has a long history of use with varying methods of physicochemical identification of the “catch.”

From an antitumor aspect, artemisinin-based dimers and trimers with varying linkers between the molecules have been well described since the late 1990s and demonstrated significant activity against tumor cell lines, though these constructs did not progress further into advanced preclinical studies, as in almost every case there was very significant toxicity against normal cells.

Utilizing clever synthetic chemistry, the Frolich group [89] produced molecules based on linkages to artenusic acid (Fig. 7; 50), and of the thirteen linked compounds described in that paper, the dimer (Fig. 9; 56) and the trimer (Fig. 9; 57) were then used to determine if they bound to any proteins in total lysates of human HCMV-infected fibroblasts. Table 3 in their paper [89] demonstrated the numbers of both cellular proteins and HCMV-related proteins that bound. Whether or not these were all primary targets could not be determined from the initial experiments, but the process can be repeated to further identify the pathway(s) by using timed infection experiments.

Fig. 9
figure 9

Structures 56–58

BAD: a Human Target of Artesunate

In 2019, Gotsbacher et al. at Macquarie University in Sydney, Australia reported an unusual and unanticipated human target for artesunate (Fig. 7; 50) from use of reverse chemical proteomics [91]. Using a phage expression system displayed on a bacteriophage T7 vector, which permitted an unbiased interrogation of several human cDNA libraries, they demonstrated that one probable human target of artesunate (Fig. 7; 50) was the cell death promoter BAD. Under their experimental conditions, artesunate (Fig. 7; 50) inhibited the phosphorylation of BAD, which causes the formation of the proapoptotic BAD/Bcl-xL complex. The unexpected role of BAD might be a route into clinical exploitation of artemisinin derivatives on the Bcl-xL life/death switch. The cytotoxicity of artesunate (Fig. 7; 50) can be abrogated in HeLa cells if BAD activity is knocked down by siBAD. The 2012 paper by Watts and Corey [92] discussed in both general and specific terms methods to prove/disprove specific interactions by this technique.

Their data suggested that binding of artesunate was required for its demonstrated apoptotic effect and that the observed antitumor activity may well be independent of reactive oxygen species. Following cell treatment with the Abbott drug candidate ABT-737 (Fig. 9; 58), a known BH3 mimic that binds to various components of the Bcl-cascade, synergistic activity was seen in HeLa cells in the presence of artesunate. Since HeLa cells are constitutively resistant to ABT-737 due to high intrinsic levels of mcl-1, this work opens up a whole new area for studying the interactions of artemisinin-derived compounds with human cell lines.

Synthetic Peroxy Compounds with “an Unexpected Result in Cell Assays”

Following the initial report of the peroxy bridge structure of artemisinin, synthetic chemists, in addition to exploring the features of this natural product, began to synthesize potential compounds that contained either that specific peroxy-motif or extended variations of it as potential antimalarial agents. These experiments led to a series of compounds, for example the trioxolane OZ277 (Fig. 10; 59) based upon the 1,2,4-trioxane pharmacophore (Fig. 10; 60) in artemisinin, plus others such as RKA182 (Fig. 10; 61) based on a tetraoxane ring system.

Fig. 10
figure 10

Structures 59–63

In 2018, Coghi et al. [93] reported their current work from utilizing such synthetic systems. They evaluated a set of peroxides including bridged 1,2,4,5-tetraoxanes, bridged 1,2,4-trioxolanes, and tricyclic monoperoxides for their in vitro antimalarial activity against P. falciparum 3D7, antitumor activities in two human tumor cell lines HepG2 and A549, and in the non-tumor cell lines hepatic LO2 and bronchial BEAS-2B, with the normal human fibroblast line CCD19Lu used as toxicity controls.

The 26 compounds were composed of 11 tetraoxanes, 6 trioxolanes (3 sets of stereoisomers), and 9 monoperoxides, with artemisinin, artesunic acid, (chloroquine and taxol structures not shown) as controls. In this study, the synthetic ozonides exhibited both high cytotoxicity in vitro and selectivity. Thus, ethyl (1R,2R,5S)-2-allyl-1,5-dimethyl-6,7,8-trioxabicyclo[3.2.1]octane-2carboxylate (Fig. 10; 62) had a selectivity index of ~ 20, and ethyl (1R,2S,5S)-2-hexyl-1,5-dimethyl6,7,8-trioxabicyclo[3.2.1]octane-2-carboxylate (Fig. 10; 63) a selectivity index of 28 against the HepG2 cancer cell line compared to the LO2 cell line. In contrast, artesunic acid (Fig. 7; 50) demonstrated effectively only a minor selectivity index of 0.3 against the same cell lines. In contrast, from an antimalarial aspect, all compounds tested were several orders of magnitude less active than the controls.

Interestingly, though designed around what were thought to be active antimalarial pharmacophores using data from a number of laboratories, those two compounds (Fig. 10; 62, 63) turned out to be potential anticancer leads with IC50 values between 360 and 590 nM against the tumor cell line HepG2, hence the implied comment in the title of this section.

Genomics and Natural Product Drug Discovery

There are many examples in the current literature using genomics to discover/develop agents from natural sources. I will only comment on three aspects in the use of genomics to discover and develop natural product drugs and these are (1) utility of genomics in analyzing TCM medications, (2) genomics coupled to discovery of molecular networks, and (3) the discovery of novel antibiotics.

Utility of Genomics in Analyzing TCM Medications

In a recent critical review by Wang et al. [94] with authors from various PRC metabolomics laboratories in Shanghai, there are excellent figures demonstrating their approach to “deciphering various TCM treatments with the aim to determine their active principle(s).” Table 1 in that review is a relatively short listing of the identified single agent “active natural product” together with the relevant potential cell target. In 2020, the journal Advances in Pharmacology published a requested review by the author covering some of the same topics but not in as much detail, suggesting “modern traditional Chinese medicine” as the name when genomics techniques coupled to new analytical chemical tools could aid in the understanding of classical TCM [95].

Genomics Coupled to Discovery of Molecular Networks

A major problem with natural product derivatives derived from genomic manipulations of microbes, with the “structures” initially identified from mass spectroscopic data coupled to high-resolution chromatographic systems, is how to determine and/or identify any underlying relationships to known compounds and/or close relatives that have been published and/or are in multiple databases in various locations.

One of the major analytical groups in this field is the Dorrestein laboratory at the University of California, San Diego, plus his collaborators from many parts of the globe that have aided in the “production” of both identification techniques and compilation of the corresponding open databases. In 2021, there were three significant publications covering these aspects that are effectively “required reading” for anyone needing to utilize these techniques. These papers contain the earlier references to the necessary database systems. The first discusses the systematic classification of metabolites with high resolution mass spectroscopy of fragments [96]. The second covers the integration of genomics and metabolomics to discover and identify non-ribosomal peptides [97]. The third covers the integration of metabolites from use of metabolomics in a natural products environment [98].

As alluded to above, the reference listings, particularly in the third review [98], will permit investigators not familiar with the processes involved to be able to begin to integrate any data that they have produced with what has, or has not, already been reported.

The Discovery of Novel Antibiotics

One of the major groups integrating genomics and metabolomics for the discovery of novel antibiotics is the Mȕller laboratory at the Heimholtz Institute for Pharmaceutical Research (HIPS) at the University of the Saarland in Germany. Between early 2021 and 2022, that group published three review papers that are effectively required reading when an investigator wishes to use genomic techniques to find and then further develop novel antibiotics from prokaryotic and eukaryotic microbes. The publications are as follows: In 2201, Synergizing metabolomics and bacterial genomics in the RSC journal Chemical Science [99]. The same year, in Nature Reviews of Chemistry came the multiple author review on sustainable discovery and development of novel antibiotics [100], which was then followed by a review in 2022 published in Natural Product Reviews looking for inhibitors of bacterial RNA polymerases as a route to novel antibiotics [101].

Conclusion

The examples given above are merely “a taste” of what has been accomplished over the years (predominately from the late 1950s onward) in isolating, identifying, and then developing potential drugs from the base chemical structures of natural products from all sources. Yes, molecules like the beta-lactams (penicillins and cephalosporins), streptomycin, and the early tetracycline antibiotics date from the beginning of WWII to the later 1940s in their initial use, but the majority of natural product-based drugs and their later descendants date from the advent of chromatographic techniques that were reproducible and scalable, though I would exempt aspirin from those comments.

It must also be recognized that almost without exception that discoveries made using natural product sources are “years in their making.” Two examples from this review would be: (1) the pederine source discovery of microbes that were not fermentable that led to the work by Piel and (2) the discovery that the dolastatins were produced by free-living cyanobacteria. Both discoveries were subsequently utilized, but up to 50 years separated the initial discoveries from drug entities. Many of the reasons for the length of time are listed in the next two paragraphs.

It should be remembered (taught?) from an analytical aspect that the first NMR machines in general use date from the Varian A60 in 1960–1961, which had to be operated by a specialist. This is in contrast to today where undergraduates are using 60 to 200 mHz solid state machines as part of freshman general chemistry. Likewise, UV and IR machines in 1960 took 30 min to perform what were then designated as “high-resolution” spectra on a paper printout. The first HPLC machines in general usage were not until the early 1980s, and mass spectroscopy in the early 1960s required a room full of equipment. Similarly, determination of amino acid composition (not sequences) again required a room full of equipment in the late 1950s. Finally, the first simple electronic calculators were not available until the mid-1970s, and computers required air-conditioned rooms in which to function and were the size of large filing cabinets. Note that IBM PCs were not generally available in labs until early 1982, and then their costs were $1565 (equivalent to $4805 in 2022), though earlier “PCs from other manufacturers” were in use but required programming skills to use.

Thus, chemists and their biological equivalents required both patience and time in those days to identify a compound chemically, and then the initial pharmacology (distribution, metabolism, etc.) for a potential drug entity took months, not a few weeks or so as today.

One might ask how valid are the comments/dates given above? Those dates, aside from the early antibiotic introductions, are taken from my own laboratory experience from 1956 to 2015 working with both natural products from many sources as potential antitumor agents and antibiotics, and as a bench chemist synthesizing porphyrins, photoactive dyestuffs, and organometallic compounds (that might fix nitrogen).