The origin of metabolic pathways is one of the most significant events in the early evolution of life, and it is also one of the most exciting and challenging questions in evolution and biology. This research field has been of interest to several researchers since the beginning of the last century. The first coherent hypothesis about this subject was proposed in 1945 by Norman Horowitz (Horowitz 1945); since then, a larger number of works have been published. However, fundamental ideas or hypotheses that have managed to set the course of research are few. These transcending ideas can be found by Sam Granick, who proposed the “forward direction hypothesis” in the 1950s, the successful idea of “Patchwork” assembly proposed independently by Martynas Yčas (Yčas 1974) and Roy Jensen (Jensen 1976), and the proposal made by Antonio Lazcano and Stanley Miller 21 years ago in the Journal Molecular Evolution (Lazcano and Miller 1999), the “semi-enzymatic origin” of metabolic pathways. This approach proposes that the origin of simple metabolic-like routes produced crucial components required by primordial living entities. These routes were based upon the preponderance of non-enzymatic or semi-enzymatic autocatalytic processes that later became fine-tuned as ribozyme-mediated and protein-based enzymatic metabolic pathways.

The following lines review the ideas proposed in this latter work and analyze its influence on studying the origin and early evolution of life.

Metabolism First?

It is hard to find a characterization of life that does not include metabolism and replication, and these are undoubtedly two essential features of living things. However, knowing which of the two occurred first from the origin of life has led us to another chicken-egg problem, dividing the researchers of life’s origin into two blocks. The problem lies in the definition of metabolism and the bias of the concept used. The very open suggestion that chemical cycles from primitive Earth led to networks of reactions, should be considered a type of metabolism and opens up the possibility of locating these events before the origin of life. Since Yčas (Yčas 1955) suggested that these are non-organismal cycles, the metabolism-first idea has become an active and proliferating research line. That has caught the attention of most chemists and physicists interested in the origin of life. Also, autocatalytic cycles were proposed as a crucial part of the origin of life. Theoretical approaches, such as the “Chemoton” by Gánti (Gánti 2003) say that a hypothetical protocell owns three properties or systems: (1) an autocatalytic network for metabolism; (2) a lipid bilayer structure; and, (3) replicating machinery, conferring to this unit of life, the possibility to evolve by natural selection. The Chemoton model has contributed to the discussion of the origin of life, and it is well accepted in the circle of geochemists, where it is considered an important idea that opens up theoretical and philosophical debate. The autopoiesis by Maturana and Varela (Varela and Maturana 1973) is an idea that targets the living condition of a system capable of reproducing and maintaining itself. Both of the above mentioned situate metabolism at the core of life (Peretó 2012).

In the last decades, a series of influential theories highlighting the role of metabolism in life’s origin were produced. Freeman Dyson proposed a statistical model that describes the transition from disorder to order in a small population (islands) of mutually catalytic molecules. The evolution occurs initially by random genetic drift only as of the primary process in life’s origin (Dyson 1982, 1999). Stuart Kauffman is recognized for proposing that biological complex systems are the result of self-organization. He also proposed a hypothesis where the emergence of self-replicating systems may be a self-organizing collective property of critically complex protein systems in prebiotic evolution. In other words, he suggests that the collective autocatalytic sets (CAS) originated first, relative to template replication of an RNA-like molecule (Kauffman 1971, 1986, 1993). Wächtershäuser proposed a protometabolism, an autocatalytic carbon fixation process, where pyrite formation was the first energy source for life. This process was mediated by transition metal complexes driven by the electrons obtained from H2S and FeS (Wächtershäuser 1988, 1990).

Perhaps the most representative idea in the line of “metabolism first” was made by Harold J. Morowitz, who proposed the Reductive Tricarboxylic Acid Cycle (rTCA) as the primordial metabolic core. This elemental cycle develops radially, adding successive layers of complexity to the original basic version (Morowitz 1999). Similar ideas have been proposed by Robert Shapiro (Shapiro 2000) and Lindahl (Lindahl 2004). As an alternative, Martin and Russell suggest the emergence of reaction networks leading from CO or CO2 to the production of organic molecules essential to life, where non-enzymatic synthesis of acetate from CO2 takes place (today it is made through the Wood–Ljungdahl pathway). Their model is for systems located in primitive hydrothermal vents and raises critical energetic roles for H2, catalysis by transition metals, and serpentinization (Martin and Russell 2003).

Despite the importance of this area, it also has raised a series of critiques and problems that are difficult to overcome to this day. As Anet mentions in his critical analysis (Anet 2004), the metabolism-first theories do not seem robust. In conjunction with the non-precise term of “metabolism” used by the proponents of this approach, the theories that suggested that metabolism precedes replication challenge the severe problem that Lazcano and Miller visualized in 1999. And that one is: “such schemes need to explain how a genetic system could arise from such metabolic systems so that Darwinian evolution could take over.”

Unlike the origin of life, the early evolution of metabolic pathways has directly benefited from the development of genomics and molecular evolution. Since the publication of Lazcano’s and Miller’s article (Lazcano and Miller 1999), more than 40,000 genomes have been completely sequenced (Genomes onLine Databases, GOLD). Thanks to this significant increase and the development of tools to compare the complete sequences of prokaryotic and eukaryotic genomes, it is possible to study the origin and evolution of metabolism. The different hypotheses on the origin of metabolism are related to a greater or lesser degree, with gene duplication, a crucial mechanism for the generation of genomic complexity (Fani 2012; Innan and Kondrashov 2010). The study of gene evolution related to the evolution of contemporary metabolic pathways has generated a series of works that have allowed us to elucidate some aspects of its assembly processes. Even models of how cellularity and metabolism coevolve in response to early environmental conditions (Takagi et al. 2020) contributed to this dilemma, but not about its primordial origin. As Lazcano and Miller wrote: “the origin of metabolic pathways lies closer to the origin of life than to the last common ancestor” (Lazcano and Miller 1999). Hence, studying the origin and evolution of contemporary metabolic pathways gives limited knowledge about the process near the beginning of life. It is important to remember that life’s origin is not discernible only by phylogenomic analysis. This is because comparative genomics cannot be extended beyond a threshold that corresponds to a period of evolution in which protein biosynthesis already evolved (Becerra et al. 2007, 2014; Lazcano 2010). This does not mean that we cannot use the crucial top-down reconstruction; it is just a reminder that we need to be careful when we extrapolate and infer from contemporary metabolism to very early life stages. As many authors have mentioned, and Goldford and Segrè (2018) explain, there is “a massive gap in knowledge regarding the transition from prebiotic geochemical processes to the biochemical complexity of LUCA.”

Horowitz, Granick and the Patchwork Hypotheses

Current proposals about the origin of metabolic pathways argue: (i) how the reaction chains were assembled; (ii) where the substrates are obtained; and (iii) how enzymes are recruited (Fani and Fondi 2009; Fani 2012; Scossa and Fernie 2020). The retrograde hypothesis was the first organized attempt to explain the origin and evolutionary history of metabolic pathways; Horowitz proposed it in 1945 and is described as retrograde evolution. That means the pathways were built up backward, step at a time, in the forward direction using intermediates present in the prebiotic conditions (Horowitz 1945). In addition to being the first hypothesis with an evolutionary perspective where natural selection is part of the explanation for the origin of metabolic pathways, in his proposal, Horowitz manages to connect the processes of prebiotic chemistry and the assembly of the first metabolisms. This idea allowed years later to open a research line, where the sequences of the proteins involved in metabolic pathways, could be analyzed to infer their evolutionary process (Falkowski 1997; Benner et al. 2002; Kacar et al. 2017). Unfortunately, very few results have been found that support this hypothesis (Díaz-Mejía et al. 2007; Light and Kraulis 2004; Wilmanns et al. 1991). Some results only show two homologous enzymes catalyzing subsequent and regressive steps (Mayans et al. 2002). For example, in histidine biosynthesis, where the prebiotic formation and accumulation of intermediates is improbable (Alifano et al. 1996). Therefore, this hypothesis in historical terms is one of the most important but not the most supported, nor the one that seems to explain this evolutionary phenomenon best.

The Granick hypothesis suggests that the expansion of biosynthetic pathways assembled in the forward direction, where the prebiotic compounds do not perform a significant role (Granick 1957). The less-known “forward pathway evolution” proposal has more important implications than can be seen at first glance, since it is not just a forward version of Horowitz’s proposal. Just as Noda-Garcia and collaborators mentioned (Noda-Garcia et al. 2018), the evolution of metabolism includes not only the appearance of enzymes but the emergence of new metabolites. This idea is an essential aspect in the proposal from Granick. Some work that supports this proposal include in polyamine biosynthesis (Minguet et al. 2008), in bacteriochlorophyll biosynthesis (Bryant et al. 2012), and in isoprene lipid pathway (Ourisson and Nakatani 1994). These are examples that can be interpreted as a Granick-type assembly. However, these examples are isolated and are far from being used to generalize this process.

The patchwork assembly hypothesis is an impressive idea proposed independently by Yčas in 1974 (Yčas 1974) and Jensen in 1976 (Jensen 1976). The biosynthetic pathways result from the serial recruitment of promiscuous enzymes endowed with broad catalytic specificity that could react in different steps and routes. Today with a large collection of genome data available, several works that have been published support this process as one of the most important in the evolution of metabolic pathways. However, as Lazcano and Miller stated, patchwork recruitment could operate only after the emergence of protein biosynthesis and enzymes’ development. Therefore, although it is a successful approach to understand the assembly process of various metabolic pathways, it is clear that it cannot be applied to all routes, nor all the steps. That can also not be the process that existed in stages close to the origin of life. It can only locate it very early in the time of the Last Common Ancestor of life (LCA) (Lazcano and Miller 1999; Becerra et al. 2007).

The Semi-Enzymatic Origin of Metabolic Pathways

This idea of a semi-enzymatic origin for metabolic pathways proposes that the origin of simple metabolic-like pathways produced crucial components required by primordial living entities. These pathways were originally non-enzymatic or semi-enzymatic autocatalytic processes that later became fine-tuned as ribozyme-mediated and protein-based enzymatic metabolic routes (Lazcano and Miller 1999; Delaye and Lazcano 2005). Lazcano and Miller based their scheme on the following four postulates: (1) A collection of rather stable prebiotic compounds was available in primitive ponds; (2) Due to leakage from existing pathways within cells, several compounds were also available. These compounds need not be remarkably stable because they are produced within the cell and used rapidly; (3) Existing enzyme types are assumed to be available from gene duplication, and they were non-specific; (4) Starter-type enzymes are assumed to arise by non-enzymatic reactions followed by the acquisition of the enzyme (Lazcano and Miller 1999; Fani and Fondi 2009). The authors notice that although enzymes mediate most of the metabolic reactions, some occur spontaneously, and the chemical step can also happen by changing the reaction conditions in the enzyme’s non-existence.

One of the most important contributions from this proposition is enabling a bridge between prebiotic chemistry and early enzymatic reactions. Most known prebiotic reactions are different from current metabolic pathways, but there are some similarities between these two constructs, the abiotic processes and extant biosynthetic routes. Examples of these similarities can be seen in riboflavin synthesis, purine de novo biosynthesis, the synthesis of orotic acid from aspartic acid and urea, the reductive amination of 2-oxoglutarate to produce glutamate, UV-light-induced cyclisation of d-aminolevulinic acid to yield pyrrole, and others (Costanzo et al. 2007; Eschenmoser and Loewenthal 1992; Lazcano 2010; Peretó 2012). The important thing in all of these examples is that they show us that some steps that are similar to those that occur today can be carried out in the absence of enzymes, as in the case of synthesis of imidazole glycerol phosphate (IGP) in bacteria, where the reaction is catalyzed by IGP synthase that is a heterodimer formed by the hisH and hisF enzymes (Fig. 1). Interestingly enough, the synthesis of IGP can also occur in vitro when there is a lack of the hisH protein under a high concentration of NH4 (Smith and Ames 1964). Furthermore, the reaction can also take place in vivo in the Klebsiella pneumoniae, without the enzyme and high concentrations of NH4 (Rieders et al. 1994; Vázquez-Salazar et al. 2018). All these examples support the proposal that the metabolic routes that precede the current ones could exist with a smaller number of enzymes, using semi-enzymatic activity, and by the presence of multifunctional generalist enzymes (the patchwork assembly).

Fig. 1
figure 1

modified from Lazcano and Miller 1999)

Enzymatic and non-enzymatic incorporation of nitrogen to N1-5′-phospho-ribulosylformimino-5-aminoimidazole-4-carboaxamide ribotide to produce IGP. (

These two main axes form the basis of the proposal made by Lazcano and Miller in their article in the Journal Molecular evolution in 1999. However, there are two more aspects to highlight in that, first, their proposal allows us to return to the early stages of the evolution of life, closer to the LCA time. They warn us: it is not only a direct extrapolation, where the contemporary metabolic routes are derived from prebiotic pathways and differ primarily by enzymes’ assistance, as suggested by previous works (de Duve 1991; Degani and Halmann 1967; Hartman 1975; Morowitz 1992). Second, as Noda-Garcia et al. noticed, the ideas of the patchwork model proposed earlier by Jensen and Yčas, were formalized primarily by Lazcano and Miller in this remarkable paper.

Current Status

Because of the explosion of genomic data, together with the metabolic information available and the collaboration between biologists and chemists, the field of primordial metabolism has had a new spark in recent years. This can be reflected in the large amount of research and new ideas around this research field. As an example, the computational network expansion algorithm simulates a biochemical network’s increase by iteratively adding to an initial set of compounds (Goldford et al. 2017, 2019). They assert that quantitative modeling of more extensive networks can provide substantial new insights into the origin of life. Their model generates a proto-metabolic network organo-sulfur-based fueled by a thioester- and redox-driven variant of the reductive TCA cycle, capable of producing lipids and keto acids (Goldford and Segrè 2018). Rogier Braakman and Eric Smith proposed a fascinating hypothesis; based on a novel method that integrates metabolic and phylogenetic constraints; they infer a primordial metabolism where the carbon fixation pathways were at the core.

Moreover, Braakman and Smith suggest coevolution of cofactor functions with an increasingly complex universal metabolic pathway, proposing an important role of cofactors in the first ability to synthesize RNA (Braakman and Smith 2012, 2014). Joseph Moran and collaborators infer from their results that some catalyzed biological reactions and, sometimes, entire biochemical pathways may have had fully prebiotic precursors relying on non-enzymatic catalysts (Muchowska et al. 2017, 2019). As an argument that this option is plausible, the same group remarkably shows that a large part of the steps that make up the reverse tricarboxylic acid (rTCA) cycle, or the reverse Krebs cycle, can be catalyzed without enzymes, where the reaction occurs with metals as Zn2 +, Cr3 + and Fe0 in an acidic aqueous solution (Muchowska et al. 2017). Some models approach very early stages of evolution. They propose moments of chemical evolution or even earlier stages, when polymer sequence diversity was generated and selected by the function in hydration-dehydration cycles (Walker et al. 2012). This list of research is incomplete and superficial, but it is only intended to demonstrate that this research area is extraordinarily vigorous and expanding. For instance, this year and during the time this paper was written, exciting publications in the field appeared in the literature. A very dynamic team from ELSI and other research centers present a continuous reaction network that generates compounds that can lead to RNA synthesis (Yi et al. 2020). Also, an approach of computational systems chemistry was used by Andersen and collaborators to explore large chemical reaction networks on the vast space of plausible prebiotic scenarios (Andersen et al. 2017). Moreover, Moran and collaborators published an important review of non-enzymatic metabolic reactions related to the origin of life (Muchowska et al. 2020). This significant development is partly due to biological (genomic and biochemical) development, geochemical data, and computer tools in combination. However, above all, it is in the legacy of work that laid the theoretical foundations for the field, where the article written by Antonio Lazcano and Stanley Miller in this journal occupies a prominent place.