Autocatalytic Sets and RNA Secondary Structure

Hordijk, Wim

doi:10.1007/s00239-017-9787-7

Autocatalytic Sets and RNA Secondary Structure

Original Article
Published: 04 April 2017

Volume 84, pages 153–158, (2017)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Journal of Molecular Evolution Aims and scope Submit manuscript

Autocatalytic Sets and RNA Secondary Structure

Download PDF

Wim Hordijk ORCID: orcid.org/0000-0002-0223-6194¹

487 Accesses
2 Citations
2 Altmetric
Explore all metrics

Abstract

The dominant paradigm in origin of life research is that of an RNA world. However, despite experimental progress towards the spontaneous formation of RNA, the RNA world hypothesis still has its problems. Here, we introduce a novel computational model of chemical reaction networks based on RNA secondary structure and analyze the existence of autocatalytic sub-networks in random instances of this model, by combining two well-established computational tools. Our main results are that (i) autocatalytic sets are highly likely to exist, even for very small reaction networks and short RNA sequences, and (ii) sequence diversity seems to be a more important factor in the formation of autocatalytic sets than sequence length. These findings could shed new light on the probability of the spontaneous emergence of an RNA world as a network of mutually collaborative ribozymes.

Evolution of RNA-Based Networks

Darwinian properties and their trade-offs in autocatalytic RNA reaction networks

Article Open access 08 February 2021

Dynamics and stability in prebiotic information integration: an RNA World model from first principles

Article Open access 09 January 2020

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

The dominant paradigm in origin of life research is that of an RNA world (Gilbert 1986; Joyce 2002). However, despite experimental progress towards the spontaneous formation of RNA (Powner et al. 2009), the RNA world hypothesis still has its problems (Benner et al. 2012; Szostak 2012), and so far no one has been able to show that RNA can catalyze its own template-directed replication.

What has been shown, though, is that some RNA molecules can catalyze the formation of other RNA molecules from shorter RNA fragments (Horning and Joyce 2016). Moreover, there are experimentally constructed sets of RNA molecules that mutually catalyze each other’s formation (Sievers and von Kiedrowski 1994; Kim and Joyce 2004; Lincoln and Joyce 2009; Vaidya et al. 2012). Rather than each RNA molecule replicating itself, they mutually help each other in being formed from their basic building blocks, in a network of molecular collaboration (Higgs and Lehman 2015; Nghe et al. 2015).

Such a collaborative RNA network is a realization of an autocatalytic set, a concept that was originally introduced by Kauffman (1971, 1986, 1993). Informally, an autocatalytic set (or RAF set, for Reflexively Autocatalytic and Food-generated) is a chemical reaction network in which (i) each reaction is catalyzed by at least one molecule from the set itself, and (ii) all molecules can be built up from an appropriate food source through a series of reactions from the set itself. This concept was made mathematically more rigorous and studied in detail, both theoretically and computationally, as RAF theory (Steel 2000; Hordijk and Steel 2004; Mossel and Steel 2005; Hordijk and Steel 2017). This theory has been applied to analyze computational models of chemical reaction networks (Hordijk et al. 2013), as well as real chemical and biological networks (Hordijk and Steel 2013; Sousa et al. 2015).

The computational models that RAF theory has been applied to are mostly variants of a simple polymer model, where molecules are represented by binary strings. In these models, only the primary sequence is taken into account when determining which molecules can catalyze which reactions. However, in real chemistry, it is often the secondary (or even tertiary) structure that determines a molecule’s catalytic capability. Other (related) computational studies on the emergence and evolution of autocatalytic sets have so far also ignored actual molecular structure (Farmer et al. 1986; Bagley and Farmer 1991; Bagley et al. 1991; Wills and Henderson 2000; Jain and Krishna 2001, 2002; Filisetti et al. 2011; Vasas et al. 2012; Tanaka et al. 2014).

Here, we introduce and analyze a novel model by combining two well-established computational methods: one for predicting RNA secondary structure (Lorenz et al. 2011) and one for detecting and analyzing RAF sets (Hordijk et al. 2015). We then study the existence of autocatalytic sets in random instances of this model, where the catalysis assignments are based on RNA secondary structure. Our main result is that autocatalytic sets are highly likely to exist in such systems, even for very small reaction networks and short RNA sequences. Furthermore, this probability increases rapidly with increasing system (network) size and increasing RNA sequence length, but seems to be mostly driven by sequence diversity rather than sequence length. These findings could shed new light on the probability of the spontaneous emergence of an RNA world as a network of collaborative ribozymes.

Methods

We combine two established computational tools to construct and analyze model instances of RNA reaction networks where catalysis is determined by an RNA’s secondary structure. First, we generate N random RNA sequences of a given length L, where at each sequence position there is an independent and uniform probability of having any of the four nucleotides A, C, G, or U. We then use the ViennaRNA 2.0 package (Lorenz et al. 2011) to fold these RNA sequences into their minimum free energy (MFE) secondary structure. This structure is represented using the common dot-parentheses notation. An example is provided in Fig. 1 for \(L=32\).

Next, we assume that each RNA sequence is broken into two smaller fragments, which can be combined together again through a chemical reaction to (re)form the full sequence. There are at least two such reactions for which empirical support exists: (i) ligation, i.e., a reaction at a phosphoanhydride bond (Bartel and Szostak 1993), and (ii) recombination, i.e., a reaction at a phosphodiester bond (Hayden and Lehman 2006). However, in the model, an RNA sequence can only be broken at a place along the sequence where there is a consecutive subsequence of at least four unpaired nucleotides (corresponding to at least four consecutive dots in the secondary structure). In particular, we choose that subsequence of four unpaired nucleotides that is closest to the center (mid-point) of the full RNA sequence. The black rectangle in Fig. 1 shows this subsequence for the given example, which in this case happens to be exactly at the mid-point of the full sequence. This, then, gives rise to a “ligation template” of four nucleotides (two on each side of the ligation site). In the example in Fig. 1, this ligation template is UAAA.

Furthermore, for each (full) RNA sequence, we extract the subsequences of all its hairpin loops. In the example in Fig. 1, there are two loops (the blue nucleotides), with subsequences AAU and AAAG.

We now construct an RNA reaction network as follows. The molecule set consists of the N random RNA sequences and their respective fragments (determined by the ligation sites as described above). The reaction set consists of the N ligation reactions that form the full RNA sequences from their respective fragments. Finally, such a ligation reaction can be catalyzed by a full RNA sequence if one of its loops contains a subsequence that is the complementary base-pair match of the ligation template of the first (to be ligated) sequence. For example, the ligation reaction for the RNA sequence shown in Fig. 1 (with ligation template UAAA) can be catalyzed by another (full) RNA sequence that has a loop containing the subsequence AUUU. On the other hand, the example RNA sequence itself (with a loop subsequence AAAG) can catalyze any ligation reaction of an RNA sequence with a ligation template UUUC. Note that the other loop (of length three) in the example is too short to match a ligation template, and can thus not be used in catalysis.

This novel model is inspired by actual experimental systems that consist of catalytic RNA molecules (ribozymes) which are broken into smaller fragments that can then be joined back together again through a chemical reaction, catalyzed by other ribozymes (Kim and Joyce 2004; Lincoln and Joyce 2009; Vaidya et al. 2012). Moreover, Lincoln and Joyce (2009) allowed a 4-nt subsequence (at either end of the molecule) to vary, and Vaidya et al. (2012) varied a 3-nt subsequence that acts as the recognition region for catalysis. In our model, we allow for fully random sequences, but use a 4-nt ligation (or “recognition”) template, in accordance with these experimental systems.

Finally, given a random instance of the RNA reaction network model, we apply the RAF algorithm (Hordijk et al. 2015) to detect and analyze the existence of autocatalytic sets, where the RNA fragments are considered to constitute the food source. This process is then repeated 1000 times (for a given set of parameter values N and L) to collect statistics on the probability and sizes of autocatalytic sets existing in random instances of the model.

Results

Taking \(N=20\) and \(L=32\) as default parameter values, the probability of autocatalytic sets existing in random instances of the RNA reaction network model is about Pr[RAF] = 0.5. In other words, even for very small system sizes (\(N=20\)) and short RNA sequences (\(L=32\)), about half of the 1000 model instances considered contain a RAF set. Figure 2 shows an example of one such set found by the RAF algorithm. Note that it contains two independent loops (two molecules mutually catalyzing each other’s ligation), one of which also catalyzes further members of the set. So, each ligation reaction in this set is catalyzed by one of the molecules from the set, and each molecule in the set is produced from the food source (RNA fragments) through a ligation reaction from the set, thus forming a proper autocatalytic set.

The example shown in Fig. 2 contains seven members, i.e., seven of the \(N=20\) random RNA sequences catalyze each other’s ligation from their respective fragments, in a self-sustaining way. On average, the RAF set size is about three (measured over those roughly 500 instances that actually contained a RAF set), with a maximum observed RAF set size of 11. Of course, there are still other catalysis events in the RNA network as a whole, i.e., among the RNA molecules that are not included in the RAF set, but they do not contribute to the self-sustaining and catalytically closed autocatalytic set.

The actual probability of a RAF set existing in a random model instance obviously depends on the two parameters N and L. Figure 3 shows these probabilities Pr[RAF] for different values of the system size (N, in red, using \(L=32\)) and sequence length (L, in blue, using \(N=20\)), again measured over 1000 model instances for each parameter value. These probabilities increase rapidly with increasing system size or sequence length. Notably, though, for the short sequence length of \(L=16\) and the default system size \(N=20\), there is still a 10% probability that a RAF set exists. In other words, even for the smallest systems and sequences, autocatalytic sets have a non-zero probability of existing, and it does not require a very large increase in either of these parameter values to get autocatalytic sets with a very high likelihood, even among completely random RNA sequences.

Looking at the sizes of the RAF sets, however, there is a difference between the two parameters. Figure 4 shows the average and maximum relative RAF sizes for different values of the system size (N, in red, using \(L=32\)) and sequence length (L, in blue, using \(N=20\)). These RAF sizes are shown relative to the total number of reactions in the network (i.e., the system size N), for a fair comparison between the two parameters (given that the system size N is kept fixed when the sequence length L is varied). Solid lines show the average, and dashed lines show the maximum observed.

As shown in the figure, the average relative RAF size grows significantly faster with increasing system size N (solid red line) than with increasing sequence length L (solid blue line). Furthermore, the maximum observed relative RAF size continues to increase with increasing system size N (dashed red line), while it appears to level off for increasing sequence length L (dashed blue line). This seems to imply that sequence diversity (i.e., system size N) is a more crucial factor for the existence of autocatalytic sets than sequence length (L).

One possible explanation for this can be found in the number and average length of loops in the RNA secondary structures for different sequence lengths. Figure 5 shows these numbers, with the solid line indicating the average number of loops and the dashed line the average loop length (in number of nucleotides). The average number of loops increases only very slowly with increasing sequence length, and the average loop length plateaus quickly at around six nucleotides. In other words, increasing the sequence length does not necessarily increase a molecule’s ability to catalyze more ligation reactions.

Discussion

The model introduced here is still a simplification of real chemistry, but it is directly inspired by actual experimental RNA systems, and uses an established and reliable method for predicting RNA secondary structure. Moreover, it represents a significant step forward in computational models for studying the existence of autocatalytic sets, by using actual molecular structure to determine catalytic capability, something that is missing in currently existing models.

The main result of this new model is that, even when catalysis is restricted to loops that contain the base-pair complement of a ligation template, determined by an RNA molecule’s folded structure, autocatalytic sets are still likely to exist, even in the extreme case of very small networks of short random sequences. Moreover, this probability increases rapidly with increasing system (network) size and RNA sequence length, but seems to be mostly driven by sequence diversity (N) rather than sequence length (L). These striking results could shed new light on the probability and mechanisms of an RNA world emerging from collections of random RNA sequences, not as individual self-replicators, but as a network of collaborating ribozymes.

Of course many additional features can be added to the basic model. For example, only ligation reactions between the two fragments of a single fully formed RNA molecule have been considered so far. But if these smaller fragments of different RNA sequences are all present in the same solution, they could also (randomly) combine with each other, creating even more possible RNA molecules of length L, next to the N ones that were (initially) chosen. However, this would simply create a larger reaction network with more reactions that can potentially be catalyzed by the same molecules. If the more restricted reaction networks generated by the current model already have a high probability of containing a RAF set, these extended networks (of which the more restricted ones are a subset) would have an even higher chance of containing RAF subsets, and possibly even larger ones.

Furthermore, catalysis is currently restricted to hairpin loops in the folded RNA molecules, but could in principle occur in any subsequence that contains at least four unpaired nucleotides. However, this would simply increase the probability that a fully formed RNA molecule catalyzes an arbitrary ligation reaction, thus also resulting in an even higher probability that a model instance contains a RAF set, or possibly an even larger one. What is surprising and encouraging here is that already in the basic (restricted) model there is such a high probability of observing autocatalytic sets.

As was noted above, the example RAF set in Fig. 2 contains two independent autocatalytic sub-networks. For the standard binary polymer model (not taking molecular structure into account), we had already shown that autocatalytic sets usually have a hierarchical structure of smaller and smaller autocatalytic subsets (Hordijk et al. 2012). The current example shows that this is also likely to be the case for the model introduced here, taking RNA secondary structure into account. It has been argued elsewhere that the existence of multiple (independent) autocatalytic subsets within a chemical reaction network is one of the requirements for them to be evolvable (Vasas et al. 2012; Hordijk and Steel 2014). This requirement thus seems to be fulfilled in our novel model as well.

It is important to note, though, that the current model is not quite representative of a pre-biotic scenario yet, as it assumes the presence of the shorter RNA fragments. However, as mentioned earlier, this novel model is inspired by actual experimental autocatalytic sets that consist of catalytic RNA molecules or peptides which are broken into smaller fragments that can then be joined back together again (Kim and Joyce 2004; Ashkenasy et al. 2004; Lincoln and Joyce 2009; Vaidya et al. 2012). Moreover, as the very first experimental evidence for autocatalytic sets has shown (Sievers and von Kiedrowski 1994), this process can even start with simple trimers forming hexamers through mutually catalyzed reactions.

The analysis presented here only focuses on network topology, and does not (yet) take actual dynamics into account. However, in previous simulation studies of experimental RNA autocatalytic networks such dynamical studies were actually performed (Hordijk and Steel 2013), also using experimentally measured reaction rates (Hordijk et al. 2014). These simulation studies provided more insight into how different RNA autocatalytic sets can come into existence over time, and how environmental influences affect this process. We hope to perform similar dynamical analyses on the model networks introduced here.

Finally, the experimentally verified existence of autocatalytic sets consisting of peptides rather than RNA (Ashkenasy et al. 2004), together with plausible evidence that RNA and peptides interacted and co-evolved very early on in the origin of life (Li et al. 2013; Polyansky et al. 2013), would make the formation of one or more autocatalytic sets even more likely, as they increase sequence diversity. In fact, RAF theory is not restricted to RNA molecules alone (or any single type of molecule), and has already been applied to models of “partitioned” chemical reaction networks, as with RNA and peptides (Smith et al. 2014). Including even more chemical realism in our novel computational model using actual molecular structure (and the catalytic capabilities determined by it), and also combining different types of molecules, seems a promising direction for learning more about possible routes to the origin of life.

References

Ashkenasy G, Jegasia R, Yadav M, Ghadiri MR (2004) Design of a directed molecular network. PNAS 101(30):10872–10877
Bagley RJ, Farmer JD (1991) Spontaneous emergence of a metabolism. In: Langton CG, Taylor C, Farmer JD, Rasmussen S (eds) Artificial life II. Addison-Wesley, Redwood City, pp 93–140
Bagley RJ, Farmer JD, Fontana W (1991) Evolution of a metabolism. In: Langton CG, Taylor C, Farmer JD, Rasmussen S (eds) Artificial life II. Addison-Wesley, pp 141–158
Bartel DP, Szostak JW (1993) Isolation of new ribozymes from a large pool of random sequences. Science 261(5127):1411–1418
Article CAS PubMed Google Scholar
Benner SA, Kim HJ, Yang Z (2012) Setting the stage: the history, chemistry, and geobiology behind RNA. Cold Spring Harb Perspect Biol 4(a003):541
Google Scholar
Farmer JD, Kauffman SA, Packard NH (1986) Autocatalytic replication of polymers. Phys D 22:50–67
Article Google Scholar
Filisetti A, Graudenzi A, Serra R, Villani M, Lucrezia DD, Füchslin RM, Kauffman SA, Packard N, Poli I (2011) A stochastic model of the emergence of autocatalytic cycles. J Syst Chem 2:2
Article CAS Google Scholar
Gilbert W (1986) The RNA world. Nature 319:618
Article Google Scholar
Hayden EJ, Lehman N (2006) Self-assembly of a group I intron from inactive oligonucleotide fragments. Chem Biol 13:909–918
Article CAS PubMed Google Scholar
Higgs PG, Lehman N (2015) The RNA world: molecular cooperation at the origins of life. Nat Rev Genet 16:7–17
Article CAS PubMed Google Scholar
Hordijk W, Steel M (2004) Detecting autocatalytic, self-sustaining sets in chemical reaction systems. J Theor Biol 227(4):451–461
Article CAS PubMed Google Scholar
Hordijk W, Steel M (2013) A formal model of autocatalytic sets emerging in an RNA replicator system. J Syst Chem 4:3
Article CAS Google Scholar
Hordijk W, Steel M (2014) Conditions for evolvability of autocatalytic sets: a formal example and analysis. Orig Life Evol Biosph 44(2):111–124
Article CAS PubMed Google Scholar
Hordijk W, Steel M (2017) Chasing the tail: the emergence of autocatalytic networks. BioSyst 152:1–10
Article CAS Google Scholar
Hordijk W, Steel M, Kauffman S (2012) The structure of autocatalytic sets: evolvability, enablement, and emergence. Acta Biotheor 60(4):379–392
Article PubMed Google Scholar
Hordijk W, Steel M, Kauffman S (2013) Autocatalytic sets: the origin of life, evolution, and functional organization. In: Pontarotti P (ed) Evolutionary biology: exobiology and evolutionary mechanisms. Springer, Berlin, pp 49–60
Hordijk W, Vaidya N, Lehman N (2014) Serial transfer can aid the evolution of autocatalytic sets. J Syst Chem 5:4
Article PubMed PubMed Central Google Scholar
Hordijk W, Smith JI, Steel M (2015) Algorithms for detecting and analysing autocatalytic sets. Algorithms Mol Biol 10:15
Article PubMed PubMed Central Google Scholar
Horning DP, Joyce GF (2016) Amplification of RNA by an RNA polymerase ribozyme. PNAS 113:9786–9791
Article CAS PubMed PubMed Central Google Scholar
Jain S, Krishna S (2001) A model for the emergence of cooperation, interdependence, and structure in evolving networks. PNAS 98(2):543–547
Article CAS PubMed PubMed Central Google Scholar
Jain S, Krishna S (2002) Large extinctions in an evolutionary model: The role of innovation and keystone species. PNAS 99(4):2055–2060
Article CAS PubMed PubMed Central Google Scholar
Joyce GF (2002) The antiquity of RNA-based evolution. Nature 418:214–221
Article CAS PubMed Google Scholar
Kauffman SA (1971) Cellular homeostasis, epigenesis and replication in randomly aggregated macromolecular systems. J Cybernet 1(1):71–96
Article Google Scholar
Kauffman SA (1986) Autocatalytic sets of proteins. J Theor Biol 119:1–24
Article CAS PubMed Google Scholar
Kauffman SA (1993) The origins of order. Oxford University Press, New York
Kim DE, Joyce GF (2004) Cross-catalytic replication of an RNA ligase ribozyme. Chem Biol 11:1505–1512
Article CAS PubMed Google Scholar
Li L, Francklyn C, Carter CW (2013) Aminoacylating urzymes challenge the RNA world hypothesis. J Biol Chem 288:26856–26863
Lincoln TA, Joyce GE (2009) Self-sustained replication of an RNA enzyme. Science 323:1229–1232
Article CAS PubMed PubMed Central Google Scholar
Lorenz R, Bernhart SH, Höner zu Siederdissen C, Tafer H, Flamm C, Stadler PF, Hofacker IL (2011) ViennaRNA Package 2.0. Algorithms Mol Biol 6:26
Mossel E, Steel M (2005) Random biochemical networks: the probability of self-sustaining autocatalysis. J Theor Biol 233(3):327–336
Article CAS PubMed Google Scholar
Nghe P, Hordijk W, Kauffman SA, Walker SI, Schmidt FJ, Kemble H, Yeates JAM, Lehman N (2015) Prebiotic network evolution: six key parameters. Mol BioSyst 11:3206–3217
Article CAS PubMed Google Scholar
Polyansky AA, Hlevnjak M, Zagrovic B (2013) Proteome-wide analysis reveals clues of complementary interactions between mRNAs and their cognate proteins as the physicochemical foundation of the genetic code. RNA Biol 10(8):1248–1254
Article CAS PubMed PubMed Central Google Scholar
Powner MW, Gerland B, Sutherland JD (2009) Synthesis of activated pyrimidine ribonucleotides in prebiotically plausible conditions. Nature 459:239–242
Article CAS PubMed Google Scholar
Sievers D, von Kiedrowski G (1994) Self-replication of complementary nucleotide-based oligomers. Nature 369:221–224
Article CAS PubMed Google Scholar
Smith J, Steel M, Hordijk W (2014) Autocatalytic sets in a partitioned biochemical network. J Syst Chem 5:2
Article PubMed PubMed Central Google Scholar
Sousa FL, Hordijk W, Steel M, Martin WF (2015) Autocatalytic sets in E. coli metabolism. J Syst Chem 6:4
Article PubMed PubMed Central Google Scholar
Steel M (2000) The emergence of a self-catalysing structure in abstract origin-of-life models. Appl Math Lett 3:91–95
Article Google Scholar
Szostak JW (2012) The eightfold path to non-enzymatic RNA replication. J Syst Chem 3:2
Article CAS Google Scholar
Tanaka S, Fellermann H, Rasmussen S (2014) Structure and selection in an autocatalytic binary polymer model. EPL 107(28):004
Google Scholar
Vaidya N, Manapat ML, Chen IA, Xulvi-Brunet R, Hayden EJ, Lehman N (2012) Spontaneous network formation among cooperative RNA replicators. Nature 491:72–77
Article CAS PubMed Google Scholar
Vasas V, Fernando C, Santos M, Kauffman S, Sathmáry E (2012) Evolution before genes. Biol Direct 7:1
Article PubMed PubMed Central Google Scholar
Wills PR, Henderson L (2000) Self-organisation and information-carrying capacity of collectively autocatalytic sets of polymers: ligation systems. In: Bar-Yam Y (ed) Unifying themes in complex systems: proceedings of the first international conference on complex systems. Perseus Books, pp 613–623

Download references

Acknowledgements

The author thanks the KLI Klosterneuburg for financial support in the form of a fellowship, and two anonymous reviewers for helpful suggestions to improve the original manuscript.

Author information

Authors and Affiliations

Konrad Lorenz Institute for Evolution and Cognition Research, Klosterneuburg, Austria
Wim Hordijk

Authors

Wim Hordijk
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wim Hordijk.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hordijk, W. Autocatalytic Sets and RNA Secondary Structure. J Mol Evol 84, 153–158 (2017). https://doi.org/10.1007/s00239-017-9787-7

Download citation

Received: 08 December 2016
Accepted: 09 March 2017
Published: 04 April 2017
Issue Date: April 2017
DOI: https://doi.org/10.1007/s00239-017-9787-7

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Autocatalytic Sets and RNA Secondary Structure

Abstract