Introduction

The search for efficient non-enzymatic replication of nucleotides is of vital importance for origin-of-life research especially regarding the RNA world hypothesis (Gilbert 1986), based on RNA’s ability to be both information carrier and catalytic agent (Kruger et al. 1982; Paul and Joyce 2002). Although theories vary, it is generally accepted that RNA ensured continuity of genetic information at some point during the prebiotic era. Starting from a series of non-enzymatic template-directed condensation reactions (Orgel 1992; Joyce 1987) and after a few simplifications, an autocatalytic oligonucleotide (von Kiedrowski 1986) was developed. Subsequently, more artificial replicators, either being nucleotides (Kruger et al. 1982; Paul and Joyce 2002; von Kiedrowski et al. 1991, 1989), peptides (Issac and Chmielewski 2002; Li and Chmielewski 2003; Lee et al. 1996) or small molecules (Vidonne and Philp 2009), appeared and began participating in more complex systems (Bissette and Fletcher 2013). But the analytical evaluation of all these replicators is still based on the same principles as discussed by von Kiedrowski in his minimal replicator theory (von Kiedrowski 1993).

The minimal replicator is based on the two, experimentally used, simplifications: a reaction cycle containing only one condensation step and using a self-complementary template. The latter is in contrast to natural nucleic acid replication which is cross-catalytic, where one strand catalyses the formation of the other and vice versa. The minimal replicator model consists of two precursors, \({\mathbf{A}}\) and \({\mathbf{B}}\), and a template molecule \({\mathbf{C}}\), complementary to \({\mathbf{A}}\) and \({\mathbf{B}}\) and facilitating the unidirectional ligation reaction between them by the reversible formation of a ternary complex \({\mathbf{ABC}}\). The reversible dissociation of resulting duplex \({\mathbf{C}} _{2}\) gives two \({\mathbf{C}}\) molecules as product, each capable of beginning a new replication cycle (Fig. 1a).

The analytical treatment provides tools to quantify and compare structurally different catalyst based on thermodynamic and kinetic data. One derived parameter is the autocatalytic reaction order p, describing the autocatalytic behaviour of the system. At \(p=1\), the reaction rate increases linearly with increasing amounts of template as expected with exponential growth. On the other hand, the square root law of autocatalysis (von Kiedrowski 1986) (\(p=0.5\)) describes the rate of autocatalysis as being proportional to the square root of initial template concentration. The latter case is a frequently witnessed phenomenon (von Kiedrowski 1986; Zielinski and Orgel 1987; von Kiedrowski et al. 1991, 1989; Lee et al. 1996) attributed to a strongly bound template duplex and is generally known as template inhibition. Much research has gone in to overcoming or avoiding such inhibition, by destabilising the \({\mathbf{ABC}}\) complex. Common strategies include distorting the template backbone (Li and Chmielewski 2003; Szostak 2012) and reducing template size (Paul and Joyce 2002; Issac and Chmielewski 2002).

Here we propose and analyse theoretically a chemical scheme to counter template inhibition by reintroducing an old concept. Inspired by the functionality of a few noncoding RNAs (Cech and Steitz 2014), we propose to promote the leaving group (LG) in self-replicating systems from plain activating agent to information carrier. This informational leaving group (ILG) approach is best illustrated by tRNA, acting as LG and information carrier during protein synthesis. Other examples from nature can be found among self-splicing, introns. Moreover, these oligonucleotide LGs express ribozyme activity besides recognition elements needed to excise the intron (Hausner et al. 2014; Roitzsch et al. 2010). However, while tRNA acts as conduit for information transfer, introns serve a more selfish role consisting of self-recognition, cleavage and product release. Implementing an ILG strategy in autocatalysis was first suggested by Song and coworkers in a paper reporting the use of pyrophosphate-linked dinucleotides as substrates for DNA polymerases (Song et al. 2011). Furthermore, these compounds seem to be an important intermediate in non-enzymatic oligonucleotide synthesis (Sulston et al. 1968). Pyrophosphates not only spark the interest of prebiotic scientists, they are also known for playing a role in cellular processes such as energy metabolism (cofactor NAD\(^{+}\)) and DNA repair (poly(ADP-ribose)) (Schreiber et al. 2006). Thus, in light of these old problems and new interests, we propose to expand Song et al.’s LG strategy to non-enzymatic replication and analytically compare this proposed informational leaving group model (ILG model) with the minimal replicator model.

This paper is organised as follows. “Model” section describes the chemical reaction systems and mathematical models used to compare the proposed ILG scheme with the minimal replicator. “Results” section gives the results of this comparison and provides a discussion. And Section 4 gives a “conclusion”. It should be noted that this paper contains equations and results for the minimal replicator model, which is not new material but is recalled here for the sake of clarity and consistency. In particular, Eqs. (20)–(24) are the same as Eqs. (29)–(33) in von Kiedrowski (1993). Likewise, Fig. 2a, top and bottom, mirrors Figs. 8a and 9 in von Kiedrowski (1993), respectively. And because the thermodynamic parameters used in Table 1a are very close to those used in von Kiedrowski (1993), the numerical results in Figs. 5a–c, 6a–c, 7a, 9a–c, 10a–c and 11a are very close to those of Figs. 10, 11, 12, 13, 14 and 15 in von Kiedrowski (1993), respectively.

Model

Chemical Reaction System

Figure 1 illustrates both von Kiedrowski’s original minimal replicator model (von Kiedrowski 1993) and the proposed ILG model. In both models, precursors (\({\mathbf{A}}\), \({\mathbf{B}}\)) and template (\({\mathbf{C}}\)) reversibly form a termolecular complex \({\mathbf{ABC}}\), and a subsequent series of kinetic and/or equilibrated steps lead to template self-replication. Similar to von Kiedrowski’s original work, the approach followed here is that of a minimal model where all transient complexes between the template and the precursors are neglected. This results in the minimal replicator and ILG models given below.

Fig. 1
figure 1

The minimal replicator (a) and the ILG model (b). Participating species are as follows: a precursors (\({\mathbf{A}}\), \({\mathbf{B}}\)), template \({\mathbf{C}}\), termolecular complex \({\mathbf{ABC}}\) and duplex \({\mathbf{C}} _{2}\), b precursors (\({\mathbf{A}}\), \({\mathbf{B}}\)), template \({\mathbf{C}}\), termolecular complexes \({\mathbf{ABC}}\) and \({\mathbf{D}}\), duplex \({\mathbf{C}} _{2}\) and leaving group \({\mathbf{L}}\). \(K_{n}\) denotes the dissociation constants of the respective complexes, k represents the (pseudo) irreversible step where a covalent bond is formed (and broken). a Reproduced from von Kiedrowski (1993), with permission of Springer

Minimal Replicator Model

It consists of two fast quasi-instantaneously equilibrated bidirectional reactions and one unidirectional rate-limiting step:

$${\mathbf{A}} +{\mathbf{B}} +{\mathbf{C}} \rightleftarrows {\mathbf{ABC}}$$
(1)
$$2{\mathbf{C}} \rightleftarrows \mathbf C _{2}$$
(2)
$${\mathbf{ABC}} \rightarrow {\mathbf{C}} _{2}$$
(3)

with associated equilibrium constants for the equilibrated bidirectional reactions:

$$K_{1}=\frac{[{\mathbf{ABC}} ]}{[{\mathbf{A}} ][{\mathbf{B}} ][{\mathbf{C}} ]}$$
(4)
$$K_{2}=\frac{[{\mathbf{C}}_{2}]}{[{\mathbf{C}} ]^{2}}$$
(5)

and with kinetic constant k for the unidirectional rate-limiting step.

ILG Model

It consists of three fast quasi-instantaneously equilibrated bidirectional reactions and one unidirectional rate-limiting step:

$${\mathbf{A}} +{\mathbf{B}} +{\mathbf{C}} \rightleftarrows {\mathbf{ABC}}$$
(6)
$$2{\mathbf{C}} \rightleftarrows {\mathbf{C}}_{2}$$
(7)
$$\mathbf C _{2}+\mathbf L \rightleftarrows \mathbf D$$
(8)
$${\mathbf{ABC}} \rightarrow {\mathbf{D}}$$
(9)

with associated equilibrium constants for the equilibrated bidirectional reactions:

$$K_{1}=\frac{[\mathbf{ABC}]}{[\mathbf{A}][{\mathbf{B}}][\mathbf{C}]}$$
(10)
$$K_{2}=\frac{[{\mathbf{C}}_{2}]}{[{\mathbf{C}}]^{2}}$$
(11)
$$K_{3}=\frac{[{\mathbf{D}}]}{[{\mathbf{C}}_{2}][{\mathbf{L}}]}$$
(12)

and with kinetic constant k for the unidirectional rate-limiting step.

The reactions that were listed above for the ILG model are necessary for template self-replication. However, the formation of unwanted complexes is also possible:

  1. 1.

    Precursors \(\mathbf{A}\) and \(\mathbf{B}\) may bind to one another in two possible ways: looking at the schematic in Fig. 1b, either \(\mathbf{B}\) binds on top of \(\mathbf{A}\) with equilibrium constant \(K_{2}\), or \(\mathbf A\) binds on top of \(\mathbf B\) with equilibrium constant \(K_{3}\). This may lead to a chain polymerization giving sequences \((\mathbf {AB} )_{n}\),where n is an integer. It can be easily shown that if \(K_{2}[\mathbf B ]K_{3}[\mathbf A ]\approx 1\), then \(\mathbf A\) and \(\mathbf B\) freeze into such chain polymers and are no longer available as free precursors. On the contrary, if \(K_{2}[\mathbf B ]K_{3}[\mathbf A ]\ll 1\), then the amount of chain-polymerized \((\mathbf {AB} )_{n}\) complexes is negligible. In the remainder of this text, we shall assume the latter.

  2. 2.

    Complexes \(\mathbf ABC\), \(\mathbf D\) and \(\mathbf C _{2}\) may also chain polymerize by binding to one another in any order, with associated equilibrium constant \(K_{2}\). Precursors \(\mathbf A\) and \(\mathbf B\) may also participate in such higher-order complexes. In the remainder of this text, we shall assume that the total amount of template c is much smaller than the total amount of precursor, a or b, so that in the above-assumed regime, \(K_{2}[\mathbf B ]K_{3}[\mathbf A ]\ll 1\), the contribution of all such complexes can be neglected.

  3. 3.

    The leaving group \(\mathbf L\) may bind to the template \(\mathbf C\) to give the complex \(\mathbf {LC}\) with equilibrium constant \(K_{3}\). Such a complex may block further self-replication of C and should therefore be taken into account. Precursor B may also bind to this complex to give the complex \(\mathbf {LBC}\) but in the above-assumed regime, \(K_{2}[\mathbf B ]K_{3}[\mathbf A ]\ll 1\), the contribution of this complex can be neglected.

Within the assumed regime of \(K_{2}[\mathbf B ]K_{3}[\mathbf A ]\ll 1\) the only unwanted reaction that should also be taken into account is

$$\begin{aligned} \mathbf L +\mathbf C \rightleftarrows \mathbf {LC} \end{aligned}$$
(13)

with associated equilibrium constants for the equilibrated bidirectional reactions:

$$K_{3}=\frac{[\mathbf{LC}]}{[\mathbf{L}][\mathbf{C}]}.$$
(14)

Rapid Equilibria

The model aims at quantifying the rate of template replication at short times after mixing a solution containing only the template with a solution containing the precursors. It assumes that right after mixing, the chemical system reaches thermodynamic equilibrium (as the result of quasi-instantaneously equilibrated reactions) before the rate-limiting, unidirectional step starts to proceed.

Minimal Replicator Model

This starting equilibrium state can be determined knowing the initial template concentration called c (c is the total amount of template in whichever form—single-strand \(\mathbf C\) or duplex \(\mathbf C _{2}\) divided by the total volume) as well as the initial precursor concentrations a and b (which are the total amount of precursors \({\mathbf{A}}\) and \({\mathbf{B}}\), respectively). This is because the quantities a, b and c are conserved by the equilibrated reactions, which results in

$$a=[\mathbf A ]+[{\mathbf {ABC} }]$$
(15)
$$b=[{\mathbf{B}}]+[{\mathbf{ABC}}]$$
(16)
$$c = [{\mathbf{C}}]+[{\mathbf{ABC}}]+2[{\mathbf{C}}_{2}].$$
(17)

Assuming that the initial amount of template is much smaller than the amount of precursors, \(c\ll a \, {\rm or}\, b,\) we have \([{\mathbf{ABC}}] < c \ll a\, {\rm or}\, b\) so that

$$a \approx [{\mathbf{A}}]$$
(18)
$$b\approx [{\mathbf{B}}].$$
(19)

Defining the dimensionless quantity q as

$$q = \frac{1}{K_{1}ab}$$
(20)

and using it in Eqs. (4) and (5) gives \([\mathbf{C}]\) and \([{\mathbf{C}}_{2}]\) as a function of the single unknown \([\mathbf{ABC}]\):

$$[{\mathbf{C}}] = q[{\mathbf{ABC}}]$$
(21)
$$[{\mathbf{C}}_{2}]=K_{2}(q[{\mathbf{ABC}}])^{2}.$$
(22)

Feeding the above equations in Eq. 17 makes \([\mathbf{ABC}]\) solution of the following quadratic equation:

$$\begin{aligned} c=(1+q)[\mathbf {ABC} ]+2K_{2}(q[\mathbf {ABC} ])^{2} \end{aligned}$$
(23)

which has a single positive solution:

$$\begin{aligned}{}[\mathbf {ABC} ]=\frac{\sqrt{8K_{2}cq^{2}+(1+q)^{2}}-(1+q)}{4K_{2}q^{2}}. \end{aligned}$$
(24)

ILG Model

Besides the conserved quantity a, b and c, the system of equilibrated reactions has an additional conserved quantity which is the amount of leaving group l (which might be initially present as the result of residual hydrolysis in the starting precursor solutions). These conserved quantities are given by

$$a=[{\mathbf{A }}]+[{\mathbf{ABC}}]$$
(25)
$$b=[{\mathbf{B}} ]+[{\mathbf{ABC}}]$$
(26)
$$\begin{aligned} c=[{\mathbf{C}}]+[{\mathbf{LC}}]+[{\mathbf{ABC}} ]+2[{\mathbf{C}}_{2}]+2[{\mathbf{D}}] \end{aligned}$$
(27)
$$l=[{\mathbf{D}} ]+[{\mathbf{L}} ]+[{\mathbf{LC}} ].$$
(28)

As for the minimal replicator model, the initial amount of template is also assumed to be much smaller than the amount of precursors, \(c\ll a\,{\rm or}\, b,\) so that

$$a\approx [{\mathbf{A}} ]$$
(29)
$$b\approx [{\mathbf{B}}].$$
(30)

As for the minimal replicator model, the dimensionless quantity q is also defined as

$$\begin{aligned} q=\frac{1}{K_{1}ab}. \end{aligned}$$
(31)

Using it in Eqs. (10), (11) and (12) gives \([{\mathbf{C}}]\), \([\mathbf C _{2}]\) and \([\mathbf D ]\) as a function of two unknowns \([\mathbf ABC ]\) and \([\mathbf L ]:\)

$$\begin{aligned}{}[{\mathbf C} ]=q[\mathbf {ABC} ] \end{aligned}$$
(32)
$$\begin{aligned}{}[{\mathbf LC} ]=K_{3}q[\mathbf {ABC} ][\mathbf L ] \end{aligned}$$
(33)
$$\begin{aligned}{}[{\mathbf C _{2}}]=K_{2}(q[\mathbf {ABC} ])^{2} \end{aligned}$$
(34)
$$\begin{aligned}{}[{\mathbf D} ]=K_{2}K_{3}(q[\mathbf {ABC} ])^{2}[\mathbf L ]. \end{aligned}$$
(35)

Feeding the above equations in Eqs. (27) and (28) makes \([\mathbf {ABC} ]\) and \([\mathbf L ]\) solutions of the set of coupled equations:

$$\begin{aligned} c=(1+q+K_{3}q[{\mathbf L} ])[{\mathbf {ABC}} ]+2K_{2}(q[{\mathbf ABC} ])^{2}+2K_{2}K_{3}(q[\mathbf {ABC} ])^{2}[\mathbf L ] \end{aligned}$$
(36)
$$\begin{aligned} l=(1+K_{3}q[{\mathbf {ABC}} ]+K_{2}K_{3}(q[\mathbf {ABC} ])^{2})[\mathbf L ] \end{aligned}$$
(37)

which has a single positive solution, but without any analytical expression.

\(\mathbf {AB}\) Complex Formation

For both the minimal replicator and ILG models, the above analysis neglects the possible formation of complexes by direct association between the precursors \(\mathbf A\) and \(\mathbf B\). If the formation of such an \(\mathbf {AB}\) has an equilibrium constant \(K_{0}\) and is no longer neglected, then the conserved quantities a and b in the above equations should be replaced by the unbound concentrations of \(\mathbf A\) and \(\mathbf B\), which are given by

$$a=\frac{K_{0}(c_{\text {A}}-c_{\text {B}})-1+\sqrt{\Delta }}{2K_{0}}$$
(38)
$$b=c_{\text {B}}-c_{\text {A}}+a,$$
(39)

where

$$\begin{aligned} \Delta = {K_{0}}^{2}(c_{\text {A}}-c_{\text {B}})^{2}+2K_{0}(c_{\text {A}}+c_{\text {B}})+1 \end{aligned}$$
(40)

and where \(c_{\text {A}}\) and \(c_{\text {B}}\) are the total known concentrations in bound and unbound form, respectively. If there are multiple different such complexes (as is the case for the ILG model, see Fig. 8 in “Detailed Comparison for a Hexanucleotide Template Reference Example” section below), then \(K_{0}\) should be replaced in the above two equations by the sum of association constants for all such complexes.

Template Self-replication

Starting from the above equilibrium that is reached quasi-instantaneously after mixing the template and precursor solutions, the total template concentration c (in whichever form—unbound or bound) starts to increase as the result of the unidirectional rate-limiting step. The rate of increase in c is given by

$$\begin{aligned} \frac{\text {d}c}{\text {d}t}=k[\mathbf {ABC} ]. \end{aligned}$$
(41)

Another quantity of interest is the autocatalytic order p defined as the exponent such that

$$\begin{aligned} \frac{\text {d}c}{\text {d}t}=\alpha c^{p} \end{aligned}$$
(42)

which by differentiation gives

$$\begin{aligned} p=\frac{\text {d}\text {log}(\frac{\text {d}c}{\text {d}t})}{\text {d}\text {log}(c)}=\frac{\text {d}(\text {log}([{\mathbf{ABC}}]))}{\text {d}(\text {log}(c)}=\frac{c}{[{\mathbf{ABC}}]}\frac{\text {d}[{\mathbf{ABC}}]}{\text {d}c}. \end{aligned}$$
(43)

Minimal Replicator Model

Differentiating Eq. 23 gives

$$\begin{aligned} p=\frac{4K_{2}cq^{2}}{8K_{2}cq^{2}+(1+q)^{2}-(1+q)\sqrt{8K_{2}cq^{2}+(1+q)^{2}}}. \end{aligned}$$
(44)

ILG Model

The two conserved quantities c and l are both varying as the result of the unidirectional rate-limiting step, and at the same rate.

$$\begin{aligned} \frac{\text {d}c}{\text {d}t}=\frac{\text {d}l}{\text {d}t} \end{aligned}$$
(45)

which is given by Eq. 41. Differentiating Eqs. (36) and (37) gives

$$\begin{aligned} \frac{\text {d}c}{\text {d}t}&=\left(1+q+K_{3}q[{\mathbf{L}} ]\right)\frac{\text {d}[{\mathbf{ABC}} ]}{\text {d}t}+K_{3}q[{\mathbf{ABC}}]\frac{\text {d}[{\mathbf{L}}]}{\text {d}t}+4K_{2}q^{2}[{\mathbf{ABC}} ]\frac{\text {d}[{\mathbf{ABC}}]}{\text {d}t}\nonumber \\&\quad +\,2K_{2}K_{3}q^{2}\left(2[{\mathbf{ABC}}]\frac{\text {d}[{\mathbf{ABC}}]}{\text {d}t}[{\mathbf{L}}]+[{\mathbf{ABC}} ]^{2}\frac{\text {d}[{\mathbf{L}}]}{\text {d}t}\right) \end{aligned}$$
(46)
$$\begin{aligned} \frac{\text {d}l}{\text {d}t}&=(2K_{2}K_{3}q^{2}[{\mathbf{ABC}} ]+K_{3}q)\frac{\text {d}[{\mathbf{ABC}} ]}{\text {d}t}[{\mathbf{L}} ]\nonumber \\&\quad +\,(1+K_{2}K_{3}(q[{\mathbf{ABC}} ])^{2}+K_{3}q[{\mathbf{ABC}}])\frac{\text {d}[{\mathbf{L}} ]}{\text {d}t}. \end{aligned}$$
(47)

Equating the above two expressions (because of Eq. 45) gives the following expression for \(\text {d}[{\mathbf{L}} ]\)/\(\text {d}t\):

$$\frac{\text {d}[{\mathbf{L}} ]}{\text {d}t}=\frac{\text {d}[{\mathbf{ABC}}]}{\text {d}t}\times \frac{4K_{2}q^{2}[{\mathbf{ABC}}]+(1+q)+2K_{2}K_{3}q^{2}[{\mathbf{ABC}}][{\mathbf{L}} ]}{1-K_{2}K_{3}q^{2}[{\mathbf{ABC}} ]^{2}}.$$
(48)

Feeding this back into Eqs. (46) or (47) and rearranging terms gives

$$\begin{aligned} \frac{\text {d}[{\mathbf{ABC}}]}{\text {d}t}=\frac{\text {d}c}{\text {d}t}\times \frac{1-K_{2}K_{3}q^{2}[{\mathbf{ABC}}]^{2}}{Q} \end{aligned}$$
(49)

from which the autocatalytic order p can be determined using Eq. 43:

$$\begin{aligned} p=\frac{c}{[{\mathbf{ABC}} ]}\times \frac{(1-K_{2}K_{3}q^{2}[{\mathbf{ABC}} ]^{2})}{Q}, \end{aligned}$$
(50)

where

$$\begin{aligned} Q&=\big (4K_{2}q^{2}[{\mathbf{ABC}} ]\nonumber \\&\quad +\,(1+q)\big )(1+K_{2}K_{3}q^{2}[{\mathbf{ABC}}]^{2}+K_{3}q[{\mathbf{ABC }}])\nonumber \\&\quad +\,4K_{2}K_{3}q^{2}[{\mathbf{ABC}}][{\mathbf{L}} ]\nonumber \\&\quad +\,K_{3}q[{\mathbf{L}} ](1+K_{2}K_{3}q^{2}[{\mathbf{ABC}} ]^{2}). \end{aligned}$$
(51)

Results

The following numerical values are used in this section for both the minimal replicator and ILG models: \(a=b=1\times 10^{-3}\,{\mathrm{M}}\) and \(c=1\times 10^{-4}\,\mathrm{M}\) as in von Kiedrowski (1993). The total amount of leaving group l is an additional conserved quantity in the case of the ILG model, for which two extreme values are considered: \(l=0\) and \(l=10\times c=1\times 10^{-3}\,\mathrm{M}\). The former assumes that hydrolysis of precursors prior to mixing the precursor and template solutions is negligible. The latter corresponds to a situation where the purity of the initial precursor \({\mathbf{A}}\) solution is only 50 % due to significant hydrolysis prior to mixing the precursor and template solutions. Further details on the source of l will be elucidated in “Detailed Comparison for a Hexanucleotide Template Reference Example” section.

General Comparison

For a given k, the ratio \([{\mathbf{ABC}} ]\)/c quantifies the relative rate of increase of the total template quantity c (see Eq. 41). Figure 2 shows the autocatalytic order (top) and the ratio \([{\mathbf{ABC}} ]\)/c (bottom) as a function, of \(K_{1}\) and \(K_{2}\) for the minimal replicator model, and of \(K_{1}\) and \(K=K_{2}=K_{3}\) for the ILG model for two different choices of total amount of leaving group \(l=0\) or \(l=10\times c\). The rationale for the choice \(K_{2}=K_{3}\) for the ILG model is that the bonds involved in the corresponding equilibrated reactions are comparable. This will be made clearer with the reference example described in the following section.

Fig. 2
figure 2

Autocatalytic order p (top) and \([{\mathbf{ABC}}]\)/c ratio (bottom) as a function of the decimal logarithm of \(K_{1}\) and \(K_{2}\) for the minimal replicator model (a), and \(K_{1}\) and \(K=K_{2}=K_{3}\) for the ILG model with a total amount of leaving group of \(l=0\) (b) and \(l=10\times c\) (c). For each plot of p, the vertical axis scale has been adjusted to the range of variation

Comparing the bottom plots of Fig. 2b, c, the rate of template replication decreases on increasing the total amount of leaving group, l. This is because the concentration of template bound to a leaving group \([{\mathbf{LC}}]\) and that of template duplex bound to a leaving group, \([{\mathbf{D}}]\), both increase on increasing l, which in turn contributes to decrease \([{\mathbf{ABC}}]\) because of the conservation relation 27.

Comparing the top plots of Fig. 2b, c, the autocatalytic order p increases on increasing the total amount of leaving group, l. Although surprising at first sight, this may be understood as follows. The lower the initial l, the higher the initial rate of template replication. Any template replication results in an equal leaving group production rate, see Eq. 45. This produced leaving group binds to the duplex, thus reducing the rate of increase of template replication. And the lower the initial l, the relatively larger this inhibition mechanism. In the extreme case where \(l \gg c\), then \([{\mathbf{L}}]\approx l\) and from Eq. 36 \([{\mathbf{ABC}} ]\) is solution of

$$\begin{aligned} c=(1+q(1+K_{3}l))[{\mathbf{ABC}}]+2K_{2}(q[{\mathbf{ABC}} ])^{2}(1+K_{3}l) \end{aligned}$$
(52)

which has the same form as Eq. 23 for the minimal replicator model, replacing q by \(q(1+K_{3}l)\) (or equivalently, replacing \(K_{1}\) by \(K_{1}\)/\((1+K_{3}l)\)) and \(K_{2}\) by \(K_{2}\)/\((1+K_{3}l)\). Increasing l is thus formally equivalent to decreasing both \(K_{1}\) and \(K_{2}\), which asymptotically makes the autocatalytic order close to unity as can be seen from Fig. 2a.

This is also the reason why, where \(p\ge 0.5\) for the minimal replicator model, p may turn negative for the ILG model (which would correspond to a sublinear c trajectory). However, we shall see in the next section below that for a realistic numerical example mirroring that of von Kiedrowski (1993), the autocatalytic order p for the ILG model can actually exceed that for the minimal replicator model.

It should also be noted that the assumed regime of \(K_{2}[{\mathbf{B}} ]K_{3}[{\mathbf{A }}]\ll 1\) (see end of “Chemical Reaction System” section) imposes \(K_{2}<1\)/\(b=1\times 10^{3}\,\mathrm{M}\) and \(K_{3}\,<\,1\)/\(a=1\times 10^{3}\,\mathrm{M}\). This restricts the applicability of Fig. 2b, c to the quadrant defined by \(2\text {log}_{10}(K)<6\) and \(\text {log}_{10}(K_{1})<6\) (assuming for the latter that \(K_{1}\) is comparable to \(K_{2}K_{3}\), which is itself comparable to \(K^{2}\)).

Detailed Comparison for a Hexanucleotide Template Reference Example

Figure 3 shows the reference hexanucleotide example used for the detailed comparison between the minimal replicator and ILG models. Whereas a DNA hexamer was used in von Kiedrowski (1993), the proposed ILG model relies upon RNA (because of the necessary 5\(^\prime\)–5\(^\prime\) and 3\(^\prime\)–2\(^\prime\) bonds) instead of DNA. For the sake of consistency, we chose to compare the proposed RNA-based ILG model with an RNA-based minimal replicator model using the same hexamer template as in von Kiedrowski (1993) (GCGCGC).

Fig. 3
figure 3

ILG model. Example template and precursors used for the detailed comparison with the minimal replicator model. Other molecules and complexes that are formed in the course of template self-replication are also shown

Such a palindromic sequence was taken from previous experimental work (von Kiedrowski 1986) where its self-complementary nature was needed in order to create a minimal replicator. In contrast, the example template chosen for the proposed ILG model should not be self-complementary, so as to avoid the possibility of a strongly bound duplex and the resulting template inhibition. Besides, specific binding sites for each monomer are necessary in order to obtain the correct intermediate for ligation, exactly as in the example used for the minimal replicator model. The branched nonanucleotide template shown in Fig. 3 was chosen to exclude other template–precursor associations which would conflict with the regiochemistry of the ligation step. For example, choosing GCGCGC instead of GGCCGG as a template sequence (and CGC instead of GCC as a precursor trimer sequence) could result in the trimer CGC binding in two different locations.

Starting from the chosen template for the ILG model, precursors were chosen as follows:

  1. 1.

    Precursor \({\mathbf{A}}\) consists of two GCC and CGG trimers connected with a 5\(^\prime\)–5\(^\prime\) pyrophosphate GC bond (see Fig. 4). Although pyrophosphates can be made in situ, the choice of a pre-activated monomer simplifies the model. The active bond makes it more prone to hydrolysis. It can be considered as the possible source of \({\mathbf{L}}\) and, thus, the previously discussed conserved quantity l.

  2. 2.

    Precursor \({\mathbf{B}}\) consists of two GGC and CCG trimers connected with a 2\(^\prime\)–3\(^\prime\) phosphate CG bond. The resulting ligation of \({\mathbf{B}}\) with the non-LG part of \({\mathbf{A}}\) results in the branched template \({\mathbf{C}}\) (see Fig. 4). Hydrolysis of phosphate bonds will not be considered in both ILG and minimal replicator models.

Fig. 4
figure 4

ILG model. Schematic of the 5\(^\prime\)–5\(^\prime\) pyrophosphate GC and of the 2\(^\prime\)–3\(^\prime\) phosphate CG bonds

To the best of our knowledge, the use of branched nucleotides in self-replication has not been explored yet, but such molecules have been used in ligation experiments, be it enzymatically (Mendel-Hartvig et al. 2004) or by chemical ligation (Carriero and Damha 2003). Moreover, the reaction occurring in complex ABC, resulting in the branched nucleotide, reminds of the lariat structure in RNA splicing (Padgett et al. 1984; Peters and Toor 2015). The key difference between the previously discussed and inspirational, self-splicing RNAs and our model lies in replacing the ribozyme activity by a purely chemical model using activated monomers. Although current synthetic methodology (Braich and Damha 1997) allows us to translate our model into a hypothetical nucleotide-based replicator example, one important caveat should be kept in mind. Thermodynamic data are needed to compare the two models, but such experimental data are lacking for branched nucleotides. This is the reason why we used data and estimation methods of unbranched RNA as an approximation (Freier et al. 1985).

The numerical values used for comparing the two models are given in Table 1. The values for the RNA minimal replicator model are very close to those used in von Kiedrowski (1993) for DNA. \(\Delta G_{i}=\Delta H_{i}-T\Delta S_{i}\), \(i=1\) or 2 for the minimal replicator model and \(i=1, 2,\, {\rm or}\, 3\) for the ILG model, is the standard Gibbs energy drop across the corresponding equilibrated reaction identified by its index i, so that the corresponding equilibrium constant is given by \(K_{i}=\text {exp}(-\Delta G_{i}\)/\(\mathcal {R}T),\) where \(\mathcal {R}\) is the ideal gas constant and T is the absolute temperature. As in von Kiedrowski (1993), a best case and a worst case were considered for the estimation of the precursors-to-template association parameters (\(\Delta H_{1}\) and \(\Delta S_{1}\)). The best case assumes a cooperative binding between precursors upon addition to the template, whereas the worst case assumes non-cooperative binding.

Table 1 Numerical parameters used for the comparison between the two models. Enthalpy changes (\(\Delta H\)) are given in kcal mol\(^{-1}\) and entropy changes (\(\Delta S\)) in cal mol\(^{-1}\,\)K\(^{-1}\)

It should be noted that the assumed regime defined by \(K_{2}[{\mathbf{B}} ]K_{3}[{\mathbf{A}} ]\ll 1\) (see end of “Chemical Reaction System” section) restricts the applicability of the analysis to temperatures above 284 K (11 \(^\circ\)C) for the ILG model.

Without Taking AB Complex Formation into Account

Figure 5 shows how the equilibrium distribution of the different species varies with temperature, for the minimal replicator model (left) and for the ILG model (right). For both minimal replicator and ILG models, a main difference between the best case and the worst case is the temperature below which the \({\mathbf{ABC}}\) complex can exist. Unlike for the minimal replicator model, for the ILG model the template duplex \({\mathbf{C}}_{2}\) is never present in any significant amount compared to the \({\mathbf{ABC}}\) complex or to the unbound template \({\mathbf{C}}\). Increasing the amount of leaving group impacts only marginally this distribution.

As in von Kiedrowski (1993), both k and \({\mathbf{C}}\) contribute to the temperature dependence of the rate r of template formation given by Eq. 41. The rate constant k is expected to vary with temperature as

$$k=A\times \text {exp}\left( -\frac{E_{\text {a}}}{\mathcal {R}T}\right) ,$$
(53)

where A is a constant. Figure 6 shows how the rate r of template formation varies with temperature, for the minimal replicator model (left) and for the ILG model (right), for various activation energies \(E_{\text {a}}\).

Figure 7 shows how the autocatalytic order p varies with temperature, for the minimal replicator model (left) and for the ILG model (right).

von Kiedrowski 1993 distinguishes three regimes:

  1. 1.

    Strong exponential Most of the template resides in the \({\mathbf{ABC}}\) complex form, the autocatalytic order p is close to unity, and the rate of template formation r increases on increasing temperature.

  2. 2.

    Weak exponential Most of the template resides either in the \({\mathbf{ABC}}\) complex or in the unbound \({\mathbf{C}}\) forms, but not in the duplex \({\mathbf{C}}_{2}\) form, the autocatalytic order p is intermediate between 0.5 and 1, and the rate of template formation typically decreases on increasing temperature.

  3. 3.

    Parabolic Most of the template resides either in the \({\mathbf{ABC}}\) complex or in the duplex \({\mathbf{C}}_{2}\) forms, but not in the unbound \({\mathbf{C}}\) form, the autocatalytic order p is close to 0.5, and the rate of template formation either increases or decreases on increasing temperature.

Whereas the minimal replicator model behaves as strong exponential at low temperature, weak exponential at high temperature and parabolic at intermediate temperature, the ILG model behaves as strong exponential at low temperature and as weak exponential at high temperature. The autocatalytic order p stays close to unity throughout the investigated temperature range, and the rate of template formation r reaches its maximum at a temperature that is nearly independent of the activation energy. Besides, this behaviour is only marginally impacted by the initial amount of leaving group.

It thus appears that the ILG model is more favourable than the minimal replicator model because (i) its autocatalytic order remains close to unity over the entire temperature range, and (ii) the rate of template formation r is always higher for the ILG model than for the minimal replicator model. Besides, these properties hold for both best and worst cases (regarding the estimation of \(\Delta G_{1}\)) as well as for a low or high initial amount of leaving group.

This advantage holds even when only considering temperatures above 284 K (11 \(^\circ\)C) for the ILG model (solid lines in Figs. 5, 6 and 7, so as to ensure \(K_{2}[{\mathbf{B}} ]K_{3}[{\mathbf{A}} ]\ll 1\), see end of “Chemical Reaction System” section). However, within this restricted temperature range, the maximum self-replication rate strongly depends on the estimated \(\Delta G_{1}\): it is significantly higher in the best case than in the worst case, because the maximum self-replication rate occurs very close to the lowest applicable temperature. Similarly, Fig. 5 shows that for the ILG model, strong exponential self-replication might only be observed close to this lowest temperature (freezing point of precursors into long chain-polymerized assemblies) and for the best case. For the worst case, the template resides mostly in the unbound \({\mathbf{C}}\) form in the applicable temperature range, which results in weak exponential self-replication.

This result is not a general property of the ILG model as seen in Fig. 2. Depending on the values of \(K_{1}\) and \(K=K_{2}=K_{3}\), the autocatalytic order p may even turn negative, whereas it is always higher than 0.5 in the case of the minimal replicator model.

The reason why the ILG model is advantageous for the considered hexanucleotide example may be understood as follows.

Fig. 5
figure 5

Best case (top) and worst case (bottom) equilibrium distributions of the various species as a function of temperature, for the minimal replicator model (a) and for the ILG model with a total amount of leaving group of \(l=0\) (b) and \(l=10\times c\) (c). For the ILG scheme, dotted lines are used outside the applicable temperature range (\(T< 11\,^\circ\)C). The vertical scale is the concentration of each species divided by the corresponding conserved quantity: \({\mathbf{ABC}}\)/c, \([{\mathbf{C}} ]\)/c and \([{\mathbf{C}}_{2}]\)/c for the minimal replicator model; \({\mathbf{ABC}}\)/c, \([{\mathbf{C}} ]\)/c, \([{\mathbf{C}}_{2}]\)/c, \([{\mathbf{L}} ]\)/l, \([{\mathbf{D}} ]\)/l and \([{\mathbf{LC}} ]\)/l for the ILG model

Fig. 6
figure 6

Best case (top) and worst case (bottom) decimal logarithm of the rate r of template formation as a function of temperature, for the minimal replicator model (a) and for the ILG model with a total amount of leaving group of \(l=0\) (b) and \(l=10\times c\) (c). For the ILG scheme, dotted lines are used outside the applicable temperature range (\(T<11\,^\circ\)C). The different curves correspond to different activation energies \(E_{\text {a}}\) ranging from 10 to 30 kcal mol\(^{-1}\) in 4 kcal mol\(^{-1}\) increments. The multiplicative coefficient in Eq. 53 was arbitrarily taken as \(A=1\times 10^{11}\,\mathrm{s}^{-1}\) as in von Kiedrowski (1993)

Fig. 7
figure 7

Best case (top) and worst case (bottom) autocatalytic order p as a function of temperature, for the minimal replicator model (a) and for the ILG model with a total amount of leaving group of \(l=0\) (b) and \(l=10\times c\) (c). For the ILG scheme, dotted lines are used outside the applicable temperature range (\(T<11\,^\circ\)C)

From Eqs. (35) and (37), we have

$$\begin{aligned}{}[{\mathbf{D}} ]=\frac{K_{2}K_{3}(q[{\mathbf{ABC}} ])^{2}l}{1+K_{3}q[{\mathbf{ABC}} ]+K_{2}K_{3}(q[{\mathbf{ABC}} ])^{2}}<\frac{K_{2}K_{3}(q[{\mathbf{ABC}} ])^{2}l}{1+K_{2}K_{3}(q[{\mathbf{ABC}} ])^{2}}. \end{aligned}$$
(54)

From Eq. 36, we have \(q[{\mathbf{ABC}} ]<c\) which fed into the above equation gives:

$$\begin{aligned}{}[{\mathbf{D}} ]<\frac{K_{2}K_{3}c^{2}l}{1+K_{2}K_{3}c^{2}}. \end{aligned}$$
(55)

The contribution of \([{\mathbf{D}} ]\) to Eq. 36 would thus be negligible if the following condition were verified:

$$\begin{aligned} \frac{K_{2}K_{3}c^{2}l}{1+K_{2}K_{3}c^{2}}\ll c. \end{aligned}$$
(56)

This is indeed the case for the considered hexanucleotide example. At room temperature (T=298 K), we have \(K_{2}\approx 60\,\mathrm{M}^{-1}\) and \(K_{3}\approx 360\,\mathrm{M}^{-1}\). This gives \(K_{2}K_{3}c^{2}\approx 2\times 10^{-4}\,\ll 1\) so that the above condition is equivalent to \(K_{2}K_{3}cl\ll 1\). We have \(K_{2}K_{3}cl\approx 2\times 10^{-3}\) even in the extreme case where \(l=10\times c\). With the contribution of \([{\mathbf{D}} ]\) to Eq. 36 being thus negligible, the rate of template replication given by Eq. 41 approximately takes the same expression for the minimal replicator and ILG models. The key advantage of the ILG model arises from the significantly lower duplex stability (and associated lower \(K_{2}\) equilibrium constant) because it only involves one \({\mathbf{A}} -{\mathbf{B}}\) bond instead of two for the minimal replicator model.

Taking \({\mathbf{AB}}\) Complex Formation into Account

For the minimal replicator model, there is only one such possible complex and the corresponding association constant \(K_{0}\) is given in Table 1. In this case, a and b should be replaced by the expressions given by Eqs. (38) and (39).

For the ILG model, Fig. 8 reveals that there are two possible such complexes, with association constants \(K_{2}\) and \(K_{3}\) as given in Table 1. In this case, a and b should be replaced by the expressions given by Eqs. (38) and (39), replacing \(K_{0}\) by \(K_{2}+K_{3}\).

Fig. 8
figure 8

Two complexes that can be formed by direct association between the precursors \({\mathbf{A}}\) and \({\mathbf{B}}\) for the ILG model

Figures 9, 10 and 11 show how the equilibrium distribution of the different species, the rate r of template formation and the autocatalytic order p, respectively, vary with temperature, for the minimal replicator model (left) and for the ILG model (right).

As stressed out in von Kiedrowski (1993) for the minimal replicator model, the formation of complex \({\mathbf{AB}}\) makes it difficult to observe strong exponential growth which otherwise might be found at low temperatures. For the ILG model, \({\mathbf{AB}}\) complex formation does not result in any significant difference in the applicable temperature range.

Fig. 9
figure 9

Taking \({\mathbf{AB}}\) complex formation into account. Best case (top) and worst case (bottom) equilibrium distributions of the various species as a function of temperature, for the minimal replicator model (a) and for the ILG model with a total amount of leaving group of \(l=0\) (b) and \(l=10\times c\) (c). For the ILG scheme, dotted lines are used outside the applicable temperature range (\(T<11\,^\circ\)C). The vertical scale is the concentration of each species divided by the corresponding conserved quantity: \([{\mathbf{ABC}}]\)/c, \([{\mathbf{C}}]\)/c and \([{\mathbf{C}}_{2}]\)/c for the minimal replicator model; \([{\mathbf{ABC}}]\)/c, \([{\mathbf{C}} ]\)/c, \([{\mathbf{C}}_{2}]\)/c, \([{\mathbf{L}} ]\)/l, \([{\mathbf{D}} ]\)/l and \([{\mathbf{LC}} ]\)/l for the ILG model

Fig. 10
figure 10

Taking \({\mathbf{AB}}\) complex formation into account. Best case (top) and worst case (bottom) decimal logarithm of the rate r of template formation as a function of temperature, for the minimal replicator model (a) and for the ILG model with a total amount of leaving group of \(l=0\) (b) and \(l=10\times c\) (c). For the ILG scheme, dotted lines are used outside the applicable temperature range (\(T<11\,^\circ\)C). The different curves correspond to different activation energies \(E_{\text {a}}\) ranging from 10 to 30 kcal mol\(^{-1}\) in 4 kcal mol\(^{-1}\) increments. The multiplicative coefficient in Eq. 53 was arbitrarily taken as \(A=1\times 10^{11}\) s\(^{-1}\) as in von Kiedrowski (1993)

Fig. 11
figure 11

Taking \({\mathbf{AB}}\) complex formation into account. Best case (top) and worst case (bottom) autocatalytic order p as a function of temperature, for the minimal replicator model (a) and for the ILG model with a total amount of leaving group of \(l=0\) (b) and \(l=10\times c\) (c). For the ILG scheme, dotted lines are used outside the applicable temperature range (\(T<11\,^\circ\)C)

Conclusion

We have proposed a self-replicating scheme based on an informational leaving group (ILG), inspired by the role of tRNA as leaving group and information carrier during protein synthesis. The potential advantage of such scheme is the weaker bonding of the template duplex and thus reduced template inhibition. We have carried out a theoretical analysis of this ILG scheme following the same approach as that of von Kiedrowski (1993) for the minimal replicator model, and have compared theoretical predictions for this scheme with those for the minimal replicator model. Although the autocatalytic order p may even turn theoretically negative for certain values of dissociation constants, when comparing this ILG scheme with the minimal replicator model for a hexanucleotide template sequence mirroring that is used by von Kiedrowski in his original theoretical work (von Kiedrowski 1993), we have found that the ILG scheme was expected to outperform the minimal replicator model, with a higher replication rate and a higher autocatalytic order. Although this ILG scheme is expected to be sensitive to initial hydrolysis, this advantage is expected to hold even with significant hydrolysis in the initial mix. For the minimal replicator model, direct precursor-to-precursor complex formation should prevent strong exponential growth which otherwise might be found at low temperatures (von Kiedrowski 1993). For the proposed ILG model, the applicable temperature range is further restricted to temperatures above the freezing point of precursors into long chain-polymerized self-assemblies. Whether strong exponential growth might be observed in this applicable temperature range strongly depends on the estimated thermodynamic data for the template–precursors complex, for which experimental data are lacking. This stresses the need to confirm these theoretical predictions by future experimental work.