Main

This study is based on a simplified version of the archetypal origami tile1 and, in particular, on the distribution of observed folds of a ‘dimer’ variant which contains two copies of the template sequence in head-to-tail repeat. The ‘monomer’ tile (Fig. 1) is created by annealing a 2,646-nucleotide (nt) circular template with 90 staples, each designed to hybridize to one or more 15- or 16-base domains of the template. 76 of the staples mediate interactions between pairs of non-contiguous template domains, as follows: 66 U-shaped ‘body’ staples form short-range contacts between domains that are relatively close in the primary sequence of the template; and 5 pairs of ‘seam’ staples form long-range contacts, bridging between positions where the template folds back on itself to form a central seam1. Unlike the interactions between amino acid residues that stabilize a protein, staples mediate interactions between template domains that are highly specific: each staple can be considered to bind stably only to complementary domains of the template. The designed fold of the monomer tile corresponds to an absolute minimum in the free energy landscape. This origami folds with high yield to form discrete rectangular tiles of approximately 80 nm × 40 nm (Fig. 1c); approximately 80% of tiles appear to be well folded.

Figure 1: The monomer tile.
figure 1

a, 66 body staples (blue) and 5 pairs of seam staples (brown) each hybridize to two non-contiguous domains of the circular template. Edge staples (grey) fill gaps at the top and bottom of the structure. Hybridization of body and seam staples pins the corresponding domains of the template together, determining the unique stable, rectangular fold of this simplified origami tile as indicated in b. c, Atomic force micrograph of the monomer tile (scale bar, 50 nm).

PowerPoint slide

The ‘dimer’ template is also circular. It contains two identical copies of the monomer joined head-to-tail and can therefore bind two copies of each staple (Fig. 2). Each pair of body and seam staples can bind in one of two configurations (Fig. 2a) to form either an internal link within each copy of the monomer sequence or a pair of cross-links between the two copies. The total number of possible domain pairings is 276 ≈ 1023. Although many of these configurations are sterically inaccessible, it is clear that the result of reducing the specificity of staple binding is that, as in the case of protein folding, the number of possible states of the system is overwhelmingly greater than the number of well-folded structures. However, in contrast to proteins (and to conventional origami structures) there is more than one ‘well-folded’ state (Fig. 2): not one but a handful of well-folded states occupy discrete energy minima in a vast configurational landscape. Remarkably, when the dimer origami is annealed by cooling from 95 °C, a small set of well-folded shapes are formed with good yield: each consists of a pair of rectangular tiles attached on one edge (Fig. 2b, c). The probability of finding well-folded structures by random search of configuration space is negligible8, therefore efficient folding pathways must exist21,22. As in protein folding, assembly is constrained such that the system is highly likely to discover free-energy minima that correspond to well-formed final states.

Figure 2: Folding origami tiles with a dimer template.
figure 2

a, Left, the base sequence of the green section of the template is the same as that of the pink section, so the dimer template can fully hybridize to two copies of each staple. Arrows indicating staple binding and unbinding transitions are annotated with reaction rates used in a model of assembly. Two identical staples can bind to the template in one of two configurations, binding together pairs of domains within each section or connecting domains in different sections. This gives a vast number of possible configurations, a handful of which are well-folded (shown right, and defined in Fig. 3 and Extended Data Figs 1 and 2). Ordered folds of the dimer template comprise two linked rectangular tiles with a characteristic offset on the long or short edges, for example, b or c, respectively (in each case, an AFM image with a scale bar of 50 nm is shown alongside a schematic of the fold).

PowerPoint slide

The dimer origami tile has 22 template routings that correspond to well-folded configurations in which all staple binding sites are occupied and in which the tile is expected to be planar and unstrained. These give 6 unique shapes, each with a characteristic offset between two linked rectangular components which have essentially the same structure as the monomer tile (Fig. 3a–c and Extended Data Fig. 1). These shapes can be grouped into classes according to the contacts made by the seam staples: fold m:n has m pairs of seam staples that connect domains within each half of the template and n pairs of seam staples that form connections between domains in opposite halves (Fig. 3a–c). Folds m:n and n:m are related by symmetry and are therefore not distinguished in our experiments or analysis (Extended Data Fig. 1b). A set of non-planar folds adds a seventh shape to the six defined above and a further 52 template routings (Fig. 3d and Extended Data Fig. 2). Fink and Ball23 have estimated the maximum number of distinct, compact configurations that can be encoded into a single polymer sequence: for a polymer of 168 unique domain types on a square lattice23,24 the theoretical limit is 13. A major factor in allowing the large number of folds in our system is the extensive re-use of structural motifs within distinct folds, a possibility not considered by Fink and Ball.

Figure 3: Classification of well-folded shapes.
figure 3

ac, Folds of the dimer origami tile can be classified by the pattern of interactions mediated by the seam staples: these contacts are shown schematically in diagrams in which the dimer template is represented as a circle. Fold m:n has m seam contacts between domains within each copy of the monomer sequence and n seam contacts that form connections between the two copies (that is, connecting template domains of different colour). The fold 5:0 can be further divided into shapes that differ in the offset along the long edge of the two tiles (5:0i, 5:0ii and so on: c). d, The set of legal folds allowed by the model includes the configurations shown in ac and an additional set of non-planar configurations (NP), one of which is shown (see Extended Data Fig. 2 for the complete set). e, Tiles observed in folding experiments can be classified according to the fractional offset along the short or long edge of the tile (w/W and l/L respectively). f, Seven unique shapes corresponding to well-folded configurations (see also Extended Data Fig. 1). g, Gallery of shapes observed by AFM in a typical experiment with measured fractional offsets (each image is 300 × 300 nm). A bin size of 0.1 is used in histograms of fitted fractional offsets (Fig. 4).

PowerPoint slide

Atomic force microscopy (AFM) enables us to distinguish different configurations of the template and this provides a unique opportunity to study folding pathways. Samples of annealed origami were imaged by AFM. Most observed shapes are consistent with the classification scheme shown in Fig. 3, and the outlines of 44% of objects identified as candidate dimer tiles were successfully fitted to measure the offset between the two component monomer tiles (Fig. 3e and Extended Data Fig. 3).

The distribution of tile shapes was compared to predictions made using a Markov chain model of folding in which each transition corresponds to binding or unbinding of a single staple domain (Fig. 2, Methods section ‘Folding model’). An unbound staple at concentration c binds to the template with a rate k+c (where k+ = 106 M−1 s−1, ref. 25). After one half of a staple has bound, the second half can bind with a rate (k+ceff) that depends on its effective concentration, ceff, at the corresponding template domain. The effective concentration depends on the proximity of the template domain which, in turn, depends on the contacts between template domains already established by hybridization of other staples. We expect folding to be dominated by short-range interactions because staples are more likely to connect two template domains that are spatially close, either because they are closely spaced along the template or because the previous binding of other staples is holding them together. To determine the effective concentration, the shortest path through the part-assembled origami that connects the complementary template and staple domains is identified. This connection is modelled as a heterogeneous freely jointed chain with double-stranded (ds) and single-stranded (ss) DNA components. The effective concentration of the part-bound staple at the complementary template domain is related to the probability that the ends of the chain lie spontaneously within a (short) interaction range. Unbinding of a staple domain is treated as a two-state transition, with a configuration-independent rate: k = k+ exp{ΔG0,duplex/RT}(1 M) where ΔG0,duplex is the change in standard free energy on forming the duplex at standard concentrations of 1 M. In order to represent steric constraints on folding, the state space of the model is restricted to patterns of staple binding in which each segment of the partially folded origami occurs in one of a set of pre-defined, well-ordered folds.

The histograms in Fig. 4 show distributions of offset values, measured by fitting AFM data (Extended Data Figs 3 and 4), and the corresponding distributions between the discrete shapes shown in Fig. 3 that are predicted by the model. Figure 4a corresponds to the staple set described above (see Fig. 1): structures with each of the seam configurations 5:0, 4:1 and 3:2 are observed. The model suggests that the folding pathway depends on competition between body and seam staples. If local interactions mediated by body staples were to form first and dominate the outcome, the system would prefer the 5:0i fold (see Fig. 3f for nomenclature) in which all body staples are bound to two domains that are as close as possible along the template. In this fold, no staples link the two halves of the template. However, strong seam connections that are inserted early in the folding pathway favour a more uniform distribution between all possible seam configurations: for example, once the part-folded structure 1:1 has formed, the 5:0 fold is inaccessible unless at least one seam connection is broken (Extended Data Fig. 5). With the staple set shown in Fig. 4a, each seam contact is bridged by two staples. The cooperative binding of seam staple pairs offsets the increased entropic cost of forming long-range contacts, with the result that seam staples are incorporated at a similar temperature to body staples in both model (Fig. 4a) and experiment (Extended Data Fig. 6). Consequently the model predicts that all seam configurations should be observed, consistent with experimental observations.

Figure 4: Folding can be guided by modifying staples to steer the folding pathway.
figure 4

The reference staple set (a) folds to give a distribution of shapes that are characterized by the fractional offset between the two component tiles along the long or short edge. Modifications to the reference staple set (be) were designed to fold into specific target shapes. Left-hand panels show the staple configurations and the seam-staple contacts in the target structures. The top left rectangle of each target shape is used to highlight modified staples in red. The distance between the two template domains linked by a staple depends on fold: in the bottom right rectangle, staples are grouped and coloured according to the distance spanned (see key in a: short-range body staples are blue, seam staples are brown to yellow; lighter shades indicate larger distances). Graphs in the central panels show the calculated fraction of contacts formed for each staple group (whether or not as part of the target structure) as a function of temperature during assembly. The right-hand panels show the distribution of shapes predicted and observed for each set: a histogram representing the continuous distribution of fitted offset values is plotted above the distribution between discrete shapes predicted by the model (indicated by silhouettes). The number of fitted shapes, N, and the yield (shapes fitted as a percentage of candidate structures identified) are shown in the top right corner of each histogram.

PowerPoint slide

We predict that the folding pathway can be changed by altering the relative strengths of short- and long-range interactions. Breaking in half one of each pair of seam staples (Fig. 4b), so that the pairs no longer bind cooperatively, weakens these long-range bonds, causing them to form later in the folding pathway (Fig. 4b, central panel and Extended Data Fig. 6) and to break and reform in alternative configurations more frequently (Extended Data Fig. 7). With weakened long-range interactions, we expect folding to be governed primarily by local interactions. The model predicts that the distribution of shapes is shifted strongly towards the 5:0i fold, in which all body staples span the smallest possible distances along the template (the same distances as in the monomer tile), and this is confirmed by experiment (Fig. 4b). The thermodynamic cost of breaking every other seam staple is approximately equal in each well-folded state and therefore this change should not affect their equilibrium populations. We have changed the distribution between folds not by changing the relative stability of the final states but by deliberately controlling the stabilities of crucial intermediate states, thus shaping the folding pathway.

The importance of stable, long-range interactions in determining the folding pathway is revealed by the evolving correlations between seam staples in the model. Characteristic patterns of correlation can be used to predict the final fold even before seam staple occupancy has reached 50% (Extended Data Figs 8 and 9).

The influence of seam staples on folding is similar to that of disulphide bonds in Anfinsen’s experiment on protein folding9. If long-range bonds are allowed to form first and, effectively, irreversibly, then folding is kinetically trapped. If they are weakened and permitted to rearrange then folding can be controlled by weaker short-range interactions.

Figure 4c shows an alternative staple set incorporating extended staples that form particularly strong short-range connections and therefore bind to the template early in the folding process (Fig. 4c, central panel). Without interference from other staples, these contacts are most likely to form between the pairs of template domains with the smallest separation along the template. These preferred contacts occur in the 3:2 and 5:0 folds but not the 4:1 fold (in the 4:1 fold, one extended staple forms a long-range contact between the two halves of the template). Experimental results confirm the model prediction that the 4:1 fold is strongly suppressed (Fig. 4c). As with the broken seam staples (Fig. 4b), this modification guides the folding pathway without imposing an energetic penalty on alternative folds.

We can control the fold of the dimer very effectively by engineering both the folding pathway and the stability of the chosen target structure. The 3:2 configuration can be favoured by weakening the original seams (as in Fig. 4b) and adding new seam staples that bridge between the monomer tiles without distortion only in the 3:2 configuration (Fig. 4d). This modification guides folding by increasing the stability of 3:2 relative to other folds. Similarly, a long staple in the bottom right corner of the monomer tile (Fig. 4e) biases folding towards the 5:0iv shape by decreasing the stability of other folds, which would require introduction of a sharp bend within the long staple. (The model does not include any penalty for bending and so fails to predict the engineered bias in this case.)

By showing that an origami tile with a duplicated template can be annealed to produce a high yield of well-folded structures from among ∼1023 disordered alternative staple configurations, our results confirm that, as in the case of proteins, efficient folding pathways exist and that folding is highly cooperative. We infer that the folding of all DNA origami is shaped by similar pathways. Manipulation of the folding pathway validates our simple folding model, which successfully predicts the dominant folding pathways observed in experiments. We anticipate that this tool will prove more generally useful, to establish how to change the relative strengths of local and long-range staple interactions to rationally steer the folding pathway towards desired target structures.

Methods

Experimental methods

Plasmid pUC19 cut with HindIII and EcoRI was amplified by PCR with the primers TGACCTAATCCTCAGCAATTCACTGGCCGTCGTTTTACAA and ACGGACGCGCTGAGGAGCTTGGCGTAATCATGGTCATAG in order to trim the template to the desired length and introduce a unique BbvCI site. The PCR product was cut with BbvCI and ligated to generate pKD1 (2,646 bp). A typical monomer plasmid preparation contains a small amount (∼1%) of plasmid dimer. The dimer plasmid was obtained by nicking a monomer plasmid preparation with Nt.BbvCI (in order to resolve monomer and dimer more easily), purifying the nicked dimer band from a 0.7% TAE agarose gel, then transforming the purified nicked dimer into the recA host DH5α. The template sequence is given in Supplementary Information.

Single-stranded template was prepared by sequential reaction of either monomer or dimer pKD1 with Nt.BspQI at 50 °C and ExoIII at 37 °C to digest the non-template strand and leave a covalently-closed single-stranded template26. Enzymes were removed by phenol:chloroform extraction and the template was recovered by ethanol precipitation; its concentration was then determined by measuring ultraviolet absorbance at 260 nm.

DNA origami was designed using caDNAno27 and was assembled by cooling template at 4–10 nM with a ∼10-fold excess of staples from 95 °C to 25 °C at 1 °C per minute in a buffer containing 40 mM Tris-acetate (pH 8.3) and 12.5 mM magnesium acetate. Excess staples were removed using an S-300 size exclusion spin column28. Staple sequences for the standard design and variations are given in Supplementary Information.

Atomic force microscopy images were acquired using either an Agilent 5500 AFM with Olympus TR400-PSA probes (Figs 1, 2, 3, 4a) or a Veeco Dimension 3100 with Bruker SNL-10 probes (all other figures). A few microlitres of sample were added to freshly cleaved mica and the sample was imaged in tapping mode in an imaging buffer containing 12.5 mM magnesium acetate, 4 mM NiCl2, 1 mM EDTA and 40 mM Tris-acetate pH 8.0–8.3 (the imaging buffer for Fig. 1c lacked NiCl2, the imaging buffer for Fig. 2c lacked EDTA).

Folding model

Our domain-level description of origami assembly is intended to reproduce some aspects of cooperativity. In particular, it accounts for the increase in incorporation rate for a staple when its target domains on the template are held more closely together as a result of the earlier binding of other staples. This effect is most noticeable in the seam where the binding of the first of a pair of seam staples greatly accelerates, and is stabilized by, the binding of the second. The model incorporates a physically reasonable approximation of the entropic cost of closing loops by staple binding, but is far from a complete description of the physics of assembly. It is useful in guiding, and providing insights into, the effects of significant changes to the origami design.

We model the folding of an isolated template in the presence of an excess of staples as an inhomogeneous continuous-time Markov chain. Each transition between states corresponds to the binding or unbinding of a single staple domain. Transition rates between two states are chosen according to an estimate of the free energy difference between the two, in a manner that would reproduce the correct Boltzmann distribution if this free energy difference were calculated exactly. The temperature is updated once per second of simulated time which allows us to use an event-based Gillespie simulation algorithm29 with transition rates fixed over one second intervals. Data on folding processes are collected by simulating multiple folding trajectories (typically 1,600 per experiment).

Subsequent sections contain more detailed descriptions of the folding model.

State space

We consider the possible configurations of staples hybridized to the template with domain-level resolution: a domain is either fully hybridized or unhybridized. A staple is called half-bound if only one of its two domains is hybridized to the template and fully bound if both domains are bound. In the model, a staple domain can only hybridize to the complementary template domain; we ignore weaker interactions that result from inevitable partial sequence complementary between other pairs of domains.

For each type of two-domain staple (and the corresponding two pairs of complementary template domains) there are 34 distinct patterns of domain binding (states) with between zero and four copies of the staple bound to the dimer template. One is an empty state. When one staple is bound to the template there are four states in which the staple is half-bound and four states in which the staple is fully bound. When two staples are bound to the template there are six states in which both staples are half-bound, eight states with one half-bound and one fully bound staple, and two states with two fully bound staples. There are four states with three half-bound staples and another four states with one fully bound and two half-bound staples. Finally there is the possibility that four half-bound staples are attached to the template. For a single-domain staple and the associated pair of template domains there are just four states. There are therefore 34 x × 4 y states of the dimer template with staples, including part-folded states, where x is the number of two-domain staples and y is the number of single-domain staples. Of these, 2 x states consist exclusively of fully-bound staples. Formally, the state space S is given by p0 × p1 ×… × pk−1 where pi denotes the set of possible states for staple i as described above and k is the total number of staples.

Exclusion algorithm

Two template domains hybridized to a single two-domain staple are held within a few tenths of a nanometre of each other at the staple crossover: many of the folds in S cannot meet this constraint. We provide an algorithm that provides an approximate representation of steric constraints, preventing the model from accessing unrealistic states. This method provides an approximation to the real steric constraints: it does not guarantee that each legal state satisfies the constraints or that all states that satisfy the steric constraints are legal.

We define a connected segment of an origami as a set of hybridized domains such that each domain can be reached from each other domain without leaving the set. Two template domains hybridized to the same staple are defined to be connected, as are two adjacent template domains hybridized to different staples. A partially folded segment of origami is considered stress-free (is legal) when it occurs in one of the set of well-ordered, two-dimensional folds shown in Extended Data Figs 1 or 2. These pre-defined folds satisfy the constraints imposed by finite staple length and steric exclusion.

More formally, we can represent the physical origami in partially folded state sS as an abstract graph G(s) = (V, E) such that each boundary between adjacent domains is a vertex v ∈ V and each template domain and staple crossover is an edge e ∈ E between the appropriate vertices. Each edge has a labelling function f: E → {single-stranded, double-stranded, crossover} that assigns an appropriate status. We can draw subgraphs consisting of connected hybridized segments of the graph: for the origami to be in a legal (stress free) state, each of these subgraphs must be present in a single well-ordered fold from the set shown in Extended Data Figs 1 and 2.

Misfolds occur in the model when at least two connected segments would be incapable of satisfying the constraints were they to become connected to each other. At that point, folding cannot advance unless one of the segments unfolds, allowing another to expand. Extended Data Fig. 3c shows a misfolded dimer that has three connected parts that cannot be joined to form a stress-free state. When simulating assembly using the staple set corresponding to Fig. 4a, about half of the simulations end in a misfolded state; for the weakened-seam variant (Fig. 4b) there are only ∼1% misfolds.

Rates model

We develop a kinetic model of folding based on standard reaction models for hybridization and a method to estimate the effective local concentration of the unhybridized domain of a half-bound staple at its complementary template domain.

Consider complementary strands A and B that can bind reversibly to form duplex AB. Under the assumptions of mass action kinetics, the concentration [AB] is described by

for rate constants k+ and k. The rate constants are constrained by the requirement that the equilibrium concentrations {A}, {B} and {AB} are consistent with , the standard change in Gibbs free energy on duplex formation:

where R denotes the molar gas constant, T temperature.

For staples within a partially folded origami, binding and unbinding rates are similarly constrained by the difference in free energy between states. We approximate the difference in free energy between partially folded states s,s′ that differ by the hybridization of a single template domain as

where is the standard free energy change corresponding to the formation or dissociation of an equivalent isolated duplex and represents the change in entropy corresponding to the geometric constraints on the template that arise when two-domain staples connect non-contiguous template domains (‘looping constraints’)6,30,31. ΔGshape quantifies cooperative effects: when a single staple domain binds or unbinds, ΔGshape depends on the pattern of binding of other staples.

Consider a single, isolated origami in partially folded state s00 and let staple p bind to the template by a single domain, resulting in state s01. The rate for this reaction is taken to be equal to that for duplex formation between isolated strands:

where σ(s,s′) is the rate of transition from state s to s′. The unbinding rate is then determined by a thermodynamic constraint analogous to equation (2):

 We have set because transitions s01s00 do not create or destroy loops in the template. (We do not take into account other ways in which hybridization of a single staple domain affects the free energy of the partly-folded origami, for example, by changing the mechanical properties and thus the free-energy cost of any pre-existing loop of which it forms part.) For the second domain of the staple, once the first domain is bound, we again fix the unbinding rate to be that of the corresponding isolated duplex. This rate does not depend on the change in entropy that results from the removal of a looping constraint30,31 because, immediately after unbinding, the conformation of the template is unchanged:

where s11 denotes the state in which the staple is bound to the template with both domains. The binding rates of the second domain of the staple, once the first domain is bound, can then be found from the thermodynamic constraint

The free energy penalty , that corresponds to the additional geometric constraints associated with the binding of the second staple domain, thus determines the binding rate for the second domain.

Looping constraints

We approximate , where corresponds to the entropic penalty of closing the new loop that forms in the template when the second domain of a staple binds. For other transitions, no loop forms and we take ΔGshape = 0. ΔGloop quantifies the difference between the entropic penalties for pinning the template into a loop so that the second staple and template domains can bind and for bringing together two domains unconnected by a loop in a hypothetical ideal system at standard conditions (1 M concentration)32. ΔGloop is thus related to the ratio between the probabilities of bringing two domains into contact in the looped system and in the ideal unconnected system:

Here, is the probability that the origami adopts a confirmation in which the unbound staple arm and the template domain are spontaneously within an interaction radius r0 of each other, where r0 is an unspecified small distance necessary for closure of the loop. is the probability that two unconnected molecules would be within r0 in a hypothetical ideal system of v0 = 1/NA litres, NA being Avogadro’s number. The rate of hybridization of a second staple domain is therefore given by

so denotes the effective concentration of the opposing domain.

As a first approximation we treat the loop of DNA as a freely-jointed chain comprising two types of link, double-stranded DNA and single-stranded DNA (dsDNA and ssDNA respectively). Let P(r) be the probability density for the end-to-end extension of the chain r. Then is the probability that the two domains are separated by at most r0.

The end-to-end distance distribution P(r) of a freely-jointed chain, in the limit of a large number of segments, is

where E[r2] is the mean squared distance between the two ends. The result for a single segment type is a classic result of statistical physics33,34. The following argument shows that the result also holds for a chain with heterogeneous segments. From the central limit theorem, for a large number of segments we expect a Gaussian distribution over the x, y and z components of r. Equation (10) is the only Gaussian distribution that also satisfies the symmetry conditions E[x] = E[y] = E[z] = 0, and E[xy] = E[xz] = E[yz] = 0.

The internal association rate is therefore given by:

where we have assumed in the second step.

The loop that is closed by the insertion of a staple into a part-folded origami has, in general, a complex structure comprising multiply connected domains of single- and double-stranded DNA. We approximate this loop by a single path through the origami, the loop with the smallest expected square end-to-end distance E[r2]. This path represents the most important constraint that leads to the enhancement of the effective local concentration of one end of the loop at the other, and thus provides the most significant enhancement of σ (s01, s11). In order to identify the dominant loop, each edge e ∈ E in the implied graph G(s) = (V, E) of the partially folded origami is assigned a weight equal to the contribution to E[r2] in the freely jointed chain approximation. Dijkstra’s shortest path algorithm35 is used to determine a loop that minimizes E[r2] and hence determines σ (s01, s11).

For the seam staples, which are paired, the loop closed by hybridization of the second staple is particularly small: it consists only of the crossover link. The predictions of the model remain physically sensible: a second staple binding to a seam has an overall ΔG which is ∼4.4 kcal mol−1 less favourable (at T = 60°C) than a continuous duplex. This destabilization is equal to that expected from a 5-nt bulge within a duplex30. We note that for the broken seam variant, the model predicts incorporation temperatures for the unbroken staple that are lower than the regular case by 2.0 °C, compared to 2.2 °C measured in experiment (Extended Data Fig. 6). It is therefore clear that we do not overestimate the cooperative stabilization of seam staples.

The approximations made in estimating the change in free energy when a staple domain binds or unbinds are not thermodynamically self-consistent: the value assigned to the difference in free energy between states depends, in general, on the path taken between them. Models of this kind will be presented in a companion paper, in which they are compared to thermodynamically self-consistent approaches for simpler systems (F.D. et al., submitted).

Parameterization of the model

Compared to unbinding rates, the rate of binding of an isolated duplex is known to be weakly dependent on duplex stability36. We assume k+ to be independent of temperature, domain sequence, and folding state, and we set k+ = 106 M−1 s−1 (refs 25, 36, 37).

The free energy change when each domain binds to its complement, ΔG0,duplex, is taken to be that of a 16-bp DNA duplex averaged over all possible sequences38. Buffer conditions of 40 mM [Tris] and 12.5 mM [Mg2+] are assumed, giving an additional entropic penalty (in units of cal mol−1 K−1) for duplex formation of:38,39,40

where N is the number of phosphates in the duplex. For ssDNA we use a contour length of Lc,ss = 0.6 nm per base and a Kuhn length of λss = 1.8 nm: a single-stranded domain of 16 bases thus has a contour length of 16 × 0.6 nm41,42,43,44,45. For dsDNA we use a contour length of Lc,ds = 0.34 nm per base46 and make the approximation that the persistence length is much longer47 than any relevant duplex: a double-stranded domain of 16 bases thus corresponds to a single rigid link of length λds = 16 × 0.34 nm. A crossover link between the two template domains hybridized to a single staple is treated as a single segment of length λss.

Example rate calculations

Consider the half-bound staple shown in Extended Data Fig. 10a that is hybridized to an otherwise empty template. A seam staple, labelled A, is used as an example here. Its second domain can hybridize to either of two sites: the closer is connected by a 448-nt ssDNA chain (E[r2] = 480 nm2) and the further by a composite chain comprising a 2,208-nt single-stranded chain and one rigid 16-bp double stranded segment (E[r2] = 2,400 nm2). Following the calculation outlined above, we find that for the closer site the effective local concentration of the opposing domain ceff = 51 µM, the loop cost ΔGloop = 6.5 kcal mol−1 (at T = 60 °C) and the hybridization rate σ = 50 s−1. For the further site: ceff = 4.6 µM, ΔGloop = 8.1 kcal mol−1, and σ = 4.5 s−1. The staple is 11 times more likely to bind to the closer domain.

Binding of one staple affects the binding of others by changing the characteristics of the template (or partly-formed origami) that links their two binding domains. We now compute the hybridization rate, loop cost and local concentration for a second seam staple, staple B, in the presence or absence of staple A. In the absence of staple A, the shorter of the two loops that connect two binding domains of the second staple consists of a 864-nt ssDNA chain: E[r2] = 980 nm2, ceff = 18 µM, ΔGloop = 7.2 kcal mol−1, σ = 18 s−1. In the presence of staple A, the loop passes through the link formed by staple A and comprises 384 nt ssDNA, 3 rigid 16-bp dsDNA segments and a staple crossover modelled as a single segment of length λss (Extended Data Fig. 10b): for this shortened loop, E[r2] = 520 nm2, ceff = 46 μM, ΔGloop = 6.6 kcal mol−1 and σ = 46 s−1. Insertion of staple A increases the rate of hybridization of the second domain of staple B by a factor of 2.6 by shortening the distance between its binding sites.

Code availability

The code used to implement the folding model is freely available via https://github.com/fdannenberg/dna.