Main

Self-assembly of materials currently requires a toolbox of building blocks with a given shape and a multitude of interaction flavours and strengths to ensure a unique product9,10,11. By contrast, achieving self-assembly of an arbitrary structure with high yield using a limited palette of flavours remains a key challenge. We therefore turn to the biological concept of self-assembly by the folding of linear chains, analogous to protein and RNA folding, and adapt it to materials science12,13,14.

Our system consists of two flavours of colloidal droplets, labelled blue (A) and yellow (B), functionalized with complementary DNA strands (Methods). These droplets irreversibly bind with valence two to form the backbone of an alternating colloidomer7,15, as depicted in Fig. 1a,b. The droplets are dispersed in an aqueous ferrofluid and we apply an intermittent magnetic field to accelerate the chaining process, giving rise to an exponential distribution of chain lengths, as shown in Extended Data Fig. 1. These chains are thermal and freely jointed because DNA diffuses on the surface even after the droplets are bound.

Fig. 1: Colloidomer design and folding.
figure 1

a, Two flavours of droplets, A (blue) and B (yellow), are functionalized with complementary backbone strands of DNA to make alternating chains. They also carry weaker DNA interactions that mediate folding. The blue flavour carries two additional types of DNA, whereas the yellow flavour carries only one DNA strand that provides two distinct interactions. This 9 base pair (bp) DNA strand carries a consecutive complement to the 6 bp strand on the blue particle, activated at a melting temperature (Tm) around 32 oC, and 6 intermittent bp that are palindromes activated at Tm around 27 oC to mediate the yellow–yellow interaction (Methods). b, An emulsion first assembles into colloidomers using a magnetic field (B), after which a temperature (T) protocol triggers folding into diverse geometries. c, Fluorescent images show colloidomers of different lengths that undergo folding over time. Scale bar, 20 μm. d, A temperature protocol gives rise to stepwise folding, each step with a duration τ, of a decamer chain into the crown foldamer. Scale bar, 5 μm.

To mediate folding, each droplet flavour is in addition functionalized with DNA strands that act as weaker secondary interactions. Droplets have the advantage that they freely rearrange after binding, facilitating folding16. If all interactions are all simultaneously switched on, one obtains a mixture of folded geometries as the final product17,18. The number of possible geometries is singular for chains shorter than hexamers, but then grows exponentially with chain length. For example, an octamer can fold into nine distinct geometries, four of which are shown in Fig. 1b. By choosing DNA strands with distinct binding energies and therefore different melting temperatures10 (Methods), we establish a hierarchy of bonds that are switched on as the temperature is lowered, as shown in Fig. 1c,d. Because the melting transition is sharp, working a few degrees below it ensures irreversible bond formation and downhill folding. For example, the decamer chain in Fig. 1d folds into the crown in a stepwise manner. First, the blue–blue palindrome interaction forms a pentamer core at high temperature, followed by the sequential locking in of yellow–blue and yellow–yellow bonds at progressively lower temperatures. Other protocols with a different sequence of secondary interactions are mediated by the same DNA strands, but grafted on droplets in different combinations (Methods).

Design of the folding landscape

Along the folding process, each new bond that forms causes the chain to adopt a different configuration. Those configurations that have the same contact matrix, ignoring chirality, are here defined to belong to a given state. All possible states between the linear chain and the final geometries map out an energy landscape that can be represented in a tree form19. In the folding tree in Fig. 2a, each row shows states with the same number of secondary bonds, that is, the same potential energy. Two states are connected in the tree if one can topologically transform into the other by making or breaking a single bond. Designing folding protocols, or the order of secondary droplet interactions, enables us to funnel the landscape to one final folded state.

Fig. 2: Folding pathways of a heptamer.
figure 2

a, All folding pathways of a four-blue, three-yellow droplet heptamer result in a rocket foldamer when only the blue–blue interaction is turned on. Experimental images of states are superimposed with the theoretical tree, in order of frequency, to show the diversity of observed pathways. Each image contains an example of a backbone arrangement overlaid in white. The number of secondary bonds acquired is shown at each level of the tree. The plots on the right show the time evolution after the temperature quench tquench of the yield of each colour-coordinated state. b, When the yellow–yellow interaction is switched on first, the same polymer folds into a single floppy state. Further interactions fold it into a rocket with a different fold, but reversing their order leads to a mixture of the rocket and the ladder. Note that switching on the yellow–blue bond last (dashed line) would require a different DNA strand design to that shown in Fig. 1a.

The example of an alternating heptamer chain in Fig. 2a shows that switching on only the blue–blue interaction yields a rocket foldamer as the final state. This tree was constructed theoretically and then populated by images of states that were observed along experimental folding pathways (Methods). The notable overlap between experiment and theory indicates that the experiments are sampling all the available states. Tracking n = 255 folding heptamers enables us to plot the evolution of the yield Y of the most popular states in each level of the tree in the side panels. Long-lived states correspond to local minima (states S1 and S2 in the tree) that are theoretical dead-ends, but are overcome in experiments because our system is quasi-two-dimensional and rare out-of-plane rearrangements are possible. As a result, all pathways lead to the rocket foldamer out of the four possible heptamer geometries on a timescale of approximately 20 min.

Because the heptamer comprises four blue and three yellow droplets, switching on the yellow–yellow interaction funnels the landscape into a much simpler tree, as shown in Fig. 2b. Here the final state is a unique floppy state that needs additional interactions to become rigid. Subsequently turning on the blue–blue interaction yields two new floppy states, one of which closes into a rigid ladder, whereas the other requires the remaining blue–yellow interaction to fold into the rocket shape. This particular protocol yields a mixture of the ladder and the rocket and does not qualify as a successful protocol. On the other hand, reversing the order of the last two steps leads only to the rocket foldamer, but with a different colour arrangement, or fold, to the one obtained from a single blue–blue interaction in Fig. 2a. This feature demonstrates the robustness of geometry to the protocol.

Foldamer search algorithm

In search of foldamers, we sweep all protocols for folding alternating sequences. The construction of folding trees becomes computationally expensive as the chain length grows, so we devise an alternative strategy for a systematic search (Extended Data Fig. 2), which enables us to reach chains with N = 15 droplets. We start by enumerating only the rigid states17,20 and we map out all the possible backbone arrangements therein (Methods). Superimposing the alternating sequence on the backbones, we add secondary bonds between neighbouring droplets according to a specific interaction matrix. The resulting states are then classified as local or global minima. Keeping track of the minima each time an interaction is added, we determine if a colloidomer eventually folds into a unique geometry for a given sequence of interactions steps. The algorithm relies on the assumption that interactions are irreversible and that all bonds form, which requires a long enough waiting time at each temperature step in the experiment. This strategy is general for any linear polymer that can freely rearrange during folding by hierarchical interactions.

Alternating sequence foldamers

Our theory systematically identifies successful protocols that yield a total of 11 foldamer geometries for chains up to 13 droplets long, as shown in Fig. 3a. Following those protocols, experiments capture most of the predicted foldamers, as shown in Fig. 3b and Supplementary Videos 17. High relative yields, defined as the proportion of rigid structures that reach the correct geometry (Methods), are achieved in all but the flower and crown foldamers, owing to floppy dead-ends they encounter on timescales beyond the experimental window. This may explain why single-step quenches have perfect yields of rigid structures, whereas multiple quenches on average have lower yields. The incorrect structures arising from local minima can be suppressed by optimizing the bond strength, as shown in the simulations in Extended Data Figs. 3 and 4a.

Fig. 3: Predicted and experimentally realized foldamers.
figure 3

a, Alternating polymers of length N = 6−14 (subscripts indicate the number of blue and yellow droplets) can be successfully folded by distinct protocols (columns) with a maximum of three interactions (rows). Foldamers shaded in yellow require only one step, which can switch on one or more interactions. At the end of each step, foldamers are shown on the left and the number of floppy geometries on the right, in order of increasing chain length. b, Experimental results show fluorescent images of predicted foldamers up to decamers, as well as their relative folding yields Y. Scale bar, 5 μm. Experimental number of observations: for N = 6, [triangle, chevron, ladder] = (19, 86, 67); for N = 7, [rocket no. 1, rocket no. 2, flower] = (175, 25, 7); for N = 8, [hourglass] = 8; for N = 9, [poodle] = 24; for N = 10, [crown] = 8. c, Modes of folding: core collapse (left) and geometric frustration (right).

Our foldamers demonstrate that the simplest alternating sequence encodes all the possible geometries of the hexamer: the ladder, the chevron and the triangle, as shown in Fig. 3b. Among longer foldamers, only the heptamer flower and the decamer bed correspond to the ground state of a folded homocolloidomer, whereas the rest are unlikely geometries in equilibrium21. For example, the octamer hourglass geometry has the highest free energy, that is, the smallest yield among the nine possible geometries because of its high symmetry number17,18. Therefore, our foldamers correspond to kinetic states that are accessible on the basis of geometric considerations alone. Another example is the nonamer poodle, which is the longest chain that can be folded with a single interaction. By contrast, the decamer folds into the crown through a many-to-one transition, as an example of a funnel-like landscape22.

Colloidomer folding mechanisms

More generally, alternating colloidomers follow two mechanisms to reach the foldamer state: core collapse and geometric frustration, as illustrated in Fig. 3c. The most common mechanism is the core collapse, which first forms a rigid core and then locks in the remaining droplets on the outside. Up to decamers, the cores consist of a maximum of five identical droplets in unique geometries. Beyond this length, foldamers comprise multiflavoured cores formed upon turning on two interactions simultaneously, as seen in the star foldamer.

The second mechanism of geometric frustration initially engages an interaction that traps the droplets by certain locking bonds into positions in which they are surrounded by neighbours with which they cannot form secondary bonds. Turning on other interactions adds the remaining bonds without changing the geometry. The Russian doll architecture of these foldamers as a function of N allows us to successfully predict the N = 14 foldamer following the same protocol, as shown in Fig. 3c.

From sequence to supracolloidal design

Next, we theoretically investigate how increasing complexity23 improves the number and variety of possible foldamers. We run our search algorithm across all possible droplet flavour sequences, while preserving the number of each flavour in the chain. This process uncovers winning protocols, increasing the total number of foldamers by roughly an order of magnitude, particularly in longer chains, as shown in Fig. 4a (dark blue). Note that chains with at least N = 13 droplets are able to encode foldamers with stable holes (Extended Data Fig. 5), which can serve as precise sieves and offer porous design. In addition, the introduction of a third flavour while designing in both sequence and protocol spaces identifies more than a half of all possible geometries up to tridecamers, giving in total 310 foldamers (red). Whereas two flavours code for all three geometries in hexamers, three letters encode all geometries up to decamers, putting a bound on what can be achieved with a small number of flavours as a function of N (refs. 24,25). To achieve these sequence-specific foldamers experimentally would require sequential droplet polymerization, as previously demonstrated in ref. 26.

Fig. 4: Colloidomer sequence design and supracolloidal assembly.
figure 4

a, Exponential growth of the number of possible rigid geometries as a function of chain length N (black line). Numbers of foldamers encoded by an alternating AB sequence (light blue), any AB sequence (dark blue) and any ABC sequence (red) via all available protocols are shown as bars (the N = 13 bar is a lower bound). b, Simulated examples of supracolloidal self-assemblies using specific interaction protocol, shown with the droplet sequence that gives that foldamer. Numbers indicate the order in which interactions are activated (Extended Data Fig. 4b).

With this lexicon of foldamers as building blocks, simulations show that they self-assemble by additional supracolloidal interactions into higher order architectures27, as shown in the simulated examples in Fig. 4b. For instance, an interaction between blue droplets assembles star foldamers into a complex mosaic. Foldamers with polarized flavours self-assemble into ribbons or islands, whereas three flavours facilitate the assembly of unique dimers. All these examples could be experimentally realized if the chains were segregated by length (Methods), diluted and the DNA strands were subsequently activated for further assembly (Methods).

Our minimal model system exhibits many of the phenomena nominally associated with protein folding. Foldamers consisting of droplets with two or three flavours have the properties of uniqueness, robustness and kinetic accessibility in a funnel landscape22,28. The core collapse folding mechanism resembles the hydrophobic collapse in proteins29, whereas that of geometric frustration has been proposed as a design principle in the assembly of peptides30. On the supracolloidal scale, foldamer assembly mimics the polymerization of fibrils31, the formation of protein-based micelles32 or protein dimerization33. These similarities occur even though our system is strictly out-of-equilibrium, highlighting the importance of geometry in guiding assembly.

Colloidal self-assembly has the advantage that the monomers are easily visualized under a microscope, giving access to the underlying rules that govern successful folding by dissecting the respective roles of sequence design, minimal number of flavours, hierarchy of interactions and topological constraints. This type of structural design influences function. Once folded, emulsions are readily polymerized to make solid two-dimensional patterns on the scale of the wavelength of near infrared light, enabling one to tune their optical properties.

Moreover, sequential secondary interactions can be programmed to fold into three-dimensional foldamers (Supplementary Videos 9 and 10). Using smaller droplets allows them to explore the available phase space in three dimensions more efficiently. An alternating hexamer uniquely gives a polytetrahedron following a three-step protocol, which we experimentally demonstrate with 100% yield (n = 5). Self-assembly of geometric clusters paves an alternative path towards materials with photonic band gaps, such as the colloidal diamond6. Instead of using droplets, one can imagine folding molecular polymers designed with hydrophobic and polar moieties8, or building macroscopic beads-on-a-string models with specific interactions, facilitated by an external drive12. This new paradigm of hierarchical folding as a precursor for large-scale self-assembly offers design rules for biomimetic materials with tunable functionalities34.

Methods

Droplet synthesis

Monodisperse polydimethylsiloxane droplets were synthesized according to a protocol modified from that outlined in refs. 7,15,26. An equal volume of dimethoxydimethysilane (Sigma Aldrich) and (3,3,3-trifluoropropyl)methyldimethoxysilane (Gelest) was mixed together with DI water at approximately 2% v/v. The monomers were prehydrolysed by vortexing for 60 min. Ammonia was added at 1% v/v, and the droplets were left to grow for 24 h. The droplets were then dialysed against 5 mM sodium dodecyl sulfate (SDS, Sigma Aldrich) to remove the remaining ammonia and reaction byproducts. We then incubated the droplets in 1% volume of (3-glycidoxypropyl) methyldiethoxysilane (Gelest) with 10 mM sodium azide and 5 mM SDS. This embedded reactive azide groups inside the droplets, such that they can be fluorescently labelled at a later stage. This synthesis produced monodisperse oil droplets that were denser than water with a low gravitational height, forming a quasi-two-dimensional system.

DNA sequences and their interactions

The following is a complete list of DNA sequences used in this work, listed with their modifications from 5′ to 3′. The strands which formed the interactions were as follows:

A: azide Cy3A GCA TTA CTT TCC GTC CCG AGA GAC CTA ACT GAC ACG CTT CCC ATC GCT A GA GTT CAC AAG AGT TCA CAA

B: azide Cy5 A GCA TTA CTT TCC GTC CCG AGA GAC CTA ACT GAC ACG CTT CCC ATC GCT A TT GTG AAC TCT TGT GAA CTC

C: azide AG CAT TAC TTT CCG TCC CGA GAG ACC TAA CTG ACA CGC TTC CCA TCG CTA TTT TTA GTC

D: azide AG CAT TAC TTT CCG TCC CGA GAG ACC TAA CTG ACA CGC TTC CCA TCG CTA TTT GAC TAA

P: azide AG CAT TAC TTT CCG TCC CGA GAG ACC TAA CTG ACA CGC TTC CCA TCG CTA TTT ATC GAT

CS: TAG CGA TGG GAA GCG TGT CAG TTA GGT CTC TCG GGA CGG AAA GTA ATG CT azide

The strongest DNA interaction is the 20 bp hybridization of A and B strands providing the colloidomer (blue–yellow) backbone. In typical experimental conditions, the backbone melts at around 75 °C.

The remaining strands provide a hierarchy of secondary interactions strengths to mediate sequential folding:

(1) The strongest secondary interaction is realized by the P strand through palindromic self-interaction. In typical experimental conditions, it melts between 40 °C and 45 °C. This strand facilitates homophilic blue–blue or yellow–yellow interactions.

(2) A weaker secondary interaction is mediated by the complementary interaction of C and D strands, which melt between 30 °C and 35 °C. This interaction facilitates secondary yellow–blue bonds.

(3) The weakest interaction is provided by the D strand, by a weak palindromic self-interaction. In typical experimental conditions, it melts around 27 °C. This strand facilitates homophilic blue–blue or yellow–yellow interactions.

In Fig. 3b, we show foldamers obtained via three protocols, each of which uses a different combination of the DNA interactions coating the droplets. Protocol I uses interactions 1 and 3 (giving the ladder foldamer). Protocol II uses interactions 1, 2 and 3 (giving the triangle, rocket, hourglass, poodle and crown foldamers). Protocol III uses interactions 2 and 3 (giving the chevron and flower foldamers).

DNA-labelling of emulsion droplets

Before labelling with DNA, emulsion droplets were diluted into 1 mM SDS at a volume fraction of approximately 6%. DNA strands with sticky ends were reacted with a DBCO terminated pegylated lipid (DPSE-PEG-DBCO, Avanti Polar Lipids), and then annealed with a complementary spacer strand as described in refs. 7,15. Droplets were incubated with backbone DNA at 200 nM concentrations with a volume fraction of 0.6% with 50 mM NaCl, 10 mM Tris pH 8 and 1 mM EDTA. After 30 min, secondary interaction DNA was added, bringing the total concentration to 5–25 μM. The droplets were then incubated for 2 h before being diluted by a factor of two with a buffer containing 50 mM NaCl, 10 mM Tris pH 8, 0.1% w/v Triton 165 and Cyanine 3 DBCO (or Cyanine 5 DBCO, both from Lumiprobe). The droplets were incubated for a further 30 min before being washed several times in 50 mM NaCl to remove all unreacted dye.

Colloidomer formation

Droplet polymerization was accelerated by dispersing the droplets in an aqueous ferrofluid (EMG 707, FerroTec) and aligning them with a magnetic field. The ferrofluid was washed several times into 0.3% F68 pluronic surfactant by centrifugation to remove the proprietary surfactant in the ferrofluid. Two sets of droplets were prepared with complementary backbone DNAs and secondary DNA strands of choice. The two droplet types were mixed at a 1:1 ratio along with a 1:3 dilution of the F68 ferrofluid buffer, 200 mM NaCl and 20 mM EDTA pH 8. The sample was added to a custom flow chamber made from a hexamethyldisilazane (Sigma Aldrich) treated glass slide and coverslip and parafilm. The flow cell was sealed with ultraviolet glue.

The sample was then heated up to 75 °C to break all bonds in the system, and then cooled down to just above the melting temperature of the strongest secondary interaction, typically 50 °C. The sample was then put through a repeated cycle of alignment with rare earth magnets and relaxation to grow the chains. Typically, this produced a mixed sample of monomers, linear chains and branched chains. The density of droplets was optimized such that they would grow sizable polymer chains, but that the chains would not aggregate on the timescale of the folding experiments. The colloidomers were allowed to relax in the absence of a magnetic field before the folding data were taken. Data were taken using a Nikon TI Eclipse with a ×20 objective using either single- or double-channel fluorescence imaging.

Temperature protocols and waiting times

The temperature was adjusted using a custom-made heating cell composed of an indium tin oxide coated glass slide (SPI) connected to a Thorlabs TC200 resistive heater with a thermocouple for feedback. The temperature protocol was programmed through custom software. For a given temperature protocol, first a sample of droplet polymers with the desired set of interactions was made. A manual sweep of the temperature was performed to determine where each interaction takes place, as the melting temperatures can change from sample to sample. The first temperature step lasting 10 min was programmed to be above the melting temperature of all interactions to identify the unfolded colloidomers.

Subsequently, there can be one, two or three additional steps depending on how many interactions are to be turned on. If there is more than one interaction that is turned on, the waiting step for the first interaction is the longest. For the data in Fig. 3c, the waiting time at the first step was 20 min (except for the N = 6 triangle, which had a waiting time of 30 min), whereas that for the second and third steps was typically 5–10 min. In principle, longer waiting times enable the resolution of local minima and lead to better yields. In practice, however, longer waiting times increase the chance that colloidomers aggregate during folding, which can be avoided in dilute samples.

Video analysis

Folding videos were analysed using a custom MATLAB data analysis software. All particles were identified and located using thresholding. These particles were then tracked through the whole video using custom software modelled after that in ref. 35. Polymers were identified using the same metrics as in ref. 7 from the first 10 min of every recording, which was always above the melting temperature of the strongest secondary interaction. An N × N × t (where N is the number of monomers in the polymer and t is the time) connectivity matrix was then calculated for each polymer using the particle locations and diameters. The contact matrix was median filtered over t to remove transient interactions. Each contact matrix was then matched to a polymer configuration theoretically computed, allowing us to track the polymer configuration over time. Selections of data were vetted by hand afterwards to ensure the integrity of the data. Polymers that aggregated or that folded into three-dimensional structures were discarded.

In Fig. 2a, the plotted yields as a function of time of a given configuration are normalized by the total number of identified configurations having the same number of bonds, that is, ones within the same row of the folding tree. If a colloidomer is lost at a given time, that is, leaves the observational window, aggregates with another one or enters an unidentifiable configuration, it is removed from the analysis pool. For Fig. 3c, the yield is defined as the fraction of polymers of length N that fold to completion into the target rigid structure over the fraction of polymers of length N that fold to completion into any rigid structure of the same size. A chain-by-chain analysis reveals that the typical fraction of chains that successfully complete folding is on average 65%, ranging from 50 to 70% for chain lengths N = 6−11 droplets. In this work, the quoted folding yields consider only these chains. To increase the fraction of viable chains, our method could be improved with a larger density mismatch between the droplets and the aqueous phase to ensure two-dimensional folding, while using a sample cell with individual wells for each chain.

Possible extensions to colloidomer folding

To experimentally realize supracolloidal self-assembly, such as the ones shown in Fig. 4b, several further steps need to be taken. The emulsion polymerization protocol yields an exponential distribution of chain lengths shown in Extended Data Fig. 1. Therefore, our samples first need to be segregated by chain length. This could be achieved using the glycerol-based density gradient centrifugation method36,37. This method has been used to separate clusters of solid colloids with different size. It can now be extended to colloidomers consisting of emulsion droplets, as they are robust against centrifugation (see the washing steps of the current synthesis), and are not destabilized by glycerol, as shown by the refractive-index matching experiments in ref. 38.

To avoid chain aggregation, secondary interactions would be implemented using linker-mediated assembly39,40,41. The desired single-chain-length sample would then be diluted to low volume fraction to avoid aggregation during folding. The linker strands would then be added to implement the appropriate folding protocol. Temperature quenches can then be followed to create a uniform sample of foldamers for supracolloidal assembly. Once folded, the unused interactions can lead to supracolloidal architectures, such as in the case shown in Fig. 4b for which activation of the unused blue–blue bond in the star foldamer leads to the mosaic assembly. In other cases, specific binding between foldamers can be activated using strand displacement reactions42 or triggered with linker-mediated interactions41.

Enumerating two-dimensional geometries

We define as a geometry any colloidomer cluster in which deformations cost energy, that is, a deformation requires the breaking of a secondary bond. Geometries are therefore rigid clusters. To enumerate two-dimensional geometries for a system of size N, we start by selecting all possible sets of N neighbouring points on an N × N triangular lattice. We form bonds between points located at a unit distance and test the rigidity of the resulting geometries by analysing the normal modes of the dynamical matrix. We describe the ensemble of NR geometries for a chain of length N by a set of planar graphs {Gi,N(V, E)}, with index i ∈ (1, NR), and where the vertices (V) are the droplets in the chain and the edges (E) are the DNA-mediated bonds. Edges may be of two types: backbone bonds and secondary bonds. Each graph is characterized by a contact matrix, which describes the bonds between droplets, and a distance matrix, which contains the distances between each droplet pair in a geometry. The first size with more than one geometry is N = 6 (ref. 18). At N ≥ 13 the first geometries with stable holes in the bulk appear.

Foldamer search algorithm

We develop a computationally efficient search algorithm to systematically scan protocol and sequence spaces and find foldamers of a given length N. The algorithm requires as input the ensemble of all backbone configurations within the geometries NR for a chain of length N, that is, the set of Hamiltonian paths \(\{{H}_{1,1},...,{H}_{{p}_{1},1},...,{H}_{1,q},...,{H}_{{p}_{q},q}\}\), for all q ∈ (1, NR), where pq is the number of paths in the qth geometry. The total number of Hamiltonian paths grows exponentially and it does not depend on the sequence or the interaction matrix. Thus, they are computed only once per N, significantly reducing the computation time. The structure of the algorithm is shown in the Extended Data Fig. 1. For a given protocol and sequence, the algorithm can be summarized as follows:

Input. Map the sequence onto Hamiltonian paths.

  1. (1)

    Form bonds. Apply the first interaction of the protocol. A bond will be formed between two vertices if they are in neighbouring lattice points and the interaction is allowed.

  2. (2)

    Are there geometries?

    1. (i)

      Yes. If the classification flags geometries, the algorithm stops. If there is a single geometry, a foldamer is reported. We choose to report a solution even if there are competing floppy states with the same or more bonds as the foldamer geometry (this becomes possible when N ≥ 7).

    2. (ii)

      No. A foldamer is not selected.

  3. (3)

    Select global minima. This is analogous to selecting floppy states with the largest number of bonds. Note that this also implies that local minima in the first interaction tree are not considered (here we assume strict downhill folding).

  4. (4)

    Continue the protocol of adding interactions. Update the interaction matrix according to the protocol.

  5. (5)

    Form new bonds. Repeat the bond-making process iterating over the states from step 3.

  6. (6)

    Classify states. We classify states into global and local minima, and transient states. Global minima are states of a tree that cannot acquire additional bonds either because they reached a rigid state or because spatially accessible neighbours do not have flavours with attractive interactions. Local minima are floppy states for which the topology prevents further formation of bonds. All other states are classified as transient states.

  7. (7)

    Is the protocol over?

    1. (i)

      Yes. Analyse the resulting geometries. If a single geometry is found, a foldamer is reported.

    2. (ii)

      No. Repeat steps 4–7 until the protocol ends.

Simulation details

We perform Dissipative Particle Dynamics (DPD)43 simulations using an in-house code. Our unit of length is the particle diameter σ = 1 and we assume all particles have the same mass m = 1. Energy is measured in units of kBT, where kB is the Boltzmann constant, and we fix the temperature of the system at kBT = 1. When folding a colloidomer of length N, we set the simulation box size L to L/σ = (N + 2). For the self-assembly of supracolloidal architectures, we choose L/σ = 30. In both cases we use periodic boundary conditions. We use a multiple-timestep simulation scheme to integrate the equations of motion with a timestep dts = 10−2 to resolve the dynamics of the solvent and a timestep dtc = 10−4 for the dynamics of the colloids. DNA-mediated interactions are modelled by a short-range, isotropic interaction potential44

$$U(r)=\varepsilon \alpha ({r}_{i},\sigma )\left[{\left(\frac{\sigma }{r}\right)}^{2}-1\right]{\left[{\left(\frac{{r}_{i}}{r}\right)}^{2}-1\right]}^{2},$$
(1)

where r is the distance between two interacting particles, ri = 1.05σ is the interaction range, ε is the strength of the interaction and α is a parameter that sets the minimum of the potential U(rmin) = ε (see ref. 44 for further details). Primary bonds are made irreversible by setting εP = 40kBT. To simulate secondary interactions, we gradually increase ε until it reaches εS, once the corresponding interaction is turned on. The increase is done over the course of 200 simulation steps to ensure downhill folding while preventing poor potential sampling.