The ability to encode and convert heritable information into molecular function is a defining feature of life as we know it (Schrödinger 1944; Joyce 1994). The conversion of information into molecular function is performed by the translation process, in which triplets of nucleotides in a nucleic acid polymer (mRNA) encode specific amino acids in a protein polymer that folds into a three-dimensional structure. The folded protein then performs one or more molecular activities, often as one part of a complex and coordinated physiological network (Goldman et al. 2012; Cuevas-Zuviría et al. 2023). Prebiotic systems, lacking the ability to explicitly translate information between genotype and phenotype, would have depended upon either chemosynthetic pathways to generate its components—constraining its complexity and evolvability (Vasas et al. 2010; Tessera 2018)—or on the ambivalence of RNA as both carrier of information and of catalytic functions—a possibility which is still supported by a very limited set of catalytic RNAs (Bernhardt 2012, Goldman et al. 2021). Thus, the emergence of translation during early evolutionary history may have allowed life to unmoor from the setting of its origin (Wolf and Koonin 2007; Goldman et al. 2016).

The emergence of translation machinery during early evolution also represents an entirely novel and distinct threshold of behavior for which there is no abiotic counterpart–it could be the only known example of computing that emerged naturally at the chemical level. The translation machinery’s decoding system is the basis of cellular translation’s information-processing capabilities, and it exhibits at least four operation types that find parallels in computer systems engineering.

First, this system performs an explicit mapping operation between one set of molecules, nucleic acids, and another (very different) set of molecules, proteins. This mapping operation, or ‘the genetic code,’ is arguably the most fundamental logical relationship of all living systems, establishing a semantic connection between two different kinds of polymers, each with orthogonal functions in the cell. Second, this operation is followed by the folding of the resulting string of amino acids, a process that we could consider like compiling a set of instructions. Recent research has shown that the folding process requires a very delicate kinetic control of the translation pace (Jiang et al. 2022). Third, translation has flow-control operations, i.e., using “riboswitches,” to debug and regulate its elemental operations. Translation can be stopped, re-started, stalled, and rescued by different processes that ensure that it continues to operate under diverse contingencies (Starosta et al. 2014; Breaker 2018). Fourth and finally, translation’s accuracy is afforded by non-linear process controls between different translation components and assemblies, with different kinetic partitioning checkpoints that favor forward-process steps for correct substrates and disfavor or delay these steps for the entry of poor substrates (Milón and Rodnina 2012). No other chemical system, either natural or synthetic, shows so many information-processing features bundled together as part of a single, multi-layered apparatus.

The principal of the correspondence between discrete nucleic acid and protein sequences is so inherently informatic that it can be analyzed in such terms without explicit reference to the underlying chemistry (Shannon 1948; Itzkovitz and Alon 2007; Adami 2012; Wills et al. 2015, Sonnerbon 1965). The information changes from a storage format into a functional format by decoding a set of three nucleotide monomers (a triplet codon) into one amino acid monomer along a growing peptide chain. The use of three bits of four RNA letters (64 possibilities) to specify one bit of amino acid information (20 possibilities), at first glance, may seem wasteful. But the excess capacity of the mapping assignment scheme, ‘code degeneracy,’ remarkably connects two very disparate needs of life: the consistency of information processing with the variability of novelty (Haig and Hurst 1991; Koonin and Novozhilov 2016; Błażej et al. 2018).

Degeneracy affords, for example, multiple and adjacent code assignments for amino acids with similar chemical properties (impacting functional divergence), variable translation speed (affecting folding), multiple stop codons limiting frameshifting or read-throughs (avoiding the waste of resources) (Nirenberg et al. 1966; Taylor and Coates 1989; Lehmann and Libchaber 2008, Liu et al. 2020, Křížek and Křížek 2012), and codon content and preference variability across different organisms (Shiba et al. 1997; Yacoubi et al. 2012; Fujishima and Kanai 2014; Pust et al. 2022, Novoa et al. 2012). Synonymous codon usage patterns (Labella et al. 2019), a ribosome’s interaction partners, and ribosomal assembly processes may not strictly map across related organisms (Timsit et al. 2021) but code degeneracy in translation as a fundamental informatic attribute is universal and conserved. For all these reasons, code degeneracy is better viewed as a hedge against (inevitable) error and a springboard of potential novelty than a wasteful use of computing resources. It is a feature, not a bug.

The information-processing capabilities of translation are distributed among the many components and subprocesses of the translation machinery, but the process of reading and decoding of sequence information at the molecular scale is nevertheless incredibly efficient. For each round of translation, the ribosome expends around four GTP molecules through translation factors and one ATP during the aminocylation of tRNAs. Despite its multiple layers of control, the nominal cellular replication process is still orders of magnitude more efficient on a per-bit basis when compared to even the most idealized conceivable electronic computers; this is also within an order of magnitude of theorized universal minimal bounds for information processing of any architecture (Zhirnov and Cavin 2013; Kempes et al. 2017).

Translation represents a uniquely evolved form of chemical computation with no known naturally occurring equivalent. This chemical intelligence is reliably repetitive, and at the same time robust to perturbation, but it is also not monolithic. The many components that take part in the translation system have intertwined, yet potentially distinct histories (Fournier et al. 2011; Fournier and Alm 2015; Petrov et al. 2015; Pouplana 2020; Fer et al. 2022). As its components co-evolved, the translation system would likely have had changing informatic capabilities, which in turn would restrict or expand the possible functions that it could have conducted. Understanding the origin of translation, from the perspective of its emerging information-processing attributes, will lead to profound new insights into the conditions in which biochemistry emerged from geochemistry.