Abstract
A theory of an early stage of genome evolution by combinatorial fusion of circular DNA units is suggested, based on protein sequence “fossil” evidence. The evidence includes preference of protein sequence lengths for certain sizes—multiples of 123 as for eukaryotes and multiples of 152 as for prokaryotes. At the DNA level these sizes correspond to 350–450 base pairs—the known optimal range for DNA ring closure. The methionine residues repeatedly appear along the sequences with the same period of about 120 as (in eukaryotes), presumably marking the sites of insertion of the early genes—rings of protein-coding DNA. No torsional constraint in this DNA results in very sharp estimate of the helical periodicity of the early DNA, indistinguishable from the experimental mean value for extant DNA. According to the combinatorial fusion theory, based on the above evidence, in the pregenomic, prerecombinational stage the genes and the noncoding sequences existed in form of autonomously replicating DNA rings of close to standard size, randomly segregating between dividing cells, like modern plasmids do. In the recombinational early genomic stage the rings started to fuse, forming larger DNA molecules consisting of several unit genes connected in various combinations and forming long protein-coding sequences (combinatorial fusion). This process, which involved, perhaps, noncoding sequences as well, eventually resulted in the formation of large genomes. The dispersed circular DNA—or, rather, evolutionarily advanced derivatives thereof—may still exist in the form of various mobile DNA elements.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Alberts B, Bray D, Lewis J, Raff M, Roberts K, Watson JD (1989) Molecular biology of the cell. Garland Publishing Inc., New York
Berman AL, Kolker E, Trifonov EN (1994) Underlying order in protein sequence organization. Proc Natl Acad Sci USA 91:4044–4047
Bonner DM, DeMoss JA, Mills SE (1965) The evolution of an enzyme. In: Bryson V, Vogel HJ (eds) Evolving genes and proteins. Academic Press, New York, pp 305–318
Campbell AM (1992) Chromosomal insertion sites for phages and plasmids. J Bacteriol 174:7495–7499
Cunningham BA, Hemperly JJ, Hopp TP, Edelman GM (1979) Favin versus concanavalin A: circularly permuted amino acid sequences. Proc Natl Acad Sci USA 76:3218–3222
Doolittle RF (1992) Reconstructing history with amino acid sequences. Protein Sci 1:191–200
Dorit RL, Schoenbach L, Gilbert W (1990) How big is the universe of exons? Science 250:1377–1382
Engelberg-Kulka H, Schoulaker-Schwarz R (1994) Regulatory implications of translational frameshifting in cellular gene expression. Mol Microbiol 11:3–8
Gast FU, Hagerman PJ (1991) Electrophoretic and hydrodynamic properties of duplex ribonucleic acid molecules transcribed in vitro: evidence that A-tracts do not generate curvature in RNA. Biochemistry 30:4268–4278
Geiduschek EP, Tocchini-Valentini GP (1988) Transcription by RNA polymerase III. Annu Rev Biochem 57:873–914
Gilbert W (1978) Why genes in pieces? Nature 271:501
Go M (1981) Correlation of DNA exonic regions with protein structural units in hemoglobin. Nature 291:90–92
Goryshin IY, Kil YV, Reznikoff WS (1994) DNA length, bending and twisting constraints on IS50 transposition. Proc Natl Acad Sci USA 91:10834–10838
Grundstrom T, Jaurin B (1982) Overlap between ampC and frd operons on the Escherichia coli chromosome. Proc Natl Acad Sci USA 79:1111–1115
Haber JE (1992) Exploring the pathways of homologous recombination. Curr Opin Cell Biol 4:401–412
Hawkins JD (1988) A survey on intron and exon lengths. Nucleic Acids Res 16:9893–9908
Hoess R, Wierzbicki A, Abremski K (1985) Formation of small circular DNA molecules via an in vitro site-specific recombination system. Gene 40:325–329
Huang WM, Ao S-Z, Casjens S, Orlandi R, Zeikus R, Weiss R, Winge D, Fang M (1988) A persistent untranslated sequence within bacteriophage T4 DNA topoisomerase gene 60. Science 239:1005–1012
Kapahnke R, Rappold W, Desselberger U, Riesner D (1986) The stiffness of dsRNA: hydrodynamic studies on fluorescence-labeled RNA segments of bovine rotavirus. Nucleic Acids Res 14:3215–3228
Kolker E, Trifonov EN (1995a) Periodic recurrence of methionines: Fossil of gene fusion? Proc Natl Acad Sci USA (in press)
Kolker E, Trifonov EN (1995b) Sequence markers of segmented protein structure. In: Pullman A, Jortner J, Pullman B (eds) Modelling of biomolecular structures and mechanisms. Kluwer Academic Publishers, Dordrecht, pp 461–471
Lagunez-Otero J, Trifonov EN (1992) mRNA periodical infrastructure complementary to the proof-reading site in the ribosome. J Biomol Struct Dyn 10:455–464
Lai MMC (1992) RNA recombination in animal and plant viruses. Microbiol Rev 56:61–79
Lamb RA, Choppin PW, Chanock RM, Lai CJ (1980) Mapping of two overlapping genes for polypeptides NS1 and NS2 on RNA segment 8 of influenza virus genome. Proc Natl Acad Sci USA 77:1857–1861
Li T, Nicolaou KC (1994) Chemical self-replication of palindromic duplex DNA. Nature 369:218–221
Livshits MA, Amosova OA, Lyubchenko YL (1990) Flexibility difference between double-stranded RNA and DNA as revealed by gel electrophoresis. J Biomol Struct Dyn 10:1237–1249
Normark S, Bergstrom S, Edlund T, Grundstrom T, Jaurin B, Lindberg FP, Olsson O (1983) Overlapping genes. Annu Rev Genet 17:499–525
Patthy L (1991) Modular exchange principles in proteins. Cuff Opin Struct Biol 1:351–361
Pietrokovski S, Trifonov EN (1992) Imported sequences in the mitochondrial yeast genome identified by nucleotide linguistics. Gene 122:129–137
Savageau MA (1986) Proteins of Escherichia coli come in sizes that are multiples of 14 kDa: domain concepts and evolutionary implications. Proc Natl Acad Sci USA 83:1198–1202
Shimada J, Yamakawa H (1984) Ring-closure probabilities for twisted worm-like chains. Application to DNA. Macromolecules 17:689–698
Shore D, Langowski J, Baldwin RL (1981) DNA flexibility studied by covalent closure of short fragments into circles. Proc Natl Acad Sci USA 78:4833–4837
Shore D, Baldwin RL (1983) Energetics of DNA twisting. I. Relation between twist and cyclization probability. J Mol Biol 170:957–981
Sievers D, von Kiedrowski G (1994) Self-replication of complementary nucleotide-based oligomers. Nature 369:221–224
Smith MW, Feng D-F, Doolittle RF (1992) Evolution by acquisition: the case for horizontal gene transfers. Trends Biochem Sci 17:489–493
Smith RA, Parkinson JS (1980) Overlapping genes at the cheA locus of Escherichia coli. Proc Natl Acad Sci USA 77:5370–5374
Svedberg T (1929) Mass and size of protein molecules. Nature 123:871
Svedberg T (1937) The ultra-centrifuge and the study of high-molecular compounds. Nature 139:1051–1062
Trifonov EN (1987) Translation framing code and frame-monitoring mechanism as suggested by the analysis of mRNA and 16S rRNA nucleotide sequences. J Mol Biol 194:643–652
Trifonov EN (1989) The multiple codes of nucleotide sequences. Bull Math Biol 51:417–432
Trifonov EN (1994) On the recombination origin of protein-sequence-subunit structure. J Molec Evol 38:543–546
Trifonov EN (1995) Hidden segmentation of protein sequences: structural connection with DNA. In: Pullman A, Jortner J, Pullman B (eds) Modelling of biomolecular structures and mechanisms. Kluwer Academic, Dordrecht, pp 473–479
White SH (1992) Amino acid preferences of small proteins. Implications for protein stability and evolution. J Mol Biol 227:991–995
White SH (1994) The evolution of proteins from random amino acid sequences. II. Evidence from the statistical distributions of the lengths of modern protein sequences. J Mol Evol 38:383–393
Zuckerkandl E (1975) The appearance of new structures and functions in proteins during evolution. J Mol Evol 7:1–57
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Trifonov, E.N. Segmented structure of protein sequences and early evolution of genome by combinatorial fusion of DNA elements. J Mol Evol 40, 337–342 (1995). https://doi.org/10.1007/BF00163239
Received:
Issue Date:
DOI: https://doi.org/10.1007/BF00163239