Summary
The frequency distributions of size (molecular weight) and of numbers of subunits were determined from lists of over 500 mammalian and bacterial proteins. The size distribution of polypeptides is well fitted by a lognormal distribution with a median value of about 40,000 daltons and a deviation of 1.8. About 60% of all proteins exist in multimeric aggregates. Of the multimers 75% have either two or four subunits while less than 1% have an odd number of subunits that is greater than three. Over 90% of the time, a given multimer is composed of subunits of nearly equal size so that the size of a N-mer is lognormally distributed with a median value of N x 40,000 daltons and a deviation of 1.8. The distribution of polypeptide size and subunit number is similar for mammalian and bacterial proteins as well as for intracellular and extracellular proteins.
The sedimentation profiles of mRNA from HeLa and CHO cells indicate that the lengths of mammalian mRNA are lognormally distributed with a median value of 1.4 kb and a deviation of 2.0. This implies that, on the average, a mRNA species is only about 25% larger than the mature polypeptide it codes for. Therefore, at most a small fraction of mammalian mRNA could code for large precursor polypeptides which are then cleaved into a number of mature polypeptides (like polio mRNA), or for 3′ coterminal mRNAs where the larger species contain the information for up to four proteins (like adenovirus mRNA).
The sedimentation profile of nascent nuclear RNA from HeLa suggests that the length distribution of transcription units has 2 components: An exponential component that decays with a half-length of 10–15 kb, and a high frequency of very short molecules. However, other distributions (for example, the lognormal distribution) of transcription unit lengths could also be consistent with the data if one or more of the following occurred: Physiological cleavage of nascent chains, perturbation of non-rRNA transcription by actinomycin D, or degradation during isolation.
The length distribution of HeLa nuclear RNA labeled for 60 min is similar to that of nascent nuclear RNA, indicating that a completed hnRNA chain is quickly transported or degraded after being cleaved.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Abbreviations
- hRNA:
-
heterogeneous RNA
- L1/2 :
-
in an exponential distribution, the increase in length required to reduce the frequency by a factor of 2
- kb:
-
kilobases
- kd:
-
kilodaltons
- CHO cells:
-
Chinese hamster ovary cells
References
Aitchison, J, Brown, J.A.C. (1957): The lognormal distribution, p. 102. Cambridge: Cambridge University Press
Baralle, F.E. (1977). Cell10, 549–558
Bishop, J.O. (1974). Cell2, 81–86
Bishop, J.O., Morton, J.G., Rosebach, M., Richardson, R.M. (1974). Nature250, 199–204
Brachet, J. (1967). Nature213, 650–655
Bremer, H., Yuan, D. (1968). J. Mol. Biol.38, 163–180
Darnall, D.W., Klotz, I.M. (1976). In: CRC Handbook of biochemical and molecular biology: Proteins. Fasman, G.D., ed., Vol.2, pp. 325–371, Cleveland: CRC Press
Darnell, J.E., Girard, M., Baltimore, D., Summers, D.F., Maizel, J. (1967). In: Molecular biology of viruses. Cotter, J., ed., New York: Academic
Derman, E., Darnell, J.E. (1974). Cell3, 255–264
Derman, E., Goldberg, S., Darnell, J.E. (1976). Cell9, 465–472
Eagle, H. (1959). Science130, 432–437
Edwards, Y.H., Hopkinson, D.A., Harris, H. (1977). Ann. Hum. Genet.40, 267–277
Efstratiadis, A., Kafatos, F.C., Maniatis, T. (1977). Cell10, 571–586
Egyhazi, E. (1975). Proc. Nat. Acad. Sci.72, 947–950
Feller, W. (1966). An introduction to probability theory and its applications., Vol.2, New York: Wiley
Gibrat, R. (1931). Les Inégalitiés Economique, Paris: Libraire de Recueil, Sirey
Giorno, R., Sauerbier, W. (1976). Cell9, 775–786
Goldberg, S., Schwartz, H., Darnell, J.E. (1977). Proc. Nat. Acad. Sci.74, 4520–4523
Greenberg, H., Penman, S. (1966). J. Mol. Biol.21, 527–535
Herman, R.C., Penman, S. (1977). Biochemistry16, 3460–3465
Hopkinson, D.A., Edwards, Y.H., Harris, H. (1976). Ann. Hum. Genet.39, 383–411
Hruby, P.E., Maki, R.A., Cummings, D.J. (1977). Biochim. Biophys. Acta,47, 89–96
Jelinek, W., Leinwand, L. (1978). Cell15, 205–214
Karlin, S., Taylor, H.M. (1975). A first course in stochastic processes. New York: Academic
Kleczkowski, A. (1949). Ann. Appl. Biol.36, 139–152
Koehn, R.K., Eanes, W.F. (1978). Evolutionary Biol.11, 39–100
Levis, R., Penman, S. (1977). Cell11, 105–113
MacReynolds, L.A., O'Malley, B.W., Nesbet, A.D., Fothergill, J.E., Givol, D., Fields, S., Robertson, M., Brownlee, G.G. (1978). Nature273, 723–728
Malloy, G.R., Jelinek, W., Salditt, M., Darnell, J.R. (1974). Cell1, 43–53
Masson, P.L. (1976). In: CRC Handbook of biochemical and molecular biology: Proteins. Fasman, G.D., ed., Vol.2, pp. 242–253, Cleveland: CRC Press
Milcarek, C., Price, R., Penman, S. (1974). Cell3, 1–10
Nei, M., Chakraborty, R., Fuerst, P.A. (1976). Proc. Nat. Acad. Sci.73, 4164–4168
Nei, M., Fuerst, P.A., Chakraborty, R. (1978). Proc. Nat. Acad. Sci.75, 3359–3362
Nemer, M., Dubroff, C.M., Graham, M. (1975). Cell6, 171–178
Nevins, J., Darnell, J.E. (1978). J. Virology25, 811–825
Nikolaev, N., Silengo, L., Schlessinger, D. (1973). Proc. Nat. Acad. Sci.70, 3361–3365
Pearson, E.S., Hartley, H.O. (1966, 1972). Biometrika tables for statisticians, Vol. 1 and 2., Cambridge: Cambridge University Press
Penman, S. (1966). J. Mol. Biol.17, 117–130
Penman, S., Scherrer, K., Becker, Y., Darnell, J.E. (1963). Proc. Nat. Acad. Sci.49, 654–662
Perry, R.P. (1963). Exp. Cell Research29, 400–406
Perry, R.P., Latorre, J., Kelly, D.E., Greenberg, J.A. (1972). Biochim. Biophys. Acta262, 220–226
Peterson, J.L., McConkey, L. (1976). J. Biol. Chem.251, 548–554
Polasa, H., Green, M. (1967) Virology31, 565–567
Preston, F.W. (1948). Ecology29, 254–283
Proudfoot, N.J. (1977). Cell10, 559–570
Proudfoot, N.J., Gillam, S., Smith, M., Longley, J.I. (1977). Cell11, 807–818
Puckett, L., Darnell, J.E. (1976). J. Cell Physiol.90, 521–534
Reddy, V.B., Thimmappaya, B., Dhar, R., Subramanian, K.N., Zain, B.S., Pan, J., Ghosh, P.K., Celma, M.L., Weissman, S.M. (1978). Science200, 494–502
Reeck, G. (1976). In: CRC Handbook of biochemistry and molecular biology: Proteins. Fasman, G.D., ed., Vol.3, pp. 504–519, Cleveland: CRC Press
Sanger, F., Dir, G.M., Barrell, B.G., Brown, B.L., Coulson, H.R., Fiddes, J.C., Hutchinson, C.V., Slocombe, P.M., Smith, M. (1976). Nature265, 687–698
Sawicki, S., Jelinek, W., Darnell, J.E. (1977), J. Mol. Biol.113, 219–239
Spradling, A., Hui, H., Penman, S. (1974). Cell4, 131–137
Strauss, J.H., Kelly, R.B., Sinsheimer, R.I. (1968). Biopolymers6, 793–807
Sueoka, N. (1961). Proc. Nat. Acad. Sci.47, 1141–1149
U.S. Department of Commerce, Office of Business Economics. (1952). Income distribution in the United States, Washington, D.C.: US Govt. Printing Office
Vallee, B.L., Wacker, W.E.C. (1976). In: CRC Handbook of biochemistry and molecular biology: Proteins. Fasman, G.D., ed., Vol.3, pp. 278–292, Cleveland: CRC Press
Villa-Komaroff, C., Guttman, N., Baltimore, D., Lodish, H.F. (1975). Proc. Nat. Acad. Sci.72, 4157–4161
Williams, C.B. (1937). Ann. Appl. Biol.24, 404–414
Yuan, P.T. (1933). Ann. Math. Statistics6, 20–34
Author information
Authors and Affiliations
Additional information
This paper is dedicated to Harold Sommer
Rights and permissions
About this article
Cite this article
Sommer, S.S., Cohen, J.E. The size distributions of proteins, mRNA, and nuclear RNA. J Mol Evol 15, 37–57 (1980). https://doi.org/10.1007/BF01732582
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF01732582