The Base-e Representation of Numbers and the Power Law

Kak, Subhash

doi:10.1007/s00034-020-01480-0

The Base-e Representation of Numbers and the Power Law

Short Paper
Published: 25 June 2020

Volume 40, pages 490–500, (2021)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

The Base-e Representation of Numbers and the Power Law

Download PDF

Subhash Kak ORCID: orcid.org/0000-0001-5426-9759¹

318 Accesses
11 Citations
127 Altmetric
7 Mentions
Explore all metrics

Abstract

Some properties of the optimal representation of numbers are investigated. This representation, which is to the base-e, is examined for coding of integers. An approximate representation without fractions that we call WF is introduced and compared with base-2 and base-3 representations that are next to base-e in efficiency. Since trees are analogous to number representation, we explore the relevance of the statistical optimality of the base-e system for the understanding of complex system behavior and of social networks. We show that this provides a new theoretical explanation for the nature of the power law exhibited by many open complex systems. In specific, we show that the power law distribution most often proposed for such systems has a form that is similar to that derived from the optimal base-e representation.

Number-Theoretical Methods of Factoring Composite Numbers and Calculating the Discrete Logarithm

Article 01 March 2022

Univoque graphs for non-integer base expansions

Article 09 January 2021

On the Arithmetic of Knuth’s Powers and Some Computational Results About Their Density

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Novel representation schemes for numerals are important in communication engineering and computer science applications [14], especially when one wishes to represent data as efficiently as possible [19]. They are also important in understanding complex systems as manifested in behaviors such as the first digit phenomenon [11, 12], power laws and series models in social networks [15,16,17], protein–protein interaction networks [6], physical systems for space itself is a representation scheme [18], transportation networks [9], city and firm sizes [1, 5, 8], and financial networks [23].

We begin with number representation. Let us consider coding of numbers to an arbitrary base r. In general, we should be able to write the expansion for a number x in the form:

$$ x = \mathop \sum \limits_{n = - \infty }^{\infty } a_{n} r^{n} $$

(1)

where $ a_{n} $ are integers so that $ a_{n} $ is zero for a sufficiently large n, and $ 0 \le a_{n} < r. $ In general, this expression applies to any kind of r, even non-integer [7] and irrational [3, 22].

The polynomial representation of Eq. (1) may be viewed as a tree. Thus, for example, the tree for the representation for the number 43 to base 3 is derived by successive division as follows: 43 = 14 × 3 + 1; 14 = 4 × 3 + 2; 4 = 1 × 3 + 1; 1 = 0 × 3 + 1.

Therefore, 43 (base 3) = $ 1 \times 3^{3} + 1 \times 3^{2} + 2 \times 3^{1} + 1 = $ 1 1 2 1, which is obtained by taking the remainders in the reverse order. This tree is shown in Fig. 1, where the nodes are read from left to right.

Note that the number of branches at each node is 3, but only the named branch is shown. If one were to gather all these branches and fold them on top of each other correctly, we will have a single-layered structure with the overlapped number of branches for 0, 1, and 2, respectively. One expects that aggregate numbers under each of these branches are equal and so we can assign them the same probability.

Now, consider the efficiency of representation based on information of the symbols [19]. The information I(x) associated with the symbol x (which is a number less than the base r) is given by:

$$ I\left( x \right) = - \log P\left( x \right) $$

(2)

where $ P\left( x \right) $ is the probability of occurrence of the symbol x. Equation (2) gives the information in bits if it uses a logarithm to base 2, and in nats if base e is used.

Clearly, the capacity of a base representation to carry information goes up as the size of the base increases. But the increase in information must be squared off against the extra burden entailed by the use of the larger set (which corresponds to a greater number of branches at each node). For binary, the capacity is $ \ln 2 = 0.693 $ nats (= 1 bit), whereas for e, it is 1 nat (= 1. 443 bits); for base 3, the figure is 1.099 nats (= 1.585 bits); and for base 10, it is 2.303 nats (= 3.322 bits).

The probability of the use of each of the r symbols may be taken to be the same and equal to $ 1/r $, and the information associated with each symbol is $ \log r $. The efficiency of the coding scheme per symbol is

$$ E\left( r \right) = \frac{\ln r}{r} $$

(3)

Theorem 1

The optimal base for number representation is e.

Proof

To find the value of r for which it is a maximum, we take the derivative of E(r) with respect to r and equate that to zero. This yields the condition that $ \ln r = 1 $, from which we conclude that the optimal base is e, with E(e) = 0.368 nats or 0.531 bits.

The fact that we define efficiency based on information content means that the optimality is true only in a probabilistic sense. This indicates that comparison of base-e number representation to standard base-2, base-3, or decimal representations for integers alone (which is a small subset of the real numbers), or any specific example, may not always show superior results.

Table 1 below provides E(r) for some of the values of r that range from 2 to 10.

Table 1 Efficiency of coding for certain bases

Full size table

The efficiency is quite close to the maximum for r = 3 (a value of 0.528 bits as compared to the optimal value for e which is 0.531 bits), with the next best value coming at the bases 2 and 4 (where it is 0.500 bits). The efficiency at r = 3 is superior to that at r = 2 by 5.6% (for details, see [11, 12, 17]. After this, the values decline monotonically as shown in Fig. 2.

In this article, we show how the base-e system works and compare it to the bases of 2 and 3, which are closest to it in efficiency. Given that e is irrational and therefore the representations for integers in base-e are not going to be efficient, we further present an approximate representation that makes it easier to do a clear comparison with base-2 and -3 representations. If it is accepted that Nature chooses optimal schemes, then the base-e representation should show up in physical space and in complex systems, such as social networks or city size, and we provide evidence that supports this hypothesis. We further show that the power law distribution often proposed for complex systems is related closely to the optimal base-e representation.

2 The Base-e system

The base-e system represents the number using powers of e and coefficients, $ d_{i} , i = \ldots 2,1,0, - 1, - 2, \ldots , $ that are integers less than e, that is 0, 1, and 2 in the most economical manner. Clearly, owing to the nature of e, the representation of integers will involve decimal expansions.

$$ x = d_{n} d_{n - 1} \ldots d_{2} d_{1} d_{0} \cdot d_{ - 1} d_{ - 2} \ldots d_{ - m} $$

(4)

$$ x = e^{n} d_{n} + e^{n - 1} d_{n - 1} + \ldots + ed_{1} + d_{0} + e^{ - 1} d_{ - 1} + e^{ - 2} d_{ - 2} + \ldots $$

For easy reference, the various powers of e are:

n	− 5	− 4	− 3	− 2	− 1	0	1	2	3	4	5
eⁿ	.0007	.018	.050	.135	.368	1	2.718	7.389	20.086	54.598	148.413

Just as 10, 100, 1000, and so on are turning points (exponentials) for base 10, the turning points for base-e (powers of e) after the numbers are rounded off are 3, 7, 20, 55, 148, 403, 1897,.. and so on.

An interesting aspect of the base-e system is that it maps all numbers to irrational points on the real line. In this, it represents the dual to schemes with integer bases that map corresponding points to rational numbers.

In the consideration of codes for integers, the question of the degree of accuracy comes in. The superiority of the base-e representation is true when we consider all real numbers, but here we only wish to compare representations for integers.

In Table 2, we map integers from 1 to 20 to the base-e representation accurate to four “decimal” places. With the added constraint on least error for number of chosen decimal points, one can assert that a unique representation is defined.

Table 2 The base-e coding for numbers 1 through 20, accurate to four “decimal” places

Full size table

Here is the explanation for how number 8 has two forms 22.1110 and 100.1120 in Table 3. In principle, one of these will be superior to the other in how close it is to the integer based on the number of decimal points that are chosen. Given the choice of four decimal points, 22.1110 equals 7.989, whereas 100.1120 equals 7.992; therefore, the latter representation is superior.

Table 3 The WF representation and true value

Full size table

It is clear that base-e representations of numbers for integers from 1 to 20 are much less efficient than base-2 or 3 systems. Consider the $ \varepsilon $-approximation of the base-e representation as

$$ x = \mathop \sum \limits_{i = - m}^{n} a_{i} e^{i} $$

(5)

with the condition that

$$ \left| {x - \mathop \sum \limits_{i = - m}^{n} a_{i} e^{i} } \right| < \varepsilon $$

when $ \varepsilon $ = 0.5, the number is correct to the nearest integer value, and the mapping will be called the WF (for “without fraction”) representation.

3 Base-e (WF) Representation

We have already mentioned that the optimality is to be understood statistically when all the points on the real line are chosen, but one would like to see it operationally for small integers.

We do so by leaving out the fractions and choosing the number closest to the integer being represented as is done in quantization in signal theory [19]. This WF (“without fraction”) representation is mathematically:

$$ x \approx \mathop \sum \limits_{i = 0}^{n} a_{i} e^{i} $$

(6)

with the condition that

$$ \left| {x - \mathop \sum \limits_{i = 0}^{n} a_{i} e^{i} } \right| < 0.5 $$

This actual error between x and its WF representation can be either negative or positive just as is the case in quantization.

Now, in Table 4, we compare the WF representation with representations for bases 2 and 3.

Table 4 Comparison of base-e (WF) coding against bases 2 and 3

Full size table

For numbers 1 through 20, we see the lengths of the representational sequences for bases e and 3 are identical excepting for the number 8.

The total number of characters needed is 51 for base-e, 74 for base-2, and 50 for base-3.

Total cost for base-e: 51 × ln e = 51 nats

Total cost for base-2: 74 × ln 2 = 51.282 nats

Total cost for base-3: 50 × ln 3 = 54.93 nats

Thus, for the WF case for integers 1 through 20, base-e is superior to both the bases of 2 and 3. But as shown in an earlier paper [16], the efficiency of the bases will depend on the nearness to powers of the base and there will be numbers less than the turning point of the base where that base may be superior to the others.

4 Random Trees and Optimal Branching

We now consider trees in natural systems where in contrast to the representation of a single number as in the previous sections we look for all possible outputs associated with the system. The study of natural systems has shown a good match with heavy-tailed distributions such as Pareto distribution or the power law [24]. Such systems are approximately scale-free or self-similar. Examples of these include social networks and collaboration networks [13], many kinds of computer networks, and the internet and the web graph of the World Wide Web [20]. Preferential attachment models have been proposed as mechanisms to explain such distributions but what follows is a new approach to the problem.

In a complex system, the branching lines from each node will be a random function [20]. When the branches are folded over, let the probabilities be represented by P(x_i) for the number of aggregated branches that can take values of i that vary from 1 to n. The entropy, H(X), of the system will be:

$$ H\left( X \right) = - \mathop \sum \limits_{i} P(x_{i} ) \log P\left( {x_{i} } \right). $$

(7)

In the best case, each of these branches will have the same probability and the mapping is most efficient. We have already seen from Eq. (3) that the optimal branching number is given by e.

For unconstrained random trees, the probability of events in ranked order is proportional to $ 1/n $, where n is the rank. If a counting process is uniformly distributed over the range {1, …, S}, with random values of S, then the sum satisfies the Newcomb–Benford Law [10], $ P\left( n \right) = log_{r} \left( {1 + \frac{1}{n}} \right) $, where n is the leading digit (n ε {1,2,…, r-1}). When r = e, we have:

$$ P\left( n \right) = { \ln }\left( {1 + \frac{1}{n}} \right) $$

(8)

$$ = \frac{1}{n + 1} + \frac{1}{{2\left( {n + 1} \right)^{2} }} + \frac{1}{{3\left( {n + 1} \right)^{3} }} + \frac{1}{{4\left( {n + 1} \right)^{4} }} + \ldots $$

(9)

If the higher order terms are ignored, we have

$$ P\left( n \right) \approx \frac{k}{n} $$

(10)

where the k in the numerator is introduced to satisfy the constraint that the probability should add to 1. This is Zipf’s distribution [2, 21, 25].

Definition

. Let a counting event be a function of some aggregated property in a random tree with r branches at each level.

Theorem 2

. The number of counts N(n + 1) at level n + 1 is rN(n).

Proof

. The proof is elementary because at each level there is a r-fold branching.

Now, consider optimal branching, which by Theorem 1 is e-fold. The total number of branches will be limited by constraints associated with the physical system, and let us call the maximum value of the count associated with the event of interest to be Max.

One can read the growing values directly or read them in reverse. In the direct expansion of the tree, we have starting with an initial count of A:

$$ A \to Ae \to Ae^{2} \to Ae^{3} \to \ldots \to {\text{Max}} $$

where Max is the maximum count associated with the variable. Written in the reverse order, the sequence is:

$$ {\text{Max}} \to {\text{Max}} e^{ - 1} \to {\text{Max}} e^{ - 2} \to {\text{Max}} e^{ - 3} \ldots $$

(11)

Let k be the variable associated with location of the values of expression (11), starting with 1 and then 2, 3,…

Since the count expands by a factor of e in each branching, it will be the highest at the nodes at the bottom of the tree. When traversing the tree backwards, the count at each step will be decreased by a factor of e. Therefore, we can assert:

Theorem 3

For optimal branching in natural systems, when the aggregated counts are written in ranked order, the probability at rank k + 1 is proportional to ke⁻¹.

This means that the scaling law is:

$$ p\left( k \right) \sim e^{{ - \left( {k - 1} \right)}} $$

(12)

which is a heavy-tailed distribution.

Experiments have shown (e.g., [4]) that many phenomena follow the following power law approximately for large values of k:

$$ p\left( k \right) \sim k^{ - \gamma } $$

(13)

where γ is a parameter whose value is typically in the range 2 < γ < 3. The main characteristic of this distribution is its heavy-tailed nature, and there is considerable variety in the nature of the law.

As shown in Table 5, the values of (12) and (13) evolve in a similar fashion. For comparison, we have chosen γ = 2.5.

Table 5 Comparison of two heavy-tailed distributions of (12) and (13)

Full size table

The values of the distribution in the new expression of (12) fall even faster than in (13), as it is an exponential function.

5 Saturated Power Law

Physical constraints will saturate the growth of the nodes down the random tree. If α is the saturation parameter (α < 1), then in the growth of the tree e will be replaced by αe. Finally, instead of formula (12) we obtain:

$$ p\left( k \right)\sim\left( {\alpha e} \right)^{{ - \left( {k - 1} \right)}} , \alpha < 1 $$

(14)

Equation (12) may be seen as a special case of (14) when $ \alpha = 1. $ The $ p\left( x \right) \sim e^{{ - \left( {k - 1} \right)}} $ as shown in Fig. 3.

The parameter in the saturated power law would vary based on the constraints associated with the natural system.

6 Conclusions

Some properties of the optimal base-e representation were described, and for integers an approximate representation without fractions that we call WF was introduced. Comparisons of the WF representation with those of bases-2 and 3 were made. Since trees are analogous to number representation, we explored the applicability of the base-e system to understanding complex system behavior. We showed that this provides a new explanation for the power law exhibited by a natural system.

The power law derived in this paper is heavy-tailed like the ones that have been widely discussed in the literature on complex systems.

References

R.L. Axtell, Zipf distribution of U.S. firm sizes. Science 293, 1818–1820 (2001)
V. Belevitch, On the statistical laws of linguistic distributions. Annales de la Société Scientifique de Bruxelles. I(73), 310–326 (1959)
MathSciNet MATH Google Scholar
G. Bergman, A base number system with an irrational base. Math. Mag. 31(2), 98–110 (1957)
Article MathSciNet Google Scholar
S. Clauset, C.R. Shalizi, M.E.J. Newman, Power-law distributions in empirical data. SIAM Rev. 51, 661–703 (2009)
Article MathSciNet Google Scholar
M. Cristelli, M. Batty, L. Pietronero, There is more than a power law in Zipf. Sci. Rep. 2, 812 (2012)
Article Google Scholar
J. De Las Rivas, C. Fontanillo, Protein-protein interactions essentials: key concepts to building and analyzing interactome networks. PLoS Comput. Biol. 6(6), e1000807 (2010)
Article Google Scholar
L.C. Eggan, C. Vanden Eynden, Decimal expansions to nonintegral bases. Am. Math. Monthly 73, 576–582 (1966)
MathSciNet MATH Google Scholar
X. Gabaix, Zipf’s law for cities: an explanation. Quart. J. Econ. 114, 739–767 (1999)
Article Google Scholar
A. Gibbons, Algorithmic Graph Theory (Cambridge University Press, Cambridge, 1985)
MATH Google Scholar
T.P. Hill, A statistical derivation of the significant-digit law. Stat. Sci. 10, 354–363 (1995)
Article MathSciNet Google Scholar
T.P. Hill, The first-digit phenomenon. Am. Sci. 86, 358–363 (1998)
Article Google Scholar
B. Hayes, Third base. Am. Sci. 89, 490–494 (2001)
Article Google Scholar
C. Kadushin, Understanding Social Networks: Theories, Concepts, and Findings (Oxford University Press, Oxford, 2012)
Google Scholar
S. Kak, Generalized unary coding. Circuits Syst. Signal Process. 36, 1419–1426 (2016)
Article MathSciNet Google Scholar
S. Kak, Power series models of self-similarity in social networks. Inf. Sci. 376, 31–38 (2017)
Article Google Scholar
S. Kak, Variations on the Newcomb-Benford law. arXiv:1806.06695 (2018)
S. Kak, On ternary coding and three-valued logic. arXiv:1807.06419 (2018)
S. Kak, The Fractal Dimension of Space (Oklahoma State University, Stillwater, 2020)
Google Scholar
R. McEliece, The Theory of Information and Coding. Cambridge (2002)
M.E.J. Newman, Power laws, Pareto distributions and Zipf’s law. Contemp. Phys. 46(5), 323–351 (2005)
Article Google Scholar
D.M.W. Powers, Applications and explanations of Zipf’s law. Assoc. Comput. Linguist. 151–160 (1998)
C. Rousseau, The phi number system revisited. Math. Mag. 48(4), 283–284 (1995)
Article MathSciNet Google Scholar
J. Stiglitz, The price of complexity in financial networks. PNAS 113, 10031–10036 (2016)
Article MathSciNet Google Scholar
M.P.H. Stumpf, Critical truths about power laws. Science 335, 665–666 (2012)
Article MathSciNet Google Scholar
G.K. Zipf, Human Behavior and the Principle of Least Effort (Addison-Wesley, Cambridge, 1949)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Electrical and Computer Engineering, Oklahoma State University, Stillwater, OK, 74078, USA
Subhash Kak

Authors

Subhash Kak
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Subhash Kak.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kak, S. The Base-e Representation of Numbers and the Power Law. Circuits Syst Signal Process 40, 490–500 (2021). https://doi.org/10.1007/s00034-020-01480-0

Download citation

Received: 04 March 2020
Revised: 02 June 2020
Accepted: 04 June 2020
Published: 25 June 2020
Issue Date: January 2021
DOI: https://doi.org/10.1007/s00034-020-01480-0

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

The Base-e Representation of Numbers and the Power Law

Abstract

Similar content being viewed by others

Number-Theoretical Methods of Factoring Composite Numbers and Calculating the Discrete Logarithm

Univoque graphs for non-integer base expansions

On the Arithmetic of Knuth’s Powers and Some Computational Results About Their Density