In Chap. 9 , we showed that the value of D q for a given q depends in general on which minimal s-covering is selected, and we showed that this ambiguity can be eliminated by using the unique lexico minimal summary vectors x(s). However, there remains a significant ambiguity in computing D q , since Definition 9.3 refers to a range of s values over which approximate equality holds. Let this range be denoted by [L, U], where L < U. It is well known that, in general, the numerical value of any fractal dimension depends on the range of box sizes over which the dimension is computed. What had not been previously recognized is that for a complex network the choice of L and U can dramatically change the shape of the D q vs. q curve: depending on L and U, the shape of the D q vs. q curve can be monotone increasing, or monotone decreasing, or even have both a local maximum and a local minimum [49]. Example 9.1 and Fig. 9.1 provided an example where the D q vs. q plot is not monotone non-increasing, even for the simple case [L, U] = [2, 3]. This behavior stands in sharp contrast to the behavior of a geometric multifractal, for which it is known [20] that D q is non-increasing in q.

Recalling that \(\log Z\big (x(s), q\big )\) for a complex network \(\mathbb {G}\) is defined by (9.4 ), one way to compute D q for a given q is to determine a range [L q , U q ] of s over which \(\log Z\big (x(s), q\big )\) is approximately linear in \(\log s\), and then use (9.5 ) to estimate D q , e.g., using linear regression. With this approach, to report computational results to other researchers, it would be necessary to specify, for each q, the range of box sizes used to estimate D q . This is certainly not the protocol currently followed in research on generalized dimensions. Rather, the approach taken in [49] and [67] is to pick a single L and U and estimate D q for all q with this L and U. Moreover, rather than estimating D q using a technique such as regression over the range [L, U] of box sizes, [49] instead estimates D q using only the two box sizes L and U. (As discussed in Chap. 5 , such a two-point estimate was also used in [46], where it was shown that even for as simple a network as a one-dimensional chain, estimates of obtained from regression do not behave well, and a two-point estimate has very desirable properties.)

With this two-point approach, the estimate of D q is 1∕(q − 1) times the slope of the secant line connecting the points

$$\displaystyle \begin{aligned} \left( \log L, \log Z\big(x(L), q\big) \right) \;\;\; {and } \; \left( \log U, \log Z\big(x(U), q\big) \right) \, , \end{aligned}$$

where x(L) and x(U) are the lexico minimal summary vectors for box sizes L and U, respectively. Using (9.4 ) and (9.5 ), this secant estimate of D q , which we denote by D q (L, U), is defined by

$$\displaystyle \begin{aligned} \begin{array}{rcl} D_q(L, U) &\displaystyle \equiv&\displaystyle \frac{ \log Z\big(x(U), q\big) - \log Z\big(x(L), q\big) }{ (q-1)\big( \log(U/\varDelta) - \log(L/\varDelta) \big) }\\ &\displaystyle =&\displaystyle \frac{1}{(q-1) \log(U/L)} \log \left( \frac{\sum_{B_{ j} \in {\mathcal{B}}(U)} [x_{ {j}}(U)]^q }{\sum_{B_{ j} \in {\mathcal{B}}(L)} [x_{ {j}}(L)]^q } \right) \, . {} \end{array} \end{aligned} $$
(10.1)
FormalPara Example 10.1

Figure 10.1 plots box counting results for the dolphins network, which has 62 nodes, 159 arcs, and Δ = 8. This is a social network describing frequent associations between 62 dolphins in a community living off Doubtful Sound, New Zealand [35]. For this network, and for all other networks described in this chapter, each lexico minimal summary vector x(s) was computed using Procedure 9.1 and the graph coloring heuristic described in [48]. Figure 10.1 shows that the \(\big ( -\log (s/\varDelta ), \log B(s) \big )\) curve is approximately linear for 2 ≤ s ≤ 6.

Fig. 10.1
figure 1

Box counting for the dolphins network

Figure 10.2 plots \(\log Z\big (x(s), q\big )\) vs. \(\log (s/\varDelta )\) for 2 ≤ s ≤ 6 and for q = 2, 4, 6, 8, 10 (q = 2 is the top curve, and q = 10 is the bottom curve). Figure 10.2 shows that, although the \(\log Z\big (x(s), q\big )\) vs. \(\log (s/\varDelta )\) curves are less linear as q increases, a linear approximation is quite reasonable. Moreover, we are particularly interested in the behavior of the \(\log Z\big (x(s), q\big )\) vs. \(\log (s/\varDelta )\) curve for small positive q, the region where the linear approximation is best. Using (10.1), Fig. 10.3 plots the secant estimate D q (L, U) vs. q for various choices of L and U. Since the D q vs. q curve for a geometric multifractal is monotone non-increasing, it is remarkable that different choices of L and U lead to such different shapes for the D q (L, U) vs. q curve for the dolphins network. □

Fig. 10.2
figure 2

\(\log Z \big (x(s), q \big )\) vs. \(\log (s/\varDelta )\) for the dolphins network

Fig. 10.3
figure 3

Secant estimate of D q for the dolphins network for different (L, U)

Let \(D^{\hspace {0.1em} \prime }_0(L,U)\) denote the first derivative with respect to q of the secant D q (L, U), evaluated at q = 0. A simple closed-form expression for \(D^{\hspace {0.1em} \prime }_0(L,U)\) is derived in [49]. For box size s, let \(x(s) = \sum {\mathcal {B}}(s)\) be lexico minimal. Define

$$\displaystyle \begin{aligned} \begin{array}{rcl} {G}(s) &\displaystyle \equiv&\displaystyle \left( \prod_{j=1}^{B(s)} x_{ {j}}(s) \right)^{\negthinspace \negthinspace 1/B(s)} \\ {A}(s) &\displaystyle \equiv&\displaystyle \frac{1}{B(s)}\sum_{j=1}^{B(s)} x_{ {j}}(s) \\ {R}(s) &\displaystyle \equiv&\displaystyle \frac{G(s)}{A(s)} {} \end{array} \end{aligned} $$
(10.2)

so G(s) is the geometric mean of the box masses summarized by x(s), A(s) is the arithmetic mean of the box masses summarized by x(s), and R(s) is the ratio of the geometric mean to the arithmetic mean. By the classic arithmetic-geometric inequality, for each s we have R(s) ≤ 1. Since \(\sum _{j=1}^{B(s)} x_{ {j}}(s) = N\), then B(s) A(s) = N. Theorems 10.1 and 10.2 below are proved in [49].

FormalPara Theorem 10.1
$$\displaystyle \begin{aligned} D^{\hspace{0.1em} \prime}_0(L,U) = \frac{1}{\log (U/L)} \log \frac{ {R}(L)}{ {R}(U)} \, . \end{aligned}$$

Theorem 10.1 says that the slope of the secant estimate of D q at q = 0 depends on x(L) and x(U) only through the ratio of the geometric mean to the arithmetic mean of the components of x(L), and similarly for x(U). Since L < U, Theorem 10.1 immediately implies the following corollary.

FormalPara Corollary 10.1

\(D^{\hspace {0.1em} \prime }_0(L,U) > 0\) if and only if R(L) > R(U), and \(D^{\hspace {0.1em} \prime }_0(L,U) < 0\) if and only if R(L) < R(U). □

For a given L and U, Theorem 10.2 below provides a sufficient condition for D q (L, U) to have a local maximum or minimum.

FormalPara Theorem 10.2

(i) If R(L) > R(U) and

then D q (L, U) has a local maximum at some q > 0. (ii) If R(L) < R(U) and

then D q (L, U) has a local minimum at some q > 0. □

FormalPara Example 10.2

To illustrate Theorem 10.2, consider the dolphins network of Example 10.1 with L = 3 and U = 5. We have B(3) = 13 and B(5) = 4, so \(D_0 = \log (13/4)/\log (5/3) \approx 2.307\). Also, and , so by (9.8 ) we have \(D_{\infty } \approx \log (28/10)/\log (5/3) \approx 2.106\). We have R(3) ≈ 0.773, R(5) ≈ 0.660, and \(D^{\hspace {0.1em} \prime }_0(L,U) \approx 0.311\). Hence D q (3, 5) has a local maximum, as seen in Fig. 10.3. Moreover, for the dolphins network, choosing L = 2 and U = 5 we have \(D_0 = \log (29/4)/\log (5/2) \approx 2.16\), and \(D_{\infty } \approx \log (28/3)/\log (5/2) \approx 2.44\), so D 0 < D , as is evident from Fig. 10.3. Thus the inequality D 0 ≥ D , which is valid for geometric multifractals, does not hold for the dolphins network with L = 2 and U = 5. □

If for s = L and s = U we can compute a minimal s-covering with equal box masses, then \(\mathbb {G}\) is a monofractal but not a multifractal. To see this, suppose all boxes in \({\mathcal {B}}(L)\) have the same mass, and that all boxes in \({\mathcal {B}}(U)\) have the same mass. Then for s = L and s = U we have x j (s) = NB(s) for 1 ≤ j ≤ B(s), and (9.1 ) yields

$$\displaystyle \begin{aligned} Z\big(x(s), q \big) = \sum_{B_{ j} \in {\mathcal{B}}(s)} \left(\frac{x_{ {j}}(s)}{N} \right)^q = \sum_{B_{ j} \in {\mathcal{B}}(s)} \left( \frac{1}{B(s)} \right) ^{q} = [B(s)]^{1-q} \, . \end{aligned}$$

From (9.5 ), for q ≠ 1 we have

$$\displaystyle \begin{aligned} \begin{array}{rcl} D_q &\displaystyle =&\displaystyle \frac{\log Z\big( (x(U), q \big)- \log Z\big(x(L), q \big)} {(q-1)\big( \log U - \log L \big)} \; = \; \frac{ \log \big( [B(U)]^{1-q} \big) - \log \big( [B(L)]^{1-q} \big)} {(q-1) \big( \log U - \log L \big)} \\ &\displaystyle =&\displaystyle \frac{ \log B(L) - \log B(U)} {\log U - \log L} \; = \; D_0 \; = \; d_{ {B}} \, , {} \end{array} \end{aligned} $$
(10.3)

so \(\mathbb {G}\) is a monofractal. Thus equal box masses imply \(\mathbb {G}\) is a monofractal, the simplest of all fractal structures.

There are several ways to try to obtain equal box masses in a minimal s-covering of \(\mathbb {G}\). As discussed in Chap. 8 , ambiguity in the choice of minimal coverings used to compute d I is eliminated by maximizing entropy. Since the entropy of a probability distribution is maximized when all the probabilities are equal, a maximal entropy minimal covering equalizes (to the extent possible) the box masses. Similarly, as discussed in Chap. 9 , ambiguity in the choice of minimal s-coverings used to compute D q is eliminated by minimizing the partition function \(Z_q \big ( {\mathcal {B}}(s) \big )\). Since for all sufficiently large q the lexico minimal vector x(s) summarizes the s-covering that minimizes \(Z_q \big ( {\mathcal {B}}(s) \big )\), and since for q > 1 a partition function is minimized when all the probabilities are equal, then x(s) also equalizes (to the extent possible) the box masses. Theorem 10.1 suggests a third way to try to equalize the masses of all boxes in a minimal s-covering: since G(s) ≤ A(s) and G(s) = A(s) when all boxes have the same mass, a minimal s-covering that maximizes G(s) will also equalize (to the extent possible) the box masses. The advantage of computing the lexico minimal summary vectors x(s), rather than maximizing the entropy or maximizing G(s), is that, by Theorem 9.1 , the summary vector x(s) is unique.

We now apply Theorem 10.1 to the chair network, to the dolphins network, and to a jazz network.

FormalPara Example 10.3

For the chair network of Fig. 8.2 we have L = 2, x(L) = (2, 2, 1), U = 3, and x(U) = (3, 2). We have \(D^{\prime }_0(2,3) \approx -0.070\), as shown in Fig. 9.1 by the slightly negative slope of the lower curve at q = 0. As mentioned above, this curve is not monotone non-increasing; it has a local minimum. □

FormalPara Example 10.4

For the dolphins network studied in Example 10.1, Table 10.1 provides \(D^{\prime }_0(L,U)\) for various choices of L and U. The values in Table 10.1 are better understood using Fig. 10.4, which plots \(\log R(s)\) vs. \(\log s\). For example, for (L, U) = (2, 6) we have \(D^{\prime }_0(2,6) = \log \big ( R(2)/R(6) \big )/\big ( \log 6/2 \big ) \approx -0.056\), as illustrated by the slightly positive slope of the dashed red line in Fig. 10.4, since the slope of the dashed red line is \(-D^{\prime }_0(2,6)\). For the other choices of (L, U) in Table 10.1, the values of \(D^{\prime }_0(L,U)\) are positive and roughly equal. Figure 10.2 visually suggests that \(\log Z\big (x(s), q\big )\) is better approximated by a linear fit over s ∈ [2, 5] than over s ∈ [2, 6], and Fig. 10.4 clearly shows that s = 6 is an outlier in that using U = 6 dramatically changes \(D^{\prime }_0(L,U)\). □

Fig. 10.4
figure 4

\(\log R(s)\) vs. \(\log s\) for the dolphins network

Table 10.1 \(D^{\prime }_0(L,U)\) for the dolphins network
FormalPara Example 10.5

This network, with 198 nodes, 2742 arcs, and diameter 6, is a collaboration network of jazz musicians [19]. Figure 10.5 shows the results of box counting; the curve appears reasonably linear for s ∈ [2, 6]. Figure 10.5 also plots D q (L, U) vs. q for four choices of L and U. Table 10.2 provides \(D^{\prime }_0(L,U)\), D 0, and D for nine choices of L and U; the rows are sorted by decreasing \(D^{\prime }_0(L,U)\). It is even possible for the D q (L, U) vs. q curve to exhibit both a local maximum and a local minimum: for the jazz network with L = 4 and U = 5, there is a local minimum at q ≈ 0.7 and a local maximum at q ≈ 12.8. Figure 10.6 plots \(\log R(s)\) vs. \(\log s\) for the jazz network. □

Fig. 10.5
figure 5

Jazz box counting (left) and D q vs. q for various L and U (right)

Fig. 10.6
figure 6

\(\log R(s)\) vs. \(\log s\) for the jazz network

Table 10.2 Results for the jazz network for various L and U

These results, together with the results in [47, 48], show that two requirements should be met when reporting fractal dimensions of a complex network. First, since there are in general multiple minimal s-coverings, and these different coverings can yield different values of D q , computational results should specify the rule (e.g., a maximal entropy covering, or a covering yielding a lexico minimal summary vector) used to unambiguously select a minimal s-covering. Second, the lower bound L and upper bound U on the box sizes used to compute D q should be reported. Published values of D q not meeting these two requirements cannot in general be considered benchmarks. As to the values of L and U yielding the most meaningful results, it is desirable to identify the largest range [L, U] over which \(\log Z\) is approximately linear in \(\log s\); this is a well-known principle in the estimation of fractal dimensions. Future research may uncover, based on the \(\log R(s)\) vs. \(\log s\) curve, other criteria for selecting L and U.