Abstract
We find a simple, closed formula for the proportion of vertices which are k-protected in all unlabeled rooted plane trees on n vertices. We also find that, as n goes to infinity, the average rank of a random vertex in a tree of size n approaches 0.727649, and the average rank of the root of a tree of size n approaches 1.62297.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Unlabeled rooted plane trees are one of the simplest tree structures. They are used to model any network with a point of origin. We will ask the question, if you start at a leaf and take only upward steps, how far might we expect to go to reach a given vertex?
The modern world is full of different kinds of networks. In each network it could be advantageous or disadvantageous to have highly protected vertices. For example, in many online communities, people must be invited in order to participate. If we were to create a tree of users where each person is connected to the person who invited them, then each time a user adds a member they once again become 1-protected. Thus a high level of protection most likely indicates that someone is not regularly bringing in new members.
In a network that must remain secure, it is desirable to make it difficult to reach the root, so it is preferable for the root to be highly protected (another question we will address directly), but we also would like nodes in general to have a high level of protection. For example, in a computer network with different levels of access, even if the network is very tall (in other words, the lowest entry point is many steps from the root), if it has a leaf at a high level, the root will still be far more accessible than the height of the network would imply.
A vertex is k-protected if it is at least k upward steps from any leaf. In Fig. 1, each vertex is labeled by the highest k for which it is k-protected, this k is also known as the rank. For example, a vertex that is of rank four is 4-protected, 3-protected, etc. The number of k-protected vertices was explored for unlabeled rooted plane trees with \(k=2\) by Cheon and Shapiro [4]. The topic of protected vertices has been examined for random recursive trees by Mahmoud and Ward [8], k-ary trees by Mansour [9], binary search trees by Bóna [1] and Bóna and Pittel [3], random phylogenetic trees by Bóna and Flajolet [2], several types of random trees by Devroye and Janson [5], and digital search trees by Du and Prodinger [6].
Throughout this paper any reference to a tree will specifically mean an unlabeled rooted plane tree.
We will also freely use a few facts: the number of such trees on n vertices, denoted as \(t(n)=\frac{1}{n} {2n-2 \atopwithdelims ()n-1}\) is the \((n-1)\)st Catalan number, and the ordinary generating function of the number of such trees on n vertices is \(\sum _{n=1}^\infty t(n) x^n=T(x)= \frac{1- \sqrt{1-4x}}{2}\). The number of all vertices of trees of size n, denoted as \(v(n)= {2n-2 \atopwithdelims ()n-1}\), is the \((n-1)\)st central binomial coefficient, and they have the ordinary generating function \(\sum _{n=1}^\infty v(n)x^n=V(x) = \frac{x}{\sqrt{1-4x}}\).
2 Grafting
Grafting is a process in horticulture where a branch from one tree is removed and a branch from a different tree is inserted in its place. The following lemma does something very similar.
Lemma 1
Let \(R_k(x)=\sum _{n =1}^\infty r_k(n)\, x^n\) be the ordinary generating function for \(r_k(n)\), the number of rooted plane trees on n vertices whose root is k-protected. Let \(L(x)=\sum _{n=0}^\infty l(n)\, x^n\) be the ordinary generating function for l(n), the number of leaves in all rooted plane trees of size \(n+1\) (or, equivalently, all rooted plane trees with n edges). Let \(T_k(x)= \sum _{n=1}^\infty t_k(n)\, x^n\) be the ordinary generating function for \(t_k(n)\), the number of k-protected vertices in all trees of size n. Then \(T_k(x)=L(x) \cdot R_k(x)\).
Proof
Let \(0<m \le n\). The number of k-protected vertices on a tree with n vertices can be counted by choosing a tree on \(n-m+1\) vertices, removing a specific leaf, then replacing that leaf with a tree whose root is k-protected. A leaf can be chosen in one of \(l(n-m)\) ways and the tree in \(r_k(m)\) ways, so the result follows by the product formula. \(\square \)
To apply this, of course, we need L(x). Let \(T(x, y)= \sum _{n=0}^\infty t_{n, m}x^n y^m\) where \(t_{n, m}\) is the number of trees on n vertices with m leaves. The generating function T(x, y) satisfies the functional equation \(T(x, y)= xy+x\left( \frac{T(x, y)}{1-T(x, y)} \right) \), and solving for T(x, y) yields \(T(x, y)=\frac{1}{2} (1 - x + x y - \sqrt{-4 x y + (-1 + x - x y)^2})\). The function we want, L(x), will be equal to \(\left. \frac{ \partial }{\partial y} T(x, y) \right| _{y=1}\). Thus
To find expressions for \(R_k(x)\), we first observe that since a tree with a 1-protected root is simply a non-empty sequence of trees, then \(R_1(x)=\frac{xT(x)}{1-T(x)}\) where \(T(x)=\frac{1- \sqrt{1-4x}}{2}\) giving \(R_1(x)=\frac{1-2x-\sqrt{1-4x}}{2}.\) Further, this can be iterated since a root is k-protected if and only if it is a non-empty sequence of trees whose roots are \((k-1)\)-protected. Thus the recursion \(R_k(x) =\) \(x \cdot \frac{R_{k-1}(x)}{1-R_{k-1}(x)}\) holds.
Theorem 1
For all \(k \ge 2\),
where \(n_k(x)\) and \(d_k(x)\) are polynomials defined as follows: for all \(k \ge 2\), \(n_k(x)= 1 - 2x -2x^2 - \cdots - 2x^k,\) \(d_2=2+x\), and for all \(k \ge 3\)
For some numerical justification, this gives the series expansions \(R_2(x) = x^3 + 2x^4+6x^5+18x^6+\cdots \) and \(R_3(x) = x^4+2x^5+6x^6+\cdots \), each of which are accurate up to trees of size 6 by examination. To prove this theorem we will need some purely computational lemmas.
Lemma 2
For all \(k \ge 2\),
Proof
We will proceed by induction on k.
For the base case, if \(k=2\), then
For the induction step, we assume that the statement holds for \(d_k(x)\), so from (3),
\(\square \)
Lemma 3
If \(k \ge 2\), then
Proof
The expansion of \(n^2_k(x)\) splits nicely into 3 parts as follows:
The first two terms will be \(1-4x\) for all \(k \ge 1\).
For all \(1 < i \le k\) we will have a two copies of \(-2x^i\) and \(i-1\) copies of \(4x^i\) giving a net total of \((i-2)x^i\). This means there will be no \(x^2\) term.
For all \( k< i \le 2k\) we will have no negative terms, and for each term we will have \(2k-1-i\) copies of \(4x^i\) giving us
\(\square \)
Proof (of Theorem 1)
We once again proceed by induction on k.
For the base case, if \(k=2\), then, after clearing denominators and multiplying by the conjugate of the denominator,
Now, assuming that the statement holds for \(R_k(x)\), and after clearing denominators and multiplying by the conjugate of the denominator,
\(\square \)
3 Asymptotics
We will say that \(a_n \sim b_n\) if \({\lim _{n \rightarrow \infty } \frac{a_n}{b_n} =1}\).
Theorem 2
(Bender’s Lemma [7]) Suppose that \(A(z)=\sum a_n z^n\) and \(B(z)=\sum b_n z^n\) are power series with radii of convergence \(\alpha > \beta \ge 0\), respectively. Suppose \(b_{n-1}/b_n\) approaches a limit b as n approaches infinity. If \(A(b) \ne 0\), then \(c_n \sim A(b)b_n\), where \(\sum c_nz^n=A(z)B(z)\).
Lemma 4
For all \(k \ge 1\), \(d_k(x)\) is never zero on the closed disk of radius \(\frac{4}{15}\).
Proof
Let \(|x| \le \frac{4}{15}\). Then \(|d_1(x)|=1 > 0\), \(|d_2(x)| \ge 2 - |x| >0\), and
Observe that, for \(k \ge 2\), \(|d_k(x)-1| <\left| x^{k-2} + \sum _{k=1}^\infty (k+1)x^k \right| \). Thus
If \(k \ge 4\), then \(\left( \frac{4}{15} \right) ^{k-2} < \frac{31}{196}\), so that \(|d_k(x)-1| <1\). Hence \(|d_k(x)| \ge 1 - |d_k(x)-1| >0\) for all \(k \ge 4.\) \(\square \)
Theorem 3
Let \(t_k(n)\) be the number of vertices which are k-protected in all rooted plane trees of size of n. Then \(t_k(n) \sim \frac{3v(n)}{(4^k+2)}=\frac{3{2n -2\atopwithdelims ()n-1}}{(4^k+2)}\).
Proof
Note that \(\frac{x}{\sqrt{1-4x}}\) is the generating function for the number of all vertices of all trees of size n. We have that
Since \(d_k(x)\) is non-zero on the closed disk of radius 4/15, it follows that the term on the right is asymptotically irrelevant and that \(\frac{x^{k-3}(n_k(x)-1+4x)}{4d_k(x)}\) has a radius of convergence larger than 1/4 for all k. Thus we can apply Bender’s Lemma to the term on the left for any k, giving
Since \(n_{k+1}(1/4)=n_k(1/4)-2(1/4)^{k+1}\) and \(n_1(1/4)=3/4\), we have \(n_k(1/4)=\frac{2+4^k}{3 \cdot 4^k}\). Applying Lemma 3,
so that \(d_k(1/4)=16 n_k^2(1/4)\). Thus
\(\square \)
Corollary 1
Let \(p_k(n)\) be the probability that a random vertex in a random rooted plane tree of size n is k-protected. Then \(p_k(n) \sim \frac{3}{4^k+2}\).
Proof
To find the probability, we divide by \({2n -2\atopwithdelims ()n-1}\). \(\square \)
This gives the sequence of values 1, 1/2, 1/6, 1/22, 1/86,..., and also shows that as we progress to a higher level of protection, we lose about 1/4 of the vertices each time. Letting \(k=2,\) we re-obtain the result by Cheon and Shapiro [4].
For numerical justification, the proportion of 3-protected vertices in trees of size 50 is
and \(\frac{1}{22} =0.0{\overline{45}}\).
There are some other interesting questions which can be answered using \(R_k(x)\). The height of a tree is defined as the longest path from the root to a leaf. The function \(R_k(x)\) enumerates instead by the shortest path from the root to a leaf.
Theorem 4
Let \(r_k(n)\) be the number of trees on n vertices whose root is k-protected. Then
where t(n) denotes the number of trees on n vertices, which is the \((n-1)\)st Catalan number.
Proof
We have that
The term on the right is once again asymptotically irrelevant, so we have
Thus we have
\(\square \)
This gives the sequenced of values 1, 1, 4/9, 16/121, 64/1849,..., and once again shows that we lose about 1/4 of the trees each time we progress to a higher level of protection.
Recall that the rank of a vertex is the greatest k for which it is k-protected, or alternatively, the length of the shortest downward path from a vertex to a leaf.
Corollary 2
The number of vertices of rank k in trees of size n approaches the value \(\frac{ 9 {2n -2\atopwithdelims ()n-1}}{10+4^{1-k}+4^{1+k}}\).
Proof
It follows immediately from the definition that a vertex has rank k if and only if it is k-protected but not \((k+1)\)-protected, so by Theorem 3, the number of vertices of rank k approaches
\(\square \)
4 Expectations
Theorem 5
Let \(X_{r,n}\) be the random variable whose value is the rank of the root of a tree and whose sample space is the set of trees of size n. Then
Theorem 6
Let \(X_{v,n}\) be the random variable whose value is the rank of a vertex and whose sample space is the set of vertices in all trees of size n. Then
Proof (of Theorem 5)
Let r(n) be the sum of the ranks of the roots of all trees of size n. Then \({\mathbb {E}}[X_{v, n}] = \lim _{n \rightarrow \infty } \frac{r(n)}{t(n)}\). To find r(n), we need only consider ranks of up to \(n-1\), since there are no vertices of rank n or more in trees of size n. Thus \(r(n)= \sum _{k=1}^{n-1} [x^n] R_k(x)\), since the trees with roots of rank one are counted once by \(R_1(x)\), the trees with roots of rank two are counted once by \(R_1(x)\) and once by \(R_2(x)\), and similarly the trees with roots of rank n are counted once by each \(R_k(x)\) for \(1 \le k \le n-1\). Let \(E_R(x)=\sum _{n=1}^\infty r(n) x^n\). Since the \(R_k(x)\) have no terms of degree lower than k, their sum converges as a formal power series. It follows that
The sums \(\sum _{k=1}^\infty \frac{x^{k-2}}{d_k(x)}\) and \(\sum _{k=1}^\infty \frac{\sum _{i=1}^{k-2}x^{i+k-2}}{d_k(x)}\) both converge absolutely on the closed disk of radius 1/4, so we can split the sum as follows:
Since the coefficient of T(x) and the double sum on the right both converge absolutely to bounded, analytic functions on the closed disk of radius 5/14, it follows that the term on the right remains asymptotically irrelevant, and that we may apply Bender’s Lemma to the term on the left, and the result follows. \(\square \)
Proof (of Theorem 6)
The proof is similar to that of Theorem 5 with the roles of \(R_k(x)\) and T(x) replaced by \(T_k(x)\) and V(x), respectively. \(\square \)
For numerical justification, we may compute the nth coefficient using only the \((n-1)\)st partial sum, since there are no vertices or roots which are n-protected in a tree of size n. We have that
and
References
Bóna, M.: k-Protected vertices in binary search trees. Adv. Appl. Math. 53, 1–11 (2014)
Bóna, M., Flajolet, P.: Isomorphism and symmetries in random phylogenetic trees. J. Appl. Probab. 46, 1005–1019 (2009)
Bona, M., Pittel, B.: On a random search tree: asymptotic enumeration of vertices by distance from leaves. J. Appl. Probab. (to appear)
Cheon, G.-S., Shapiro, L.W.: Protected points in ordered trees. Appl. Math. Lett. 21, 516–520 (2008)
Devroye, L., Janson, S.: Protected nodes and fringe subtrees in some random trees. Electron. Commun. Probab. 19, 1–10 (2014)
Du, R.R.X., Prodinger, H.: On protected nodes in digital search trees. Appl. Math. Lett. 25, 1025–1028 (2012)
Flajolet, P., Sedgewick, R.: Analytic combinatorics. Cambridge University Press, Cambridge (2009)
Mahmoud, H.M., Ward, M.D.: Asymptotic properties of protected nodes in random recursive trees. J. Appl. Probab. 52, 290–297 (2015)
Mansour, T.: Protected points in k-ary trees. Appl. Math. Lett. 24, 478–480 (2011)
Acknowledgements
I am grateful to Miklos Bóna for guidance on the ins and outs of the submission process, as well as careful reading of the manuscript; my wife, Jaclyn van Wingerden for her many readings and constant support; Arnold Knopfmacher and Aubrey Blecher for their very helpful comments; and the referees for helpful suggestions.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Copenhaver, K. k-Protected Vertices in Unlabeled Rooted Plane Trees. Graphs and Combinatorics 33, 347–355 (2017). https://doi.org/10.1007/s00373-017-1772-9
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00373-017-1772-9