Abstract
While spherical data arises in many contexts, including in directional statistics, the current tools for density estimation and population comparison on spheres are quite limited. Popular approaches for comparing populations (on Euclidean domains) mostly involve a two-step procedure: (1) estimate probability density functions (pdf s) from their respective samples, most commonly using the kernel density estimator, and (2) compare pdf s using a metric such as the \(\mathbb {L}^{2}\) norm. However, both the estimated pdf s and their differences depend heavily on the chosen kernels, bandwidths, and sample sizes. Here we develop a framework for comparing spherical populations that is robust to these choices. Essentially, we characterize pdf s on spherical domains by quantifying their smoothness. Our framework uses a spectral representation, with densities represented by their coefficients with respect to the eigenfunctions of the Laplacian operator on a sphere. The change in smoothness, akin to using different kernel bandwidths, is controlled by exponential decays in coefficient values. Then we derive a proper distance for comparing pdf coefficients while equalizing smoothness levels, negating influences of sample size and bandwidth. This signifies a fair and meaningful comparisons of populations, despite vastly different sample sizes, and leads to a robust and improved performance. We demonstrate this framework using examples of variables on \(\mathbb {S}^{1}\) and \(\mathbb {S}^{2}\), and evaluate its performance using a number of simulations and real data experiments.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Anderson, N., Hall, P. and Titterington, D. (1994). Two-sample test statistics for measuring discrepancies between two multivariate probability density functions using kernel-based density estimates. Journal of Multivariate Analysis 50, 1, 41–54.
Boothby, W.M. (2003). An introduction to differentiable manifolds and riemannian geometry, 2nd edn. Academic, New York.
Botev, Z.I., Grotowski, J.F. and Kroese, D.P. (2010). Kernel density estimation via diffusion. Annals of Statistics 38, 5, 2916–2957.
Bowman, A.W. (1984). An alternative method of cross-validation for the smoothing of density estimates. Biometrika 71, 2, 353–360.
Cha, S.-H. (2007). Comprehensive survey on distance/similarity measures between probability density functions. International Journal of Mathematical Models and Methods in Applied Sciences 1, 4, 300–307.
Chaudhry, R., Ravich, A., Hager, G. and Vidal, R. (2009). Histograms of oriented optical flow and Binet-Cauchy kernels on nonlinear dynamical systems for the recognition of human actions. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1932–1939.
Chaudhuri, P. and Marron, J.S. (2000). Scale space view of curve estimation. The Annals of Statistics 28, 2, 408–428.
Dalal, N. and Triggs, B. (2005). Histograms of oriented gradients for human detection. In: International conference on computer vision and pattern recognition, vol. 2, pp. 886–893.
Gretton, A., Borgwardt, K.M., Rasch, M.J., Schölkopf, B. and Smola, A. (2012). A kernel two-sample test. The Journal of Machine Learning Research 13, 723–773.
Gretton, A., Borgwardt, K., Rasch, M., Schölkopf, B. and Smola, A. (2007). A kernel method for the two sample problem. In: Advances In Neural Information Processing Systems 19, MIT Press, pp. 513–520.
Hartman, P. and Watson, G.S. (1974). Normal distribution functions on spheres and the modified Bessel functions. The Annals of Probability 2, 4, 593–607.
Jones, C., Marron, J.S. and Sheather, S.J. (1996a). Progress in data-based bandwidth selection for kernel density estimation. Computational Statistics 11, 337–381.
Jones, M.C., Marron, J.S. and Sheather, S.J. (1996b). A brief survey of bandwidth selection for density estimation. Journal of the American Statistical Association 91, 401–407.
Klassen, E. and Srivastava, A. (2006). Geodesics between 3D closed curves using path-straightening. In: Proceedings of ECCV, Lecture notes in computer science, pp. I: 95–106.
Klemelä, J. (2000). Estimation of densities and derivatives of densities with directional data. Journal of Multivariate Analysis 73, 1, 18–40.
Lafferty, J. and Lebanon, G. (2005). Diffusion kernels on statistical manifolds. Journal of Machine Learning Research 6, 129–163.
Landsea, C., Franklin, J. and Beven, J. (2015). The revised Atlantic hurricane database (HURDAT2), NOAA/NHC.[Available online at nhc. noaa. gov].
Lindeberg, T. (1990). Scale-space for discrete signals. IEEE Transactions on Pattern Analysis and Machine Intelligence 12, 234–254.
Liu, G., Chang, S. and Ma, Y. (2014). Blind image deblurring using spectral properties of convolution operators. IEEE Transactions on image processing 23, 12, 5047–5056.
Liu, X. and Wang, D. (2003). Texture classification using spectral histograms. IEEE Transactions on Image Processing 12, 6, 661–670.
Lowe, D.G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60, 91–110.
Marron, J.S. and Nolan, D. (1988). Canonical kernels for density estimation. Statistics & Probability Letters 7, 3, 195–199.
Marron, J. and Schmitz, H.-P. (1992). Simultaneous density estimation of several income distributions. Econometric Theory 8, 476–488.
Osada, R., Funkhouser, T., Chazelle, B. and Dobkin, D. (2002). Shape distributions. ACM Transactions on Graphics 21, 4, 807–832.
Parzen, E. (1962). On estimation of a probability density function and mode. The Annals of Mathematical Statistics 33, 3, 1065–1076.
Rosenblatt, M. (1956). Remarks on some nonparametric estimates of a density function. The Annals of Mathematical Statistics 27, 3, 832–837.
Scott, D.W. and Terrell, G.R. (1987). Biased and unbiased cross-validation in density estimation. Journal of the American Statistical Association 82, 400, 1131–1146.
Smirnov, N. (1948). Table for estimating the goodness of fit of empirical distributions. The Annals of Mathematical Statistics 19, 2, 279–281.
Turlach, B.A. (1993). Bandwidth selection in kernel density estimation: a review. In: CORE and Institut de Statistique, pp. 23–493.
Viola, P. and Jones, M. (2001). Rapid object detection using a boosted cascade of simple features. In: International Conference on Computer Vision and Pattern Recognition, pp. 511–518.
Zhang, Z., Klassen, E. and Srivastava, A. (2013). Gaussian blurring-invariant comparison of signals and images. IEEE Transactions on Image Process 22, 8, 3145–3157.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
1.1 Proof that \(S_{\kappa }\) is an Orthogonal Section
An orthogonal section \(S_{\kappa }\) is a subset of \(\mathbb {R}^{N + 1}\) (coefficient representation of densities) under the action of the group \(\mathbb {R}\) (defined in Eq. 2.3 in the main paper) if: (i) one and only one element of every orbit \([\boldsymbol {c}]\) in \(\mathbb {R}^{N + 1}\) presents in \(S_{\kappa }\), and (ii) the set \(S_{\kappa }\) is perpendicular to every orbit at the point of intersection. The last property means that if \(S_{\kappa }\) intersects an orbit \([\boldsymbol {c}]\) at \(\tilde {\boldsymbol {c}}\), then \(T_{\tilde {\boldsymbol {c}}}(S_{\kappa }) \perp T_{\tilde {\boldsymbol {c}}}([\boldsymbol {c}])\). We need to verify the two properties: (1) The function \(t \mapsto {\sum }_{n} e^{-2\lambda _{n} t} \lambda _{n} {c_{n}^{2}}\) is a strictly monotonically-decreasing function that ranges (\(+\infty \), 0). Thus, for any \(\boldsymbol {c} \in \mathbb {R}^{N + 1}\) and \(\kappa > 0\), there exists a unique \(t^{*}\) such that \({\sum }_{n} e^{-2\lambda _{n} t^{*}} \lambda _{n} {c_{n}^{2}} = \kappa \). (2) At any point c ∈ Sκ, the space normal to \(S_{\kappa }\) (inside \(\mathbb {R}^{N}\), notice that \(\lambda _{0} = 0\)) is a one-dimensional space spanned by the vector \(\textbf {n}_{\boldsymbol {c}} = \{ \lambda _{1} c_{1}, \lambda _{2} c_{2}, \dots , \lambda _{N} c_{N}\}\). Let \(\textbf {u}_{\boldsymbol {c}}\) denote the unit vector in the normal direction uc = nc/∥nc∥. Since \(S_{\kappa }\) is a level set of G, it is automatically perpendicular to \(\textbf {u}_{\boldsymbol {c}}\) and \(T_{c}([\boldsymbol {c}])\). In other words, the orbits are just the flow lines for the gradient vector field of the function G and since the level sets of a functional are perpendicular to the flow lines of gradient of that function, it follows that the Sκ is perpendicular to these orbits.
1.2 Path Straightening Algorithm on \(S_{\kappa }\)
Here we present the path straightening algorithm for calculating distances on \(S_{\kappa }\). We first list the following basic tools for the path straightening algorithm.
-
1.
Projection onto Mainfold \(S_{\kappa }\): For any arbitrary point \(\boldsymbol {c} \in \mathbb {R}^{N}\), we need a tool to project \(\boldsymbol {c}\) to the nearest point in \(S_{\kappa }\). One can find this nearest point by iteratively updating \(\boldsymbol {c}\) according to \(\boldsymbol {c} \mapsto \boldsymbol {c} + (\kappa - G(\boldsymbol {c}))\textbf {u}_{\boldsymbol {c}}\), until \(G(\boldsymbol {c}) = \kappa \).
-
2.
Projection onto the Tangent Space \(T_{c}(S_{k})\): Given a vector \(w \in \mathbb {R}^{N} \), we need to project w onto \(T_{\boldsymbol {c}}(S_{\kappa })\). Since the unit normal to \(S_{\kappa }\) at \(\boldsymbol {c}\) is \(\textbf {u}_{\boldsymbol {c}}\), the projection of w on Tc(Sκ) is given by \(w \rightarrow (w - \left \langle w , \textbf {u}_{\boldsymbol {c}} \right \rangle \textbf {u}_{\boldsymbol {c}} )\).
-
3.
Covariant Derivative and Integral: Let \(\alpha \) be a given path on \(S_{\kappa }\), i.e., \(\alpha :[0,1] \to S_{\kappa }\), and let w be a vector field along \(\alpha \), i.e., for each \(\tau \in [0,1]\), \(w(\tau ) \in T_{\alpha (\tau )}(S_{\kappa })\). We define the covariant derivative of w along \(\alpha \), denoted \(\frac {Dw}{d\tau }\), to be the vector field obtained by projecting \(\frac {dw}{d\tau }(\tau ) \in \mathbb {R}^{N}\) onto the tangent space \(T_{\alpha (\tau )}(S_{\kappa })\). Covariant integral is the inverse procedure of covariant derivative. A vector field u is called a covariant integral of w along \(\alpha \) if the covariant derivative of u is w, i.e., \(\frac {Du}{d\tau } = w\). Using the previous item on projection, one can derive tools for computing covariant derivatives and integrals of any given vector field.
-
4.
Parallel Translation: We will also need tools for forward and backward parallel translation of tangent vectors along a given path \(\alpha \) on \(S_{\kappa }\). A forward parallel translation of a tangent vector \(w \in T_{\alpha (0)}(S_{\kappa })\), is a vector field along \(\alpha \), denoted \(\tilde {w}\), such that the covariant derivative of \(\tilde {w}\) is 0 for all \(\tau \in [0,1]\), i.e., \(\frac {D \tilde {w}(\tau )}{d\tau } = 0\), and \(\tilde {w}(0) = w\). Similarly, backward parallel translation of a tangent vector w ∈ Tα(1)(Sκ), satisfies that \(\tilde {w}(1) = w\) and \(\frac {D \tilde {w}(\tau )}{d\tau } = 0\) for all \(\tau \in [0,1]\).
Algorithm (Path Straightening in \(S_{\kappa }\)): Given two points \(p_{1}\) and \(p_{2}\) in \(S_{\kappa }\). Suppose \(p_{1},p_{2} \in \mathbb {R}^{N}\), and \(\tau = 0,1,2,...,k\).
-
1.
Initilize a path \(\alpha \): for all \(\tau = 0,1,2,...k\), using a straight line \((\tau /k)p_{1}+(1-(\tau /k))p_{2}\) in \(\mathbb {R}^{N}\). Project each of these points to their nearest points in \(S_{\kappa }\) to obtain \(\alpha (\tau /k)\).
-
2.
Compute \(\frac {d \alpha }{d \tau }\) along \(\alpha \): let \(\tau = 1,2,...,k\) and \(v(0) = \textbf {0}\). Compute \(v(\tau /k) = k(\alpha (\tau /k)-\alpha ((\tau -1)/k))\) in \(\mathbb {R}^{N}\). Project \(v(\tau /k)\) into Tα(τ/k)(Sκ) to get \(\frac {d \alpha }{dt}(\tau /k)\).
-
3.
Compute covariant integral of \(\frac {d \alpha }{d\tau }\), with zero initial condition, along \(\alpha \) to obtain a vector field u along \(\alpha \).
-
4.
Backward parallel translate \(u(1)\) along \(\alpha \) to obtain \(\tilde {u}\).
-
5.
Compute gradient vector field of E according to \(w(\tau /k)=u(\tau /k)-(\tau /k)(\tilde {u}(\tau /k))\) for all \(\tau \).
-
6.
Update path \(\tilde {\alpha }(\tau /k) = \alpha (\tau /k)-\epsilon w(\tau /k)\) by selecting a small \(\epsilon >0\). Then project \(\tilde {\alpha }(\tau /k)\) to \(S_{\kappa }\) to obtain the updated path \(\alpha (\tau /k)\).
-
7.
Return to step 2 unless \(\|w\|\) is small enough or max iteration times reached.
Rights and permissions
About this article
Cite this article
Zhang, Z., Klassen, E. & Srivastava, A. Robust Comparison of Kernel Densities on Spherical Domains. Sankhya A 81, 144–171 (2019). https://doi.org/10.1007/s13171-018-0131-0
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13171-018-0131-0