Abstract
Curvature properties for statistical structures are studied. The study deals with the curvature tensor of statistical connections and their duals as well as the Ricci tensor of the connections, Laplacians and the curvature operator. Two concepts of sectional curvature are introduced. The meaning of the notions is illustrated by presenting few exemplary theorems.
Research supported by the NCN grant 2013/11/B/ST1/02889.
Access provided by Autonomous University of Puebla. Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
The curvature tensor is one of the most important tensors in differential geometry. On the base of this tensor many other objects can be defined. In particular, the Ricci tensor, the scalar curvature, the Weyl curvature tensor, the Weitzenböck curvature tensor or the sectional curvature. Some of these notions, however, are attributed only to Riemannian structures with their Levi-Civita connections. For instance, the sectional curvature is such a notion. We claim that some of these, especially strongly attributed to Riemannian geometry, notions can be extended to statistical structures and like in the Riemannian case, provide a lot of information on the structures.
By a statistical structure on a manifold M we mean a pair \((g,\nabla )\), where g is a metric tensor field and \(\nabla \) is a torsion-free connection for which \(\nabla g\) as a cubic form is symmetric in all arguments, see [1]. Such a structure is also called a Codazzi structure. One can define (equivalently) a statistical structure by equipping a Riemannian manifold (M, g) with a symmetric (1, 2)-tensor field K, for which the cubic form \(C(X,Y,Z)=g(K(X,Y),Z)\) is symmetric in all arguments. Having K one defines a torsion-free connection \(\nabla \) by the formula \(\nabla _XY=\hat{\nabla }_XY+K(X,Y)\), where \(\hat{\nabla }\) is the Levi-Civita connection of g. The pair \((g,\nabla )\) turns out to be a statistical structure. Of course, instead of K one can prescribe a symmetric cubic form C. A manifold endowed with a statistical structure is called a statistical manifold.
In information theory classical examples of statistical manifolds are manifolds of probability distributions equipped with the Fisher information metric and an appropriate cubic form. Namely, let \((\mathcal X, \mathcal B)\) be a measurable space with \(\sigma \)-algebra \(\mathcal B\) over \(\mathcal X\). Let \(\varLambda \) be a domain in \(\mathbf {R}^n\) and
be a function smoothly depending on \(\lambda \). Moreover, we assume that \(p_{\lambda }(x):=p(x,\lambda )\) is a probability measure on \(\mathcal X\) for each \(\lambda \in \varLambda \). Set \(\ell (x,\lambda )=log\,p(x,\lambda )\). The Fisher information metric g on \(\varLambda \) is given by
where \(\mathbb E_\lambda \) denotes the expectation relative to \(p_\lambda \), \(\partial _i\ell \) stands for \(\frac{\partial \ell }{\partial \lambda _i}\) and \(\lambda =(\lambda _1,...,\lambda _n)\). One defines a symmetric cubic form C on \(\varLambda \) by the formula
The pair \((g,\alpha C)\) constitutes a statistical structure on \(\varLambda \) for every \(\alpha \in \mathbf {R}\).
However, the oldest source of statistical structures is the theory of affine hypersurfaces in \(\mathbf {R}^{n}\) or the geometry of the second fundamental form of hypersurfaces in real space forms. Lagrangian submanifolds of complex space forms are also naturally endowed with statistical structures. Nevertheless, most statistical structures are outside these categories. In general, a statistical structure is not realizable on a hypersurface nor on a Lagrangian submanifold, even locally, see [4].
In this paper we present some ideas of extending Riemannian geometry to the case of statistical structures. We concentrate on the ideas depending on naturally defined curvature tensors for a statistical structure. Exemplary theorems concerning these ideas are provided.
2 Statistical Structures
A statistical structure on a manifold M can be defined in few equivalent ways. First of all M must have a Riemannian structure defined by a metric tensor field g. We assume that g is positive definite, although g can be also indefnite. A statistical structure can be defined as a pair (g, K), where g is a Riemannian metric tensor field and K is a symmetric (1, 2)-tensor field on M which is also symmetric relative to g, that is, the cubic form
is symmetric relative to X, Y. It is clear that any symmetric cubic form C on a Riemannian manifold (M, g) defines by (2) a (1, 2)-tensor field K having the symmetry properties as above. Another definition says that a statistical structure is a pair \((g,\nabla )\), where \(\nabla \) is a torsion-free affine connection on M and \(\nabla g\) as a (0, 3)-tensor field on M is symmetric in all arguments. \(\nabla \) is called a statistical connection. The equivalence of the above definitions is established by taking K as the difference tensor between the connection \(\nabla \) and the Levi-Civita connection \(\hat{\nabla }\) for g, that is,
for every vector fields X, Y on M. The cubic forms C and \(\nabla g\) are related by the equality \(\nabla g=-2C.\)
For any connection \(\nabla \) on a Riemannian manifold (M, g) one defines its conjugate connection \(\overline{\nabla }\) (relative to g) by the formula
for any vector fields X, Y, Z on M. The connections \(\nabla \) and \(\overline{\nabla }\) are simultaneously torsion-free. If \((g,\nabla )\) is a statistical structure then so is \((g,\overline{\nabla })\). Moreover, if \((g,\nabla )\) is trace-free then so is \((g,\overline{\nabla })\). Recall that a statistical structure \((g,\nabla )\) is trace-free if \({\mathrm {tr}}\,_g(\nabla g)(X,\cdot , \cdot )=0\) for every X or equivalently \({\mathrm {tr}}\,_gK=0\), or equivalently \({\mathrm {tr}}\,K_X=0\) for every X, where \(K_XY\) stands for K(X, Y). If R is the curvature tensor for \(\nabla \) and \(\overline{R}\) is the curvature tensor for \(\overline{\nabla }\) then we have, see [3],
for every X, Y, Z, W. In particular, \(R=0\) if and only if \(\overline{R}=0\). If K is the difference tensor between \(\nabla \) and \(\hat{\nabla }\) then
We also have, [3],
where \(\hat{R}\) is the curvature tensor for \(\hat{\nabla }\). Writing the same equality for \(\overline{\nabla }\) and adding both equalities we get
The above formulas yield, see [4],
Lemma 1
Let (g, K) be a statistical structure. The following conditions are equivalent:
-
(1)
\(R=\overline{R}\),
-
(2)
\(\hat{\nabla }K(X,Z,Y)\) is symmetric in all arguments,
-
(3)
g(R(X, Y)Z, W) is skew-symmetric relative to Z, W.
A statistical structure is called a Hessian structure if the connection \(\nabla \) is flat, that is, \(R=0\). In this case, by (8), we have
For a statistical structure one defines the vector field E by
The dual (relative to g) form will be denoted by \(\tau \). We have \({\mathrm {tr}}\,_g \nabla g(\cdot ,\cdot , Z)=-2\tau (Z)\). If \(\nu _g\) is the volume form determined by g then \(\nabla _Z\nu _g=-\tau (Z)\nu _g\). Therefore, a statistical structure \((g,\nabla )\) is trace-free if and only if \(\nabla \nu _g=0\). Trace-free statistical structures are of the greatest importance in the classical theory of affine hypersurfaces of \(\mathbf {R}^{n+1}\). In this theory they are called Blaschke structures. In the theory of Lagrangian submanifolds trace-free statistical structures appear on minimal submanifolds.
Denote by Ric, \(\overline{Ric}\) and \(\widehat{Ric} \) the Ricci tensors for \(\nabla \), \(\overline{\nabla }\) and \(\hat{\nabla }\) respectively. Recall that for any linear connection \(\nabla \) with curvature tensor R its Ricci tensor is defined by \(Ric(Y,Z)={\mathrm {tr}}\,\{X\rightarrow R(X,Y)Z\}\). Note that the Ricci tensor does not have to be symmetric. We have
It follows that
In particular, if \((g, \nabla )\) is trace-free then
The above formulas also yield
Hence \(\nabla \) is Ricci-symmetric if and only if \(d\tau =0\). Recall that the Ricci tensor of \(\nabla \) is symmetric if and only if there is a (locally defined) volume form \(\nu \) parallel relative to \(\nabla \).
Denote by \(\rho \) the scalar curvature for \(\nabla \), that is, \(\rho ={\mathrm {tr}}\,_g Ric (\cdot ,\cdot )\). By (5) it is clear that the scalar curvature for \(\overline{\nabla }\) is equal to \(\rho \). Taking now the trace relative to g on both sides of (11) we get
The last formula implies, in particular, that the Riemannian scalar curvature for \(\hat{\nabla }\) is maximal among scalar curvatures of connections which are statistical for g. More precisely, we have, see [4],
Proposition 1
The functional
attains its maximum for the Levi-Civita connection at each point of M. Conversely, if \(\nabla \) is a statistical connection for g and \(\mathfrak {scal}\) attains its maximum for \(\nabla \) at each point on M, then \(\nabla \) is the Levi-Civita connection for g.
As we have already observed the curvature tensor R for \(\nabla \) does not have the same symmetries as the curvature tensor of the Levi-Civita connection. By Lemma 1 we see that the symmetry conditions are fulfilled if \(R=\overline{R}\). It turns out that this condition is important in many considerations. For instance, in theorems saying that under some curvature conditions a statistical structure is trivial, that is, \(\nabla =\hat{\nabla }\). Proofs of the following theorems can be found in [4].
Theorem 1
Let M be a connected compact surface and \((g, \nabla )\) be a trace-free statistical structure on M. If M is of genus 0 and \(R=\overline{R}\) then \(\nabla =\hat{\nabla }\) on M. If M is of genus 1 and \(K=0\) at one point of M then \(\nabla =\hat{\nabla }\) on M.
Theorem 2
Let M be a compact manifold equipped with a trace-free statistical structure \((g,\nabla )\) such that \(R=\overline{R}\). If the sectional curvature \(\hat{k}\) for g is positive then \(\nabla =\hat{\nabla }\).
Although the Ricci tensors for \(\nabla \) and \(\overline{\nabla }\) differs very much from each other, their integrals over a unit sphere bundle UM are the same. Namely we have
Theorem 3
Let M be a compact oriented manifold and \((g,\nabla )\) be a statistical structure on it. Then
If \((g,\nabla )\) is trace-free then
and the equality holds if and only if \(\nabla =\hat{\nabla }\) on M.
3 On Examples
As it was mentioned in the Introduction a natural source of statistical structures is the theory of affine hypersurfaces. Let \(\mathbf {f}: M\rightarrow \mathbf {R}^{n+1}\) be a locally strongly convex hypersurface. For simplicity assume that M is oriented. Let \(\xi \) be a transversal vector field on M. We define the induced volume form \(\nu _\xi \) on M (compatible with the given orientation) as follows
We also have the induced connection \(\nabla \) and the second fundamental form g defined on M by the Gauss formula:
where D is the standard flat connection on \(\mathbf {R}^{n+1}\). Since the hypersurface is locally strongly convex, g is definite. By multiplying \(\xi \) by \(-1\), if necessary, we can assume that g is positive definite. A transversal vector field is called equiaffine if \(\nabla \nu _\xi =0\). This condition is equivalent to the fact that \(\nabla g\) is symmetric, i.e. \((g,\nabla )\) is a statistical structure. It means, in particular, that for a statistical structure obtained on a hypersurface by a choice of a transversal vector field, the Ricci tensor of \(\nabla \) is automatically symmetric. In general, the Ricci tensor of a statistical structure on an abstract manifold does not have to be symmetric. Therefore, all structures with non-symmetric Ricci tensor are non-realizable as the induced structures on hypersurfaces. For statistical structures induced on hypersurfaces the condition \(R=\overline{R}\) describes the so called affine spheres. The class of affine spheres is very large, very attractive for geometers and still very misterious. Again, it is easy to find examples of statistical structures on abstract manifolds for which \(R=\overline{R}\) and which cannot be realized on affine spheres (although in this case the Ricci tensor of \(\nabla \) is symmetric). For a statistical structure a necessary condition for being (locally) realizable on a hypersurface is that the connection \(\overline{\nabla }\) is projectively flat. It is a strong condition which is rarely satisfied.
Another source of statistical structures is the theory of Lagrangian submanifolds in almost Hermitian manifolds. In this case K can be regarded as the second fundamental tensor of a sumbanifold. The theory is best developed for Lagrangian submanifolds of complex space forms. In this case \(\hat{\nabla }K\) as a (1, 3)-tensor field is symmetric. Hence, by Lemma 1, we also have \(\overline{R}=R\). In this case an obstructive condition (which makes that a statistical structure satisfying \(R=\overline{R}\) might be non-realizable on a Lagrangian submanifold) is the Gauss equation.
In analogy with the case of hypersurfaces by an equiaffine statistical structure we mean a triple \((g,\nabla , \nu )\), where \((g,\nabla )\) is a statistical structure and \(\nu \) is a volume form on M (in most cases different than \(\nu _g\)) parallel relative to \(\nabla \).
For more information on dual connections, affine differential geometry and the geometry of statistical structures we refer to [1–4, 6].
4 Sectional Curvatures
Of course we have the ordinary sectional curvature for g. In general, the tensor field R for a statistical connection \(\nabla \) is not good enough to produce the sectional curvature. The reason is that, in general, g(R(X, Y)Z, W) is not skew symmetric relative to Z, W. But the tensor field \(\mathcal R=\frac{1}{2}(R+\overline{R})\) has the property \(g((\mathcal R(X,Y)Z, W)=-g((\mathcal R(X,Y)W, Z)\). Moreover, it satisfies the first Bianchi identity. This allows to define the sectional curvature, which we call the sectional \(\nabla \)-curvature. Namely, if X, Y is an orthonormal basis of a vector plane \(\pi \subset T_xM\) then the sectional \(\nabla \)-curvature by this plane is defined as \(g(\mathcal {R}(X,Y)Y,X)\). But for this sectional curvature Schur’s lemma does not hold, in general. It is because there is no appropriate universal second Bianchi identity. We have, however, the following analogue of the second Bianchi identity, see [4],
where \(\varXi _{U,X,Y}\) denotes the cyclic permutation sum. It follows that for statistical structures satisfying the condition \(R=\overline{R}\) Schur’s lemma holds. Another result in which the assumption \(R=\overline{R}\) is important is the following analogue of Tachibana’s theorem, [4],
Theorem 4
Let M be a connected compact oriented manifold and \((g,\nabla )\) be a statistical structure on M such that \(R=\overline{R}\). If the curvature operator \(\hat{\mathfrak {R}}\) for \(\hat{R}\) is non-negative and \(\mathrm{div}\,^{\hat{\nabla }}R=0\) then \(\hat{\nabla }R=0\). If additionally \(\hat{\mathfrak {R}} >0\) at some point of M then the sectional \(\nabla \)-curvature is constant.
Since the tensor \(\mathcal R\) has good symmetry properties, one can also define the curvature operator, say \(\mathfrak R\), for \(\mathcal R\) sending 2-vectors into 2-vectors. Namely, we set
where g denotes here the natural extension of g to tensors. The formula defines a linear, symmetric relative to g operator \(\mathfrak R: \varLambda ^2TM\rightarrow \varLambda ^2TM.\) In particular, it is diagonalizable and hence it can be positive, negative (definite) etc.
We have the following analogue of a theorem of Meyer-Gallot for trace-free statistical structures, see [4],
Theorem 5
Let M be a connected compact oriented manifold and \((g,\nabla )\) be a trace-free statistical structure on M. If the curvature operator \(\mathfrak R\) for \(\mathcal R\) is non-negative on M then each harmonic form is parallel relative to \(\nabla \), \(\overline{\nabla }\) and \(\hat{\nabla }\). If moreover the curvature operator is positive at some point of M then the Betti numbers \(b_1(M)=...=b_{n-1}(M)=0\).
Another sectional curvature for a statistical structure (g, K) can be defined by using the tensor field K. We define a (1, 3)-tensor field [K, K] by
for \(X,Y,Z\in T_xM,\ x\in M\). Recall that for a Hessian structure we have \([K,K]=-\hat{R}\). The tensor field [K, K] is skew symmetric in X, Y, skew-symmetric relative to g and satisfies the first Bianchi identity. Therefore one can define the sectional K-curvature by a vector plane \(\pi \) tangent to M as \(k(\pi )=g([K,K](X,Y)Y,X)\), where X, Y is an orthonormal basis of \(\pi \). As in the previous case, for this sectional curvature Schur’s lemma holds if \(\hat{\nabla }K\) is symmetric as a (1, 3)-tensor field. Note that the notion of the sectional K-curvature is purely algebraic. The fact that this curvature is constant implies that K has a special expression. Namely, we have, see [5],
Theorem 6
Let (g, K) be a statistical structure on an n-dimensional manifold M. If the sectional K-curvature is constant and equal to A for all vector planes in TM then for each \(x\in M\) there is an orthonormal basis \(e_1,...,e_n\) of \(T_xM\) such that
for \(i=2,...n\) and
for some numbers \(\lambda _i\), \(\mu _i\) for \(i=1,..., n-1\) and \(j>i\). Moreover
for \(i=1,..., n-1\) where \(A_0=A\). If additionally the statistical structure (g, K) is trace-free then \(A\le 0\), \(\lambda _i\) and \(\mu _i\) are expressed as follows
Note that, in general, it is not possible to find a local frame \(e_1,..., e_n\) around a point of M in which K has expression as in the above theorem.
Below there are few theorems serving as examples of results dealing with the sectional K-curvature. In these theorems the notation \([K,K]\cdot K\), \(\hat{R}\cdot K\) means that [K, K] and \(\hat{R}\) act on K as differentiations. Details concerning the theorems are providedT in [5].
Theorem 7
Let (g, K) be a statistical structure on a manifold M. If the sectional K-curvature is non-positive on M and \([K,K]\cdot K=0\) then the sectional K-curvature vanishes on M.
Corollary 1
If (g, K) is a Hessian structure on M with non-positive sectional curvature of g and such that \(\hat{R}\cdot K=0\) then \(\hat{R}=0\).
Theorem 8
If (g, K) is a statistical structure on a manifold M, the sectional K-curvature is negative on M and \(\hat{R}\cdot K=0\) then \(\hat{R}=0\).
Theorem 9
Assume that \([K,K]=0\) on a statistical manifold (M, g, K), \(\hat{\nabla }K\) is symmetric and \(\hat{\nabla }E=0\). If K is non-degenerate, that is, the mapping \(T_xM\ni X\rightarrow K_X\in HOM(T_xM)\) is a monomorphism at each point of M then \(\hat{R}=0\) and \(\hat{\nabla }K =0\) on M.
Theorem 10
Let (g, K) be a trace-free statistical structure on a manifold M with symmetric \(\hat{\nabla }K \). If the sectional K-curvature is constant then either \(K=0\) or \(\hat{R}=0\) and \(\hat{\nabla }K=0\) on M.
5 Bochner-Type Theorems
Bochner’s theorems for Riemannian manifolds say, roughly speaking, that under some curvature assumptions harmonic forms must be parallel. This is a converse to the trivial statement that a parallel form is harmonic.
For statistical structures one can prove some analogues of this theorems. First we define a new Laplacian \(\varDelta ^\nabla \) depending on the statistical connection. If \((g,\nabla )\) is a statistical structure on M then we define the codifferential \(\delta ^{\nabla }\) acting on differential forms copying the classical Weitzenböck formula
for any differential form \(\omega \). We now set
If the statistical structure is trace-free then \(\varDelta ^\nabla \) is the ordinary Laplacian for g. A form \(\omega \) is called \(\nabla \)-harmonic if \(\varDelta ^\nabla \omega =0\). Hodge’s theory can be adapted to this definition of a Laplacian and harmonicity. In particular, for an equiaffine statistical structure on a compact manifold M we have \(\dim \mathcal H^{k,\nabla }(M)=b_k(M)\), where \( \mathcal H^{k,\nabla }(M)\) is the space of all \(\nabla \)-harmonic forms and \(b_k(M)\) is the k-th Betti number of M.
Below are exemplary analogues of Bochner-type theorems for statistical structures.
Theorem 11
Let M be a connected compact oriented manifold with an equiaffine statistical structure \((g,\nabla , \nu )\). If the Ricci tensor Ric for \(\nabla \) is non-negative on M then every \(\nabla \)-harmonic 1-form on M is \(\overline{\nabla }\)-parallel. In particular, the first Betti number \(b_1(M)\) is not greater than \(\dim M\). If additionally \(Ric>0\) at some point of M then \(b_1(M)=0\).
Theorem 12
Let M be a connected compact oriented manifold. Let \((g,\nabla )\) be a trace-free statistical structure on M. If \(Ric +\overline{Ric} \ge 0\) on M then each harmonic 1-form on M is parallel relative to the connections \(\nabla \), \(\overline{\nabla }\) and \(\hat{\nabla }\). In particular, \(b_1(M)\le \dim M\). If moreover \(Ric+\overline{Ric}>0\) at some point then \(b_1(M)=0\).
For any statistical structure \((g,\nabla )\) one can define the Weitzenböck curvature operator denoted here by \(\mathcal W ^R\). It depends only on g and the curvature tensor R. More precisely, it can be introduced as follows. Let s be a tensor field of type (l, k), where \(k>0\), on M. One defines a tensor field \(\mathcal W^Rs\) of type (l, k) by the formula
where \(e_1,...,e_n\) is an arbitrary orthonormal frame, \(R(e_j,X_i)\cdot s\) means that \(R(e_j,X_i)\) acts as a differentiation on s, and \(e_j\) in the last parenthesis is at the i-th place. It is possible to prove appropriate generalizations of Bochner-Weitzenböck’s and Lichnerowicz’s formulas for the Laplacian acting on differential forms on statistical manifolds. In particular, for a trace-free structure we have the following simple formula
where \(\nabla ^*\) is suitably defined formal adjoint for \(\nabla \).
Details concerning Hodge’s theory and Bochner’s technique for statistical structures can be found in [4].
References
Lauritzen, S.T.: Statistical Manifolds. IMS Lecture Notes-Monograph Series, vol. 10, pp. 163–216 (1987)
Li, A.-M., Simon, U., Zhao, G.: Global Affine Differential Geometry of Hypersurfaces. Walter de Gruyter, Berlin (1993). Geom. Appl. 24, 567–578 (2006)
Nomizu, K., Sasaki, T.: Affine Differential Geometry. Cambridge University Press, Cambridge (1994)
Opozda, B.: Bochner’s technique for statistical manifolds. Ann. Glob. Anal. Geom. doi:10.1007/s10455-015-9475-z
Opozda, B.: A sectional curvature for statistical structures. arXiv:1504.01279 [math.DG]
Shima, H.: The Geometry of Hessian Structures. World Scientific, Singapore (2007)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Opozda, B. (2015). Curvatures of Statistical Structures. In: Nielsen, F., Barbaresco, F. (eds) Geometric Science of Information. GSI 2015. Lecture Notes in Computer Science(), vol 9389. Springer, Cham. https://doi.org/10.1007/978-3-319-25040-3_26
Download citation
DOI: https://doi.org/10.1007/978-3-319-25040-3_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25039-7
Online ISBN: 978-3-319-25040-3
eBook Packages: Computer ScienceComputer Science (R0)