Abstract
A multiple decision statistical problem for the elements of inverse covariance matrix is considered. Associated optimal unbiased multiple decision statistical procedure is given. This procedure is constructed using the Lehmann theory of multiple decision statistical procedures and the conditional tests of the Neyman structure. The equations for thresholds calculation for the tests of the Neyman structure are analyzed.
Access provided by Autonomous University of Puebla. Download chapter PDF
Similar content being viewed by others
Keywords
- Inverse covariance matrix
- Tests of the Neyman structure
- Multiple decision statistical procedure
- Generating hypothesis
1 Introduction
A market network is constructed by means of some similarity measure between every pairs of stocks. The most popular measure of similarity between stocks of a market is the correlation between them [1–4]. The analysis of methods of market graph construction [2] from the statistical point of view was started in [5]. In [5] multiple decision statistical procedure for market graph construction based on the Pearson test is suggested. The authors of [5] note that a procedure of this type can be made optimal in the class of unbiased multiple decision statistical procedures if one uses the tests of the Neyman structure for generating hypothesis. In the present paper we use the partial correlations as a measure of similarity between stocks. In this case the elements of inverse covariance matrix are the weights of links between stocks in the market network [6].
The main goal of the paper is investigation of the problem of identification of inverse covariance matrix as a multiple decision statistical problem. As a result an optimal unbiased multiple decision statistical procedure for identification of inverse covariance matrix is given. This procedure is constructed using the Lehmann theory of multiple decision statistical procedures and the conditional tests of the Neyman structure. In addition the equations for thresholds calculation for the tests of the Neyman structure are analyzed.
The paper is organized as follows. In Sect. 2 we briefly recall the Lehmann theory of multiple decision statistical procedures. In Sect. 3 we describe the tests of the Neyman structure. In Sect. 4 we formulate the multiple decision problem for identification of inverse covariance matrix. In Sect. 5 we construct and study the optimal tests for testing of generating hypothesis for the elements of an inverse covariance matrix. In Sect. 6 we construct the multiple statistical procedure and consider some particular cases. In Sect. 7 we summarize the main results of the paper.
2 Lehmann Multiple Decision Theory
In this section we recall for the sake of completeness the basic idea of the Lehmann theory following the paper [5].
Suppose that the distribution of a random vector R is taken from the family f(r, θ): θ ∈ Ω, where θ is a parameter, Ω is the parametric space, r is an observation of R. We need to construct a statistical procedure for the selection of one from the set of L hypotheses, which in the general case can be stated as:
The most general theory of a multiple decision procedure is the Lehmann theory [7]. The Lehmann theory is based on three concepts: generating hypothesis, generating hypothesis testing and compatibility conditions, and additivity condition for the loss function.
The multiple decision problem (1) of selecting one from the set of L hypothesesH i : θ ∈ Ω i ; i = 1, …, L is equivalent to a family of M two decision problems:
with
This equivalence is given by the relations:
where
Hypotheses H j ′: (j = 1, …, M) are called generating hypotheses for the problem (1).
The equivalence between problem (1) and the family of problems (2) reduces the multiple decision problem to the testing of generating hypothesis. Any statistical procedure δ j for hypothesis testing of H j ′ can be written in the following form
where ∂ j is the decision of acceptance of H j ′ and ∂ j −1 is the decision of acceptance of K j ′, X j is the acceptance region of H j ′ and X j −1 is the acceptance region of K j ′ (rejection region of H j ′) in the sample space. One has \(X_{j} \cap X_{j}^{-1} = \theta \), \(X_{j} \cup X_{j}^{-1} = X\), X being the sample space.
Define the acceptance region for H i by
where χ i, j are defined by (3), and put \(X_{j}^{1} = X_{j}\). Note that \(\bigcup _{i=1}^{L}D_{i} \subset X\), but it is possible that \(\bigcup _{i=1}^{L}D_{i}\neq X\).
Therefore, if \(D_{1},D_{2},\ldots,D_{L}\) is a partition of the sample space X, then one can define the statistical procedure δ(r) by
According to [8, 9] we define the conditional risk for multiple decision statistical procedure by
where E θ is the expectation according to the density function f(x, θ) and w(θ, d k ) is the loss from decision d k under the condition that θ is true, θ ∈ Ω. Under the additivity condition of the loss function (see [5] for more details) the conditional risk can be written as
We call statistical procedure optimal in a class of statistical procedures if it has minimal conditional risk for all θ ∈ Ω in this class. The main result of the Lehmann theory (see [7]) states that if Eq. (8) is satisfied and statistical procedures (4) are optimal in the class of unbiased statistical tests, then the associated multiple decision statistical procedure (6) is optimal in the class of unbiased multiple decision procedures.
3 Unbiasedness and Tests of the Neyman Structure
The class of unbiased multiple decision statistical procedures according to Lehmann [9, 10] is defined by:
Let f(r; θ) be the density of the exponential family:
where c(θ) is a function defined in the parameters space, h(r), T j (r) are functions defined in the sample space, and T j (R) are the sufficient statistics for θ j , j = 1, …, M.
Suppose that generating hypotheses (2) has the form:
where θ j 0 are fixed. For a fixed j the parameter θ j is called information or structural parameter and θ k , k ≠ j are called nuisance parameters. According to [9] the optimal unbiased tests for generating hypotheses (10) are:
where t i = T i (r), i = 1, …, M and constants \(c_{j}^{1}(t_{1},\ldots,t_{j-1},t_{j+1},\ldots,t_{M})\), \(c_{j}^{2}(t_{1},\ldots,t_{j-1},t_{j+1},\ldots,t_{M})\) are defined from the equations
and
where \(f(t_{j},\theta _{j}^{0}\vert T_{ i} = t_{i},i = 1,\ldots,M;i\neq j)\) is the density of conditional distribution of statistics T j and α j is the level of significance of the test.
A test satisfying (12) is said to have Neyman structure. This test is characterized by the fact that the conditional probability of rejection of H j ′ (under the assumption that H j ′ is true) is equal to α j on each of the surfaces \(\bigcap _{k\neq j}(T_{k}(x) = t_{k})\). Therefore the multiple decision statistical procedure associated with the tests of the Neyman structure (11), (12), and (13) is optimal in the class of unbiased multiple decision procedures.
4 Problem of Identification of Inverse Covariance Matrix
In this section we formulate the multiple decision problem for elements of inverse covariance matrix.
Let N be the number of stocks on a financial market, and let n be the number of observations. Denote by r i (t) the daily return of the stock i for the day t (i = 1, …, N; t = 1, …, n). We suppose r i (t) to be an observation of the random variable R i (t). We use the standard assumptions: the random variables R i (t), t = 1, …, n are independent and have all the same distribution as a random variable R i (i = 1, …, N). The random vector \((R_{1},R_{2},\ldots,R_{N})\) describes the joint behavior of the stocks. We assume that the vector \((R_{1},R_{2},\ldots,R_{N})\) has a multivariate normal distribution with covariance matrix \(\|\sigma _{ij}\|\) where \(\sigma _{ij} = cov(R_{i},R_{j}) = E(R_{i} - E(R_{i}))(R_{j} - E(R_{j})),\rho _{ij} = (\sigma _{ij})/(\sqrt{\sigma _{ii } \sigma _{jj}}),i,j = 1,\ldots,N\), E(R i ) is the expectation of the random variable R i . We define a sample space as R N×n with the elements (r i (t)). Statistical estimation of σ ij is \(s_{ij} =\varSigma _{ t=1}^{n}(r_{ i}(t) -\overline{r_{i}})(r_{j}(t) -\overline{r_{j}})\) where \(\overline{r_{i}} = (1/n)\sum _{t=1}^{n}r_{i}(t)\). The sample correlation between the stocks i and j is defined by \(r_{ij} = (s_{ij})/(\sqrt{s_{ii } s_{jj}})\). It is known [11] that for a multivariate normal vector the statistics \((\overline{r_{1}},\overline{r_{2}},\ldots,\overline{r_{N}})\) and \(\|s_{ij}\|\)(matrix of sample covariances) are sufficient.
Let σ ij be the elements of inverse covariance matrix \(\|{\sigma }^{ij}\|\). Then the problem of identification of inverse covariance matrix can be formulated as a multiple decision problem of the selection of one from the set of hypotheses:
where L = 2Mwith M = N(N − 1)∕2.
Multiple decision problem (14) is a particular case of the problem (1). The parameter space Ω is the space of positive semi-definite matrices | | σ ij | | , Ω k is a domain in the parameters space associated with the hypothesis H k from the set (14) k = 1, …, L. For the multiple decision problem (14) we introduce the following set of generating hypotheses:
We use the following notations: ∂ i, j is the decision of acceptance of the hypothesis h i, j and ∂ i, j −1 is the decision of rejection of h i, j .
5 Tests of the Neyman Structure for Testing of Generating Hypothesis
Now we construct the optimal test in the class of unbiased tests for generating hypothesis (15). To construct these tests we use the sufficient statistics s ij with the following Wishart density function [11]:
if the matrix (s kl ) is positive definite, and f({s kl }) = 0 otherwise. One has for a fixed i < j:
where
Therefore, this distribution belongs to the class of exponential distributions with parameters σ kl.
The optimal tests of the Neyman structure (11), (12), and (13) for generating hypothesis (15) take the form:
where the critical values are defined from the equations
where I is the interval of values of s ij such that the matrix (s kl ) is positive definite and α ij is the level of significance of the tests.
Consider Eqs. (17) and (18). Note that det(s kl ) is a quadratic polynomial of s ij . Let \(\det (s_{k,l}) = -C_{1}s_{i,j}^{2} + C_{2}s_{i,j} + C_{3} = C_{1}(-s_{i,j}^{2} + As_{i,j} + B)\), C 1 > 0 then the positive definiteness of the matrix (s k, l ) for a fixed s k, l , (k, l) ≠ (i, j) gives the following interval for the value of s i, j :
Now we define the functions
where K = (n − N − 2)∕2. One can calculate the critical values of \(c_{i,j}^{1},c_{i,j}^{2}\) from the equations:
The test (16) can be written in terms of sample correlations. First note that \(\det (s_{kl}) =\det (r_{kl})s_{11}s_{22}\ldots s_{NN}\) where r kl are the sample correlations. One has
where \(I = J\sqrt{s_{ii } s_{jj}}\), \(c_{i,j}^{k} = e_{i,j}^{k}\sqrt{s_{ii } s_{jj}};k = 1,2\). Therefore, the tests (16) take the form
It means that the tests of the Neyman structure for generating hypothesis (15) do not depend on s k, k , k ≠ i, k ≠ j. In particular for N = 3, (i, j) = (1, 2) one has
To emphasize the peculiarity of the constructed test we consider some interesting particular cases.
\(\underline{n - N - 2 = 0,\sigma _{0}^{ij}\neq 0}\). In this case expressions (20) and (21) can be simplified. Indeed one has in this case
Finally one has the system of two equations for defining constants \(c_{ij}^{1},c_{ij}^{2}\):
\(\underline{\sigma _{0} = 0}\). In this case the critical values are defined by the system of algebraic equations (22) and (23) where the functions Ψ 1, Ψ 2 are defined by
In this case the tests of the Neyman structure have the form
\(\underline{n - N - 2 = 0,\sigma _{0} = 0}\). In this case one has
6 Multiple Statistical Procedure Based on the Tests of the Neyman Structure
Now it is possible to construct the multiple decision statistical procedure for problem (14) based on the tests of Neyman structure. Then the multiple decision statistical procedure (6) takes the form:
where \(c_{ij}(\{s_{kl}\})\) are defined from the Eqs. (17) and (18).
One has \(D_{k} =\{ r \in {R}^{N\times n}:\ \delta (r) = d_{k}\}\), k = 1, 2, …, L. It is clear that
Then \(D_{1},D_{2},\ldots,D_{L}\) is a partition of the sample space R N×n. The tests of the Neyman structure for generating hypothesis (15) are optimal in the class of unbiased tests. Therefore if the condition of the additivity (8) of the loss function is satisfied, then the associated multiple decision statistical procedure is optimal. For discussion of additivity of the loss function see [5].
We illustrate statistical procedure (32) with an example.
Let N = 3. In this case problem (14) is the problem of the selection of one from eight hypotheses:
Generating hypotheses are:
\(h_{1,2} {:\sigma }^{12} =\sigma _{ 0}^{12}\ \mbox{ vs}\ k_{1,2} {:\sigma }^{12}\neq \sigma _{0}^{12}\), σ 13, σ 23 are the nuisance parameters.
\(h_{1,3} {:\sigma }^{13} =\sigma _{ 0}^{13}\ \mbox{ vs}\ k_{1,3} {:\sigma }^{13}\neq \sigma _{0}^{13}\), σ 12, σ 23 are the nuisance parameters.
\(h_{2,3} {:\sigma }^{23} =\sigma _{ 0}^{23}\ \mbox{ vs}\ k_{2,3} {:\sigma }^{23}\neq \sigma _{0}^{23}\), σ 12, σ 13 are the nuisance parameters.
In this case multiple statistical procedure for problem (33) (if σ 0 ≠ 0) is:
The critical values \(c_{12}^{k} = c_{12}^{k}(r_{13},r_{23},s_{11}s_{22})\), \(c_{13}^{k} = c_{13}^{k}(r_{12},r_{23},s_{11}s_{33})\), \(c_{23}^{k} = c_{23}^{k}(r_{12},r_{13},s_{22}s_{33})\); k = 1, 2 are defined from Eqs. (24) and (25). If n = 5; σ 0 ij ≠ 0; i, j = 1, 2, 3, then the critical values c ij k; k = 1, 2 are defined from (28) and (29).
If σ 0 ij = 0, ∀i, j and n = 5, then tests (30) for generating hypothesis depend on the sample correlation only. Therefore the corresponding multiple statistical procedure with L decisions depends only on the sample correlation too. This procedure is (34) where constants \(c_{12}^{k} = c_{12}^{k}(r_{13},r_{23}),c_{13}^{k} = c_{13}^{k}(r_{12},r_{23}),c_{23}^{k} = c_{23}^{k}(r_{12},r_{13});k = 1,2\).
In this case
where
and the critical values are
Note that in this case test (34) has a very simple form.
7 Concluding Remarks
Statistical problem of identification of elements of inverse covariance matrix is investigated as multiple decision problem. Solution of this problem is developed on the base of the Lehmann theory of multiple decision procedures and theory of tests of the Neyman structure. It is shown that this solution is optimal in the class of unbiased multiple decision statistical procedures. Obtained results can be applied to market network analysis with partial correlations as a measure of similarity between stocks returns.
References
Mantegna, R.N.: Hierarchical structure in financial market. Eur. Phys. J. series B 11, 193–197 (1999)
Boginsky V., Butenko S., Pardalos P.M.: On structural properties of the market graph. In: Nagurney, A. (ed.) Innovations in Financial and Economic Networks. pp. 29–45. Edward Elgar Publishing Inc., Northampton (2003)
Boginsky V., Butenko S., Pardalos P.M.: Statistical analysis of financial networks. Comput. Stat. Data. Anal. 48, 431–443 (2005)
M. Tumminello, T. Aste, T. Di Matteo, R.N. Mantegna, H.A.: Tool for Filtering Information in Complex Systems. Proc. Natl. Acad. Sci. Uni. States Am. 102 30, 10421–10426 (2005)
Koldanov, A.P., Koldanov, P.A., Kalyagin, V.A., Pardalos, P.M.: Statistical Procedures for the Market Graph Construction. Comput. Stat. Data Anal. 68, 17–29, DOI: 10.1016/j.csda.2013.06.005. (2013)
Hero, A., Rajaratnam, B.: Hub discovery in partial correlation graphical models. IEEE Trans. Inform. Theor. 58–—(9), 6064–6078 (2012)
Lehmann E.L.: A theory of some multiple decision procedures 1. Ann. Math. Stat. 28, 1–25 (1957)
Wald, A.: Statistical Decision Function. Wiley, New York (1950)
Lehmann E.L., Romano J.P.: Testing Statistical Hypothesis. Springer, New York (2005)
Lehmann E.L.: A general concept of unbiasedness. Ann. Math. Stat. 22, 587–597 (1951)
Anderson T.W.: An Introduction to Multivariate Statistical Analysis. 3 edn. Wiley-Interscience, New York (2003)
Acknowledgements
The authors are partly supported by National Research University, Higher School of Economics, Russian Federation Government grant, N. 11.G34.31.0057
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer Science+Business Media New York
About this chapter
Cite this chapter
Koldanov, A.P., Koldanov, P.A. (2014). Optimal Multiple Decision Statistical Procedure for Inverse Covariance Matrix. In: Demyanov, V., Pardalos, P., Batsyn, M. (eds) Constructive Nonsmooth Analysis and Related Topics. Springer Optimization and Its Applications, vol 87. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8615-2_13
Download citation
DOI: https://doi.org/10.1007/978-1-4614-8615-2_13
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-8614-5
Online ISBN: 978-1-4614-8615-2
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)