Optimal Multiple Decision Statistical Procedure for Inverse Covariance Matrix

Koldanov, Alexander P.; Koldanov, Petr A.

doi:10.1007/978-1-4614-8615-2_13

Alexander P. Koldanov⁵ &
Petr A. Koldanov⁵

Part of the book series: Springer Optimization and Its Applications ((SOIA,volume 87))

1076 Accesses
4 Citations

Abstract

A multiple decision statistical problem for the elements of inverse covariance matrix is considered. Associated optimal unbiased multiple decision statistical procedure is given. This procedure is constructed using the Lehmann theory of multiple decision statistical procedures and the conditional tests of the Neyman structure. The equations for thresholds calculation for the tests of the Neyman structure are analyzed.

Access provided by Autonomous University of Puebla. Download chapter PDF

Multiple Testing of Conditional Independence Hypotheses Using Information-Theoretic Approach

Classes of multiple decision functions strongly controlling FWER and FDR

Article 30 October 2014

Objective Bayesian model selection approach to the two way analysis of variance

Article 20 April 2017

Keywords

1 Introduction

A market network is constructed by means of some similarity measure between every pairs of stocks. The most popular measure of similarity between stocks of a market is the correlation between them [1–4]. The analysis of methods of market graph construction [2] from the statistical point of view was started in [5]. In [5] multiple decision statistical procedure for market graph construction based on the Pearson test is suggested. The authors of [5] note that a procedure of this type can be made optimal in the class of unbiased multiple decision statistical procedures if one uses the tests of the Neyman structure for generating hypothesis. In the present paper we use the partial correlations as a measure of similarity between stocks. In this case the elements of inverse covariance matrix are the weights of links between stocks in the market network [6].

The main goal of the paper is investigation of the problem of identification of inverse covariance matrix as a multiple decision statistical problem. As a result an optimal unbiased multiple decision statistical procedure for identification of inverse covariance matrix is given. This procedure is constructed using the Lehmann theory of multiple decision statistical procedures and the conditional tests of the Neyman structure. In addition the equations for thresholds calculation for the tests of the Neyman structure are analyzed.

The paper is organized as follows. In Sect. 2 we briefly recall the Lehmann theory of multiple decision statistical procedures. In Sect. 3 we describe the tests of the Neyman structure. In Sect. 4 we formulate the multiple decision problem for identification of inverse covariance matrix. In Sect. 5 we construct and study the optimal tests for testing of generating hypothesis for the elements of an inverse covariance matrix. In Sect. 6 we construct the multiple statistical procedure and consider some particular cases. In Sect. 7 we summarize the main results of the paper.

2 Lehmann Multiple Decision Theory

In this section we recall for the sake of completeness the basic idea of the Lehmann theory following the paper [5].

Suppose that the distribution of a random vector R is taken from the family f(r, θ): θ ∈ Ω, where θ is a parameter, Ω is the parametric space, r is an observation of R. We need to construct a statistical procedure for the selection of one from the set of L hypotheses, which in the general case can be stated as:

$$\displaystyle{ \begin{array}{ll} &H_{i}:\theta \in \varOmega _{i},i = 1,\ldots,L, \\ &\varOmega _{i} \cap \varOmega _{j} = \theta,\ i\neq j,\ \bigcup\nolimits_{i=1}^{L}\varOmega _{i} =\varOmega.\end{array} }$$

(1)

The most general theory of a multiple decision procedure is the Lehmann theory [7]. The Lehmann theory is based on three concepts: generating hypothesis, generating hypothesis testing and compatibility conditions, and additivity condition for the loss function.

The multiple decision problem (1) of selecting one from the set of L hypothesesH _i: θ ∈ Ω _i; i = 1, …, L is equivalent to a family of M two decision problems:

$$\displaystyle{ H_{j}^{{\prime}}:\theta \in \omega _{ j}\:\mbox{ vs}\:K_{j}^{{\prime}}:\theta \in \omega _{ j}^{-1},j = 1,\ldots,M, }$$

(2)

with

$$\displaystyle{\bigcup _{i=1}^{L}\varOmega _{ i} =\omega _{j} \cup \omega _{j}^{-1} =\varOmega.}$$

This equivalence is given by the relations:

$$\displaystyle{\varOmega _{i} =\bigcap _{ j=1}^{M}\omega _{ j}^{\chi _{i,j} },\quad \omega _{j} =\bigcup _{\{i:\chi _{i,j}=1\}}\varOmega _{i},}$$

where

$$\displaystyle{ \chi _{i,j} = \left \{\begin{array}{ll} 1, &\mbox{ }\varOmega _{i} \cap \omega _{j}\neq \varnothing, \\ - 1,&\mbox{ }\varOmega _{i} \cap \omega _{j} = \varnothing. \end{array} \right. }$$

(3)

Hypotheses H _j ^′: (j = 1, …, M) are called generating hypotheses for the problem (1).

The equivalence between problem (1) and the family of problems (2) reduces the multiple decision problem to the testing of generating hypothesis. Any statistical procedure δ _j for hypothesis testing of H _j′ can be written in the following form

$$\displaystyle{ \delta _{j}(r) = \left \{\begin{array}{ll} \partial _{j}, &r \in X_{j}, \\ \partial _{j}^{-1},&r \in X_{j}^{-1}, \end{array} \right. }$$

(4)

where ∂ _j is the decision of acceptance of H _j′ and ∂ _j ⁻¹ is the decision of acceptance of K _j′, X _j is the acceptance region of H _j′ and X _j ⁻¹ is the acceptance region of K _j′ (rejection region of H _j′) in the sample space. One has $X_{j} \cap X_{j}^{-1} = \theta $, $X_{j} \cup X_{j}^{-1} = X$, X being the sample space.

Define the acceptance region for H _i by

$$\displaystyle{ D_{i} =\bigcap _{ j=1}^{M}X_{ j}^{\chi _{i,j} }, }$$

(5)

where χ _i, j are defined by (3), and put $X_{j}^{1} = X_{j}$. Note that $\bigcup _{i=1}^{L}D_{i} \subset X$, but it is possible that $\bigcup _{i=1}^{L}D_{i}\neq X$.

Therefore, if $D_{1},D_{2},\ldots,D_{L}$ is a partition of the sample space X, then one can define the statistical procedure δ(r) by

$$\displaystyle{ \delta (r) = \left \{\begin{array}{ll} d_{1}, &r \in D_{1}, \\ d_{2}, &r \in D_{2},\\ \ldots &\ldots \ \ldots \\ d_{L},&r \in D_{L}. \end{array} \right. }$$

(6)

According to [8, 9] we define the conditional risk for multiple decision statistical procedure by

$$\displaystyle{ risk(\theta,\delta ) = E_{\theta }w(\theta,\delta (R)) =\sum _{ k=1}^{L}w(\theta,d_{ k})P_{\theta }(\delta (R) = d_{k}), }$$

(7)

where E _θ is the expectation according to the density function f(x, θ) and w(θ, d _k) is the loss from decision d _k under the condition that θ is true, θ ∈ Ω. Under the additivity condition of the loss function (see [5] for more details) the conditional risk can be written as

$$\displaystyle{ risk(\theta,\delta ) =\sum _{ j=1}^{M}risk(\theta,\delta _{ j}),\quad \theta \in \varOmega. }$$

(8)

We call statistical procedure optimal in a class of statistical procedures if it has minimal conditional risk for all θ ∈ Ω in this class. The main result of the Lehmann theory (see [7]) states that if Eq. (8) is satisfied and statistical procedures (4) are optimal in the class of unbiased statistical tests, then the associated multiple decision statistical procedure (6) is optimal in the class of unbiased multiple decision procedures.

3 Unbiasedness and Tests of the Neyman Structure

The class of unbiased multiple decision statistical procedures according to Lehmann [9, 10] is defined by:

$$\displaystyle{E_{\theta }w(\theta,\delta (R)) \leq E_{\theta }w{(\theta }^{{\prime}},\delta (R))\:for\:any\:\theta {,\theta }^{{\prime}}\in \varOmega.}$$

Let f(r; θ) be the density of the exponential family:

$$\displaystyle{ f(r;\theta ) = c(\theta )exp(\sum _{j=1}^{M}\theta _{ j}T_{j}(r))h(r), }$$

(9)

where c(θ) is a function defined in the parameters space, h(r), T _j(r) are functions defined in the sample space, and T _j(R) are the sufficient statistics for θ _j, j = 1, …, M.

Suppose that generating hypotheses (2) has the form:

$$\displaystyle{ H_{j}^{{\prime}}:\theta _{ j} =\theta _{ j}^{0}\mbox{ vs }K_{ j}^{{\prime}}:\theta _{ j}\neq \theta _{j}^{0},\ \ j = 1,2,\ldots,M, }$$

(10)

where θ _j ⁰ are fixed. For a fixed j the parameter θ _j is called information or structural parameter and θ _k, k ≠ j are called nuisance parameters. According to [9] the optimal unbiased tests for generating hypotheses (10) are:

$$\displaystyle{ \delta _{j} = \left \{\begin{array}{lc} \partial _{j}, &c_{j}^{1}(t_{1},\ldots,t_{j-1},t_{j+1},\ldots,t_{M}) < t_{j} < c_{j}^{2}(t_{1},\ldots,t_{j-1},t_{j+1},\ldots,t_{M}), \\ \partial _{j}^{-1},& \mbox{ otherwise }, \end{array} \right. }$$

(11)

where t _i = T _i(r), i = 1, …, M and constants $c_{j}^{1}(t_{1},\ldots,t_{j-1},t_{j+1},\ldots,t_{M})$, $c_{j}^{2}(t_{1},\ldots,t_{j-1},t_{j+1},\ldots,t_{M})$ are defined from the equations

$$\displaystyle{ \int _{c_{j}^{1}}^{c_{j}^{2} }f(t_{j},\theta _{j}^{0}\vert T_{ i} = t_{i},i = 1,\ldots,M;i\neq j)dt_{j} = 1 -\alpha _{j}, }$$

(12)

and

$$\displaystyle{ \begin{array}{l} \int _{-\infty }^{c_{j}^{1} }t_{j}f(t_{j},\theta _{j}^{0}\vert T_{ i} = t_{i},i = 1,\ldots,M;i\neq j)dt_{j} \\ +\int _{ c_{j}^{2}}^{+\infty }t_{ j}f(t_{j},\theta _{j}^{0}\vert T_{ i} = t_{i},i = 1,\ldots,M;i\neq j)dt_{j} \\ =\alpha _{j}\int _{-\infty }^{+\infty }t_{ j}f(t_{j},\theta _{j}^{0}\vert T_{ i} = t_{i},i = 1,\ldots,M;i\neq j)dt_{j}, \end{array} }$$

(13)

where $f(t_{j},\theta _{j}^{0}\vert T_{ i} = t_{i},i = 1,\ldots,M;i\neq j)$ is the density of conditional distribution of statistics T _j and α _j is the level of significance of the test.

A test satisfying (12) is said to have Neyman structure. This test is characterized by the fact that the conditional probability of rejection of H _j ^′ (under the assumption that H _j ^′ is true) is equal to α _j on each of the surfaces $\bigcap _{k\neq j}(T_{k}(x) = t_{k})$. Therefore the multiple decision statistical procedure associated with the tests of the Neyman structure (11), (12), and (13) is optimal in the class of unbiased multiple decision procedures.

4 Problem of Identification of Inverse Covariance Matrix

In this section we formulate the multiple decision problem for elements of inverse covariance matrix.

Let N be the number of stocks on a financial market, and let n be the number of observations. Denote by r _i(t) the daily return of the stock i for the day t (i = 1, …, N; t = 1, …, n). We suppose r _i(t) to be an observation of the random variable R _i(t). We use the standard assumptions: the random variables R _i(t), t = 1, …, n are independent and have all the same distribution as a random variable R _i(i = 1, …, N). The random vector $(R_{1},R_{2},\ldots,R_{N})$ describes the joint behavior of the stocks. We assume that the vector $(R_{1},R_{2},\ldots,R_{N})$ has a multivariate normal distribution with covariance matrix $\|\sigma _{ij}\|$ where $\sigma _{ij} = cov(R_{i},R_{j}) = E(R_{i} - E(R_{i}))(R_{j} - E(R_{j})),\rho _{ij} = (\sigma _{ij})/(\sqrt{\sigma _{ii } \sigma _{jj}}),i,j = 1,\ldots,N$, E(R _i) is the expectation of the random variable R _i. We define a sample space as R ^N×n with the elements (r _i(t)). Statistical estimation of σ _ij is $s_{ij} =\varSigma _{ t=1}^{n}(r_{ i}(t) -\overline{r_{i}})(r_{j}(t) -\overline{r_{j}})$ where $\overline{r_{i}} = (1/n)\sum _{t=1}^{n}r_{i}(t)$. The sample correlation between the stocks i and j is defined by $r_{ij} = (s_{ij})/(\sqrt{s_{ii } s_{jj}})$. It is known [11] that for a multivariate normal vector the statistics $(\overline{r_{1}},\overline{r_{2}},\ldots,\overline{r_{N}})$ and $\|s_{ij}\|$(matrix of sample covariances) are sufficient.

Let σ ^ij be the elements of inverse covariance matrix $\|{\sigma }^{ij}\|$. Then the problem of identification of inverse covariance matrix can be formulated as a multiple decision problem of the selection of one from the set of hypotheses:

$$\displaystyle{ \begin{array}{ll} &H_{1} {:\sigma }^{ij} =\sigma _{ 0}^{ij},\ i,j = 1,\ldots,N,\ i < j, \\ &H_{2} {:\sigma }^{12}\neq \sigma _{0}^{12}{,\sigma }^{ij} =\sigma _{ 0}^{ij},i,j = 1,\ldots,N,(i,j)\neq (1,2),\ i < j, \\ &H_{3} {:\sigma }^{13}\neq \sigma _{0}^{13}{,\sigma }^{ij} =\sigma _{ 0}^{ij},i,j = 1,\ldots,N,(i,j)\neq (1,3),\ i < j, \\ &H_{4} {:\sigma }^{12}\neq \sigma _{0}^{12}{,\sigma }^{13}\neq \sigma _{0}^{13}{,\sigma }^{ij} =\sigma _{ 0}^{ij},i,j = 1,\ldots,N,(i,j)\neq (1,2),\ (i,j)\neq (1,3),\\ &\ldots \\ &H_{L} {:\sigma }^{ij}\neq \sigma _{0}^{ij},i,j = 1,\ldots,N,\ i < j,\end{array} }$$

(14)

where L = 2^Mwith M = N(N − 1)∕2.

Multiple decision problem (14) is a particular case of the problem (1). The parameter space Ω is the space of positive semi-definite matrices | | σ ^ij | | , Ω _k is a domain in the parameters space associated with the hypothesis H _k from the set (14) k = 1, …, L. For the multiple decision problem (14) we introduce the following set of generating hypotheses:

$$\displaystyle{ h_{i,j} {:\sigma }^{ij} =\sigma _{ 0}^{ij}\ \mbox{ vs}\ k_{ i,j} {:\sigma }^{ij}\neq \sigma _{ 0}^{ij},\ i,\ j = 1,2,\ldots,N,\ i < j, }$$

(15)

We use the following notations: ∂ _i, j is the decision of acceptance of the hypothesis h _i, j and ∂ _i, j ⁻¹ is the decision of rejection of h _i, j.

5 Tests of the Neyman Structure for Testing of Generating Hypothesis

Now we construct the optimal test in the class of unbiased tests for generating hypothesis (15). To construct these tests we use the sufficient statistics s _ij with the following Wishart density function [11]:

$$\displaystyle{f(\{s_{k,l}\}) = \frac{{[\det {(\sigma }^{kl})]}^{n/2} \times {[\det (s_{kl})]}^{(n-N-2)/2} \times \exp [-(1/2)\sum _{k}\sum _{l}s{_{k,l}\sigma }^{kl}]} {{2}^{(Nn/2)} {\times \pi }^{N(N-1)/4} \times \varGamma (n/2)\varGamma ((n - 1)/2)\cdots \varGamma ((n - N + 1)/2)}}$$

if the matrix (s _kl) is positive definite, and f({s _kl}) = 0 otherwise. One has for a fixed i < j:

$$\displaystyle{f(\{s_{kl}\}) = C{(\{\sigma }^{kl}\}) \times \exp [{-\sigma }^{ij}s_{ ij} -\frac{1} {2}\sum _{(k,l)\neq (i,j);(k,l)\neq (j,i)}s{_{kl}\sigma }^{kl}] \times h(\{s_{ kl}\}),}$$

where

$$\displaystyle\begin{array}{rcl} & & \qquad \qquad \qquad C{(\{\sigma }^{kl}\}) = \frac{1} {q}\det {(\sigma }^{kl}){]}^{n/2}, {}\\ & & q = {2}^{(Nn/2)} {\times \pi }^{N(N-1)/4} \times \varGamma (n/2)\varGamma ((n - 1)/2)\cdots \varGamma ((n - N + 1)/2), {}\\ & & \qquad \qquad \qquad h(\{s_{kl}\}) = {[\det (s_{kl})]}^{(n-N-2)/2}. {}\\ \end{array}$$

Therefore, this distribution belongs to the class of exponential distributions with parameters σ ^kl.

The optimal tests of the Neyman structure (11), (12), and (13) for generating hypothesis (15) take the form:

$$\displaystyle{ \delta _{i,j}(\{s_{kl}\}) = \left \{\begin{array}{rl} \ \partial _{i,j},&\mbox{ }\:\:c_{i,j}^{1}(\{s_{kl}\}) < s_{ij} < c_{i,j}^{2}(\{s_{kl}\}),\ (k,l)\neq (i,j), \\ \ \partial _{i,j}^{-1},&\mbox{ }\:\:s_{ij} \leq c_{i,j}^{1}(\{s_{kl}\})\mbox{ or }s_{ij} \geq c_{i,j}^{2}(\{s_{kl}\}),\ (k,l)\neq (i,j), \end{array} \right. }$$

(16)

where the critical values are defined from the equations

$$\displaystyle{ \frac{\int _{I\cap [c_{i,j}^{1};c_{i,j}^{2}]}\exp [-\sigma _{0}^{ij}s_{ij}]{[\det (s_{kl})]}^{(n-N-2)/2}ds_{ij}} {\int _{I}\exp [-\sigma _{0}^{ij}s_{ij}]{[\det (s_{kl})]}^{(n-N-2)/2}ds_{ij}} = 1 -\alpha _{i,j}, }$$

(17)

$$\displaystyle{ \begin{array}{l} \int _{I\cap [-\infty;c_{i,j}^{1}]}s_{ij}\exp [-\sigma _{0}^{ij}s_{ ij}]{[\det (s_{kl})]}^{(n-N-2)/2}ds_{ ij} \\ +\int _{I\cap [c_{i,j}^{2};+\infty ]}s_{ij}\exp [-\sigma _{0}^{ij}s_{ ij}]{[\det (s_{kl})]}^{(n-N-2)/2}ds_{ ij} \\ =\alpha _{i,j}\int _{I}s_{ij}\exp [-\sigma _{0}^{ij}s_{ ij}]{[\det (s_{kl})]}^{(n-N-2)/2}ds_{ ij},\end{array} }$$

(18)

where I is the interval of values of s _ij such that the matrix (s _kl) is positive definite and α _ij is the level of significance of the tests.

Consider Eqs. (17) and (18). Note that det(s _kl) is a quadratic polynomial of s _ij. Let $\det (s_{k,l}) = -C_{1}s_{i,j}^{2} + C_{2}s_{i,j} + C_{3} = C_{1}(-s_{i,j}^{2} + As_{i,j} + B)$, C ₁ > 0 then the positive definiteness of the matrix (s _k, l) for a fixed s _k, l, (k, l) ≠ (i, j) gives the following interval for the value of s _i, j:

$$\displaystyle{ I =\{ x: \frac{A} {2} -\sqrt{\frac{{A}^{2 } } {4} + B} < x < \frac{A} {2} + \sqrt{\frac{{A}^{2 } } {4} + B}\}. }$$

(19)

Now we define the functions

$$\displaystyle{ \varPsi _{1}(x) =\int _{ 0}^{x}{(-{t}^{2} + At + B)}^{K}\exp (-\sigma _{ 0}^{ij}t)dt, }$$

(20)

$$\displaystyle{ \varPsi _{2}(x) =\int _{ 0}^{x}t{(-{t}^{2} + At + B)}^{K}\exp (-\sigma _{ 0}^{ij}t)dt, }$$

(21)

where K = (n − N − 2)∕2. One can calculate the critical values of $c_{i,j}^{1},c_{i,j}^{2}$ from the equations:

$$\displaystyle{ \varPsi _{1}(c_{ij}^{2}) -\varPsi _{ 1}(c_{ij}^{1}) = (1 -\alpha _{ ij})(\varPsi _{1}(\frac{A} {2} + \sqrt{\frac{{A}^{2 } } {4} + B}) -\varPsi _{1}(\frac{A} {2} -\sqrt{\frac{{A}^{2 } } {4} + B})), }$$

(22)

$$\displaystyle{ \varPsi _{2}(c_{ij}^{2}) -\varPsi _{ 2}(c_{ij}^{1}) = (1 -\alpha _{ ij})(\varPsi _{2}(\frac{A} {2} + \sqrt{\frac{{A}^{2 } } {4} + B}) -\varPsi _{2}(\frac{A} {2} -\sqrt{\frac{{A}^{2 } } {4} + B})). }$$

(23)

The test (16) can be written in terms of sample correlations. First note that $\det (s_{kl}) =\det (r_{kl})s_{11}s_{22}\ldots s_{NN}$ where r _kl are the sample correlations. One has

$$\displaystyle{ \frac{\int _{J\cap [e_{i,j}^{1};e_{i,j}^{2}]}\exp [-\sigma _{0}^{ij}r_{ij}\sqrt{s_{ii } s_{jj}}]{[\det (r_{kl})]}^{(n-N-2)/2}dr_{ij}} {\int _{J}\exp [-\sigma _{0}^{ij}r_{ij}\sqrt{s_{ii } s_{jj}}]{[\det (r_{kl})]}^{(n-N-2)/2}dr_{ij}} = 1 -\alpha _{i,j}, }$$

(24)

$$\displaystyle{ \begin{array}{l} \int _{J\cap [-\infty;e_{i,j}^{1}]}r_{ij}\exp [-\sigma _{0}^{ij}r_{ ij}\sqrt{s_{ii } s_{jj}}]{[\det (r_{kl})]}^{(n-N-2)/2}dr_{ ij} \\ +\int _{J\cap [e_{i,j}^{2};+\infty ]}r_{ij}\exp [-\sigma _{0}^{ij}r_{ ij}\sqrt{s_{ii } s_{jj}}]{[\det (r_{kl})]}^{(n-N-2)/2}dr_{ ij} \\ =\alpha _{i,j}\int _{J}r_{ij}\exp [-\sigma _{0}^{ij}r_{ ij}\sqrt{s_{ii } s_{jj}}]{[\det (r_{kl})]}^{(n-N-2)/2}dr_{ ij},\end{array} }$$

(25)

where $I = J\sqrt{s_{ii } s_{jj}}$, $c_{i,j}^{k} = e_{i,j}^{k}\sqrt{s_{ii } s_{jj}};k = 1,2$. Therefore, the tests (16) take the form

$$\displaystyle{ \delta _{i,j}(\{r_{kl}\}) = \left \{\begin{array}{rl} \ \partial _{i,j},&\mbox{ }\:\:e_{i,j}^{1}(s_{ii},s_{jj},\{r_{kl}\}) < r_{ij} < e_{i,j}^{2}(s_{ii},s_{jj},\{r_{kl}\}), \\ \ &(k,l)\neq (i,j), \\ \ \partial _{i,j}^{-1},&\mbox{ }\:\:r_{ij} \leq e_{i,j}^{1}(s_{ii},s_{jj}\{r_{kl}\})\mbox{ or }r_{ij} \geq e_{i,j}^{2}(s_{ii},s_{jj}\{r_{kl}\}), \\ \ &(k,l)\neq (i,j). \end{array} \right. }$$

(26)

It means that the tests of the Neyman structure for generating hypothesis (15) do not depend on s _k, k, k ≠ i, k ≠ j. In particular for N = 3, (i, j) = (1, 2) one has

$$\displaystyle{ \delta _{1,2}(\{r_{k,l}\}) = \left \{\begin{array}{rl} \ \partial _{1,2},&\mbox{ }\:e_{12}^{1}(s_{11},s_{22},r_{13},r_{23}) < r_{12} < e_{12}^{2}(s_{11},s_{22},r_{13},r_{23}), \\ \ \partial _{1,2}^{-1},&\mbox{ }\:r_{12} \leq e_{12}^{1}(s_{11},s_{22},r_{13},r_{23})\mbox{ or }r_{12} \geq e_{12}^{2}(s_{11},s_{22},r_{13},r_{23}). \end{array} \right. }$$

(27)

To emphasize the peculiarity of the constructed test we consider some interesting particular cases.

$\underline{n - N - 2 = 0,\sigma _{0}^{ij}\neq 0}$. In this case expressions (20) and (21) can be simplified. Indeed one has in this case

$$\displaystyle{\varPsi _{1}(x) =\int _{ 0}^{x}\exp (-\sigma _{ 0}^{ij}t)dt = \frac{1} {\sigma _{0}^{ij}}(1 -\exp (-\sigma _{0}^{ij}x)),}$$

$$\displaystyle{\varPsi _{2}(x) =\int _{ 0}^{x}t\exp (-\sigma _{ 0}^{ij}t)dt = - \frac{x} {\sigma _{0}^{ij}}\exp (-\sigma _{0}^{ij}x) - \frac{1} {{(\sigma _{0}^{ij})}^{2}}\exp (-\sigma _{0}^{ij}x) + \frac{1} {{(\sigma _{0}^{ij})}^{2}}.}$$

Finally one has the system of two equations for defining constants $c_{ij}^{1},c_{ij}^{2}$:

$$\displaystyle{ \begin{array}{l} \exp (-\sigma _{0}^{ij}c_{ ij}^{1}) -\exp (-\sigma _{ 0}^{ij}c_{ ij}^{2}) \\ = (1 -\alpha _{ij})\{\exp (-\sigma _{0}^{ij}(\frac{A} {2} -\sqrt{\frac{{A}^{2 } } {4} + B})) -\exp (-\sigma _{0}^{ij}(\frac{A} {2} + \sqrt{\frac{{A}^{2 } } {4} + B}))\}, \end{array} }$$

(28)

$$\displaystyle{ \begin{array}{l} c_{ij}^{1}\exp (-\sigma _{ 0}^{ij}c_{ ij}^{1}) - c_{ ij}^{2}\exp (-\sigma _{ 0}^{ij}c_{ ij}^{2}) \\ = (1 -\alpha _{ij})\{(\frac{A} {2} -\sqrt{\frac{{A}^{2 } } {4} + B})\exp (-\sigma _{0}^{ij}(\frac{A} {2} -\sqrt{\frac{{A}^{2 } } {4} + B})) \\ - (\frac{A} {2} + \sqrt{\frac{{A}^{2 } } {4} + B})\exp (-\sigma _{0}^{ij}(\frac{A} {2} + \sqrt{\frac{{A}^{2 } } {4} + B}))\}. \end{array} }$$

(29)

$\underline{\sigma _{0} = 0}$. In this case the critical values are defined by the system of algebraic equations (22) and (23) where the functions Ψ ₁, Ψ ₂ are defined by

$$\displaystyle\begin{array}{rcl} & & \varPsi _{1}(x) =\int _{ 0}^{x}{(-{t}^{2} + At + B)}^{K}dt, {}\\ & & \varPsi _{2}(x) =\int _{ 0}^{x}t{(-{t}^{2} + At + B)}^{K}dt, {}\\ \end{array}$$

In this case the tests of the Neyman structure have the form

$$\displaystyle{ \delta _{i,j}(\{r_{k,l}\}) = \left \{\begin{array}{rl} \ \partial _{i,j},&\mbox{ }\:\:e_{i,j}^{1}(\{r_{kl}\} < r_{ij} < e_{i,j}^{2}(\{r_{kl}\},\ (k,l)\neq (i,j), \\ \ \partial _{i,j}^{-1},&\mbox{ }\:\:r_{ij} \leq e_{i,j}^{1}(\{r_{kl}\}\mbox{ or }r_{ij} \geq e_{i,j}^{1}(\{r_{kl}\},\ (k,l)\neq (i,j). \end{array} \right. }$$

(30)

$\underline{n - N - 2 = 0,\sigma _{0} = 0}$. In this case one has

$$\displaystyle{ \begin{array}{l} c_{i,j}^{1} = \frac{A} {2} - (1 -\alpha _{ij})\sqrt{\frac{{A}^{2 } } {4} + B}, \\ c_{i,j}^{2} = \frac{A} {2} + (1 -\alpha _{ij})\sqrt{\frac{{A}^{2 } } {4} + B}. \end{array} }$$

(31)

6 Multiple Statistical Procedure Based on the Tests of the Neyman Structure

Now it is possible to construct the multiple decision statistical procedure for problem (14) based on the tests of Neyman structure. Then the multiple decision statistical procedure (6) takes the form:

$$\displaystyle{ \delta = \left \{\begin{array}{ll} d_{1}, &c_{ij}^{1}(\{s_{kl}\}) < r_{ij} < c_{ij}^{2}(\{s_{kl}\}), \\ &i < j,i,j,k,l = 1,\ldots,N,(k,l)\neq (i,j), \\ d_{2}, &r_{12} \leq c_{12}^{1}(\{s_{kl}\})\mbox{ or }r_{12} \geq c_{12}^{2}(\{s_{kl}\}),c_{ij}^{1}(\{s_{kl}\}) < r_{ij} < c_{ij}^{2}(\{s_{kl}\}), \\ &i < j,i,j,k,l = 1,\ldots,N,(k,l)\neq (i,j),\\ \ldots &\ldots \ldots \ldots \\ d_{L},&r_{ij} \leq c_{ij}^{1}(\{s_{kl}\})\mbox{ or }r_{ij} \geq c_{ij}^{2}(\{s_{kl}\}), \\ &i < j,i,j,k,l = 1,\ldots,N,(k,l)\neq (i,j), \end{array} \right. }$$

(32)

where $c_{ij}(\{s_{kl}\})$ are defined from the Eqs. (17) and (18).

One has $D_{k} =\{ r \in {R}^{N\times n}:\ \delta (r) = d_{k}\}$, k = 1, 2, …, L. It is clear that

$$\displaystyle{\bigcup _{k=1}^{L}D_{ k} = {R}^{N\times n}.}$$

Then $D_{1},D_{2},\ldots,D_{L}$ is a partition of the sample space R ^N×n. The tests of the Neyman structure for generating hypothesis (15) are optimal in the class of unbiased tests. Therefore if the condition of the additivity (8) of the loss function is satisfied, then the associated multiple decision statistical procedure is optimal. For discussion of additivity of the loss function see [5].

We illustrate statistical procedure (32) with an example.

Let N = 3. In this case problem (14) is the problem of the selection of one from eight hypotheses:

$$\displaystyle{ \begin{array}{rlll} H_{1}:&{\sigma }^{12} =\sigma _{ 0}^{12},&{\sigma }^{13} =\sigma _{ 0}^{13},&{\sigma }^{23} =\sigma _{ 0}^{23}, \\ H_{2}:&{\sigma }^{12}\neq \sigma _{0}^{12}, &{\sigma }^{13} =\sigma _{ 0}^{13},&{\sigma }^{23} =\sigma _{ 0}^{23}, \\ H_{3}:&{\sigma }^{12} =\sigma _{ 0}^{12},&{\sigma }^{13}\neq \sigma _{0}^{13}, &{\sigma }^{23} =\sigma _{ 0}^{23}, \\ H_{4}:&{\sigma }^{12} =\sigma _{ 0}^{12},&{\sigma }^{13} =\sigma _{ 0}^{13},&{\sigma }^{23}\neq \sigma _{0}^{23}, \\ H_{5}:&{\sigma }^{12}\neq \sigma _{0}^{12}, &{\sigma }^{13}\neq \sigma _{0}^{13}, &{\sigma }^{23} =\sigma _{ 0}^{23}, \\ H_{6}:&{\sigma }^{12} =\sigma _{ 0}^{12},&{\sigma }^{13}\neq \sigma _{0}^{13}, &{\sigma }^{23}\neq \sigma _{0}^{23}, \\ H_{7}:&{\sigma }^{12}\neq \sigma _{0}^{12}, &{\sigma }^{13} =\sigma _{ 0}^{13},&{\sigma }^{23}\neq \sigma _{0}^{23}, \\ H_{8}:&{\sigma }^{12}\neq \sigma _{0}^{12}, &{\sigma }^{13}\neq \sigma _{0}^{13}, &{\sigma }^{23}\neq \sigma _{0}^{23}.\end{array} }$$

(33)

Generating hypotheses are:

$h_{1,2} {:\sigma }^{12} =\sigma _{ 0}^{12}\ \mbox{ vs}\ k_{1,2} {:\sigma }^{12}\neq \sigma _{0}^{12}$, σ ¹³, σ ²³ are the nuisance parameters.

$h_{1,3} {:\sigma }^{13} =\sigma _{ 0}^{13}\ \mbox{ vs}\ k_{1,3} {:\sigma }^{13}\neq \sigma _{0}^{13}$, σ ¹², σ ²³ are the nuisance parameters.

$h_{2,3} {:\sigma }^{23} =\sigma _{ 0}^{23}\ \mbox{ vs}\ k_{2,3} {:\sigma }^{23}\neq \sigma _{0}^{23}$, σ ¹², σ ¹³ are the nuisance parameters.

In this case multiple statistical procedure for problem (33) (if σ ₀ ≠ 0) is:

$$\displaystyle{ \delta = \left \{\begin{array}{rccc} d_{1},& c_{12}^{1} < r_{12} < c_{12}^{2}, & c_{13}^{1} < r_{13} < c_{13}^{2}, & c_{23}^{1} < r_{23} < c_{23}^{2}, \\ d_{2},&r_{12} \leq c_{12}^{1}\mbox{ or }r_{12} \geq c_{12}^{2},& c_{13}^{1} < r_{13} < c_{13}^{2}, & c_{23}^{1} < r_{23} < c_{23}^{2}, \\ d_{3},& c_{12}^{1} < r_{12} < c_{12}^{2}, &r_{13} \leq c_{13}^{1}\mbox{ or }r_{13} \geq c_{13}^{2},& c_{23}^{1} < r_{23} < c_{23}^{2}, \\ d_{4},& c_{12}^{1} < r_{12} < c_{12}^{2}, & c_{13}^{1} < r_{13} < c_{13}^{2}, &r_{23} \leq c_{23}^{1}\mbox{ or }r_{23} \geq c_{23}^{2}, \\ d_{5},&r_{12} \leq c_{12}^{1}\mbox{ or }r_{12} \geq c_{12}^{2},&r_{13} \leq c_{13}^{1}\mbox{ or }r_{13} \geq c_{13}^{2},& c_{23}^{1} < r_{23} < c_{23}^{2}, \\ d_{6},& c_{12}^{1} < r_{12} < c_{12}^{2}, &r_{13} \leq c_{13}^{1}\mbox{ or }r_{13} \geq c_{13}^{2},&r_{23} \leq c_{23}^{1}\mbox{ or }r_{23} \geq c_{23}^{2}, \\ d_{7},&r_{12} \leq c_{12}^{1}\mbox{ or }r_{12} \geq c_{12}^{2},& c_{13}^{1} < r_{13} < c_{13}^{2}, &r_{23} \leq c_{23}^{1}\mbox{ or }r_{23} \geq c_{23}^{2}, \\ d_{8},&r_{12} \leq c_{12}^{1}\mbox{ or }r_{12} \geq c_{12}^{2},&r_{13} \leq c_{13}^{1}\mbox{ or }r_{13} \geq c_{13}^{2},&r_{23} \leq c_{23}^{1}\mbox{ or }r_{23} \geq c_{23}^{2}. \end{array}\right. }$$

(34)

The critical values $c_{12}^{k} = c_{12}^{k}(r_{13},r_{23},s_{11}s_{22})$, $c_{13}^{k} = c_{13}^{k}(r_{12},r_{23},s_{11}s_{33})$, $c_{23}^{k} = c_{23}^{k}(r_{12},r_{13},s_{22}s_{33})$; k = 1, 2 are defined from Eqs. (24) and (25). If n = 5; σ ₀ ^ij ≠ 0; i, j = 1, 2, 3, then the critical values c _ij ^k; k = 1, 2 are defined from (28) and (29).

If σ ₀ ^ij = 0, ∀i, j and n = 5, then tests (30) for generating hypothesis depend on the sample correlation only. Therefore the corresponding multiple statistical procedure with L decisions depends only on the sample correlation too. This procedure is (34) where constants $c_{12}^{k} = c_{12}^{k}(r_{13},r_{23}),c_{13}^{k} = c_{13}^{k}(r_{12},r_{23}),c_{23}^{k} = c_{23}^{k}(r_{12},r_{13});k = 1,2$.

In this case

$$\displaystyle\begin{array}{rcl} I_{1,2}& =& (r_{13}r_{23} - G_{1,2};r_{13}r_{23} + G_{1,2}), {}\\ I_{1,3}& =& (r_{12}r_{23} - G_{1,3};r_{12}r_{23} + G_{1,3}), {}\\ I_{2,3}& =& (r_{12}r_{13} - G_{2,3};r_{12}r_{13} + G_{2,3}), {}\\ \end{array}$$

where

$$\displaystyle\begin{array}{rcl} G_{1,2}& =& \sqrt{(1 - r_{13 }^{2 })(1 - r_{23 }^{2 })}, {}\\ G_{1,3}& =& \sqrt{(1 - r_{12 }^{2 })(1 - r_{23 }^{2 })}, {}\\ G_{2,3}& =& \sqrt{(1 - r_{12 }^{2 })(1 - r_{13 }^{2 })}, {}\\ \end{array}$$

and the critical values are

$$\displaystyle\begin{array}{rcl} c_{12}^{1}& =& r_{ 13}r_{23} - (1 -\alpha _{12})G_{1,2}, {}\\ c_{12}^{2}& =& r_{ 13}r_{23} + (1 -\alpha _{12})G_{1,2}, {}\\ c_{13}^{1}& =& r_{ 12}r_{23} - (1 -\alpha _{13})G_{1,3}, {}\\ c_{13}^{2}& =& r_{ 12}r_{23} + (1 -\alpha _{13})G_{1,3}, {}\\ c_{23}^{1}& =& r_{ 12}r_{13} - (1 -\alpha _{23})G_{2,3}, {}\\ c_{23}^{2}& =& r_{ 12}r_{13} + (1 -\alpha _{23})G_{2,3}. {}\\ \end{array}$$

Note that in this case test (34) has a very simple form.

7 Concluding Remarks

Statistical problem of identification of elements of inverse covariance matrix is investigated as multiple decision problem. Solution of this problem is developed on the base of the Lehmann theory of multiple decision procedures and theory of tests of the Neyman structure. It is shown that this solution is optimal in the class of unbiased multiple decision statistical procedures. Obtained results can be applied to market network analysis with partial correlations as a measure of similarity between stocks returns.

References

Mantegna, R.N.: Hierarchical structure in financial market. Eur. Phys. J. series B 11, 193–197 (1999)
Article Google Scholar
Boginsky V., Butenko S., Pardalos P.M.: On structural properties of the market graph. In: Nagurney, A. (ed.) Innovations in Financial and Economic Networks. pp. 29–45. Edward Elgar Publishing Inc., Northampton (2003)
Google Scholar
Boginsky V., Butenko S., Pardalos P.M.: Statistical analysis of financial networks. Comput. Stat. Data. Anal. 48, 431–443 (2005)
Article Google Scholar
M. Tumminello, T. Aste, T. Di Matteo, R.N. Mantegna, H.A.: Tool for Filtering Information in Complex Systems. Proc. Natl. Acad. Sci. Uni. States Am. 102 30, 10421–10426 (2005)
Google Scholar
Koldanov, A.P., Koldanov, P.A., Kalyagin, V.A., Pardalos, P.M.: Statistical Procedures for the Market Graph Construction. Comput. Stat. Data Anal. 68, 17–29, DOI: 10.1016/j.csda.2013.06.005. (2013)
Google Scholar
Hero, A., Rajaratnam, B.: Hub discovery in partial correlation graphical models. IEEE Trans. Inform. Theor. 58–—(9), 6064–6078 (2012)
Google Scholar
Lehmann E.L.: A theory of some multiple decision procedures 1. Ann. Math. Stat. 28, 1–25 (1957)
Article MATH Google Scholar
Wald, A.: Statistical Decision Function. Wiley, New York (1950)
Google Scholar
Lehmann E.L., Romano J.P.: Testing Statistical Hypothesis. Springer, New York (2005)
Google Scholar
Lehmann E.L.: A general concept of unbiasedness. Ann. Math. Stat. 22, 587–597 (1951)
Article MATH Google Scholar
Anderson T.W.: An Introduction to Multivariate Statistical Analysis. 3 edn. Wiley-Interscience, New York (2003)
MATH Google Scholar

Download references

Acknowledgements

The authors are partly supported by National Research University, Higher School of Economics, Russian Federation Government grant, N. 11.G34.31.0057

Author information

Authors and Affiliations

National Research University, Higher School of Economics, Bolshaya Pecherskaya 25/12, Nizhny Novgorod, 603155, Russia
Alexander P. Koldanov & Petr A. Koldanov

Authors

Alexander P. Koldanov
View author publications
You can also search for this author in PubMed Google Scholar
Petr A. Koldanov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Petr A. Koldanov .

Editor information

Editors and Affiliations

Saint Petersburg State University, Saint Petersburg, Russia
Vladimir F. Demyanov
Department of Industrial & Systems Engin, University of Florida, Gainesville, Florida, USA
Panos M. Pardalos
Higher School of Economics, National Research University, Nizhny Novgorod, Russia
Mikhail Batsyn

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Koldanov, A.P., Koldanov, P.A. (2014). Optimal Multiple Decision Statistical Procedure for Inverse Covariance Matrix. In: Demyanov, V., Pardalos, P., Batsyn, M. (eds) Constructive Nonsmooth Analysis and Related Topics. Springer Optimization and Its Applications, vol 87. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8615-2_13

Download citation

DOI: https://doi.org/10.1007/978-1-4614-8615-2_13
Published: 25 September 2013
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-8614-5
Online ISBN: 978-1-4614-8615-2
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

Optimal Multiple Decision Statistical Procedure for Inverse Covariance Matrix

Abstract

Similar content being viewed by others

Multiple Testing of Conditional Independence Hypotheses Using Information-Theoretic Approach

Classes of multiple decision functions strongly controlling FWER and FDR

Objective Bayesian model selection approach to the two way analysis of variance

Keywords

1 Introduction

2 Lehmann Multiple Decision Theory

3 Unbiasedness and Tests of the Neyman Structure

4 Problem of Identification of Inverse Covariance Matrix

5 Tests of the Neyman Structure for Testing of Generating Hypothesis

6 Multiple Statistical Procedure Based on the Tests of the Neyman Structure

7 Concluding Remarks

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Navigation

Optimal Multiple Decision Statistical Procedure for Inverse Covariance Matrix

Abstract

Similar content being viewed by others

Multiple Testing of Conditional Independence Hypotheses Using Information-Theoretic Approach

Classes of multiple decision functions strongly controlling FWER and FDR

Objective Bayesian model selection approach to the two way analysis of variance

Keywords

1 Introduction

2 Lehmann Multiple Decision Theory

3 Unbiasedness and Tests of the Neyman Structure

4 Problem of Identification of Inverse Covariance Matrix

5 Tests of the Neyman Structure for Testing of Generating Hypothesis

6 Multiple Statistical Procedure Based on the Tests of the Neyman Structure

7 Concluding Remarks

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation