1 Statistics in China Before 1980’s

Professor Pao-Lu Hsu (1910–1970) is generally considered as the founder of probability and statistics in China. It is known that Prof. Hsu was the first teacher to offer courses in probability and statistics in the old “Southwest United University” in Kunming of China in 1940’s during the Sino-Japanese war in World War II [9]. Prof. Hsu finished his Ph.D. study in the University College of London of the United Kingdom in 1938 and pursued his research in the United States in the last few years of 1940’s. He returned to Peking University in 1947 and taught for more than 20 years there. In late 1950’s, Kai-Tai Fang was one of Prof. Hsu’s students in a series of seminar classes in probability and statistics [32]. Since then, Kai-Tai Fang developed more and more interests in statistics and devoted his lifetime career to statistics. During the early development of probability and statistics in China between early 1960’s and before 1980’s, probability and statistics were considered as a small unit in any mathematics departments of universities in China. Because of the serious shortage of teachers in probability and statistics, professors in this small unit mainly focused on teaching before 1980’s. Prof. Kai-Tai Fang was one of the very few lecturers who insisted on doing research in the old “closed society” before 1977 although research topics are mainly focused on application of statistics in the industrial area, see, for example [18, 34, 35, 58, 59, 99]. The year of 1977 was the most memorable year in the history of the higher education in China since 1949 when the whole academia was re-open after the 10-year “Cultural Revolution”. Since 1977, both theoretical and applied research in all areas of science and technology was highly recognized in academic institutions of China. Based on his non-stopping efforts in pursuing probability and statistics research during the lost ten years, Prof. Fang became one of the leading researchers in mathematical statistics and its applications in various areas in China. While his research in both theoretical and applied statistics had been continuing in 1970’s [19, 20, 36, 37, 100, 118], Prof. Fang also collaborated with his colleagues in writing statistics textbooks to meet the urgent need of statistical education in China in late 1970’s [37], (Fang et al. 1979).

The last three years (1977, 1978, 1979) of 1970’s is generally considered as a period of academic revival of Chinese higher education after the ten-year “Cultural Revolution” (1966–1976). Many scientists and researchers burst out a kind of never-seen energy in pursuing new knowledge and accomplishments after being forced out of their academic life for ten years. Prof. Fang belonged to the small group of researchers who could focus most of their time on research and never stopped along their research directions. The strong basis laid down from Prof. Pao-Lu Hsu’s seminar classes helped Fang’s research throughout the early years in his statistical career. The statistical foundation knowledge trained from Prof. Hsu’s classes and his never-give-up ambition in pursuing high-quality statistical career turned out to equip Fang with inexhaustible resources in his future years of being a highly productive statistician and a well-known statistical educator. Sections 1.2, 1.3 and 1.4 will introduce Prof. Fang’s creative contributions to multivariate statistics in the last two decades of the 20th century.

2 Multivariate Analysis and Generalized Multivariate Statistics

After Prof. Pao-Lu Hsu opened the statistical door for young Chinese statisticians and led them into the realm of classical statistics in late 1950’s, a number of Prof. Hsu’s students grew up in late 1970’s. Among these students, Prof. Yao-Ting Zhang (1933–2007) [31] and Prof. Kai-Tai Fang made significant contributions to inheritance of Prof. Hsu’s major idea in multivariate statistical analysis and its application in various areas. Both Profs. Zhang and Fang not only brought with their own prolific research accomplishments but also trained a large number of graduate students and statistical practitioners from various institutions of Chinese higher education. Facing the almost empty of statistical textbooks and readings in the higher education of China in late 1970’s and the early years of 1980’s, both Profs. Zhang and Fang, cooperating with their colleagues, published a few urgent-needed statistical textbooks to meet the needs of college students and postgraduate students in their beginning study in statistics, for example [29, 38, 62, 63, 68, 70,71,72, 101, 123]. All of these early statistical textbooks and readings greatly enriched the urgent needs for students in Chinese higher education in the whole 1980’s. By training graduate students and organizing statistical seminars and workshops in various directions, Prof. Fang took the leading role in developing new research directions in multivariate statistics and statistical education during the last twenty years of the 20th century. Profs. Fang and Zhang helped open numerous Chinese young statisticians’ eyes in entering the realm of modern multivariate statistics and its applications through their productive research accomplishments and comprehensive statistical education. Profs. Zhang and Fang are generally considered as the pioneers and initiators of multivariate statistics and statistical education after Prof. Pao-Lu Hsu in the 20th century of China.

2.1 Development of the Theory of Elliptically Contoured Distributions

With the rapid development of the statistical science and computer science in the last two or three decades of the 20th century, classical statistics under the normal assumption can no longer meet the needs of high-dimensional data analysis. Statisticians in the world have long realized the phenomenon and reality of fat-tailed distributed data. Normal-theory-based statistical methods become doubtable when applied to this kind of data. Modern computer technology and algorithms make it possible to analyze a large amount of high-dimensional data beyond the classical normal assumption. Some early research on extending the normal-theory-based statistical methods to the ones under a wider class of probability distributions, which is called the elliptically contoured distributions (ECD for simplicity), includes [2,3,4,5,6,7,8, 11,12,13,14,15, 28, 33, 40, 55, 69, 71, 73,74,75, 112, 113, 122, 124].

The stochastic representation method plays an important role in the development of the theory on ECD. For example, the p-dimensional normal distribution \(N_p(\varvec{\mu },\varvec{\varSigma })\) has a stochastic representation

$$\begin{aligned} \varvec{x}{\mathop {=}\limits ^\mathrm{d}}\varvec{\mu }+\varvec{A}\varvec{y}, \end{aligned}$$
(1.1)

where \(\varvec{A}\varvec{A}'=\varvec{\varSigma }\), \(\varvec{y}\) has the standard normal distribution \(N_p(\varvec{o},\mathrm{i})\) (\(\mathrm{i}\) stands for the identity matrix), and “\({\mathop {=}\limits ^\mathrm{d}}\)” denotes that the two sides of the equality have the same probability distribution. Equation (1.1) is called the stochastic representation of the multivariate normal distribution. One can pay attention to the fact that for any constant \(p\times p\) orthogonal matrix \(\varvec{\varGamma }\), it is always true \(\varvec{\varGamma }y{\mathop {=}\limits ^\mathrm{d}}\varvec{y}\) for \(\varvec{y}\sim N_p(\varvec{o},\mathrm{i})\). The probability distribution of \(\varvec{y}\) is said to have rotational invariance or to have spherical symmetry. The idea of spherical symmetry can be extended to the general case by defining a family of random vectors satisfying spherical symmetry:

$$\begin{aligned} {\mathscr {S}}_p(\phi )=\{\varvec{x}:\ \varvec{\varGamma }\varvec{x}{\mathop {=}\limits ^\mathrm{d}}\varvec{x}\ \text { for any constant }p\times p \text { orthogonal matrix }\varvec{\varGamma }\}, \end{aligned}$$
(1.2)

where \(\phi (\cdot )\) stands for the characteristic function of a distribution. \({\mathscr {S}}_p(\phi )\) is called the family of spherically symmetric distributions or simply called spherical distributions. It is obvious that \({\mathscr {S}}_p(\phi )\) includes that the standard normal distribution \(N_p(\varvec{o},\mathrm{i})\) and some commonly known multivariate distributions such as the multivariate Student t-distribution with zero mean and identity covariance matrix. It is known that \(\varvec{x}\in {\mathscr {S}}_p(\phi )\) if and only if

$$\begin{aligned} \varvec{x}{\mathop {=}\limits ^\mathrm{d}}R\mathbf {u}^{(p)}, \end{aligned}$$
(1.3)

where \(\mathbf {u}^{(p)}\) stands for the uniform distribution on the surface of the unit sphere in \(R^p\) (the p-dimensional real space), that is , \(\Vert \mathbf {u}^{(p)}\Vert =1\) (\(\Vert \cdot \Vert \) stands for the usual Euclidean norm), and \(R>0\) is a random variable that is independent of \(\mathbf {u}^{(p)}\). Equation (1.3) is called the stochastic representation for a spherical distribution. For any nontrivial \(\varvec{x}\in {\mathscr {S}}_p(\phi )\) with \(P(\varvec{x}=\varvec{o})=0\), it is always true that

$$\begin{aligned} \varvec{x}{\mathop {=}\limits ^\mathrm{d}}\Vert \varvec{x}\Vert \cdot \frac{\varvec{x}}{\Vert \varvec{x}\Vert }, \end{aligned}$$
(1.4)

where \(\Vert \varvec{x}\Vert \) and \(\varvec{x}/\Vert \varvec{x}\Vert \) are independent, and \(\varvec{x}/\Vert \varvec{x}\Vert {\mathop {=}\limits ^\mathrm{d}}\mathbf {u}^{(p)}\).

Equation (1.1) is a linear transformation of the standard normal \(N_p(\varvec{o},\mathrm{i})\) and gives a family of general multivariate normal distributions by choosing different linear transformations. This idea can be applied to the distributions in \({\mathscr {S}}_p(\phi )\) and gives a bigger family of distributions:

$$\begin{aligned} ECD_p(\varvec{\mu },\varvec{\varSigma },\phi )=\{\varvec{x};\ \varvec{x}{\mathop {=}\limits ^\mathrm{d}}\varvec{\mu }+\varvec{A}\varvec{y},\ \varvec{y}\in {\mathscr {S}}_p(\phi ),\ \varvec{\mu }\in R^p,\ \varvec{A}\varvec{A}'=\varvec{\varSigma }\}. \end{aligned}$$
(1.5)

\(ECD_p(\varvec{\mu },\varvec{\varSigma },\phi )\) is called the family of elliptically contoured distributions or simply called elliptical distributions. The Eq. (1.1) with \(\varvec{y}\in {\mathscr {S}}_p(\phi )\) is called the stochastic representation for an elliptical distribution. One can imagine that an elliptical distribution would have similar properties to those of the normal distribution \(N_p(\varvec{\mu },\varvec{\varSigma })\). For example, if \(\varvec{x}\in ECD_p(\varvec{\mu },\varvec{\varSigma },\phi )\) possesses a probability density function \(f(\varvec{x})\), it must have the form

$$\begin{aligned} f(\varvec{x})=c\varvec{\varSigma }^{-\frac{1}{2}}g[(\varvec{x}-\varvec{\mu })'\varvec{\varSigma }^{-1}(\varvec{x}-\varvec{\mu })], \end{aligned}$$
(1.6)

where \(g(\cdot )>0\) is a scalar function and \(c>0\) is a normalizing constant. For example, if \(\varvec{x}\sim N_p(\varvec{\mu },\varvec{\varSigma })\), \(g(x)=\exp (-x/2)\).

The method of stochastic representation used in Eqs. (1.1)–(1.5) plays an important role in developing some theory on ECD. Some statistical inference on the mean parameter \(\varvec{\mu }\) and covariance matrix \(\varvec{\varSigma }\) in \(ECD_p(\varvec{\mu },\varvec{\varSigma },\phi )\) was developed by Fang and his collaborators. Their comprehensive outcomes are summarized in [4,5,6, 77]. Some goodness-of-fit methods for spherical symmetry (a subfamily of ECD) were developed by Fang and his collaborators, for example [79, 88, 91, 92, 125, 127]. Some major approaches to testing spherical and elliptical symmetry were summarized in [56] and updated by [30].

2.2 Application of the Theory of Spherical Matrix Distributions

Prof. Fang’s contribution to the area of multivariate analysis and generalized multivariate statistics, including papers, monographs, and textbooks, has been cited by many international researchers in developing new statistical methodologies for data analysis. For example [80, 85,86,87], employed the major theory of spherical matrix distributions in [77] to developed a class of exact multivariate tests for normal statistical inference. These tests can be still effectively applicable under high dimension with a small sample size, which may be smaller than the dimension of sample data. The tests developed by Läuter and his associates provide exact solutions to multivariate normal mean comparisons under high dimension with a small sample size. These tests extend the traditional Hotelling’s \(T^2\)-test to the multiple mean comparisons as in multivariate analysis of variance (so-called MANOVA) and general linear tests for regression coefficients in multivariate regression models. Their tests are still applicable with fair power performance even in the case that the sample size is smaller than the dimension of sample data, see [84].

An \(n\times p\) random matrix \({\varvec{X}}\) is said to have a left-spherical matrix distribution, denote by \({\varvec{X}}\sim LS_{n\times p}(\phi )\), if for any constant orthogonal matrix \(\varvec{\varGamma }\) (\(n\times n\))

$$\begin{aligned} \varvec{\varGamma }{\varvec{X}}{\mathop {=}\limits ^\mathrm{d}}{\varvec{X}}. \end{aligned}$$
(1.7)

It is known that \({\varvec{X}}\sim LS_{n\times p}(\phi )\) if and only if \({\varvec{X}}\) has the stochastic representation

$$\begin{aligned} {\varvec{X}}{\mathop {=}\limits ^\mathrm{d}}{\varvec{U}}\textit{V}, \end{aligned}$$
(1.8)

where \({\varvec{U}}\) (\(n\times p\)) is independent of \(\textit{V}\) (\(p\times p\)) and \({\varvec{U}}\sim {\mathscr {U}}^{(n\times p)}\), which is uniformly distributed on the Stielfel manifold

$$\begin{aligned} {\mathscr {Q}}(n,p)=\{\varvec{H}_{n\times p}:\ \varvec{H}'\varvec{H}= I_p\}. \end{aligned}$$
(1.9)

If \({\varvec{X}}=(\varvec{x}_1,\dots ,\varvec{x}_n)'\) (\(n\times p\)) consists of i.i.d. observations from \(N_p(\mathbf 0 ,\varvec{\varSigma })\), then \({\varvec{X}}\sim LS_{n\times p}(\phi )\) and \({\varvec{X}}\) has a stochastic representation (1.8). For any random matrix \({\varvec{D}}_{p\times q}\) (\(q\le p\)), which is a function of \({\varvec{X}}\) in the quadratic form \({\varvec{D}}=f({\varvec{X}}'{\varvec{X}})\), it can be proved that \({\varvec{X}}{\varvec{D}}\sim LS_{n\times p}(\phi )\). So \({\varvec{X}}{\varvec{D}}\) also has a stochastic representation similar to (1.8), say, \({\varvec{X}}{\varvec{D}}{\mathop {=}\limits ^\mathrm{d}}{\varvec{U}}\varvec{A}\) and \({\varvec{U}}\sim {\mathscr {U}}^{(n\times q)}\) that is independent of \(\varvec{A}\) (\(q\times q\)). As a result of this stochastic representation, any affine-invariant statistic \(T(\cdot )\) satisfies \(T({\varvec{X}}{\varvec{D}}){\mathop {=}\limits ^\mathrm{d}}T({\varvec{U}})\), whose distribution is uniquely determined no matter how to choose the quadratic function \({\varvec{D}}=f({\varvec{X}}'{\varvec{X}})\). One can always choose \(q\le p\) as dimension reduction for \({\varvec{U}}\sim {\mathscr {U}}^{(n\times q)}\). For example, let \(q=\min (n,\,p)-1\), this will make a statistic \(T({\varvec{X}}{\varvec{D}})\) applicable for the case of high dimension with a small sample size, even for \(p\le n\). This is the main idea in constructing Läuter and his associates’ parametric tests.

By using the idea of spherical matrix distribution in [77, 85] and his associates’ (1998) approach to constructing multivariate parametric tests, Prof. Fang led his graduate students and colleagues to develop a class of nonparametric goodness-of-fit tests for multivariate normality for the case of high dimension with a small sample size, including some graphical methods for detecting non-normality with confidence regions, and a class of tests for spherical symmetry. The representative papers are: [54, 57, 91,92,93]. Fang’s approach to constructing multivariate tests and graphical methods for goodness-of-fit purpose was further developed by his graduate students and associates, see, for example [1, 89, 90, 94,95,96,97,98]. These papers are all based on the comprehensive study in [52, 77].

3 General Multivariate Symmetric and Related Distributions

Beyond the ECD are some classes of general multivariate symmetric distributions. A systematic summary of general multivariate continuous distributions can be dated back to [81]. Prof. Fang’s research in constructing new classes of continuous multivariate symmetric distributions and their statistical inference started in 1980’s, see, for example [33, 69].

3.1 From Spherical Distributions to the \(l_1\)-norm Symmetric Distributions

A general continuous multivariate symmetric distribution is usually constructed by a nonnegative random combination of a multivariate uniform distribution on the surface of a unit generalized sphere. By changing the distance measure for defining the unit generalized sphere, we can construct different families of continuous multivariate symmetric distributions. By applying a linear transformation to the stochastic representation of a general continuous multivariate symmetric distribution, one can obtain an even more general continuous multivariate symmetric distribution. For example, an elliptically contoured distribution (ECD) is obtained by applying a linear transformation to the stochastic representation of a spherically symmetric distribution. Fang and Fang [42] proposed a new family of multivariate exponential distributions. Based on their result [43], constructed different families of multivariate distributions related to the exponential distribution. We follow [43] notation to define the family of distributions given by

$$\begin{aligned} F_n=\{L(\varvec{z}):\ \varvec{z}{\mathop {=}\limits ^\mathrm{d}}R\mathbf {u},\ R\ge 0\ \text {is independent of}\ \mathbf {u}\}, \end{aligned}$$
(1.10)

where \(\mathbf {u}=(U_1,\ldots ,U_n)'\) is uniformly distributed on the \(l_1\)-norm unit sphere constrained to the positive quadrant

$$\begin{aligned} {\mathscr {S}}^1_+=\{\varvec{z}=(z_1,\ldots ,z_n)':\ z_i\ge 0\ (i=1,\ldots ,n),\ \Vert \varvec{z}\Vert _1=\sum _{i=1}^nz_i=1\}, \end{aligned}$$
(1.11)

where \(\Vert \varvec{z}\Vert _1=\sum _{i=1}^nz_i\) is called the \(l_1\)-norm of \(\varvec{z}\) with nonnegative components. Fang and Fang [43] proved that for any \(\varvec{z}=(Z_1,\ldots ,Z_n)'\in F_n\), its survival function

$$\begin{aligned} P(Z_1>z_1,\ldots ,Z_n>z_n) \end{aligned}$$
(1.12)

only depends on the \(l_1\)-norm \(\Vert \varvec{z}\Vert _1=\sum _{i=1}^nz_i\). As a result, a new family of distributions can be constructed:

$$\begin{aligned} T_n=\{L(\varvec{z}):\ \varvec{z}=(Z_1,\ldots ,Z_n)'\in R^n_+,\ P(Z_1>z_1,\ldots ,Z_n>z_n)=h(\Vert \varvec{z}\Vert _1)\}, \end{aligned}$$
(1.13)

where \(R^n_+=\{\varvec{z}=(z_1,\ldots ,z_n)':\ z_i\ge 0\ (i=,\ldots ,n)\}\). Fang and Fang [43] proved \(T_n\) contains a subfamily of symmetric multivariate distributions:

$$\begin{aligned} \begin{array}{rl} D_{n,\infty }=&{}\{L(\varvec{z}):\ \varvec{z}{\mathop {=}\limits ^\mathrm{d}}R\varvec{x},\ R\ge 0\ \text {is independent of}\ \varvec{x}=(X_1,\ldots ,X_n)\\ &{}\text {consisting of i.i.d.}\ X_i\sim \text {exp}(\lambda )\}, \end{array} \end{aligned}$$
(1.14)

where exp(\(\lambda \)) stands for the exponential distribution with parameter \(\lambda >0\). \(D_{n,\infty }\) is actually the family of mixtures of exponential distributions. Fang and Fang [43] proved the interesting relationship between the three families of distributions:

$$\begin{aligned} D_{n,\infty }\subset T_n\subset F_n, \end{aligned}$$
(1.15)

which means that \(F_n\) is the largest family of distributions that contains \(T_n\) as its subset and \(T_n\) contains \(D_{n,\infty }\) as its subset. Fang and Fang [43] obtained the general formulation of the survival function of \(\varvec{z}=(Z_1,\ldots ,Z_n)'\in F_n\):

$$\begin{aligned} P(Z_1>a_1,\ldots ,Z_n>a_n)=\int _{\Vert \varvec{a}\Vert _1}^{+\infty }(1-\Vert \varvec{a}\Vert _1/r)^{n-1}dG(r), \end{aligned}$$
(1.16)

where G(r) is the distribution function of R in the stochastic representation (1.10), \(\varvec{a}=(a_1,\ldots ,a_n)'\in R^n_+\). If \(\varvec{z}=(Z_1,\ldots ,Z_n)'\in F_n\) has a density function, it must have the form of \(f(\Vert \varvec{z}\Vert _1)\) (\(\varvec{z}\in R^n_+\)) that depends only on the \(l_1\)-norm. Fang and Fang [44] obtained the distributions of the order statistics from the family of multivariate \(l_1\)-norm symmetric distributions. Fang and Fang [45] proposed the exponential matrix distribution. Fang and Fang [46] studied statistical inference on the location and scale parameters of the multivariate \(l_1\)-norm symmetric distributions. Fang and Fan [39] studied large sample properties for distributions with rotational symmetries. Fang and Fang [16] obtained a characterization property of multivariate \(l_1\)-norm symmetric distributions. Fang and Xu [73] constructed a class of multivariate distributions including the multivariate logistic. Fang et al. [52] summarizes most of the current findings on symmetric multivariate and related distributions. The idea of defining the general distribution family \(F_n\) in (1.10) was generalized to the \(l_p\)-norm symmetric distributions by [121], and was further generalized to the \(L_p\)-norm symmetric distributions by [117].

3.2 Other Related Multivariate Distributions

Fang and his collaborators’ research on the direction of multivariate symmetric and related distributions continued throughout the 1990’s and after. For example Fang and Fang [47] constructed a class of generalized Dirichlet distributions; Fang et al. [49] constructed a family of bivariate distributions with non-elliptical contours; Fang et al. [53] introduced the \(L_1\)-norm symmetric distributions to the topic of \(L_1\)-norm statistical analysis. Kotz et al. [82] applied the method of vertical density representation to a class of multivariate symmetric distributions and proposed a new method for generating random numbers from these distributions. Rosen et al. [115] proposed an approach to extending the complex normal distribution. Zhu et al. [126] proposed a new approach to testing symmetry of high-dimensional distributions. Fang et al. [49] constructed a family of bivariate distributions with non-elliptical contours. Fang et al. [76] proposed a new approach to generating multivariate distributions by using vertical density representation. Fang et al. [50] developed a copula method for constructing meta-elliptical distributions with given marginals. Their copula method has been cited by many international scholars in different areas, see for example, scholar.google.com [50], “The meta-elliptical distributions with given marginals” has been cited for 296 times. Among various methods for constructing multivariate distributions, the copula method is one of the most cited methods for constructing a multivariate distribution with given marginals, see, for example [83].

Fang et al. [50] idea for constructing the meta-type ECD is based on the well-known property of ECD. If \(\varvec{z}=(Z_1,\ldots ,Z_n)'\sim ECD_n(\mathbf 0 ,\varvec{R},g)\) with a density-generating function \(g(\cdot )\) as in (1.6) and correlation matrix \(\varvec{R}\), the marginal p.d.f. (probability density function) of each component \(Z_i\) (\(i=1,\ldots ,n\)) is given by

$$\begin{aligned} q_g(z)=\frac{\pi ^{(n-1)/2}}{\varGamma ((n-1)/2)}\int _{z^2}^{+\infty }(y-z^2)^{(n-1)/2}g(y)dy \end{aligned}$$
(1.17)

and a cumulative distribution function (c.d.f.) given by

$$\begin{aligned} Q_g(z)=\frac{1}{2}+\frac{\pi ^{(n-1)/2}}{\varGamma ((n-1)/2)}\int _0^z\int _{u^2}^{+\infty }(y-u^2)^{(n-1)/2}g(y)dydu. \end{aligned}$$
(1.18)

Let \(\varvec{x}=(X_1,\ldots ,X_n)'\) be a random vector with each component \(X_i\) having a continuous p.d.f. \(f_i(x_i)\) and a c.d.f. \(F_i(x_i)\). Let the random vector \(\varvec{z}=(Z_1,\ldots ,Z_n)'\sim ECD_n(\mathbf 0 ,\varvec{R},g)\). Suppose that

$$\begin{aligned} Z_i=Q_g^{-1}(F_i(X_i)),\quad i=1,\ldots , n, \end{aligned}$$
(1.19)

where \(Q_g^{-1}(\cdot )\) is the inverse of \(Q_g(\cdot )\) given by (1.18). Fang et al. [50] obtained the p.d.f. of \(\varvec{x}=(X_1,\ldots ,X_n)'\) given by

$$\begin{aligned} h(x_1,\ldots ,x_n)=\phi \Big (Q_g^{-1}(F_1(x_1)),\ldots ,Q_g^{-1}(F_n(x_n)\Big )\prod _{i=1}^nf_i(x_i), \end{aligned}$$
(1.20)

where \(\phi \) is the n-variate density weighting function:

$$\begin{aligned} \phi (z_1,\ldots ,z_n)=\frac{|\varvec{R}|^{-\frac{1}{2}}g(\varvec{z}'\varvec{R}^{-1}\varvec{z})}{\prod _{i=1}^nq_g(z_i)}. \end{aligned}$$
(1.21)

If \(\varvec{x}=(X_1,\ldots ,X_n)'\) has a p.d.f. given by (1.20), \({\varvec{X}}\) is said to have a meta-elliptical distribution, denote by \({\varvec{X}}\sim ME_n(\mathbf 0 ,\varvec{R},g;\,F_1,\ldots ,F_n)\). The family \(ME_n(\mathbf 0 ,\varvec{R},g;\,F_1,\ldots ,F_n)\) includes various multivariate distributions, such as \(ECD_n(\mathbf 0 ,\varvec{R},g)\), the meta-Gaussian distributions and various asymmetric distributions by choosing suitable marginal c.d.f. \(F_i(x_i)\). Fang et al. [50] obtained some interesting meta-elliptical distributions in the two-dimensional case. In general, \(ME_n(\mathbf 0 ,\varvec{R},g;\,F_1,\ldots ,F_n)\) is such a big family of distributions that the exact p.d.f. of any given member is difficult to obtain. Today copula method has been comprehensively studied and has been applied to various fields, see, for example [10, 17, 103].

Based on the theory of spherical distributions developed by [52, 95] proposed a class of uniform tests for goodness of fit of the \(L_p\)-norm symmetric multivariate distributions. All of the research accomplishments from Fang and his collaborators have greatly enriched the theory of general symmetric multivariate and related distributions.

4 Directional Data Analysis, Occupancy Problem, Growth Curve Model, and Miscellaneous Directions

Entering the open age of the economic reform of China in late 1970’s and 1980’s, Prof. Fang’s research topics were eradicating onto various directions. For example, to meet the needs of applied statistics in industry of China, Prof. Fang carried out a series of research projects in clustering analysis, occupancy problem, mathematical statistics and standardization, quality control, and graph analysis of multivariate observations. The research outcomes from these projects were summarized in papers: [19,20,21,22,23,24,25, 36, 60, 64].

4.1 Directional Data Analysis and Occupancy Problem

Directional data analysis is one of Prof. Fangs interests in late 1980’s. Directional data occurs in many areas, namely the earth sciences, meteorology and medicine. It was a hot international research area in 1970’s. A summary overview on directional data analysis was given by [102]. Let \(\varvec{x}=(x_1,\ldots ,x_p)'\) be a direction on the surface of the unit sphere \(S_p=\{\varvec{x}\in R^p:\ \Vert \varvec{x}\Vert =1\}\) (\(R^p\) stands for the usual p-dimensional Euclidean space, \(\Vert \cdot \Vert \) stands for the usual distance function). Some important topics in directional data analysis include the correlation analysis of data on any two different directions \(\varvec{x}\) and \(\varvec{y}\) on \(S_p\) and regression problem like \(\varvec{y}\) given \(\varvec{x}\). Fang led his graduate students to this research area that was brand new to Chinese statisticians in late 1980’s. The major research outcomes were published in their series of papers [41, 52].

In addition to focusing his research on statistical theory and its applications, Prof. Fang also carried out research on probability theory and its applications. For example, occupancy in probability theory is about the problem of reasonably assigning a set of balls into a group of cells. Although the occupancy problem originated from simple probability theory, some practical problems on resource allocation can be reduced to the solution to some kind of occupancy problems. For example, the number of units in use in hotel rooms, apartment flats, or offices, or the number of persons using an undivided space, etc., can be described as a kind of occupancy problems. The optimal allocation of limited resources reduces to the solution to an occupancy problem. Prof. Fang’s research on the occupancy problem can be dated back to early 1980’s, see, for example [25,26,27, 61].

4.2 Growth Curve Model and Miscellaneous Directions

The growth curve model (GCM for simplicity) is another research field in which Prof. Fang guided his graduate students in the middle of 1980’s. A general review on GCM methodologies for data analysis was given by [114]. Among others, Prof. Fang’s former Ph.D. student Jianxin Pan played the leading role in developing new GCM methodologies for data analysis. Outlier detection, discovery of influential observations, and covariance structure are important topics in the GCM theory. A general formulation of GCM is [104] defined by

$$\begin{aligned} \varvec{Y}_{p\times n}= {\varvec{X}}_{p\times m}\varvec{B}_{m\times r}\varvec{Z}_{r\times n}+{\varvec{E}}_{p\times n}, \end{aligned}$$
(1.22)

where where \({\varvec{X}}\) and \(\varvec{Z}\) are known design matrices of rank \(m < p\) and \(r < n\), respectively, and the regression coefficient matrix \(\varvec{B}\) is unknown. Furthermore, the columns of the error matrix \({\varvec{E}}\) are independent p-variate normal with a mean vector \(\mathbf 0 \) and a common unknown covariance matrix \(\varvec{\varSigma }>\mathbf 0 \). The GCM formulation defined by (1.22) can be written as a matrix-variate normal distribution \(\varvec{Y}\sim N_{p\times n}({\varvec{X}}\varvec{B}\varvec{Z},\varvec{\varSigma }\otimes I_n)\) (“\(\otimes \)” stands for the Kronecker product). The maximum likelihood estimate (MLE) for the unknown coefficient matrix \(\varvec{B}\) and the unknown covariance matrix \(\varvec{\varSigma }\) can be easily obtained from the expression of the matrix normal distribution of GCM. Pan and Fang [104] employed the mean-shift regression model to develop an approach for multiple outlier detection. Pan and Fang [105] studied the influence of a subset of observations on the growth regression fittings by comparing empirical influence functions. Pan et al. [108] proposed the Bayesian local influence approach to develop a method for GCM model diagnostics with Rao’s simple covariance structure. Pan et al. [109] studied the local influence assessment in GCM with unstructured covariance under an abstract perturbation scheme. Pan et al. [110] discussed the posterior distribution of the covariance matrix of GCM. Pan and Fang [106] extended the results in [108] from Rao’s simple covariance structure to unstructured covariance. Pan et al. [111] applied projection pursuit techniques to multiple outlier detection in multivariate data analysis. A comprehensive study on the current development of GCM was summarized in [107].

Prof. Fang’s research interest and accomplishments have been emanating from a number of areas and applications during 1990’s. Besides his contributions to the areas of generalized multivariate analysis, theory on symmetric multivariate and related distributions, occupancy problems, directional data analysis, and growth curve modeling, Prof. Fang’s miscellaneous and other significant contributions to statistics can be found from Fang’s series of papers. Among the miscellaneous research directions, construction of effective algorithms for complex numerical computation in statistics became one of Prof. Fang’s important research directions in 1990’s. For example [65, 66, 119], proposed the sequential algorithm for optimization problems and solving nonlinear equations [67]; proposed the general applications of number-theoretic methods in statistics [116]; proposed the neural computation on nonlinear regression analysis problems [51]; proposed some global optimization algorithms in statistics [120]; discussed the quasi-Monte Carlo approaches and their applications in statistics and econometrics; and [78] proposed a two-stage algorithm associated with number-theoretic methods for numerical evaluation of integrals. In addition to the major research areas, these miscellaneous research directions, as well as their related applications, have significantly enrich Prof. Fang’s field of research.

Entering the new millenium of 2000, Prof. Fang led his graduate students and worked with his collaborators on the theory and applications of uniform design and general experimental designs –the biggest research area that Prof. Fang and his collaborators have been developing with the richest outcomes. One can refer to Prof. Fang and his collaborators’ series of papers in 2000’s. It is no doubt that the new millenium marks Prof. Kai-Tai Fang’s biggest step to the statistical pyramid. We wish Prof. Fang would never stop marching to the peak of the statistical pyramid in his lifetime as a statistician.