Multivariate uncertain regression model with imprecise observations

Ye, Tingqing; Liu, Yuhan

doi:10.1007/s12652-020-01763-z

Multivariate uncertain regression model with imprecise observations

Original Research
Published: 04 March 2020

Volume 11, pages 4941–4950, (2020)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Multivariate uncertain regression model with imprecise observations

Download PDF

343 Accesses
28 Citations
Explore all metrics

Abstract

The multivariate regression model is a mathematical tool for estimating the relationships among some explanatory variables and some response variables. In some cases, observed data are imprecise. In order to model those imprecise data, we can employ uncertainty theory to design the uncertain regression model by regarding those data as uncertain variables. Parameters estimation is an important topic in the uncertain regression model. In this paper, we explore a method of parameters estimation by the principle of least squares in the multivariate uncertain regression model containing more than one response variables and assuming both explanatory variables and response variables as uncertain variables. Besides, when the new explanatory variables are given, we propose an approach to obtain the forecast value and the confidence interval of the response variables. At last, a numerical example of the multivariate uncertain regression model is showed.

Uncertain regression analysis: an approach for imprecise observations

Article 15 February 2017

Uncertain linear regression model and its application

Article 24 December 2014

Uncertain maximum likelihood estimation with application to uncertain regression analysis

Article 30 April 2020

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

In order to understand the relationships among lots of factors, people need to impose structure on those factors. Usually, we build a regression model to describe how the changes in some variables (explanatory variables) affect other variables (response variables). If the regression model contains only one response variable, we call the model multiple regression model. Furthermore, if we want to study the relationships among explanatory variables and more than one response variables, maybe we have two methods. One is that we can establish a multiple regression model for each response variable and all explanatory variables, and consider those models independently. The other is that we can design a multivariate regression model including all response variables and explanatory variables, and take the relationships among response variables into consider. Perhaps the latter one is better. Because we often meet with cases that the correlation of the response variables is high. For example, we want to study the relationships among systolic blood pressure, diastolic blood pressure of a patient (response variables) and his gender, body temperature, heart rate (explanatory variables). Since systolic blood pressure is highly correlated with diastolic blood pressure, it is inconclusive to separate them. Thus, multivariate regression model is more reasonable. In fact, the very reason why we employ a multivariate model is to incorporate the relationships of response variables.

In statistical domain, the relationship among each response variable and explanatory variables is expressed by a function, thus called functional relationship. For example, in the multivariate linear regression model, all functions are linear. Generally speaking, in the multivariate regression model, the functional relationships should be determined in advance through people’s experience although there are unknown parameters in those functions. An experienced simple model not only is easier to remember but also can inspire new idea. The process of modeling is the process of understanding the world. In statistics, the most widely used model is linear model. In Galton (1886), firstly proposed “regression” for a simple linear regression model to study the relationship between children’s height and parents’ height. Twelve years later, Yule (1897) introduced the regression into statistical domain. In Fama et al. (1969), used event study methodology in the multivariate regression model to study the effect of new information on asset prices. In recent years, Ganesh (2018) presented individual regression method to predict the $\mathrm{PM}_{2.5}$ concentration. Krishnamurthy (2019) using support vector regression to calculate the Lyapunov exponents of short time series.

In the multivariate regression model, it is vital for us to estimate unknown parameters based on given observations. The multivariate least squares estimation is the most widely used estimation method, generalized by Aitken (1935) and developed by Watson (1967). Besides, the multivariate least absolute estimation (Gentle 1977; Bilodeau and Brenner 1999), maximum likelihood estimation (Anderson 1951) and least distance estimation (Bai et al. 1990) are other common methods of point estimation. However, those methods do not consider the relationships among the response variables. In order to take the relationships into consider, Breiman and Friedman (1997) proposed the restrained multivariate least squares estimation by canonical analysis, and Jhun and Choi (2009) presented the bootstrapping least distance estimation in the multivariate regression model.

Note that explanatory variables and the response variables in traditional regression model are assumed to be observed precisely. However, the observations are unable to be precise in some cases. For example, the data of the factories’ carbon emission or the social benefit of factories during some time are collected in an imprecise way. How do we model those imprecise data? Liu (2012) suggested to employ uncertainty theory to model the imprecisely observed data given by the domain experts. Uncertainty theory was founded by Liu (2007) and developed by Liu (2009) based on normality, duality, subadditivity, and product axioms in order to deal with the belief degree with human uncertainty. The regression model based on uncertainty theory is called uncertain regression model. And in uncertain regression model, those imprecisely observed data are regarded as uncertain variables. It is an important topic to estimate the unknown parameters in the uncertain regression model. On the one hand, for the uncertain multiple regression model including only one response variable, many scholars have proposed lots of methods such as the least squares estimation (Yao and Liu 2018), the least absolute deviations estimation (Liu and Yang 2019), and the maximum likelihood estimation (Lio and Liu 2019). In addition, Lio and Liu (2018) explored the interval estimation to predict response variables. On the other hand, if the uncertain regression model includes more than one response variable, we call it multivariate uncertain regression model. Song and Fu (2018) applied the least squares estimation in multivariate uncertain regression model where only the observed data of response variables are imprecise. In this paper, we aim to study multivariate uncertain regression model where the observed data of both explanatory variables and response variables are imprecise. Our work mainly includes parameters estimation, residual analysis, forecast value and confidence interval.

The rest of the paper is organized as follows: In Sect. 2, we propose the multivariate uncertain regression model and estimate the parameters in the model. In Sect. 3, we analyze the residual based on those estimations. In Sect. 4, confidence interval is suggested to forecast the response variables when new explanatory variables are given. In Sect. 5, we provide an example to show the application of the multivariate uncertain regression model. At last, some conclusions are made in Sect. 6.

2 Multivariate uncertain regression model

Assume $(x_1,x_2,\ldots ,x_p)$ is a vector of explanatory variables and $(y_1,y_2,\ldots ,y_q)$ is a vector of response variables. The functional relationships between $y_j$ and $x_1,$ $x_2,\ldots ,x_p$ are assumed to be expressed by the multivariate regression model

$$\begin{aligned} \begin{pmatrix} y_1\\ y_2\\ \vdots \\ y_q \end{pmatrix}= \begin{pmatrix} f_1(x_1,x_2,\ldots ,x_p|{\varvec{\beta }}_{{\varvec{1}}})\\ f_2(x_1,x_2,\ldots ,x_p|{\varvec{\beta }}_{{\varvec{2}}})\\ \vdots \\ f_p(x_1,x_2,\ldots ,x_q|{\varvec{\beta }}_{{\varvec{q}}}) \end{pmatrix}+ \begin{pmatrix} \varepsilon _1\\ \varepsilon _2\\ \vdots \\ \varepsilon _q \end{pmatrix} \end{aligned}$$

(1)

where ${\varvec{\beta }}_{{\varvec{j}}}=(\beta _{0j},\beta _{1j},\ldots ,\beta _{pj})^T$ are vectors of unknown parameters and $\varepsilon _j$ are disturbance terms for $j=1,2,$ $\ldots ,q$.

In traditional model, we assume that both $(x_1,x_2,$ $\ldots ,x_p)$ and $(y_1,y_2,\ldots ,y_q)$ are precisely observational. However, the observations we can obtain are imprecise in some cases. And thus, those observations should be characterized as uncertain variables. Assume that we have the observed data of $x_1,x_2,\ldots ,x_p$ and $y_1,y_2,\ldots ,$ $y_q$ as follows,

$$\begin{aligned} {\widetilde{x}}_{i1},{\widetilde{x}}_{i2},\ldots ,{\widetilde{x}}_{ip},{\widetilde{y}} _{i1},{\widetilde{y}} _{i2},\ldots ,{\widetilde{y}} _{iq} \end{aligned}$$

where ${\widetilde{x}}_{i1},{\widetilde{x}}_{i2},\ldots ,{\widetilde{x}}_{ip},{\widetilde{y}} _{i1},{\widetilde{y}} _{i2},\ldots ,{\widetilde{y}} _{iq}$ are uncertain variables with uncertainty distributions $\varPhi _{i1},\varPhi _{i2},\ldots ,\varPhi _{ip},\varPsi _{i1},$ $\varPsi _{i2},\ldots ,\varPsi _{iq}$ for $i=1,2,\ldots ,n$, respectively. For simplicity, denote

$$\begin{aligned} {\varvec{\beta }}= \begin{pmatrix} {\varvec{\beta }}_{{\varvec{1}}}&{\varvec{\beta }}_{{\varvec{2}}}&\cdots&{\varvec{\beta }}_{{\varvec{q}}} \end{pmatrix}=\begin{pmatrix} \beta _{01} &{}\beta _{02} &{} \cdots &{}\beta _{0q} \\ \beta _{11} &{}\beta _{12} &{} \cdots &{}\beta _{1q}\\ \vdots &{}\vdots &{} &{}\vdots \\ \beta _{p1} &{}\beta _{p2} &{} \cdots &{}\beta _{pq} \end{pmatrix}. \end{aligned}$$

The solution of the minimization problem

$$\begin{aligned} \min _{{\varvec{\beta }}} \sum _{i=1}^n \sum _{j=1}^q E\left[ ({\widetilde{y}}_{ij}-f_j({\widetilde{x}}_{i1},{\widetilde{x}}_{i2},\ldots ,{\widetilde{x}}_{ip}|{\varvec{\beta }}_{{\varvec{j}}}))^2\right] \end{aligned}$$

(2)

is the least squares estimate of ${\varvec{\beta }}$ in the multivariate regression model (1). Denote the optimal solution by

$$\begin{aligned} {\varvec{\beta }}^{{\varvec{*}}}= \begin{pmatrix} {\varvec{\beta }}_{{\varvec{1}}}^{{\varvec{*}}}&{\varvec{\beta }}_{{\varvec{2}}}^{{\varvec{*}}}&\cdots&{\varvec{\beta }}_{{\varvec{q}}}^{{\varvec{*}}} \end{pmatrix}=\begin{pmatrix} \beta _{01}^{*} &{}\beta _{02}^{*} &{} \cdots &{}\beta _{0q}^{*} \\ \beta _{11}^{*} &{}\beta _{12}^{*} &{} \cdots &{}\beta _{1q}^{*}\\ \vdots &{}\vdots &{} &{}\vdots \\ \beta _{p1}^{*} &{}\beta _{p2}^{*} &{} \cdots &{}\beta _{pq}^{*} \end{pmatrix}. \end{aligned}$$

Then the fitted regression model is

$$\begin{aligned} \begin{pmatrix} y_1\\ y_2\\ \vdots \\ y_q \end{pmatrix}= \begin{pmatrix} f_1(x_1,x_2,\ldots ,x_p|{\varvec{\beta }}_{{\varvec{1}}}^{{\varvec{*}}})\\ f_2(x_1,x_2,\ldots ,x_p|{\varvec{\beta }}_{{\varvec{2}}}^{{\varvec{*}}})\\ \vdots \\ f_p(x_1,x_2,\ldots ,x_q|{\varvec{\beta }}_{{\varvec{q}}}^{{\varvec{*}}}) \end{pmatrix}. \end{aligned}$$

(3)

Theorem 1

Suppose the imprecisely observed data ${\widetilde{x}}_{i1},$ ${\widetilde{x}}_{i2},\ldots ,{\widetilde{x}}_{ip},{\widetilde{y}} _{i1},{\widetilde{y}} _{i2},\ldots ,{\widetilde{y}} _{iq}, i=1,2,\ldots ,n$ are independent uncertain variables. And assume those uncertain variables ${\widetilde{x}}_{i1},{\widetilde{x}}_{i2},\ldots ,{\widetilde{x}}_{ip},{\widetilde{y}} _{i1},{\widetilde{y}} _{i2},\ldots ,{\widetilde{y}} _{iq}$ have regular uncertainty distributions $\varPhi _{i1},\varPhi _{i2},\ldots ,\varPhi _{ip},\varPsi _{i1},\varPsi _{i2},$ $\ldots ,\varPsi _{iq},i=1,2,\ldots ,n$, respectively. Then the least squares estimate of

$$\begin{aligned} {\varvec{\beta }}=\begin{pmatrix} \beta _{01} &{}\beta _{02} &{} \cdots &{}\beta _{0q} \\ \beta _{11} &{}\beta _{12} &{} \cdots &{}\beta _{1q}\\ \vdots &{}\vdots &{} &{}\vdots \\ \beta _{p1} &{}\beta _{p2} &{} \cdots &{}\beta _{pq} \end{pmatrix} \end{aligned}$$

in the multivariate linear regression model

$$\begin{aligned} \begin{pmatrix} y_1\\ y_2\\ \vdots \\ y_q \end{pmatrix}= \begin{pmatrix} \displaystyle \beta _{01}+\sum _{k=1}^p \beta _{k1}x_k\\ \displaystyle \beta _{02}+\sum _{k=1}^p \beta _{k2}x_k\\ \vdots \\ \displaystyle \beta _{0q}+\sum _{k=1}^p \beta _{kq}x_k \end{pmatrix}+ \begin{pmatrix} \varepsilon _1\\ \varepsilon _2\\ \vdots \\ \varepsilon _q \end{pmatrix} \end{aligned}$$

is the solution of the following problem:

$$\begin{aligned} \begin{aligned}&\min _{{\varvec{\beta }}} \sum _{i=1}^n \sum _{j=1}^q \int _0^1 \left( \varPsi _{ij}^{-1}(\alpha )-\beta _{0j}\right. \\&\quad \left. -\sum _{k=1}^p \beta _{kj}\varUpsilon _{ik}^{-1}(\alpha ,\beta _{kj})\right) ^2 \mathrm{d}{\alpha } \end{aligned} \end{aligned}$$

where

$$\begin{aligned} \varUpsilon _{ik}^{-1}(\alpha ,\beta _{kj})= {\left\{ \begin{array}{ll} \varPhi _{ik}^{-1}(1-\alpha ), &{} \text {if } \beta _{kj} \ge 0\\ \varPhi _{ik}^{-1}(\alpha ), &{} \text {if } \beta _{kj} < 0 \end{array}\right. } \end{aligned}$$

for $i=1,2,\ldots ,n,j=1,2,\ldots ,q$ and $k=1,2,\ldots ,p$.

Proof

The least squares estimate of ${\varvec{\beta }}$ in the linear regression model is the optimal solution of the following problem,

$$\begin{aligned} \min _{{\varvec{\beta }}} \sum _{i=1}^n \sum _{j=1}^q E \left[ \left( {\widetilde{y}} _{i1}-\beta _{0j}-\sum _{k=1}^p \beta _{kj}{\widetilde{x}}_{ik}\right) ^2 \right] . \end{aligned}$$

Since ${\widetilde{x}}_{i1},{\widetilde{x}}_{i2},\ldots ,{\widetilde{x}}_{ip},{\widetilde{y}} _{i1},{\widetilde{y}} _{i2},\ldots ,{\widetilde{y}} _{iq},i=1,2,\ldots ,n$ are independent, we can obtain that ${\widetilde{y}}_{ij}- \beta _{0j}-\sum _{k=1}^p \beta _{kj}{\widetilde{x}}_{ik}$ have the inverse uncertainty distributions

$$\begin{aligned} \varPsi _{ij}^{-1}(\alpha )-\beta _{0j}-\sum _{k=1}^p \beta _{kj}\varUpsilon _{ik}^{-1}(\alpha ,\beta _{kj}) \end{aligned}$$

for $i=1,2,\ldots ,n,j=1,2,\ldots ,q$, respectively. Thus,

$$\begin{aligned} \begin{aligned}&E \left[ \left( {\widetilde{y}} _{i1}-\beta _{0j}-\sum _{k=1}^p \beta _{kj}{\widetilde{x}}_{ik}\right) ^2 \right] \\&\quad = \int _0^1 \left( \varPsi _{ij}^{-1}(\alpha )-\beta _{0j}-\sum _{k=1}^p \beta _{kj}\varUpsilon _{ik}^{-1}(\alpha ,\beta _{kj})\right) ^2 \mathrm{d}{\alpha }, \end{aligned} \end{aligned}$$

$i=1,2,\ldots ,n,j=1,2,\ldots ,q$. Therefore, the least squ-

ares estimate of ${\varvec{\beta }}$ in the multivariate linear regression model is the solution of the minimization problem,

$$\begin{aligned} \begin{aligned}&\min _{{\varvec{\beta }}} \sum _{i=1}^n \sum _{j=1}^q \int _0^1 \left( \varPsi _{ij}^{-1}(\alpha )-\beta _{0j}\right. \\&\quad \left. -\sum _{k=1}^p \beta _{kj}\varUpsilon _{ik}^{-1}(\alpha ,\beta _{kj})\right) ^2 \mathrm{d}{\alpha }. \end{aligned} \end{aligned}$$

The theorem is proved. $\square $

Theorem 2

Suppose the imprecisely observed data ${\widetilde{x}}_{i1},$ ${\widetilde{x}}_{i2},\ldots ,{\widetilde{x}}_{ip},{\widetilde{y}} _{i1},{\widetilde{y}} _{i2},\ldots ,{\widetilde{y}} _{iq}, i=1,2,\ldots ,n$ are independent uncertain variables. And assume those uncertain variables ${\widetilde{x}}_{i1},{\widetilde{x}}_{i2},\ldots ,{\widetilde{x}}_{ip},{\widetilde{y}} _{i1},{\widetilde{y}} _{i2},\ldots ,{\widetilde{y}} _{iq}$ have regular uncertainty distributions $\varPhi _{i1},\varPhi _{i2},\ldots ,\varPhi _{ip},\varPsi _{i1},\varPsi _{i2},$ $\ldots ,\varPsi _{iq}, i=1,2,\ldots ,n$, respectively. Then the least squares estimate of

$$\begin{aligned} {\varvec{\beta }}= \begin{pmatrix} \beta _{01} &{}\beta _{02} &{} \cdots &{}\beta _{0q} \\ \beta _{11} &{}\beta _{12} &{} \cdots &{}\beta _{1q}\\ \beta _{21} &{}\beta _{22} &{} \cdots &{}\beta _{2q} \end{pmatrix} \end{aligned}$$

in the multivariate asymptotic regression model

$$\begin{aligned} \begin{pmatrix} y_1\\ y_2\\ \vdots \\ y_q \end{pmatrix}= \begin{pmatrix} \displaystyle \beta _{01}-\beta _{11}\exp {(-\beta _{21}x)}\\ \displaystyle \beta _{02}-\beta _{12}\exp {(-\beta _{22}x)}\\ \vdots \\ \displaystyle \beta _{0q}-\beta _{1q}\exp {(-\beta _{2q}x)} \end{pmatrix}+ \begin{pmatrix} \varepsilon _1\\ \varepsilon _2\\ \vdots \\ \varepsilon _q \end{pmatrix}, \\ \beta _{0j}>0,\beta _{1j}>0,\beta _{2j}>0,j=1,2,\ldots ,q \end{aligned}$$

is the optimal solution of the following problem:

$$\begin{aligned} &\min _{{\varvec{\beta }}} \sum _{i=1}^n \sum _{j=1}^q \int _0^1 \left( \varPsi _{ij}^{-1}(\alpha )-\beta _{0j}\right. \\&\quad \left. +\,\beta _{1j}\exp {(-\beta _{2j}\varPhi _{i}^{-1}(1-\alpha ))}\right) ^2 \mathrm{d}{\alpha }. \end{aligned} $$

Proof

The least squares estimate of ${\varvec{\beta }}$ in the asymptotic regression model is actually the optimal solution of the minimization problem,

$$\begin{aligned} \min _{{\varvec{\beta }}} \sum _{i=1}^n \sum _{j=1}^q E \left[ \left( {\widetilde{y}}_{ij}-\beta _{0j}+\beta _{1j}\exp {(-\beta _{2j}{\widetilde{x}}_i)}\right) ^2 \right] . \end{aligned}$$

Since ${\widetilde{x}}_i,{\widetilde{y}} _{i1},{\widetilde{y}} _{i2},\ldots ,{\widetilde{y}} _{iq},i=1,2,\ldots ,n$ are independent, we can obtain that $\displaystyle {\widetilde{y}}_{ij}-\beta _{0j}+\beta _{1j}\exp {(-\beta _{2j}{\widetilde{x}}_i)}$ have the inverse uncertainty distributions

$$\begin{aligned} \varPsi _{ij}^{-1}(\alpha )-\beta _{0j}+\beta _{1j}\exp {(-\beta _{2j}\varPhi _{i}^{-1}(1-\alpha ))} \end{aligned}$$

for $i=1,2,\ldots ,n,j=1,2,\ldots ,q$, respectively. Thus,

$$\begin{aligned}\begin{aligned}&E \left[ \left( {\widetilde{y}}_{ij}-\beta _{0j}+\beta _{1j}\exp {(-\beta _{2j}{\widetilde{x}}_i)}\right) ^2 \right] \\&\quad = \int _0^1 \left( \varPsi _{ij}^{-1}(\alpha )-\beta _{0j}+\beta _{1j}\exp {(-\beta _{2j}\varPhi _{ik}^{-1}(1-\alpha )})\right) ^2 \mathrm{d}{\alpha }, \end{aligned} \end{aligned}$$

$i=1,2,\ldots ,n,j=1,2,\ldots ,q$. Therefore, the least squares estimate of ${\varvec{\beta }}$ in the multivariate asymptotic regression model is the solution of the minimization problem,

$$\begin{aligned} \begin{aligned}&\min _{{\varvec{\beta }}} \sum _{i=1}^n \sum _{j=1}^q \int _0^1 \left( \varPsi _{ij}^{-1}(\alpha )-\beta _{0j}\right. \\&\quad \left. +\beta _{1j}\exp {(-\beta _{2j}\varPhi _{i}^{-1}(1-\alpha ))}\right) ^2 \mathrm{d}{\alpha }. \end{aligned} \end{aligned}$$

The theorem is proved. $\square $

Theorem 3

Suppose the imprecisely observed data ${\widetilde{x}}_{i1},$ ${\widetilde{x}}_{i2},\ldots ,{\widetilde{x}}_{ip},{\widetilde{y}} _{i1},{\widetilde{y}} _{i2},\ldots ,{\widetilde{y}} _{iq}, i=1,2,\ldots ,n$ are independent uncertain variables. And assume those uncertain variables ${\widetilde{x}}_{i1},{\widetilde{x}}_{i2},\ldots ,{\widetilde{x}}_{ip},{\widetilde{y}} _{i1},{\widetilde{y}} _{i2},\ldots ,{\widetilde{y}} _{iq}$ have regular uncertainty distributions $\varPhi _{i1},\varPhi _{i2},\ldots ,\varPhi _{ip},\varPsi _{i1},\varPsi _{i2},$ $\ldots ,\varPsi _{iq}, i=1,2,\ldots ,n$, respectively. Then the least squares estimate of

$$\begin{aligned} {\varvec{\beta }}= \begin{pmatrix} \beta _{11} &{}\beta _{12} &{} \cdots &{}\beta _{1q}\\ \beta _{21} &{}\beta _{22} &{} \cdots &{}\beta _{2q} \end{pmatrix} \end{aligned}$$

in the multivariate Michaelis-Menten regression model

$$\begin{aligned} \begin{pmatrix} y_1\\ y_2\\ \vdots \\ y_q \end{pmatrix}= \begin{pmatrix} \displaystyle \frac{\beta _{11}x}{\beta _{21}+x}\\ \displaystyle \frac{\beta _{12}x}{\beta _{22}+x}\\ \vdots \\ \displaystyle \frac{\beta _{1q}x}{\beta _{2q}+x} \end{pmatrix}+ \begin{pmatrix} \varepsilon _1\\ \varepsilon _2\\ \vdots \\ \varepsilon _q \end{pmatrix}, \\ \beta _{1j}>0,\beta _{2j}>0,j=1,2,\ldots ,q \end{aligned}$$

is the optimal solution of the following problem:

$$\begin{aligned} \min _{{\varvec{\beta }}} \sum _{i=1}^n \sum _{j=1}^q \int _0^1 \left( \varPsi _{ij}^{-1}(\alpha )-\frac{\beta _{1j}\varPhi _{i}^{-1}(1-\alpha )}{\beta _{2j}+\varPhi _{i}^{-1}(1-\alpha )}\right) ^2 \mathrm{d}{\alpha }. \end{aligned}$$

Proof

The least squares estimate of ${\varvec{\beta }}$ in the Michaelis-Menten regression model is the optimal solution of the minimization problem,

$$\begin{aligned} \min _{{\varvec{\beta }}} \sum _{i=1}^n \sum _{j=1}^q E \left[ \left( {\widetilde{y}}_{ij}-\frac{\beta _{1j}{\widetilde{x}}_i}{\beta _{2j}+{\widetilde{x}}_i}\right) ^2 \right] . \end{aligned}$$

Since ${\widetilde{x}}_i,{\widetilde{y}} _{i1},{\widetilde{y}} _{i2},\ldots ,{\widetilde{y}} _{iq}$ are independent, we can obtain that $\displaystyle {\widetilde{y}}_{ij}-\frac{\beta _{1j}{\widetilde{x}}_i}{\beta _{2j}+{\widetilde{x}}_i}$ have the inverse uncertainty distributions

$$\begin{aligned} \varPsi _{ij}^{-1}(\alpha )-\frac{\beta _{1j}\varPhi _{i}^{-1}(1-\alpha )}{\beta _{2j}+\varPhi _{i}^{-1}(1-\alpha )} \end{aligned}$$

for $i=1,2,\ldots ,n,j=1,2,\ldots ,q$, respectively. Thus,

$$\begin{aligned} \begin{aligned}&E \left[ \left( {\widetilde{y}}_{ij}-\frac{\beta _{1j}{\widetilde{x}}_i}{\beta _{2j}+{\widetilde{x}}_i}\right) ^2 \right] \\&\quad = \int _0^1 \left( \varPsi _{ij}^{-1}(\alpha )-\frac{\beta _{1j}\varPhi _{i}^{-1}(1-\alpha )}{\beta _{2j}+\varPhi _{i}^{-1}(1-\alpha )}\right) ^2 \mathrm{d}{\alpha }, \end{aligned} \end{aligned}$$

$i=1,2,\ldots ,n,j=1,2,\ldots ,q$. Therefore, the least squares estimate of ${\varvec{\beta }}$ in the multivariate Michaelis-Menten regression model is the solution of the minimization problem,

$$\begin{aligned} \min _{{\varvec{\beta }}} \sum _{i=1}^n \sum _{j=1}^q \int _0^1 \left( \varPsi _{ij}^{-1}(\alpha )-\frac{\beta _{1j}\varPhi _{i}^{-1}(1-\alpha )}{\beta _{2j}+\varPhi _{i}^{-1}(1-\alpha )}\right) ^2 \mathrm{d}{\alpha }. \end{aligned}$$

The theorem is proved. $\square $

3 Multivariate residual analysis

In the regression model (1), there is a disturbance term ${\varvec{\varepsilon }}=(\varepsilon _1, \varepsilon _2, \ldots , \varepsilon _q)^{\mathrm{T}}$. It is difficult to discover the disturbance term ${\varvec{\varepsilon }}$ exactly since the term changes for each observation. Thus, we are concerned about how to estimate ${\varvec{\varepsilon }}$ based on imprecisely observed data, ${\widetilde{x}}_{i1},{\widetilde{x}}_{i2},\ldots ,$ ${\widetilde{x}}_{ip},{\widetilde{y}} _{i1},{\widetilde{y}} _{i2},\ldots ,{\widetilde{y}} _{iq},i=1,2,\ldots ,n$.

Definition 1

Assume the fitted regression model is

$$\begin{aligned} \begin{pmatrix} y_1\\ y_2\\ \vdots \\ y_q \end{pmatrix}= \begin{pmatrix} f_1(x_1,x_2,\ldots ,x_p|{\varvec{\beta }}_{{\varvec{1}}}^{{\varvec{*}}})\\ f_2(x_1,x_2,\ldots ,x_p|{\varvec{\beta }}_{{\varvec{2}}}^{{\varvec{*}}})\\ \vdots \\ f_q(x_1,x_2,\ldots ,x_p|{\varvec{\beta }}_{{\varvec{q}}}^{{\varvec{*}}}) \end{pmatrix} \end{aligned}$$

and ${\widetilde{x}}_{i1},{\widetilde{x}}_{i2},\ldots ,{\widetilde{x}}_{ip},{\widetilde{y}} _{i1},{\widetilde{y}} _{i2},\ldots ,{\widetilde{y}} _{iq}, i=1,2,\ldots ,n$ are imprecisely observed data. Then we call

$$\begin{aligned} {\hat{{\varvec{\varepsilon }}}}_{{\varvec{i}}}=\begin{pmatrix} {\hat{\varepsilon }}_{i1}\\ {\hat{\varepsilon }}_{i2}\\ \vdots \\ {\hat{\varepsilon }}_{iq} \end{pmatrix}=\begin{pmatrix} {\widetilde{y}}_{i1}-f_1({\widetilde{x}}_{i1},{\widetilde{x}}_{i2},\ldots ,{\widetilde{x}}_{ip}|{\varvec{\beta }}_{{\varvec{1}}}^{{\varvec{*}}})\\ {\widetilde{y}}_{i2}-f_2({\widetilde{x}}_{i1},{\widetilde{x}}_{i2},\ldots ,{\widetilde{x}}_{ip}|{\varvec{\beta }}_{{\varvec{2}}}^{{\varvec{*}}})\\ \vdots \\ {\widetilde{y}}_{iq}-f_q({\widetilde{x}}_{i1},{\widetilde{x}}_{i2},\ldots ,{\widetilde{x}}_{ip}|{\varvec{\beta }}_{{\varvec{q}}}^{{\varvec{*}}}) \end{pmatrix} \end{aligned}$$

the i-th residual for each i $(i=1,2,\ldots ,n)$.

Suppose that the disturbance term ${\varvec{\varepsilon }}=(\varepsilon _1, \varepsilon _2, \ldots , $ $\varepsilon _q)^{\mathrm{T}}$ is an uncertain vector. Then, for each j $(j=1,2,\ldots ,q)$, we use the average of the expected values of residuals, i.e.,

$$\begin{aligned} {\hat{e}}_j=\frac{1}{n} \sum _{i=1}^n E[{\hat{\varepsilon }}_{ij}] \end{aligned}$$

to estimate the expected values of the disturbance term $\varepsilon _j$, and

$$\begin{aligned} {\hat{\sigma }}_j^2=\frac{1}{n} \sum _{i=1}^n E[({\hat{\varepsilon }}_{ij}-{\hat{e}}_j)^2] \end{aligned}$$

to estimate the variances. Then, we call

$$\begin{aligned} {\hat{{\varvec{e}}}}= \begin{pmatrix} {\hat{e}}_1\\ {\hat{e}}_2\\ \vdots \\ {\hat{e}}_q \end{pmatrix},\, \hat{{\varvec{\sigma }}}^{{\varvec{2}}}= \begin{pmatrix} {\hat{\sigma }}_1^2\\ {\hat{\sigma }}_2^2\\ \vdots \\ {\hat{\sigma }}_q^2 \end{pmatrix} \end{aligned}$$

the vectors of the estimated expected values and estimated variances of disturbance term ${\varvec{\varepsilon }}$, respectively.

Theorem 4

Suppose the imprecisely observed data ${\widetilde{x}}_{i1},$ ${\widetilde{x}}_{i2},\ldots ,{\widetilde{x}}_{ip},{\widetilde{y}} _{i1},{\widetilde{y}} _{i2},\ldots ,{\widetilde{y}} _{iq}, i=1,2,\ldots ,n$ are independent uncertain variables. And assume those uncertain variables ${\widetilde{x}}_{i1},{\widetilde{x}}_{i2},\ldots ,{\widetilde{x}}_{ip},{\widetilde{y}} _{i1},{\widetilde{y}} _{i2},\ldots ,{\widetilde{y}} _{iq}$ have regular uncertainty distributions $\varPhi _{i1},\varPhi _{i2},\ldots ,\varPhi _{ip},\varPsi _{i1},\varPsi _{i2},$ $\ldots ,\varPsi _{iq}, i=1,2,\ldots ,n$, respectively. Let the fitted multivariate linear regression model be

$$\begin{aligned} \begin{pmatrix} y_1\\ y_2\\ \vdots \\ y_q \end{pmatrix}= \begin{pmatrix} \beta _{01}^{*}+\sum _{k=1}^p \beta _{k1}^{*}x_k\\ \beta _{02}^{*}+\sum _{k=1}^p \beta _{k2}^{*}x_k\\ \vdots \\ \beta _{0q}^{*}+\sum _{k=1}^p \beta _{kq}^{*}x_k \end{pmatrix}. \end{aligned}$$

Then the vector of estimated expected values of the disturbance term ${\varvec{\varepsilon }}$ is

$$\begin{aligned} {\hat{{\varvec{e}}}}= \begin{pmatrix} \begin{aligned} &{}\frac{1}{n} \sum \nolimits _{i=1}^n \int _0^1 \left( \varPsi _{i1}^{-1}(\alpha )-\beta _{01}^{*}\right. \\ &{}\quad \left. -\sum \nolimits _{k=1}^p \beta _{k1}^{*}\varUpsilon _{ik}^{-1}(\alpha ,\beta _{k1}^{*})\right) \mathrm{d}{x}\end{aligned}\\ \begin{aligned} &{}\frac{1}{n} \sum \nolimits _{i=1}^n \int _0^1 \left( \varPsi _{i2}^{-1}(\alpha )-\beta _{02}^{*}\right. \\ &{}\quad \left. -\sum \nolimits _{k=1}^p \beta _{k2}^{*}\varUpsilon _{ik}^{-1}(\alpha ,\beta _{k2}^{*})\right) \mathrm{d}{x}\end{aligned}\\ \vdots \\ \begin{aligned} &{}\frac{1}{n} \sum \nolimits _{i=1}^n \int _0^1 \left( \varPsi _{iq}^{-1}(\alpha )-\beta _{0q}^{*}\right. \\ &{}\quad \left. -\sum \nolimits _{k=1}^p \beta _{kq}^{*}\varUpsilon _{ik}^{-1}(\alpha ,\beta _{kq}^{*})\right) \mathrm{d}{x}\end{aligned} \end{pmatrix} \end{aligned}$$

and the vector of estimated variances is

$$\begin{aligned} \hat{{\varvec{\sigma }}}^{{\varvec{2}}}= \begin{pmatrix} \begin{aligned} &{}\frac{1}{n} \sum \nolimits _{i=1}^n \int _0^1 \left( \varPsi _{i1}^{-1}(\alpha )-\beta _{01}^{*}\right. \\ &{}\quad \left. -\sum \nolimits _{k=1}^p \beta _{k1}^{*}\varUpsilon _{ik}^{-1}(\alpha ,\beta _{k1}^{*})-{\hat{e}}_1\right) ^2 \mathrm{d}{x}\end{aligned}\\ \begin{aligned} &{}\frac{1}{n} \sum \nolimits _{i=1}^n \int _0^1 \left( \varPsi _{i2}^{-1}(\alpha )-\beta _{02}^{*}\right. \\ &{}\quad \left. -\sum \nolimits _{k=1}^p \beta _{k2}^{*}\varUpsilon _{ik}^{-1}(\alpha ,\beta _{k2}^{*})-{\hat{e}}_2\right) ^2 \mathrm{d}{x}\end{aligned}\\ \vdots \\ \begin{aligned} &{}\frac{1}{n} \sum \nolimits _{i=1}^n \int _0^1 \left( \varPsi _{iq}^{-1}(\alpha )-\beta _{0q}^{*}\right. \\ &{}\quad \left. -\sum \nolimits _{k=1}^p \beta _{kq}^{*}\varUpsilon _{ik}^{-1}(\alpha ,\beta _{kq}^{*})-{\hat{e}}_q\right) ^2 \mathrm{d}{x}\end{aligned} \end{pmatrix} \end{aligned}$$

where

$$\begin{aligned} \varUpsilon _{ik}^{-1}(\alpha ,\beta _{kj}^{*})= {\left\{ \begin{array}{ll} \varPhi _{ik}^{-1}(1-\alpha ), &{} \text {if} \,\beta _{kj}^{*} \ge 0\\ \varPhi _{ik}^{-1}(\alpha ), &{} \text {if} \,\beta _{kj}^{*} < 0 \end{array}\right. } \end{aligned}$$

for $i=1,2,\ldots ,n,j=1,2,\ldots ,q$ and $k=1,2,\ldots ,p$.

Proof

Since the inverse uncertainty distributions of $ {\widetilde{y}}_{ij}- \beta _{0j}^{*}-\sum _{k=1}^p \beta _{kj}^{*}{\widetilde{x}}_{ik}$ are

$$\begin{aligned} \varPsi _{ij}^{-1}(\alpha )-\beta _{0j}^{*}-\sum _{k=1}^p \beta _{kj}^{*}\varUpsilon _{ik}^{-1}(\alpha ,\beta _{kj}^{*}) \end{aligned}$$

for $i=1,2,\ldots ,n,j=1,2,\ldots ,q$, respectively, Theorem 4 holds immediately. $\square $

Theorem 5

Suppose the imprecisely observed data ${\widetilde{x}}_{i1},$ ${\widetilde{x}}_{i2},\ldots ,{\widetilde{x}}_{ip},{\widetilde{y}} _{i1},{\widetilde{y}} _{i2},\ldots ,{\widetilde{y}} _{iq}, i=1,2,\ldots ,n$ are independent uncertain variables. And assume those uncertain variables ${\widetilde{x}}_{i1},{\widetilde{x}}_{i2},\ldots ,{\widetilde{x}}_{ip},{\widetilde{y}} _{i1},{\widetilde{y}} _{i2},\ldots ,{\widetilde{y}} _{iq}$ have regular uncertainty distributions $\varPhi _{i1},\varPhi _{i2},\ldots ,\varPhi _{ip},\varPsi _{i1},\varPsi _{i2},$ $\ldots ,\varPsi _{iq}, i=1,2,\ldots ,n$, respectively. Let the fitted multivariate asymptotic regression model be

$$\begin{aligned}& \begin{pmatrix} y_1\\ y_2\\ \vdots \\ y_q \end{pmatrix}= \begin{pmatrix} \displaystyle \beta _{01}^{*}-\beta _{11}\exp {(-\beta _{21}^{*}x)}\\ \displaystyle \beta _{02}^{*}-\beta _{12}^{*}\exp {(-\beta _{22}^{*}x)}\\ \vdots \\ \displaystyle \beta _{0q}^{*}-\beta _{1q}^{*}\exp {(-\beta _{2q}^{*}x)} \end{pmatrix},\\ &\quad \beta _{0j}^{*}>0,\beta _{1j}^{*}>0,\beta _{2j}^{*}>0,j=1,2,\ldots ,q. \end{aligned}$$

Then the vector of estimated expected values of the disturbance term ${\varvec{\varepsilon }}$ is

$$\begin{aligned} {\hat{{\varvec{e}}}}= \begin{pmatrix} \begin{aligned}\frac{1}{n} \sum \nolimits _{i=1}^n \int _0^1 &{}\left( \varPsi _{i1}^{-1}(\alpha )-\beta _{01}^{*}\right. \\ &{}\left. +\beta _{11}^{*}\exp {(-\beta _{21}^{*}\varPhi _{i}^{-1}(1-\alpha ))}\right) \mathrm{d}{x}\end{aligned}\\ \begin{aligned}\frac{1}{n} \sum \nolimits _{i=1}^n \int _0^1 &{}\left( \varPsi _{i2}^{-1}(\alpha )-\beta _{02}^{*}\right. \\ &{}\left. +\beta _{12}^{*}\exp {(-\beta _{22}^{*}\varPhi _{i}^{-1}(1-\alpha ))}\right) \mathrm{d}{x}\end{aligned}\\ \vdots \\ \begin{aligned} \frac{1}{n} \sum \nolimits _{i=1}^n \int _0^1 &{}\left( \varPsi _{iq}^{-1}(\alpha )-\beta _{0q}^{*}\right. \\ &{}\left. +\beta _{1q}^{*}\exp {(-\beta _{2q}^{*}\varPhi _{i}^{-1}(1-\alpha ))}\right) \mathrm{d}{x}\end{aligned} \end{pmatrix} \end{aligned}$$

and the vector of estimated variances is

$$\begin{aligned} \hat{{\varvec{\sigma }}}^{{\varvec{2}}}= \begin{pmatrix} \begin{aligned}\frac{1}{n} \sum \nolimits _{i=1}^n &{}\int _0^1 \left( \varPsi _{i1}^{-1}(\alpha )-\beta _{01}^{*}\right. \\ &{}\left. +\beta _{11}^{*}\exp {(-\beta _{21}^{*}\varPhi _{i}^{-1}(1-\alpha ))}-{\hat{e}}_1\right) ^2 \mathrm{d}{x}\end{aligned}\\ \begin{aligned}\frac{1}{n} \sum \nolimits _{i=1}^n &{}\int _0^1 \left( \varPsi _{i2}^{-1}(\alpha )-\beta _{02}^{*}\right. \\ &{}\left. +\beta _{12}^{*}\exp {(-\beta _{22}^{*}\varPhi _{i}^{-1}(1-\alpha ))}-{\hat{e}}_2\right) ^2 \mathrm{d}{x}\end{aligned}\\ \vdots \\ \begin{aligned} \frac{1}{n} \sum \nolimits _{i=1}^n &{}\int _0^1 \left( \varPsi _{iq}^{-1}(\alpha )-\beta _{0q}^{*}\right. \\ &{}\left. +\beta _{1q}^{*}\exp {(-\beta _{2q}^{*}\varPhi _{i}^{-1}(1-\alpha ))}-{\hat{e}}_q\right) ^2 \mathrm{d}{x}\end{aligned} \end{pmatrix}. \end{aligned}$$

Proof

Since the inverse uncertainty distributions of $\displaystyle {\widetilde{y}}_{ij}-\beta _{0j}^{*}+\beta _{1j}^{*}\exp {(-\beta _{2j}^{*}{\widetilde{x}}_i)}$ are

$$\begin{aligned} \varPsi _{ij}^{-1}(\alpha )-\beta _{0j}^{*}+\beta _{1j}^{*}\exp {(-\beta _{2j}^{*}\varPhi _{i}^{-1}(1-\alpha ))} \end{aligned}$$

for $i=1,2,\ldots ,n,j=1,2,\ldots ,q$, respectively, Theorem 5 holds immediately. $\square $

Theorem 6

Suppose the imprecisely observed data ${\widetilde{x}}_{i1},$ ${\widetilde{x}}_{i2},\ldots ,{\widetilde{x}}_{ip},{\widetilde{y}} _{i1},{\widetilde{y}} _{i2},\ldots ,{\widetilde{y}} _{iq}, i=1,2,\ldots ,n$ are independent uncertain variables. And assume those uncertain variables ${\widetilde{x}}_{i1},{\widetilde{x}}_{i2},\ldots ,{\widetilde{x}}_{ip},{\widetilde{y}} _{i1},{\widetilde{y}} _{i2},\ldots ,{\widetilde{y}} _{iq}$ have regular uncertainty distributions $\varPhi _{i1},\varPhi _{i2},\ldots ,\varPhi _{ip},\varPsi _{i1},\varPsi _{i2},$ $\ldots ,\varPsi _{iq}, i=1,2,\ldots ,n$, respectively. Let the fitted multivariate Michaelis–enten regression model be

$$\begin{aligned} \begin{pmatrix} y_1\\ y_2\\ \vdots \\ y_q \end{pmatrix}= \begin{pmatrix} \displaystyle \frac{\beta _{11}^{*}x}{\beta _{21}^{*}+x}\\ \displaystyle \frac{\beta _{12}^{*}x}{\beta _{22}^{*}+x}\\ \vdots \\ \displaystyle \frac{\beta _{1q}^{*}x}{\beta _{2q}^{*}+x} \end{pmatrix}, \beta _{1j}^{*}>0,\beta _{2j}^{*}>0,j=1,2,\ldots ,q. \end{aligned}$$

Then the vector of estimated expected values of the disturbance term ${\varvec{\varepsilon }}$ is

$$\begin{aligned} {\hat{{\varvec{e}}}}= \begin{pmatrix} \frac{1}{n} \sum _{i=1}^n \int _0^1 \left( \varPsi _{i1}^{-1}(\alpha )-\frac{\beta _{11}^{*}\varPhi _{i}^{-1}(1-\alpha )}{\beta _{21}^{*}+\varPhi _{i}^{-1}(1-\alpha )}\right) \mathrm{d}{x}\\ \frac{1}{n} \sum _{i=1}^n \int _0^1 \left( \varPsi _{i2}^{-1}(\alpha )-\frac{\beta _{12}^{*}\varPhi _{i}^{-1}(1-\alpha )}{\beta _{22}^{*}+\varPhi _{i}^{-1}(1-\alpha )}\right) \mathrm{d}{x}\\ \vdots \\ \frac{1}{n} \sum _{i=1}^n \int _0^1 \left( \varPsi _{iq}^{-1}(\alpha )-\frac{\beta _{1q}^{*}\varPhi _{i}^{-1}(1-\alpha )}{\beta _{2q}^{*}+\varPhi _{i}^{-1}(1-\alpha )}\right) \mathrm{d}{x} \end{pmatrix} \end{aligned}$$

and the vector of estimated variances is

$$\begin{aligned} \hat{{\varvec{\sigma }}}^{{\varvec{2}}} =\begin{pmatrix} \frac{1}{n} \sum _{i=1}^n \int _0^1 \left( \varPsi _{i1}^{-1}(\alpha )-\frac{\beta _{11}^{*}\varPhi _{i}^{-1}(1-\alpha )}{\beta _{21}^{*}+\varPhi _{i}^{-1}(1-\alpha )}-{\hat{e}}_1\right) ^2 \mathrm{d}{x}\\ \frac{1}{n} \sum _{i=1}^n \int _0^1 \left( \varPsi _{i2}^{-1}(\alpha )-\frac{\beta _{12}^{*}\varPhi _{i}^{-1}(1-\alpha )}{\beta _{22}^{*}+\varPhi _{i}^{-1}(1-\alpha )}-{\hat{e}}_2\right) ^2 \mathrm{d}{x}\\ \vdots \\ \frac{1}{n} \sum _{i=1}^n \int _0^1 \left( \varPsi _{iq}^{-1}(\alpha )-\frac{\beta _{1q}^{*}\varPhi _{i}^{-1}(1-\alpha )}{\beta _{2q}^{*}+\varPhi _{i}^{-1}(1-\alpha )}-{\hat{e}}_q\right) ^2 \mathrm{d}{x} \end{pmatrix}. \end{aligned} $$

Proof

Since the inverse uncertainty distributions of $\displaystyle {\widetilde{y}}_{ij}-\frac{\beta _{1j}{\widetilde{x}}_i}{\beta _{2j}+{\widetilde{x}}_i}$ are

$$\begin{aligned} \varPsi _{ij}^{-1}(\alpha )-\frac{\beta _{1j}\varPhi _{i}^{-1}(1-\alpha )}{\beta _{2j}+\varPhi _{i}^{-1}(1-\alpha )} \end{aligned}$$

for $i=1,2,\ldots ,n,j=1,2,\ldots ,q$, respectively, Theorem 6 holds immediately. $\square $

4 Forecast value and confidence interval

In Sects. 2 and 3, we obtain the least squares estimate ${\varvec{\beta }}^{*}$ and the estimations of the expected value ${\hat{{\varvec{e}}}}$ and the variance $\hat{{\varvec{\sigma }}}^{{\varvec{2}}}$ of disturbance term ${\varvec{\varepsilon }}$ based on imprecisely observed data $({\widetilde{x}}_{i1},{\widetilde{x}}_{i2},\ldots ,{\widetilde{x}}_{ip},{\widetilde{y}} _{i1},{\widetilde{y}} _{i2},\ldots ,{\widetilde{y}} _{iq}),i=1,$ $2,\ldots ,n$. Based on the work, we are interested in forecasting the response vector for the new explanatory vector. Assume that ${\varvec{{\widetilde{x}}}}=({\widetilde{x}}_{1}, {\widetilde{x}}_{2}, \ldots , {\widetilde{x}}_{p})^{\mathrm{T}}$ is a vector of new explanatory variables where ${\widetilde{x}}_{1},{\widetilde{x}}_{2},\ldots ,{\widetilde{x}}_{p}$ are independent uncertain variables. Suppose those uncertain variables ${\widetilde{x}}_{1},{\widetilde{x}}_{2},\ldots ,{\widetilde{x}}_{p}$ have regular uncertainty distributions $\varPhi _1,\varPhi _2,\ldots ,\varPhi _p$, respectively. Although the relationship between uncertain explanatory vector and the uncertain response vector may be complicated, it is still valuable and useful to apply linear regression model for the data. Suppose that the fitted linear regression model is

$$\begin{aligned} \begin{pmatrix} y_1\\ y_2\\ \vdots \\ y_q \end{pmatrix}= \begin{pmatrix} \beta _{01}^{*}+\sum _{k=1}^p \beta _{k1}^{*}x_k\\ \beta _{02}^{*}+\sum _{k=1}^p \beta _{k2}^{*}x_k\\ \vdots \\ \beta _{0q}^{*}+\sum _{k=1}^p \beta _{kq}^{*}x_k \end{pmatrix} \end{aligned}$$

and the disturbance term ${\varvec{\varepsilon }}$ has the estimated expected value ${\hat{{\varvec{e}}}}$ and variance $\hat{{\varvec{\sigma }}}^{{\varvec{2}}}$, and is independent of ${\widetilde{x}}_{1},{\widetilde{x}}_{2},$ $\ldots ,{\widetilde{x}}_{p}$. Then the forecast uncertain vector of ${\varvec{y}}=(y_1,$ $ y_2, \ldots , y_q)^{\mathrm{T}}$ with respect to $({\widetilde{x}}_{1},{\widetilde{x}}_{2},\ldots ,{\widetilde{x}}_{p})$ is determined by

$$\begin{aligned} {\hat{{\varvec{y}}}}=\begin{pmatrix} {\hat{y}}_1\\ {\hat{y}}_2\\ \vdots \\ {\hat{y}}_q \end{pmatrix}= \begin{pmatrix} \beta _{01}^{*}+\sum _{k=1}^p \beta _{k1}^{*}{\widetilde{x}}_k\\ \beta _{02}^{*}+\sum _{k=1}^p \beta _{k2}^{*}{\widetilde{x}}_k\\ \vdots \\ \beta _{0q}^{*}+\sum _{k=1}^p \beta _{kq}^{*}{\widetilde{x}}_k \end{pmatrix}+ \begin{pmatrix} \varepsilon _1\\ \varepsilon _2\\ \vdots \\ \varepsilon _q \end{pmatrix}. \end{aligned}$$

(4)

For each j $(j=1,2,\ldots ,q)$, a single value of $y_j$ should be estimated from the forecast uncertain vector, and it is natural to define the forecast value of $y_j$ as

$$\begin{aligned} \mu _j=\beta _{0j}^{*}+\sum _{k=1}^p \beta _{kj}^{*}E[{\widetilde{x}}_k]+{\hat{e}}_j \end{aligned}$$

which is the expected value of the forecast uncertain variable ${\hat{y}}_j$. Then we write

$$\begin{aligned} {\varvec{\mu }}=\begin{pmatrix} \beta _{01}^{*}+\sum _{k=1}^p \beta _{k1}^{*}E[{\widetilde{x}}_k]+{\hat{e}}_1\\ \beta _{02}^{*}+\sum _{k=1}^p \beta _{k2}^{*}E[{\widetilde{x}}_k]+{\hat{e}}_2\\ \vdots \\ \beta _{01}^{*}+\sum _{k=1}^p \beta _{k1}^{*}E[{\widetilde{x}}_k]+{\hat{e}}_q \end{pmatrix} \end{aligned}$$

(5)

as the forecast value of ${\varvec{y}}$.

Furthermore, in Eq. (4), if we assume that the disturbances $\varepsilon _1,\varepsilon _2,\ldots ,\varepsilon _q$ are identically distributed, but their expected values and variances differ across equation. Especially, if $\varepsilon _1,\varepsilon _2,\ldots ,\varepsilon _q$ are normal uncertain variables ${\mathcal {N}}({\hat{e}}_1,{\hat{\sigma }}_1),{\mathcal {N}}({\hat{e}}_2,{\hat{\sigma }}_2),\ldots ,{\mathcal {N}}({\hat{e}}_q,{\hat{\sigma }}_q)$, respectively, then the inverse uncertainty distributions of ${\hat{y}}_j$ are determined by

$$\begin{aligned} {\hat{\varPsi }}_{j}^{-1}(\alpha )=\beta _{0j}^{*}+\sum _{k=1}^p \beta _{kj}^{*}\varUpsilon _{k}^{-1}(\alpha ,\beta _{kj})+\phi _j^{-1}(\alpha ) \end{aligned}$$

where

$$\begin{aligned} \varUpsilon _{k}^{-1}(\alpha ,\beta _{kj})= {\left\{ \begin{array}{ll} \varPhi _{k}^{-1}(\alpha ), &\quad \text {if} \,\beta _{kj} \ge 0\\ \varPhi _{k}^{-1}(1-\alpha ), &\quad \text {if} \,\beta _{kj} < 0 \end{array}\right. } \end{aligned}$$

and $\phi _j^{-1}(\alpha )$ are the inverse uncertainty distributions of $\varepsilon _j$, i.e.,

$$\begin{aligned} \phi _j^{-1}(\alpha )={\hat{e}}_j+\frac{{\hat{\sigma }}_j}{\pi } \end{aligned}$$

for $j=1,2,\ldots ,q,k=1,2,\ldots ,p$, respectively. Then the uncertainty distributions ${\hat{\varPsi }}_{j}$ of ${\hat{y}}_j$ can be obtained by ${\hat{\varPsi }}_{j}^{-1}$, $j=1,2,\ldots ,q$, respectively.

For each j $(j=1,2,\ldots ,q)$, the forecast value $\mu _j$ is a point estimation of $y_j$. However, it is not convincing to claim that the value of each component of uncertain vector ${\varvec{y}}$ is always a precise value. In fact, the point estimates are hard to convince people that they are accurate. If the estimation is a range, like $3\sim 4$, it is more convincing. Although the value range is wider, the reliability is obviously higher. Thus, we propose the confidence interval to estimate ${\varvec{y}}$.

Taking $\alpha $ (e.g., 95$\%$) as a confidence level, we are interested in finding the minimum values $b_j$ such that

$$\begin{aligned} {\hat{\varPsi }}_{j}(\mu _j+b_j)-{\hat{\varPsi }}_{j}(\mu _j-b_j) \ge \alpha , \end{aligned}$$

(6)

$j=1,2,\ldots ,q$, respectively. Since

$$\begin{aligned} {\mathcal {M}}\{\mu _j-b_j \le {\hat{y}}_j \le \mu _j+b_j \} \ge {\hat{\varPsi }}_{j}(\mu _j+b_j)-{\hat{\varPsi }}_{j}(\mu _j-b_j) \ge \alpha , \end{aligned}$$

the $\alpha $ confidence intervals of $y_j$ are suggested as $[\mu _j-b_j,\mu _j+b_j]$, which can be abbreviated as $\mu _j\pm b_j$, $j=1,2,\ldots ,q$, respectively. Denote

$$\begin{aligned} {\varvec{b}}= \begin{pmatrix} b_1\\ b_2\\ \vdots \\ b_q \end{pmatrix}. \end{aligned}$$

Then, the $\alpha $ confidence interval of ${\varvec{y}}$ is written as

$$\begin{aligned} \begin{pmatrix} \mu _1\\ \mu _2\\ \vdots \\ \mu _q \end{pmatrix} \pm \begin{pmatrix} b_1\\ b_2\\ \vdots \\ b_q \end{pmatrix} \end{aligned}$$

which represents the set

$$\begin{aligned} \begin{aligned}&{[}{\varvec{\mu }}-{\varvec{b}},{\varvec{\mu }}+{\varvec{b}}]=\{ {\varvec{x}}=(x_1,x_2,\ldots ,x_q) \in \mathfrak {R}^q : \\&\quad \mu _j-b_j \le x_j \le \mu _j+b_j, j=1,2,\ldots ,q \} \end{aligned} \end{aligned}$$

and we have a chance of $\alpha $ to cover ${\varvec{y}}$ with our confidence interval.

5 Numerical example

In this section, we design a numerical example to show the estimation of unknown parameters, residual analysis, forecast value and confidence interval in multivariate uncertain regression model.

Consider the linear regression model

$$ \begin{aligned} \begin{pmatrix} y_1\\ y_2\\ y_3 \end{pmatrix}=\begin{pmatrix} \beta _{01}+ \beta _{11}x_1+\beta _{21}x_2+\beta _{31}x_3+\beta _{41}x_4\\ \beta _{02}+ \beta _{12}x_1+\beta _{22}x_2+\beta _{32}x_3+\beta _{42}x_4\\ \beta _{03}+ \beta _{13}x_1+\beta _{23}x_2+\beta _{33}x_3+\beta _{43}x_4 \end{pmatrix}+ \begin{pmatrix} \varepsilon _1\\ \varepsilon _2\\ \varepsilon _3 \end{pmatrix}. \end{aligned}$$

Denote

$$\begin{aligned} {\varvec{\beta }}=\begin{pmatrix} \beta _{01} &{}\beta _{02} &{} \beta _{03} \\ \beta _{11} &{}\beta _{12} &{} \beta _{13}\\ \beta _{21} &{}\beta _{22} &{} \beta _{23}\\ \beta _{31} &{}\beta _{32} &{} \beta _{33}\\ \beta _{41} &{}\beta _{42} &{} \beta _{43} \end{pmatrix}. \end{aligned}$$

Suppose imprecisely observed data ${\widetilde{x}}_{i1},{\widetilde{x}}_{i2},{\widetilde{x}}_{i3},{\widetilde{x}}_{i4},{\widetilde{y}} _{i1},$ ${\widetilde{y}} _{i2},{\widetilde{y}} _{i3}$, $i=1,2,\ldots ,21$ are independent uncertain variables, where ${\widetilde{x}}_{i1},{\widetilde{x}}_{i2},{\widetilde{x}}_{i3},{\widetilde{x}}_{i4},{\widetilde{y}} _{i1},{\widetilde{y}} _{i2},{\widetilde{y}} _{i3}$ have linear uncertainty distributions, $\varPhi _{i1},\varPhi _{i2},\varPhi _{i3},\varPhi _{i4},\varPsi _{i1},\varPsi _{i2},\varPsi _{i3}$, respectively. See the data in Table 1.

Table 1 Imprecisely observed data where ${\mathcal {L}}(a, b)$ represents linear uncertain variable

Full size table

First, we estimate the unknown parameters. That is, we should solve the following problem,

$$\begin{aligned} \begin{aligned}&\min _{{\varvec{\beta }}} \sum _{i=1}^{21} \sum _{j=1}^3 E \left[ ({\widetilde{y}}_{ij}-\beta _{0j}-\beta _{1j}{\widetilde{x}}_{i1}-\beta _{2j}{\widetilde{x}}_{i2}-\beta _{3j}{\widetilde{x}}_{i3}\right. \\&\quad \left. -\beta _{4j}{\widetilde{x}}_{i4})^2 \right] . \end{aligned} \end{aligned}$$

(7)

It follows from Theorem 1 that the equation (7) can be transformed to an equivalent form

$$\begin{aligned} \begin{aligned}&\min _{{\varvec{\beta }}} \sum _{i=1}^{21} \sum _{j=1}^3 \int _0^1 \left( \varPsi _{ij}^{-1}(\alpha )-\beta _{0j}- \beta _{1j}\varUpsilon _{i1}^{-1}(\alpha ,\beta _{1j})\right. \\&\quad \left. -\beta _{2j}\varUpsilon _{i2}^{-1}(\alpha ,\beta _{2j})-\beta _{3j}\varUpsilon _{i3}^{-1}(\alpha ,\beta _{3j})-\beta _{4j}\varUpsilon _{i4}^{-1}(\alpha ,\beta _{4j})\right) ^2 \mathrm{d}{\alpha } \end{aligned} \end{aligned}$$

where

$$\begin{aligned} \varUpsilon _{ik}^{-1}(\alpha ,\beta _{kj})= {\left\{ \begin{array}{ll} \varPhi _{ik}^{-1}(1-\alpha ), &\quad \text {if} \,\beta _{kj} \ge 0\\ \varPhi _{ik}^{-1}(\alpha ), &\quad \text {if} \,\beta _{kj} < 0 \end{array}\right. } \end{aligned}$$

for $i=1,2,\ldots ,21,j=1,2,\ldots ,3$ and $k=1,2,\ldots ,4$. Then we can obtain the least squares estimate

$$\begin{aligned} {\varvec{\beta ^{*}}}=\begin{pmatrix} \beta _{01}^{*} &{}\beta _{02}^{*} &{} \beta _{03}^{*}\\ \beta _{11}^{*} &{}\beta _{12}^{*} &{} \beta _{13}^{*}\\ \beta _{21}^{*} &{}\beta _{22}^{*} &{} \beta _{23}^{*}\\ \beta _{31}^{*} &{}\beta _{32}^{*} &{} \beta _{33}^{*}\\ \beta _{41}^{*} &{}\beta _{42}^{*} &{} \beta _{43}^{*} \end{pmatrix}=\begin{pmatrix} 20.7786 &{}15.8565 &{} 14.4941 \\ 1.1164 &{}0.9178 &{} 1.2615\\ 0.3080 &{}0.5033 &{} 0.6114\\ 0.9210 &{}1.2423 &{} 1.4588\\ 0.2032 &{} 0.2775 &{}0.3950 \end{pmatrix}. \end{aligned}$$

Thus, the fitted linear multivariate regression model is

$$\begin{aligned} \begin{pmatrix} y_1\\ y_2\\ y_3 \end{pmatrix}=\begin{pmatrix} 20.7786 &{}15.8565 &{} 14.4941 \\ 1.1164 &{}0.9178 &{} 1.2615\\ 0.3080 &{}0.5033 &{} 0.6114\\ 0.9210 &{}1.2423 &{} 1.4588\\ 0.2032 &{} 0.2775 &{}0.3950 \end{pmatrix}^{\mathrm{T}}\begin{pmatrix} 1 \\ x_1\\ x_2\\ x_3\\ x_4 \end{pmatrix}. \end{aligned}$$

It follows from Theorem 4 that we obtain the vectors of the estimated expected values and estimated variances of disturbance term, i.e.,

$$\begin{aligned} {\hat{{\varvec{e}}}}= \begin{pmatrix} 0.0000\\ 0.0000\\ 0.0000 \end{pmatrix},\, \hat{{\varvec{\sigma }}}^{{\varvec{2}}}= \begin{pmatrix} 2.3898\\ 2.9500\\ 3.8313 \end{pmatrix}, \end{aligned}$$

respectively. Now assume

$$\begin{aligned} ({\widetilde{x}}_{1},{\widetilde{x}}_{2},{\widetilde{x}}_{3},{\widetilde{x}}_{4})\sim ({\mathcal {L}}(5,7),{\mathcal {L}}(16,17),{\mathcal {L}}(5,6),{\mathcal {L}}(24,26)) \end{aligned}$$

is a new uncertain explanatory vector and ${\widetilde{x}}_{1},{\widetilde{x}}_{2},{\widetilde{x}}_{3},{\widetilde{x}}_{4},$ $\varepsilon _1,\varepsilon _2,\varepsilon _3$ are independent. Then the forecast uncertain vector of ${\varvec{y}}=(y_1, y_2, y_3 )^{\mathrm{T}}$ is

$$\begin{aligned} \begin{pmatrix} {\hat{y}}_1\\ {\hat{y}}_2\\ {\hat{y}}_3 \end{pmatrix}=\begin{pmatrix} 20.7786 &{}15.8565 &{} 14.4941 \\ 1.1164 &{}0.9178 &{} 1.2615\\ 0.3080 &{}0.5033 &{} 0.6114\\ 0.9210 &{}1.2423 &{} 1.4588\\ 0.2032 &{} 0.2775 &{}0.3950 \end{pmatrix}^{\mathrm{T}}\begin{pmatrix} 1 \\ {\widetilde{x}}_1\\ {\widetilde{x}}_2\\ {\widetilde{x}}_3\\ {\widetilde{x}}_4 \end{pmatrix}+\begin{pmatrix} \varepsilon _1\\ \varepsilon _2\\ \varepsilon _3 \end{pmatrix}. \end{aligned}$$

Hence, it follows from equation (5) that the forecast value of ${\varvec{y}}$ is

$$\begin{aligned} {\varvec{\mu }}=\begin{pmatrix} \mu _1\\ \mu _2\\ \mu _3 \end{pmatrix}= \begin{pmatrix} 42.7051\\ 43.4379\\ 50.0501 \end{pmatrix}. \end{aligned}$$

In order to obtain confidence interval of ${\varvec{y}}$, we take a confidence level $\alpha =95\%$ and suppose $\varepsilon _1,\varepsilon _2,\varepsilon _3$ are normal uncertain variables ${\mathcal {N}}({\hat{e}}_1,{\hat{\sigma }}_1),{\mathcal {N}}({\hat{e}}_2,{\hat{\sigma }}_2),{\mathcal {N}}({\hat{e}}_3,{\hat{\sigma }}_3)$, respectively. It follows from equation (6) that we can calculate

$$\begin{aligned} {\varvec{b}}=\begin{pmatrix} b_1\\ b_2\\ b_3 \end{pmatrix}= \begin{pmatrix} 4.9540\\ 5.4339\\ 6.5106 \end{pmatrix} \end{aligned}$$

such that for each j $(j=1,2,3)$, $b_j$ is the minimum value satisfying

$$\begin{aligned} {\hat{\varPsi }}_{j}(\mu _j+b_j)-{\hat{\varPsi }}_{j}(\mu _j-b_j) \ge \alpha \end{aligned}$$

where ${\hat{\varPsi }}_{j}$ is the uncertainty distribution of ${\hat{y}}_j$. Thus, the $95\%$ confidence interval of the vector of the response variables ${\varvec{y}}$ is

$$\begin{aligned} \begin{pmatrix} 42.7051\\ 43.4379\\ 50.0501 \end{pmatrix}\pm \begin{pmatrix} 4.9540\\ 5.4339\\ 6.5106 \end{pmatrix}. \end{aligned}$$

6 Conclusions

This paper is aimed at studying the multivariate uncertain regression model that contains more than one response variables and assumes both explanatory variables and response variables as uncertain variables since the observed data are imprecise in some cases. Based on those data, we estimate unknown parameters by the principle of least squares in the different multivariate regression model, such as multivariate linear regression model, multivariate asymptotic regression model and multivariate Michaelis-Menten regression model. In order to analyze the disturbance terms in the model, we propose the concepts of residuals, and design the vectors of estimated expected values and variances of disturbance terms. Furthermore, it is significant to forecast the response variables when a set of new explanatory variables is given. In the future, we will try to take the relationships among the response variables into consider.

References

Aitken A (1935) On least squares and linear combinations of observations. Proc R Soc Edinb 55:42–48
MATH Google Scholar
Anderson T (1951) Estimating linear restrictions on regression coefficients for multivariate normal distributions. Ann Math Stat 22(3):327–351
MathSciNet MATH Google Scholar
Bai Z, Chen X, Miao B, Radhakrishna R (1990) Asymptotic theory of least distances estimate in multivariate linear models. Statistics 21(4):503–519
MathSciNet MATH Google Scholar
Bilodeau M, Brenner D (1999) Theory of multivariate statistics. Springer, New York
MATH Google Scholar
Breiman L, Friedman J (1997) Predicting multivariate responses in multiple linear regression. J R Stat Soc 59(1):3–54
MathSciNet MATH Google Scholar
Fama E, Fisher L, Jensen M, Roll R (1969) The adjustment of stock prices to new information. International Economic Review 10(1):1–21
Google Scholar
Galton F (1886) Regression towards mediocrity in hereditary stature. J Anthropol Inst GB Ireland 15:246–263
Google Scholar
Ganesh S, Arulmozhivarman P, Tatavarti V (2018) Prediction of ${\rm PM}_{2.5}$ using an ensemble of artificial neural networks and regression models. J Ambient Intell Humaniz Comput 8:1–11
Google Scholar
Gentle J (1977) Least absolute values estimation: an introduction. Commun Stat Simul Comput 6(4):313–328
MATH Google Scholar
Jhun M, Choi I (2009) Bootstrapping least distance estimator in the multivariate regression model. Comput Stat Data Anal 53(12):4221–4227
MathSciNet MATH Google Scholar
Krishnamurthy K, Manoharan S, Swaminathan R A Jacobian approach for calculating the Lyapunov exponents of short time series using support vector regression. J Ambient Intell Hum Comput. https://doi.org/10.1007/s12652-019-01525-6
Lio W, Liu B (2018) Residual and confidence interval for uncertain regression model with imprecise observations. J Intell Fuzzy Syst 35(2):2573–2583
Google Scholar
Lio W, Liu B (2019) Maximum likelihood estimation for uncertain regression analysis. Technical Report
Liu B (2007) Uncertainty theory, 2nd edn. Springer, Berlin
MATH Google Scholar
Liu B (2009) Some research problems in uncertainty theory. J Uncert Syst 3(1):3–10
Google Scholar
Liu B (2012) Why is there a need for uncertainty theory. J Uncert Syst 6(1):3–10
Google Scholar
Liu Z, Yang Y (2019) Least absolute deviations estimation for uncertain regression with imprecise observations. Fuzzy Optim Decis Mak. https://doi.org/10.1007/s10700-019-09312-w
Song Y, Fu Z (2018) Uncertain multivariable regression model. Soft Comput 22(17):5861–5866
MATH Google Scholar
Watson G (1967) Linear least squares regression. Ann Math Stat 38(6):1679–1699
MathSciNet MATH Google Scholar
Yao K, Liu B (2018) Uncertain regression analysis: an approach for imprecise observations. Soft Comput 22(17):5579–5582
MATH Google Scholar
Yule G (1897) On the theory of correlation. J R Stat Soc 60(4):812–854
Google Scholar

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China of No. 61873329.

Author information

Authors and Affiliations

Department of Mathematical Sciences, Tsinghua University, Beijing, 100084, China
Tingqing Ye
Department of Mathematical Sciences, University of Cincinnati, Cincinnati, OH, 45221-00025, USA
Yuhan Liu

Authors

Tingqing Ye
View author publications
You can also search for this author in PubMed Google Scholar
Yuhan Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yuhan Liu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ye, T., Liu, Y. Multivariate uncertain regression model with imprecise observations. J Ambient Intell Human Comput 11, 4941–4950 (2020). https://doi.org/10.1007/s12652-020-01763-z

Download citation

Received: 04 October 2019
Accepted: 01 February 2020
Published: 04 March 2020
Issue Date: November 2020
DOI: https://doi.org/10.1007/s12652-020-01763-z

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Multivariate uncertain regression model with imprecise observations

Abstract

Similar content being viewed by others

Uncertain regression analysis: an approach for imprecise observations

Uncertain linear regression model and its application

Uncertain maximum likelihood estimation with application to uncertain regression analysis

1 Introduction

2 Multivariate uncertain regression model

Theorem 1

Proof

Theorem 2

Proof

Theorem 3

Proof

3 Multivariate residual analysis

Definition 1

Theorem 4

Proof

Theorem 5

Proof

Theorem 6

Proof

4 Forecast value and confidence interval

5 Numerical example

6 Conclusions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multivariate uncertain regression model with imprecise observations

Abstract

Similar content being viewed by others

Uncertain regression analysis: an approach for imprecise observations

Uncertain linear regression model and its application

Uncertain maximum likelihood estimation with application to uncertain regression analysis

Explore related subjects

1 Introduction

2 Multivariate uncertain regression model

Theorem 1

Proof

Theorem 2

Proof

Theorem 3

Proof

3 Multivariate residual analysis

Definition 1

Theorem 4

Proof

Theorem 5

Proof

Theorem 6

Proof

4 Forecast value and confidence interval

5 Numerical example

6 Conclusions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation