Inference Progress in Missing Data Analysis from Independent to Longitudinal Setup

Sutradhar, Brajendra C.

doi:10.1007/978-1-4614-6871-4_5

Brajendra C. Sutradhar⁸

Part of the book series: Lecture Notes in Statistics ((LNSP,volume 211))

1156 Accesses
1 Citations

Abstract

In the independent setup with multivariate responses, the data become incomplete when partial responses, such as responses on some variables as opposed to all variables, are available from some individuals. The main challenge here is obtaining valid inferences such as unbiased and consistent estimates of mean parameters of all response variables by using available responses. Typically, unbalanced correlation matrices are formed and moments or likelihood analysis based on the available responses are employed for such inferences. Various imputation techniques also have been used. In the longitudinal setup, when a univariate response is repeatedly collected from an individual, these repeated responses become correlated and the responses form a multivariate distribution. In this setup, it may happen that a portion of responses are not available from some individuals under study. These non-responses may be monotonic or intermittent. Also the response may be missing following a mechanism such as missing completely at random (MCAR), missing at random (MAR), or missing non-ignorably. In a longitudinal regression setup, the covariates may also be missing, but typically they are known for all time periods. Obtaining unbiased and consistent regression estimates specially when longitudinal responses are missing following MAR or ignorable mechanism becomes a challenge. This happens because one requires to accommodate both longitudinal correlations and missing mechanism to develop a proper inference tool. Over the last three decades some progress has been made toward this mainly by taking partial care of missing mechanism in developing estimation techniques. But overall, they fall short and may still produce biased and hence inconsistent estimates. The purpose of this paper is to outline these perspectives in a comprehensive manner so that real progress and challenges are understood in order to develop proper inference techniques.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Explicating the Conditions Under Which Multilevel Multiple Imputation Mitigates Bias Resulting from Random Coefficient-Dependent Missing Longitudinal Data

Article 19 November 2016

Empirical likelihood inference in general linear model with missing values in response and covariates by MNAR mechanism

Article 05 April 2019

Comparison of Different Methods for Multiple Imputation by Chain Equation

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Missing data analysis in the independent setup with multivariate responses has a long history. For example, for an early work, we refer to Lord (1995) who considered a set of incomplete trivariate normal responses collected from K independent individuals. But all components of all three variables were not available from the K individuals. To estimate the mean parameters consistently, instead of dropping out the individuals with incomplete information, Lord (1995) has utilized the available information and constructed unbalanced (bivariate and trivariate) probability functions for individuals toward writing a likelihood function for the desired inference. Note that this technique for consistent estimation of the parameters and other similar inferences by using incomplete data have been used by many researchers over the last six decades. See, for example, Mehta and Gurland (1973), Morrison (1973), Naik (1975), Little (1988), and Krishnamoorthy and Pannala (1999), among others.

In the independent setup, techniques of imputation and multiple imputation (Rubin 1976; Rubin and Schenker 1986; Meng 1994) have also been widely used. Some authors such as Paik (1997) used this imputation technique in repeated measure (longitudinal) setup. The imputation at a given time point is done mainly by averaging over the responses of other individuals at that time who has the same covariates history as that of the individual concerned. Once the missing values are estimated, they are used as data with necessary adjustments to construct complete data based estimating equations for the desired parameters.

In a univariate longitudinal response setup, when T repeated measures are taken they become correlated and hence they jointly follow a T-dimensional multivariate distribution. However, unlike in the Gaussian setup for linear data, the multivariate distributions for repeated binary and count data become complex or impractical. However if a portion of individuals do not provide responses for all T time points, then adopting likelihood approach by blending missing mechanism and correlation structure of the repeated data would naturally become extremely complicated or impossible. As a remedy, either imputation or estimating equation approaches became popular which, however, work well if the missing data occur following the simplest MCAR mechanism. When the missing data occur following the MAR mechanism, writing a proper estimating equation by accommodating both longitudinal correlations and missing mechanism becomes difficult. Robins et al. (1995) proposed an inverse probability weights based generalized estimating equations (WGEE) approach as an extension of the GEE approach proposed by Liang and Zeger (1986) to the incomplete setup. Remark that as demonstrated by Sutradhar and Das (1999) and Sutradhar (2010), for example, the GEE approach can produce less efficient regression estimates than the well-known simpler moments or quasi-likelihood (QL) estimates, in the complete data setup. Thus, to be realistic, there is no reason how WGEE approach can be more efficient in the incomplete longitudinal setup as compared to simpler moments and QL estimates. In fact in the incomplete longitudinal setup, the WGEE approach constructed based on working correlations as opposed to the use of MAR based correlation matrix may yield biased and hence inconsistent regression estimates (Sutradhar and Mallick 2010). Further remark that this inconsistency issue was, however, not adequately addressed in the literature including the studies by Robins et al. (1995), Paik (1997), Rotnitzky et al. (1998), and Birmingham et al. (2003). One of the main reasons is this that none of the studies used any stochastic correlation structure in conjunction with the missing mechanism to model the longitudinal count and binary data in the incomplete longitudinal setup. Details on this inconsistency problem are given in Sect. 3, whereas in Sect. 2 we provide a detailed discussion on missing data analysis in independent setup.

Without realizing the aforementioned inconsistency problems that can be caused because of the use of working correlations in the estimating equations under the MAR based longitudinal setup, some authors such as Wang (1999) and Rotnitzky et al. (1998) used similar estimating equations approach in non-ignorable missing mechanism-based incomplete longitudinal setup. Some authors such as Troxel et al. (1998) (see also Troxel et al. 1997) and Ibrahim et al. (2001) (see also Ibrahim et al. 1999) have used random effects based generalized linear mixed model to accommodate the longitudinal correlations and certain binary logistic models to generate the non-ignorable mechanism based response indicator variables. In general expectation-maximization (EM) techniques are used to estimate the likelihood based parameters. These approaches appear to encounter similar difficulties as the existing MAR based approaches in generating first the response indicator and then the responses so that underlying longitudinal correlation structure is satisfied. Thus the inference validity of these approaches is not yet established. This problem becomes more complicated when longitudinal correlations are not generated through random effects and writing a likelihood such as for repeated count data becomes impossible. For clarity, in this paper we discuss in detail the successes and challenges with the inferences for MAR based incomplete longitudinal models only. The non-ignorable missing data based longitudinal analysis will therefore be beyond the scope of the paper.

2 Missing Data Analysis in Independent Setup

Missing data analysis in the independent setup with multivariate responses has a long history. For example, for an early work, we refer to Lord (1995) who considered a set of incomplete trivariate normal responses collected from K independent individuals. To be specific, suppose that $y = (y_{1},y_{2},y_{3})^{\prime} $ represents a trivariate response, but all components of y were not available from K individuals. Suppose that y ₃ was recorded from all K individuals, and either y ₁ or y ₂ was recorded for all individuals, but not both. For j = 1, …, 3, let K _j denote the number of individuals having the response y _j. It then follows that

$$\displaystyle{K_{1} + K_{2} = K,\;K3 = K.}$$

Further suppose that the K ₁ individuals for whom y ₁ is recorded will be denoted collectively as group 1 (G ₁); and the K ₂ individuals with y ₂ will be denoted as group 2 (G ₂). Now because y ₁ and y ₂ are correlated, it is obvious that the data for G ₂ contain some information relevant for estimating the parameters of variable y ₁, and that the data for G ₁ contain some information relevant for estimating the parameters of y ₂. The problem is to use the available data as efficiently as possible for estimating the parameters concerned. Denote the distribution of $y = [y_{1},\;y_{2},\;y_{3}]^{\prime} $ as

$$\displaystyle{y \sim N(\mu,\Sigma ),}$$

with $\mu = [\mu _{1},\mu _{2},\mu _{3}]^{\prime} $ and

$$\displaystyle{\Sigma = \left (\begin{array}{lll} \sigma _{11} & \rho _{12}{[\sigma _{11}\sigma _{22}]}^{\frac{1} {2} } & \rho _{13}{[\sigma _{11}\sigma _{33}]}^{\frac{1} {2} } \\ & \sigma _{22} & \rho _{23}{[\sigma _{22}\sigma _{33}]}^{\frac{1} {2} } \\ & & \sigma _{33}\\ \end{array} \right ).}$$

Note that in this setup, there are no data available to estimate ρ ₁₂. For the likelihood estimation of all the other parameters, define

$$\displaystyle\begin{array}{rcl} \bar{y}_{1}^{{\ast}}& =& \frac{1} {K_{1}}\sum _{i=1}^{K_{1} }y_{1i},\;\bar{y}_{2}^{{\ast}} = \frac{1} {K_{2}}\sum _{i=1}^{K_{2} }y_{2i},\;\bar{y}_{3} = \frac{1} {K}\sum _{i=1}^{K}y_{ 3i},\;\bar{y}_{3}^{{\ast}} = \frac{1} {K_{1}}\sum _{i=1}^{K_{1} }y_{3i},\;\bar{y}_{3}^{{\ast}{\ast}} = \frac{1} {K_{2}}\sum _{i=1}^{K_{2} }y_{3i} \\ s_{11}^{{\ast}}& =& \frac{1} {K_{1}}\sum _{i=1}^{K_{1} }{[y_{1i} -\bar{ y}_{1}^{{\ast}}]}^{2},\;s_{ 22}^{{\ast}} = \frac{1} {K_{2}}\sum _{i=1}^{K_{2} }{[y_{2i} -\bar{ y}_{2}^{{\ast}}]}^{2},\;s_{ 33} = \frac{1} {K}\sum _{i=1}^{K}{[y_{ 3i} -\bar{ y}_{3}]}^{2}, \\ s_{33}^{{\ast}}& =& \frac{1} {K_{1}}\sum _{i=1}^{K_{1} }{[y_{3i} -\bar{ y}_{3}^{{\ast}}]}^{2},\;s_{ 33}^{{\ast}{\ast}} = \frac{1} {K_{2}}\sum _{i=1}^{K_{2} }{[y_{3i} -\bar{ y}_{3}^{{\ast}{\ast}}]}^{2} \\ r_{13}& =& \frac{1} {K_{1}}\sum _{i=1}^{K_{1} }[(y_{1i} -\bar{ y}_{1}^{{\ast}})(y_{ 3i} -\bar{ y}_{3}^{{\ast}})]/{[s_{ 11}^{{\ast}}s_{ 33}^{{\ast}}]}^{\frac{1} {2} }, \\ r_{23}& =& \frac{1} {K_{2}}\sum _{i=1}^{K_{2} }[(y_{2i} -\bar{ y}_{2}^{{\ast}})(y_{ 3i} -\bar{ y}_{3}^{{\ast}{\ast}})]/{[s_{ 22}^{{\ast}}s_{ 33}^{{\ast}{\ast}}]}^{\frac{1} {2} }. {}\end{array}$$

(1)

The maximum likelihood estimators for the means are then given by

$$\displaystyle{ \hat{\mu }_{1} =\bar{ y}_{1}^{{\ast}}- b_{ 13}[\bar{y}_{3}^{{\ast}}-\bar{ y}_{ 3}],\;\hat{\mu }_{2} =\bar{ y}_{2}^{{\ast}}- b_{ 23}[\bar{y}_{3}^{{\ast}{\ast}}-\bar{ y}_{ 3}],\;\mbox{ and}\;\hat{\mu }_{3} =\bar{ y}_{3}, }$$

(2)

where

$$\displaystyle{b_{13} = r_{13}\frac{s_{11}^{{\ast}}} {s_{33}^{{\ast}}},\;\mbox{ and}\;b_{23} = r_{23} \frac{s_{22}^{{\ast}}} {s_{33}^{{\ast}{\ast}}}.}$$

These estimators in (2) are unbiased and consistent for $\mu _{1},\;\mu _{2},\;\mbox{ and}\;\mu _{3}$, respectively. The remaining parameters may also be estimated similarly (Lord 1995).

Note that the aforementioned technique for consistent estimation of the parameters and for other similar inferences by using incomplete data has been subsequently used by many researchers over the last six decades. See, for example, Mehta and Gurland (1973), Morrison (1973), Naik (1975), Little (1988), and Krishnamoorthy and Pannala (1999), among others. This idea of making inferences about the underlying model parameters such that the missing data (assuming a small proportion of missing) may not to any major extent negatively influence the inferences has also been extended to the analysis of incomplete repeated measure data. For example, one may refer to Little (1995), Robins et al. (1995), and Paik (1997), as some of the early studies. This inference procedure for incomplete longitudinal data is discussed in detail in the next section.

In the independent setup, techniques of imputation and multiple imputation (Rubin 1976; Rubin and Schenker 1986; Meng 1994) have also been widely used. Later on some authors also used this imputation technique in repeated measure (longitudinal) setup. For example, here we illustrate an imputation formula from Paik (1997) in repeated measure setup. The imputation at a given time point is done mainly by averaging over the responses of other individuals at that time who has the same covariates history as that of the individual concerned. Once the missing values are estimated, they are used as data with necessary adjustments to construct complete data based estimating equations for the desired parameters.

In a univariate longitudinal response setup, when T repeated measures are taken they become correlated and hence they jointly follow a T-dimensional multivariate distribution. Now suppose that T _i responses are observed for the ith (i = 1, …, K) individual. So, one requires to impute T − T _i missing values which may be done following Paik (1997), for example. Interestingly, a unified recursive relation can be developed as follows to obtain the imputed value $\tilde{y}_{i,T_{i}+k_{i}}$ at time point $T_{i} + k_{i}$ for all $k_{i} = 1,\ldots,T - T_{i}$. For this, first define

$$\displaystyle{ \tilde{y}_{j,T_{i}+k_{i}}^{(0)} = y_{ j,T_{i}+k_{i}} }$$

(3)

for the jth individual where j ≠ i, j = 1, …, K. Also, let $D_{iT_{i}}$ denote the covariate history up to time point T _i for the ith individual, and

$$\displaystyle{D_{i,T_{i}+k_{i}}^{{\ast}} = (x_{ i,T_{i}+1},\ldots,x_{i,T_{i}+k_{i}})}$$

is the covariate information for the ith individual from time T _i + 1 up to $T_{i} + k_{i}$ for $k_{i} = 1,\ldots,T - T_{i}$. Further let, $r_{jw} = 1,\mbox{ or},0$, for example, indicates the response status of the jth individual at wth time. One may then obtain $\tilde{y}_{i,T_{i}+k_{i}}$ by computing $\tilde{y}_{i,T_{i}+k_{i}}^{(k_{i})}$, that is, $\tilde{y}_{i,T_{i}+k_{i}} \equiv \tilde{ y}_{i,T_{i}+k_{i}}^{(k_{i})}$, where

$$\displaystyle\begin{array}{rcl} & & \tilde{y}_{i,T_{i}+k_{i}}^{(k_{i})} = \left [\sum _{ j=1}^{K}\tilde{y}_{ j,T_{i}+k_{i}}^{(0)}\Pi _{ u=1}^{k_{i} }r_{j,T_{i}+u}I(D_{jT_{i}} = D_{iT_{i}},D_{j,T_{i}+k_{i}}^{{\ast}} = D_{ i,T_{i}+k_{i}}^{{\ast}})\right. \\ & +& \sum _{m_{i}=1}^{k_{i}-1}\sum _{ j=1}^{K}\tilde{y}_{ j,T_{i}+k_{i}}^{(m_{i})}\Pi _{ u=k_{i}-(m_{i}-1)}^{k_{i} }(1 - r_{j,T_{i}+u}) \\ & & \times \left.\Pi _{u=1}^{k_{i}-m_{i} }r_{j,T_{i}+u}I(D_{jT_{i}} = D_{iT_{i}},D_{j,T_{i}+k_{i}}^{{\ast}} = D_{ i,T_{i}+k_{i}}^{{\ast}})\right ] \\ & \times &{ \left [\sum _{j=1}^{K}r_{ j,T_{i}+1}I(D_{jT_{i}} = D_{iT_{i}},D_{j,T_{i}+k_{i}}^{{\ast}} = D_{ i,T_{i}+k_{i}}^{{\ast}})\right ]}^{-1}. {}\end{array}$$

(4)

Note that $\tilde{y}_{i.T_{i}+k_{i}} \equiv \tilde{ y}_{i,T_{i}+k_{i}}^{(k_{i})}$ is an unbiased estimate of $\mu _{i,T_{i}+k_{i}}$ as the individuals used to impute the missing value of the ith subject has the same covariate history up to time point $T_{i} + k_{i}$, unlike the covariate history up to time point T _i (Paik 1997).

3 Missing Data Models in Longitudinal Setup

Let Y _it be the potential response from the ith (i = 1, …, K) individual at time point t which may or may not be observed, and $x_{it} = (x_{it1},\ldots,x_{itp})^{\prime} $ be the corresponding p-dimensional covariate vector which is assumed to be available for all times t = 1, …, T. In this setup, K is large (K → ∞) and T is small such as 3 or 4. Suppose that $\beta = (\beta _{1},\ldots,\beta _{p})^{\prime} $ denote the effect of x _it on y _it. Irrespective of the situation whether Y _it is observed or not, it is appropriate in the longitudinal setup to assume that the repeated responses follow a correlation model with known functional forms for the mean and the variance, but the correlation structure may be unknown. Recall that in the independent setup, Lord (1995) considered multivariate responses having a correlation structure and incompleteness arose because of missing information on some response variables, whereas in the present longitudinal setup, repeated responses from an individual form a multivariate response with a suitable mean, variance, and correlation structures, but it remains a possibility that one individual may not provide responses for the whole duration of the study. As indicated in the last section, suppose that for the ith (i = 1, …, K) individual T _i responses (1 < T _i ≤ T) are collected. Also suppose that the remaining T − T _i potential responses are missing and the non-missing responses occur in a monotonic pattern.

As far as the mean, variance, and correlation structure of the potential responses are concerned, it is convenient to define them for the complete data. Let ${y_{i}}^{c} = {(y_{i1},\cdots \,,y_{it},\cdots \,,y_{iT})}^{{\prime}}$ and ${X_{i}}^{c} = {(x_{i1},\cdots \,,x_{it},\cdots \,,x_{iT})}^{^{\prime} }$ denote the T ×1 complete outcome vector and T ×p covariate matrix, respectively, for the i-th (i = 1, ⋯ , K) individual over T successive points in time. Also, let

$$\displaystyle{ E({Y _{i}}^{c}\vert {x_{ i}}^{c}) ={ \mu _{ i}}^{c}(\beta ) = {(\mu _{ i1}(\beta ),\cdots \,,\mu _{it}(\beta ),\cdots \,,\mu _{iT}(\beta ))}^{^{\prime} } }$$

(5)

where $\mu _{it}(\beta ) = {h}^{-1}(\eta _{it})\;\;\mbox{ with}\;\;\eta _{it} = x_{it}^{^{\prime} }\beta$, h being a suitable link function. For example, for linear models, a linear link function is used so that $\mu _{it}(\beta ) = x^{\prime} _{it}\beta;$ whereas for the binary data a logistic link function is commonly used so that $\mu _{it}(\beta ) =\exp (\eta _{it})/[1 +\exp (\eta _{it})]$, and for count data a log linear link function is used so that $\mu _{it}(\beta ) =\exp (\eta _{it})$. Further let

$$\displaystyle{ \Sigma _{i}^{c}(\beta,\rho ) ={ A_{ i}^{c}}^{\frac{1} {2} }(\beta )\tilde{C}_{i}(\rho,x_{i}^{c}){A_{i}^{c}}^{\frac{1} {2} }(\beta ) }$$

(6)

be the true covariance matrix of $y_{i}^{c}$, where $A_{i}^{c}(\beta ) = \mbox{ diag}[\sigma _{i11}(\beta ),\cdots \,,\sigma _{itt}(\beta ),\cdots \,,\sigma _{iTT}(\beta )]$ with $\sigma _{itt}(\beta ) = var(Y _{it})$, and $\tilde{C}_{i}(\rho,x_{i}^{c})$ is the correlation matrix for the ith individual with ρ as a suitable vector of correlation parameters, for example, $\rho \equiv (\rho _{1},\ldots,\rho _{\ell},\ldots,\rho _{T-1})^{\prime} $, where ρ _ℓ is known to be the lag ℓ auto-correlation. Note that when covariates are time dependent, the true correlation matrix is free from time-dependent covariates in linear longitudinal setup, but it depends on the time-dependent covariates through $X_{i}^{c}$ in the discrete longitudinal setup (Sutradhar 2010). In the stationary case, that is, when covariates are time independent, we will denote the correlation matrix by $\tilde{C}(\rho )$ in the complete longitudinal setup, and similar to Sutradhar (2010, 2011), this matrix satisfies the auto-correlation structure given by

$$\displaystyle{ \tilde{C}(\rho ) = \left [\begin{array}{ccccc} 1 & \rho _{1} & \rho _{2} & \cdots &\rho _{T-1} \\ \rho _{1} & 1 & \rho _{1} & \cdots &\rho _{T-2}\\ \vdots & \vdots & \vdots & & \vdots \\ \rho _{T-1} & \rho _{T-2} & \rho _{T-3} & \cdots & 1\\ \end{array} \right ], }$$

(7)

where for ℓ = 1, …, T, ρ _ℓ is known to be the ℓth lag auto-correlation. Note that when this correlation structure (7) will be used in the incomplete longitudinal setup, it would be denoted by $\tilde{C_{i}}(\rho )$ as it will be constructed for T _i available responses.

As far as the missing mechanism is concerned, it is customary to assume that a longitudinal response may be missing completely at random (MCAR), or missing at random (MAR), or the missing can be non-ignorable. Under the MCAR mechanism, the missing-ness does not depend on any present, past, or future responses. Under the MAR mechanism, the missing-ness depends only on the past responses but not on the present or future responses, whereas under the non-ignorable mechanism the missing-ness depends on the past, present, and future possible responses. In notation, let R _it be a response indicator variable at time t (t = 1, ⋯ , T) for the i-th (i = 1, ⋯ , K) individual, so that

$$\displaystyle{ R_{it} = \left \{\begin{array}{ll} 1,&\mbox{ if}\ Y _{it}\mbox{ is observed} \\ 0,&\mbox{ otherwise.} \end{array} \right. }$$

(8)

Note that all individuals provide the responses at the first time point t = 1. Thus, we set R _i1 = 1 with $P(R_{i1} = 1) = 1.0$ for all i = 1, ⋯ , K. Further we assume that the response indicators satisfy the monotonic relationship

$$\displaystyle{ R_{i1} \geq R_{i2} \geq \cdots \geq R_{it} \geq \cdots \geq R_{iT}. }$$

(9)

Next suppose that r _it denote the observed value for R _it. For $t = 2,\ldots,T,$ one may then describe the aforementioned three missing mechanisms as

$$\displaystyle\begin{array}{rcl} \mbox{ MCAR Model}&: & Pr(R_{it} = 1\mid {y_{i}}^{c},x_{ i},r_{i,t-1} = 1) = Pr(R_{it} = 1\mid r_{i,t-1} = 1) {}\\ \mbox{ MAR Model}&: & Pr(R_{it} = 1\mid {y_{i}}^{c},x_{ i},r_{i,t-1} = 1) {}\\ & =& Pr(R_{it} = 1\mid y_{i1},\cdots \,,y_{i,t-1},x_{i},r_{i,t-1} = 1) {}\\ \mbox{ Non-ignorable Model}&: & Pr(R_{it} = 1\mid {y_{i}}^{c},x_{ i},r_{i,t-1} = 1) {}\\ & =& Pr(R_{it} = 1\mid y_{i1},\cdots \,,y_{i,t-1},y_{it},\ldots,y_{iT},x_{i},r_{i,t-1} = 1) {}\\ \end{array}$$

(Little and Rubin 1987; Laird 1988; Fitzmaurice et al. 1996). Furthermore, it follows under the monotonic missing pattern (9) that $Pr(R_{it} = 1\vert y_{i}^{c},x_{i},r_{i,t-1} = 0) = 0$ irrespective of the missing mechanism. Note that the inferences based on the non-ignorable missing mechanism may be quite complicated, and we do not include this complicated mechanism in the current paper.

3.1 Inferences When Longitudinal Responses Are Subject to MCAR

When the longitudinal responses are MCAR, R _it does not depend on the past, present, or future responses. In such a situation, R _it and Y _it are independent, implying that

$$\displaystyle{ E[R_{it}(Y _{it} -\mu _{it}(\beta ))] = E[R_{it}]E[Y _{it} -\mu _{it}(\beta )] = 0, }$$

(10)

because $E[Y _{it} -\mu _{it}(\beta )] = 0$. It is then clear that the inference for β involved in μ _it(β) is not affected by the MCAR mechanism. Thus, one may estimate the regression effects β consistently and efficiently by solving the GQL estimating equation

$$\displaystyle{ \sum _{i=1}^{K}\frac{\partial \mu _{i}^{^{\prime} }(\beta )} {\partial \beta } \Sigma _{i}^{-1}(\beta,\hat{\rho })(y_{ i} -\mu _{i}(\beta )) = 0, }$$

(11)

where for T _i-dimensional observed response vector $y_{i} = (y_{i1},\ldots,y_{iT_{i}})^{\prime} $,

$$\displaystyle\begin{array}{rcl} \mu _{i}(\beta ) = E[Y _{i}]& =& {(\mu _{i1}(\beta ),\cdots \,,\mu _{it}(\beta ),\cdots \,,\mu _{iT_{i}}(\beta ))}^{^{\prime} } {}\\ \Sigma _{i}(\beta,\hat{\rho })& =& A_{i}^{1/2}(\beta )\tilde{C}_{ i}(\rho,x_{i})A_{i}^{1/2}(\beta ), {}\\ \end{array}$$

with $A_{i}(\beta ) = \mbox{ diag}(\sigma _{i,11}(\beta ),\cdots \,,\sigma _{i,tt}(\beta ),\cdots \,,\sigma _{i,T_{i}T_{i}}(\beta ))$, where $\sigma _{i,tt}(\beta ) = \mbox{ var}[Y _{it}]$. Note that the incomplete data based estimating equation (11) can be written in terms of pretended complete data. To be specific, by using the available responses $y_{i} = (y_{i1},\ldots,y_{iT_{i}})^{\prime} $ corresponding to the known response indicators

$$\displaystyle{R_{i}^{c} = r_{ i}^{c} = \left [\begin{array}{cc} I_{T_{i}} & 0 \\ 0 &0, \end{array} \right ]}$$

one may write the GQL estimating equation (11) under the MCAR mechanism as

$$\displaystyle{ \sum _{i=1}^{K}\frac{\partial {\mu _{i}^{c}}^{^{\prime} }(\beta )} {\partial \beta }{ \left [\{I - r_{i}^{c}\} + r_{ i}^{c}\Sigma _{ i}^{c}(\beta,\hat{\rho }){r_{ i}^{c}}^{^{\prime} }\right ]}^{-1}r_{ i}^{c}\left (y_{ i}^{c} -\mu _{ i}^{c}(\beta )\right ) = 0, }$$

(12)

where $y_{i}^{c} = {(y_{i}^{^{\prime} },y_{im}^{^{\prime} })}^{^{\prime} }$ with y _im representing the T − T _i dimensional missing responses which are unobserved but for the computational purpose in the present approach one can use it as a zero vector, for convenience, without any loss of generality. Let $\hat{\beta }_{GQL,MCAR}$ denote the solution of (11) or (12). This estimator is asymptotically unbiased and hence consistent for β.

Note that the computation of $\tilde{C}_{i}(\hat{\rho },x_{i})$ matrix in (11) in general, i.e., when covariates are time dependent, depends on the specific correlation structure (Sutradhar 2010). In stationary cases as well as in linear longitudinal model setup, one may, however, compute the stationary correlation matrix $\tilde{C}_{i}(\hat{\rho })$, by first computing a larger $\tilde{C}(\hat{\rho })$ matrix for $\ell= 1,\ldots,T - 1$, and then using the desired part of this large matrix for t = 1, …, T _i. Turning back to the computation for the larger matrix with dimension $T = \mbox{ max}_{1\leq i\leq K}T_{i}$ for T _i ≥ 2, we exploit the observed response indicator r _it given by

$$\displaystyle\begin{array}{rcl} r_{it}& =& \left \{\begin{array}{ll} 1&\mbox{ if}\;t \leq T_{i} \\ 0&\mbox{ if}\;T_{i} < t \leq T. \end{array} \right.{}\\ \end{array}$$

for all t = 1, …, T. For known β and σ _{i t t}, the ℓth lag correlation estimate $\hat{\rho }_{\ell}$ for the larger $\tilde{C}(\hat{\rho })$ matrix may be computed as

$$\displaystyle{ \hat{\rho }_{\ell} = \frac{\sum _{i=1}^{K}\sum _{t=1}^{T-\ell}r_{it}r_{i,t+\ell}[\left (\frac{y_{it}-x^{\prime} _{it}\beta } {\sigma _{itt}} \right )\left (\frac{y_{i,t+\ell}-x^{\prime} _{it,t+\ell}\beta } {\sigma _{i,t+\ell,t+\ell}} \right )]/\sum _{i=1}^{K}\sum _{t=1}^{T-\ell}r_{it}r_{i,t+\ell}} {\sum _{i=1}^{K}\sum _{t=1}^{T}r_{it}{[\frac{y_{it}-x^{\prime} _{it}\beta } {\sigma _{itt}} ]}^{2}/\sum _{i=1}^{K}r_{it}}, }$$

(13)

(cf. Sneddon and Sutradhar 2004, eqn. (16)) for $\ell= 1,\ldots,T - 1$. Note that as this estimator contains $\hat{\beta }_{GQL,MCAR}$, both (11) and (13) have to be computed iteratively until convergence.

Further note that in the existing GEE approach, instead of (11), one solves the estimating equation

$$\displaystyle{ \sum _{i=1}^{K}\frac{\partial \mu _{i}^{^{\prime} }(\beta )} {\partial \beta } V _{i}^{-1}(\beta,\hat{\alpha })(y_{ i} -\mu _{i}(\beta )) = 0, }$$

(14)

[Liang and Zeger 1986] where $V _{i}(\beta,\hat{\alpha }) = A_{i}^{1/2}(\beta )Q_{i}(\alpha )A_{i}^{1/2}(\beta )$, with Q _i(α) as the $T_{i} \times T_{i}$ “working” correlation matrix of y _i. It is, however, known that this GEE approach may sometimes encounter consistency breakdown (Crowder 1995) because of the difficulty in estimating the “working” correlation or covariance structure, leading to the failure of estimation of β or the non-convergence of β estimator to β. Furthermore, even if GEE β estimate becomes consistent, it may produce inefficient estimate than simpler independence assumption based moment or quasi-likelihood (QL) estimate (Sutradhar and Das 1999; Sutradhar 2011). Thus, one should be clear from these points that the GEE approach even if corrected for missing mechanism may encounter similar consistency and inefficiency in estimating the regression parameters.

We also remark that even though the non-response probability is not affected by the past history under the MCAR mechanism, the respective efficiency of GQL and GEE estimators will decrease if T _i is very small as compared to the attempted complete duration T, that is, if T − T _i is large. As far as the value of T _i is concerned, it depends on the probability, P[R _it = 1] which in general decreases due to the monotonic condition (9). This is because under this monotonic property (9) and following MCAR mechanism, one writes

$$\displaystyle\begin{array}{rcl} Pr[R_{it} = 1]& \equiv & Pr[R_{i1} = 1,R_{i2} = 1,\ldots,R_{it} = 1] \\ & =& \Pi _{j=2}^{t}P[R_{ ij} = 1], {}\end{array}$$

(15)

which gets smaller as t gets larger, implying that T _i can be small as compared to T if P[R _{i j} = 1] is far away down from 1 such as $P[R_{ij} = 1] = 0.90$, say.

3.2 Inferences When Longitudinal Responses Are Subject to MAR

Unlike in the MCAR case, R _it and y _it are not independent under the MAR mechanism. That is

$$\displaystyle{ E[R_{it}(Y _{it} -\mu _{it}(\beta ))]\neq 0\;\;\mbox{ under MAR}. }$$

(16)

This is because

$$\displaystyle\begin{array}{rcl} & & E\left [R_{it}(Y _{it} -\mu _{it}(\beta ))\mid H_{i,t-1}(y)\right ] \\ & =& E_{Y _{it}}E\left [R_{it}(Y _{it} -\mu _{it}(\beta ))\mid Y _{it},H_{i,t-1}(y)\right ] \\ & =& E_{Y _{it}}\left [\{(Y _{it} -\mu _{it}(\beta ))\vert H_{i,t-1}(y)\}E\{R_{it}\vert Y _{it},H_{i,t-1}(y)\}\right ] \\ & =& E_{Y _{it}}\left [\{(Y _{it} -\mu _{it}(\beta ))\vert H_{i,t-1}(y)\}E\{R_{it}\vert H_{i,t-1}(y)\}\right ] {}\end{array}$$

(17)

as R _it does not depend on Y _it by the definition of MAR.

Next due to the monotonic property (9) of the response indicators

$$\displaystyle\begin{array}{rcl} & & E\left [R_{it}\mid H_{i,t-1}(y)\right ] \\ & =& P\left [R_{i1} = 1,R_{i2} = 1,\cdots \,,R_{i,t-1} = 1,R_{it} = 1\vert H_{i,t-1}(y)\right ] \\ & =& P(R_{i1} = 1)P\left [R_{i2} = 1\mid R_{i1} = 1;H_{i1}(y)\right ]\cdots \\ & \times & P\left [R_{it} = 1\mid R_{i1} = 1,\cdots \,,R_{i,t-1} = 1;H_{i,t-1}(y)\right ] \\ & =& \prod _{j=1}^{t}g_{ ij}(y_{i,j-1},\cdots \,,y_{i,j-q};\gamma ) \\ & =& w_{it}\{H_{i,t-1}(y);\gamma \}, {}\end{array}$$

(18)

and

$$\displaystyle{ E_{Y _{it}}\left [(Y _{it} -\mu _{it}(\beta ))\vert H_{i,t-1}(y)\right ] = (\lambda _{it}(H_{i,t-1}(y),\beta,\rho ) -\mu _{it}(\beta )), }$$

(19)

where $\lambda _{it}(H_{i,t-1}(y),\beta,\rho )$ is the conditional mean of Y _it. In (18), one may, for example, use g _ij(γ) as

$$\displaystyle\begin{array}{rcl} g_{ij}(\gamma )& =& Pr[(R_{ij} = 1)\vert R_{i1} = 1,\ldots,R_{i,j-1} = 1,H_{i,j-1}(y)] \\ & =& \frac{\exp (1 +\gamma y_{i,j-1})} {1 +\exp (1 +\gamma y_{i,j-1})}. {}\end{array}$$

(20)

Now because both $w_{it}\{H_{i,t-1}(y);\gamma \}$ and $\lambda _{it}(H_{i,t-1}(y),\beta,\rho )$ are functions of the past history of responses H _{i, t − 1}(y), and because

$$\displaystyle{ E_{H_{i,t-1}(y)}[\lambda _{it}(H_{i,t-1}(y),\beta,\rho ) -\mu _{it}(\beta )] = 0, }$$

(21)

it then follows from (17), by (18) and (19), that

$$\displaystyle{ E[R_{it}(Y _{it} -\mu _{it}(\beta ))] = E_{H_{i,t-1}(y)}E\left [R_{it}(Y _{it} -\mu _{it}(\beta ))\mid H_{i,t-1}(y)\right ]\neq 0, }$$

(22)

unless $w_{it}\{H_{i,t-1}(y);\gamma \}$ is a constant free of H _{i, t − 1}(y), which is, however, impossible under MAR missing mechanism as opposed to the MCAR mechanism. Thus, $E[R_{it}\{Y _{it} -\mu _{it}(\beta )\}]\neq 0$.

3.2.1 Existing Partially Standardized GEE Estimation for Longitudinal Data Subject to MAR

Note, however, that

$$\displaystyle\begin{array}{rcl} & & E\left \{ \frac{R_{it}} {w_{it}\{H_{i,t-1}(y);\gamma \}}(Y _{it} -\mu _{it}(\beta ))\right \} \\ & =& E_{H_{i,t-1}(y)}\left [\left (\lambda _{it}(H_{i,t-1}(y),\beta,\rho ) -\mu _{it}(\beta )\right )\right ] = 0.{}\end{array}$$

(23)

Now suppose that

$$\displaystyle{\Delta _{i} = \mbox{ diag}[\delta _{i1},\delta _{i2},\cdots \,,\delta _{iT_{i}}]\;\mbox{ with}\;\delta _{it} = R_{it}/w_{it}\{H_{i,t-1}(y);\gamma \}}$$

implying that $E[\Delta _{i}\vert H_{i}(y)] = I_{T_{i}}$, and where H _i(y) is used to denote appropriate past history showing that the response indicators are generated based on observed responses only.

By observing the unconditional expectation property from (23), in the spirit of GEE [Liang and Zeger 1986], Robins et al. (1995, eqn. (10), p. 109) proposed a conditional inverse weights based PSGEE for the estimation of β which has the form

$$\displaystyle\begin{array}{rcl} & & \sum _{i=1}^{K}\frac{\partial E_{H_{i}(y)}E[\{\Delta _{i}\mu _{i}(\beta )\}^{\prime} \vert H_{i}(y)]} {\partial \beta } V _{i}^{-1}(\hat{\alpha })\{\Delta _{ i}(y_{i} -\mu _{i}(\beta ))\vert H_{i}(y)\} \\ & & =\sum _{ i=1}^{K}\frac{\partial \{\mu _{i}(\beta )\}^{\prime} } {\partial \beta } V _{i}^{-1}(\hat{\alpha })\{\Delta _{ i}(y_{i} -\mu _{i}(\beta ))\} = 0, {}\end{array}$$

(24)

(see also Paik 1997, eqn. (1), p. 1321). Note that we refer to the GEE in (23) as a partly or partially standardized GEE (PSGEE) because $V _{i}(\alpha ) =\hat{ \mbox{ cov}}(Y _{i})$ used in this GEE is a partial weight matrix which ignores the missing mechanism, whereas $\mbox{ cov}[\Delta _{i}(y_{i} -\mu _{i}(\beta ))]$ would be a full weight matrix.

Note that over the last decade many researchers have used this PSWGEE approach for studying various aspects of longitudinal data subject to non-response. See, for example, the studies by Rotnitzky et al. (1998), Preisser et al. (2002), and Birmingham et al. (2003), among others. However, even if the MAR mechanism is accommodated to develop an unbiased estimating function $\Delta _{i}(y_{i} -\mu _{i}(\beta ))$ (for 0) to construct the fully standardized GEE (FSGEE), the consistency of the estimator of β may break down (see Crowder 1995 for complete longitudinal models) because of the use of “working” covariance matrix V _i(α), whereas the true covariance matrix for y _i is given by $\mbox{ cov}[Y _{i}] = \Sigma _{i}(\rho )$. This can happen for those cases where α is not estimable. To be more clear, V _i(α) is simply a “working” covariance matrix of y _i, whereas a proper estimating equation must use the correct variance (or its consistent estimate) matrix of $\{\Delta _{i}(y_{i} -\mu _{i}(\beta ))\}$.

To understand the roles of both missing mechanism and longitudinal correlation structure in constructing a proper estimating equation, we now provide following three estimating equations for β. The difficulties and/or advantages encountered by these equations are also indicated.

3.2.2 Partially Standardized GQL (PSGQL) Estimation for Longitudinal Data Subject to MAR

When V _i(α) matrix in (24) is replaced with the true $T_{i} \times T_{i}$ covariance matrix of the available responses, that is, $\Sigma _{i}(\rho ) = \mbox{ cov}[Y _{i}]$, one obtains the PSGQL estimating equation given by

$$\displaystyle{ \sum _{i=1}^{K}\frac{\partial \{\mu _{i}(\beta )\}^{\prime} } {\partial \beta } \Sigma _{i}^{-1}(\hat{\rho })\{\Delta _{ i}(y_{i} -\mu _{i}(\beta ))\} = 0, }$$

(25)

which also may produce biased and hence inconsistent estimate. This is because Σ _i(ρ) may still be very different than the covariance matrix of the actual variable $\{\Delta _{i}(y_{i} -\mu _{i}(\beta ))\}$. Thus, if the proportion of missing values is more, one may not get convergent solution to the estimating equation (25) and the consistency for β would break down (Crowder 1995). The convergence problems encountered by (24) would naturally be more severe as even in the complete data case V _i(α) may not be estimable.

3.2.3 Partially Standardized Conditional GQL (PSCGQL) Estimation for Longitudinal Data Subject to MAR

Suppose that one uses conditional (on history) variance

$$\displaystyle{ \mbox{ cov}\left \{\Delta _{i}(Y _{i} -\mu _{i}(\beta ))\right \}\vert H_{i}(y) = \Sigma _{ich}^{{\ast}}(H_{ i}(y),\beta,\rho,\gamma ), }$$

(26)

to construct the estimating equation. Then following (25), one may write the PSCGQL estimating equation given by

$$\displaystyle{ \sum _{i=1}^{K}\frac{\partial \{\mu _{i}(\beta )\}^{\prime} } {\partial \beta }{ \Sigma _{ich}^{{\ast}}}^{-1}(H_{ i}(y),\beta,\rho,\gamma )\{\Delta _{i}(y_{i} -\mu _{i}(\beta ))\} = 0 }$$

(27)

It is, however, seen that

$$\displaystyle\begin{array}{rcl} & &{ \Sigma _{ich}^{{\ast}}}^{-1}(H_{ i}(y),\beta,\rho,\gamma )[\Delta _{i}(Y _{i} -\mu _{i}(\beta ))] \\ & \rightarrow &{ \Sigma _{ich}^{{\ast}}}^{-1}(H_{ i}(y),\beta,\rho,\gamma )E[\{\Delta _{i}(Y _{i} -\mu _{i}(\beta ))\}\vert H_{i}(y)] \\ & =&{ \Sigma _{ich}^{{\ast}}}^{-1}(H_{ i}(y),\beta,\rho,\gamma )[\lambda _{i}(H_{i}(y)) -\mu _{i}(\beta )] {}\end{array}$$

(28)

But,

$$\displaystyle\begin{array}{rcl} & & E_{H_{i}(y)}\left [\frac{\partial \{\mu _{i}(\beta )\}^{\prime} } {\partial \beta }{ \Sigma _{ich}^{{\ast}}}^{-1}(H_{ i}(y),\beta,\rho,\gamma )\right. \\ & \times & \left.[\lambda _{i}(H_{i}(y)) -\mu _{i}(\beta )]\right ]\neq 0, {}\end{array}$$

(29)

even though

$$\displaystyle{E_{H_{i}(y)}[\lambda _{i}(H_{i}(y)) -\mu _{i}(\beta )] = 0.}$$

Thus, the PSCGQL estimating equation (27) is not an unbiased equation for 0, and may produce bias estimate.

3.2.3 Computational formula for $\Sigma _{ich}^{{\ast}}(\beta,\rho,\gamma )$

For convenience, we first write

$$\displaystyle\begin{array}{rcl} \Delta _{i}& =& W_{i}^{-1}R_{ i},\;\mbox{ with} {}\\ W_{i}& =& \mbox{ diag}[w_{i1},w_{i2},\ldots,w_{iT_{i}}],\;\mbox{ and}\;R_{i} = \mbox{ diag}[R_{i1},\ldots,R_{iT_{i}}]. {}\\ \end{array}$$

It then follows that

$$\displaystyle\begin{array}{rcl} \Sigma _{ich}^{{\ast}}(\beta,\rho )& =& \mbox{ cov}[\{\Delta _{ i}\{(y_{i} -\mu _{i}(\beta ))\}\}\vert H_{i}(y)] \\ & =& W_{i}^{-1}\mbox{ cov}[\{R_{ i}(y_{i} -\mu _{i}(\beta ))\}\vert H_{i}(y)]W_{i}^{-1}.{}\end{array}$$

(30)

Now to compute the covariance matrix in the middle term in the right-hand side of (30), we first re-express $R_{i}(y_{i} -\mu _{i}(\beta ))$ as

$$\displaystyle{R_{i}(y_{i} -\mu _{i}(\beta )) = [R_{i1}(y_{i1} -\mu _{i1}),\ldots,R_{it}(y_{it} -\mu _{it}),\ldots,R_{iT_{i}}(y_{iT_{i}} -\mu _{iT_{i}})]^{\prime},}$$

and compute the variances for its components as

$$\displaystyle{ \mbox{ var}[\{R_{i1}(y_{i1} -\mu _{i1})\}\vert y_{i1}] = 0, }$$

(31)

because R _i1 = 1 always and y _i1 is random. In the Poisson case $\sigma _{i,11} =\mu _{i1}$ and in the binary case $\sigma _{i,11} =\mu _{i1}(1 -\mu _{i1})$, with appropriate formula for μ _i1 in a given case. Next for t = 2, …, T _i,

$$\displaystyle\begin{array}{rcl} & & \mbox{ var}[R_{it}(y_{it} -\mu _{it})\vert H_{i,t-1}(y)] = \mbox{ var}[R_{it}\vert H_{i,t-1}(y)]\mbox{ var}[y_{it}\vert H_{i,t-1}(y)] \\ & +& {E}^{2}[R_{ it}\vert H_{i,t-1}(y)]\mbox{ var}[(y_{it})\vert H_{i,t-1}(y)] + \mbox{ var}[R_{it}\vert H_{i,t-1}(y)]{E}^{2}[(y_{ it} -\mu _{it})\vert H_{i,t-1}(y)] \\ & =& w_{it}(1 - w_{it})\sigma _{ic,tt} + w_{it}^{2}\sigma _{ ic,tt} + w_{it}(1 - w_{it})\{\lambda _{it} -\mu {_{it}\}}^{2} \\ & =& w_{it}[\sigma _{ic,tt} + {(\lambda _{it} -\mu _{it})}^{2}] - w_{ it}^{2}{(\lambda _{ it} -\mu _{it})}^{2}, {}\end{array}$$

(32)

where, given the history, λ _it and σ _ic, tt are the conditional mean and variance of y _it, respectively.

Furthermore, all pairwise covariances conditional on the history H _{i, t − 1}(y) may be computed as follows. For u < t,

$$\displaystyle\begin{array}{rcl} & & \mbox{ cov}[\{R_{iu}(y_{iu} -\mu _{iu}),R_{it}(y_{it} -\mu _{it})\}\vert H_{i,t-1}(y)] \\ & =& E[\{R_{iu}R_{it}(y_{iu} -\mu _{iu})(y_{it} -\mu _{it})\}\vert H_{i,t-1}(y)] - E[\{R_{iu}(y_{iu} -\mu _{iu})\}\vert H_{i,t-1}(y)] \\ & \times & E[\{R_{it}(y_{it} -\mu _{it})\}\vert H_{i,t-1}(y)] \\ & =& (y_{iu} -\mu _{iu})E[\{R_{it}(y_{it} -\mu _{it})\}\vert H_{i,t-1}(y)] - [w_{iu}(y_{iu} -\mu _{iu})][w_{it}(\lambda _{it} -\mu _{it})] \\ & =& [(y_{iu} -\mu _{iu})(1 - w_{iu})][w_{it}(\lambda _{it} -\mu _{it})] {}\end{array}$$

(33)

3.2.4 A Fully Standardized GQL (FSGQL) Approach

All three estimating equations, namely PSGEE (24), PSGQL (25), and PSCGQL (27) may produce bias estimates, PSGEE being the worst. The reasons for the poor performance of PSGEE are two fold. This is because it completely ignores the missing mechanism and uses a working correlation matrix to accommodate the longitudinal nature of the available data. As opposed to the PSGEE approach, PSGQL approach uses the true correlation structure under a class of auto-correlations but similar to the PSGEE approach it also ignores the missing mechanism. As far as the PSCGQL approach it uses a correct conditional covariance matrix which accommodates both missing mechanism and correlation structure. However, the resulting estimating equation may not unbiased for zero as the history of the responses involved in covariance matrix make a weighted distance function which is not unbiased.

To remedy the aforementioned problems, it is therefore important to use the correct covariance matrix or its consistent estimate to construct the weight matrix by accommodating both missing mechanism and longitudinal correlations of the repeated data. For this to happen, because the distance function is unconditionally unbiased for zero, i.e.,

$$\displaystyle{E_{H_{i}(y)}E[\left \{\Delta _{i}(Y _{i} -\mu _{i}(\beta ))\right \}\vert H_{i}(y)] = 0,}$$

one must use the unconditional covariance matrix of $\left \{\Delta _{i}(Y _{i} -\mu _{i}(\beta ))\right \}$ to compute the incomplete longitudinal weight matrix, for the construction of a desired unbiased estimating equation. Let $\Sigma _{i}^{{\ast}}(\beta,\rho,\gamma )$ denote this unconditional covariance matrix which is computed by using the formula

$$\displaystyle\begin{array}{rcl} \Sigma _{i}^{{\ast}}(\beta,\rho,\gamma )& =& \mbox{ cov}\left \{\Delta _{ i}(Y _{i} -\mu _{i}(\beta ))\right \} = E_{H_{i}(y)}[\mbox{ cov}\left \{\Delta _{i}(Y _{i} -\mu _{i}(\beta ))\right \}\vert H_{i}(y)] \\ & +& \mbox{ cov}_{H_{i}(y)}[E\left \{\Delta _{i}(Y _{i} -\mu _{i}(\beta ))\right \}\vert H_{i}(y)]. {}\end{array}$$

(34)

In the spirit of Sutradhar (2003), we propose the FSGQL estimating equation for β given by

$$\displaystyle\begin{array}{rcl} & & \sum _{i=1}^{K}\frac{\partial E_{H_{i}(y)}E[\{\Delta _{i}\mu _{i}(\beta )\}^{\prime} \vert H_{i}(y)]} {\partial \beta } {[\mbox{ cov}\{\Delta _{i}(y_{i} -\mu _{i})\}]}^{-1}\{\Delta _{ i}(y_{i} -\mu _{i}(\beta ))\} \\ & =& \sum _{i=1}^{K}\frac{\partial \mu ^{\prime} _{i}} {\partial \beta } {[\Sigma _{i}^{{\ast}}(\beta,\rho,\gamma )]}^{-1}\{\Delta _{ i}(y_{i} -\mu _{i}(\beta ))\} = 0, {}\end{array}$$

(35)

where $\Sigma _{i}^{{\ast}}(\beta,\rho,\gamma )$ is yet to be computed. This estimating equation is solved iteratively by using

$$\displaystyle\begin{array}{rcl} \hat{\beta }_{FSGQL}(m + 1)& =& \hat{\beta }_{FSGQL}(m) + \left [\sum _{i=1}^{K}\frac{\partial \mu ^{\prime} _{i}(\beta )} {\partial \beta } {[\Sigma _{i}^{{\ast}}(\beta,\rho,\gamma )]}^{-1}\frac{\partial \mu _{i}(\beta )} {\partial \beta ^{\prime} } \right ]_{m}^{-1} \\ & & \times \left [\sum _{i=1}^{K}\frac{\partial \mu _{i}^{^{\prime} }} {\partial \beta } {[\Sigma _{i}^{{\ast}}(\beta,\rho,\gamma )]}^{-1}\Delta _{ i}(y_{i} -\mu _{i}(\beta ))\right ]_{m} {}\end{array}$$

(36)

3.2.4 Computation of $\Sigma _{i}^{{\ast}}(\beta,\rho,\gamma ) = \mbox{ cov}[\Delta _{i}(y_{i} -\mu _{i})]$

Rewrite (34) as

$$\displaystyle\begin{array}{rcl} \Sigma _{i}^{{\ast}}(\beta,\rho,\gamma )& =& E_{ H_{i}(y)}[\mbox{ cov}\left \{\Delta _{i}(Y _{i} -\mu _{i}(\beta ))\right \}\vert H_{i}(y)] \\ & +& \mbox{ cov}_{H_{i}(y)}[E\left \{\Delta _{i}(Y _{i} -\mu _{i}(\beta ))\right \}\vert H_{i}(y)] \\ & =& E_{H_{i}(y)}[\Sigma _{ich}^{{\ast}}(\beta,\rho )] + \mbox{ cov}_{ H_{i}(y)}[E_{ich}(\beta,\rho )],{}\end{array}$$

(37)

where $\Sigma _{ich}^{{\ast}}(\beta,\rho )$ is constructed by (30) by using the formulas from (31) to (33), and E _{i c h}(β, ρ) has the form $E_{ich}(\beta,\rho ) = [(y_{i1} -\mu _{i1}),(\lambda _{i2} -\mu _{i2}),\ldots,(\lambda _{iT_{i}} -\mu _{iT_{i}})]^{\prime} $.

It then follows that the components of the $T_{i} \times T_{i}$ unconditional covariance matrix $\Sigma _{i}^{{\ast}}(\beta,\rho,\gamma )$ are given by

$$\displaystyle\begin{array}{rcl} & & \mbox{ cov}[\delta _{iu}(y_{iu} -\mu _{iu}),\delta _{it}(y_{it} -\mu _{it})] {} \\ & =& \left \{\begin{array}{l} \mbox{ var}_{y_{i1}}(y_{i1} -\mu _{i1}) =\sigma _{i11}\qquad \mbox{ for u=t=1} \\ E_{H_{i}(y)}[w_{it}^{-1}\{\sigma _{ic,tt} + {(\lambda _{it} -\mu _{it})}^{2}\} - {(\lambda _{it} -\mu _{it})}^{2}] + E_{H_{i}(y)}{(\lambda _{it} -\mu _{it})}^{2},\qquad \mbox{ for u=t=2,}\ldots \\ E_{H_{i}(y)}[(y_{i1} -\mu _{i1})(\lambda _{it} -\mu _{it})]\qquad \mbox{ for u=1,t=2,}\ldots \\ E_{H_{i}(y)}[(w_{iu}^{-1} - 1)\{(y_{iu} -\mu _{iu})(\lambda _{it} -\mu _{it})\}] \\ + E_{H_{i}(y)}[(\lambda _{iu} -\mu _{iu})(\lambda _{it} -\mu _{it})],\qquad \mbox{ for u=2,}\ldots;\;u < t \end{array} \right.\\ \end{array}$$

(38)

3.2.4.1 (a). Example of $\Sigma _{i}^{{\ast}}(\beta,\rho,\gamma )$ under linear longitudinal models with T = 2

Note that $R_{i1} = r_{i1} = 1$ always. But R _i2 can be 1 or 0 and under MAR, its probability depends on y _i1. Consider

$$\displaystyle{Pr[R_{i1} = 1] = g_{i1} = w_{i1} = 1.0}$$

$$\displaystyle{ P[R_{i2} = 1\vert r_{i1} = 1,y_{i1}] = g_{i2}(\gamma ) = \frac{\exp \{1 +\gamma y_{i1}\}} {1 +\exp \{ 1 +\gamma y_{i1}\}} }$$

(39)

by (20), yielding

$$\displaystyle\begin{array}{rcl} w_{i2}& =& E[R_{i2}\vert H_{i1(y)}] = P[R_{i1} = 1,R_{i2} = 1\vert H_{i1}(y)] {}\\ & =& P[R_{i1} = 1]P[R_{i2} = 1\vert H_{i1}(y)] = g_{i1}g_{i2}(y_{i1}), {}\\ \end{array}$$

(see also (18)). With regard to the longitudinal model for potential responses $y_{i1},y_{i2}$, along with their non-stationary (time dependent covariates), consider the model as:

$$\displaystyle{ y_{it} \sim (x^{\prime} _{it}\beta, \frac{{\sigma }^{2}} {1 {-\rho }^{2}}),\;\mbox{ corr}(Y _{it},Y _{i,t+\ell}) {=\rho }^{\ell}. }$$

(40)

Assuming normal distribution, one may write

$$\displaystyle\begin{array}{rcl} E[Y _{it}\vert H_{i,t-1}]& =& x^{\prime} _{it}\beta + [\mbox{ cov}(y_{it},y_{i,t-\ell})]{[\mbox{ var}(y_{i,t-\ell})]}^{-1}(y_{ i,t-\ell}- x^{\prime} _{i,t-\ell}\beta ) \\ & =& x^{\prime} _{it}\beta + [ \frac{{\sigma {}^{2}\rho }^{\ell}} {1 {-\rho }^{2}}] = x^{\prime} _{it}\beta {+\rho }^{\ell}[y_{i,t-\ell}- x^{\prime} _{i,t-\ell}\beta ]. {}\end{array}$$

(41)

When the response y _it depends on its immediate history, the conditional mean has the formula

$$\displaystyle{E[Y _{it}\vert y_{i,t-1}] =\lambda _{it} = x^{\prime} _{it}\beta +\rho (y_{i,t-1} - x^{\prime} _{i,t-1}\beta ),}$$

implying that the unconditional mean is given by $\mu _{it} = E[Y _{it}] = x^{\prime} _{it}\beta$, which is the same as the mean in (40), as expected.

Now following (38), we provide the elements of the 2 ×2 matrix $\Sigma _{i}^{{\ast}}(\beta,\rho,\gamma )$ as

$$\displaystyle\begin{array}{rcl} \sigma _{i11}^{{\ast}}& =& \frac{{\sigma }^{2}} {1 {-\rho }^{2}} \\ \sigma _{i12}^{{\ast}}& =& \sigma _{ i21}^{{\ast}} =\rho \mbox{ var}[Y _{ i1} - x_{i11}\beta ] =\rho \frac{{\sigma }^{2}} {1 {-\rho }^{2}} \\ \sigma _{i22}^{{\ast}}& =& E_{ y_{i1}}[w_{i2}^{-1}\{\mbox{ var}(Y _{ i2}\vert y_{i1}) + {(\lambda _{i2} -\mu _{i2})}^{2}\}] \\ & =& E_{y_{i1}}[\{1 +\frac{1} {\exp (1 +\gamma y_{i1})}\}\{{\sigma }^{2} {+\rho }^{2}{(y_{ i1} - x_{i11}\beta )}^{2}\}] \\ & =& \frac{{\sigma }^{2}} {1 -{\rho }^{2}} +{\sigma }^{2}E_{ 1} {+\rho }^{2}E_{ 2}, {}\end{array}$$

(42)

where

$$\displaystyle\begin{array}{rcl} E_{1} =\int [ \frac{1} {\exp (1 +\gamma y_{i1})}]g_{N}(y_{i1})dy_{i1},\;\mbox{ and}& & {}\\ E_{2} =\int [\frac{\{y_{i1} - x{_{i11}\beta \}}^{2}} {\exp (1 +\gamma y_{i1})} ]g_{N}(y_{i1})dy_{i1},& & {}\\ \end{array}$$

$g_{N}(y_{i1})$ being the normal (say) density of y _i1. Thus, $\Sigma _{i}^{{\ast}}(\beta,\rho,\gamma )$ has the formula

$$\displaystyle{ \Sigma _{i}^{{\ast}}(\beta,\rho,\gamma ) = \frac{{\sigma }^{2}} {1 {-\rho }^{2}}\left [\begin{array}{cc} 1& \rho \\ \rho &\{1 + (1 {-\rho }^{2})E_{1} {+ \frac{{\rho }^{2}(1{-\rho }^{2})} {\sigma }^{2}} E_{2}\} \end{array} \right ], }$$

(43)

Note that in the complete longitudinal case w _i2 would be 1 and $\sigma _{i22}^{{\ast}}$ would reduce to $\frac{{\sigma }^{2}} {1{-\rho }^{2}}$, leading to

$$\displaystyle{ \Sigma _{i}^{{\ast}}(\beta,\rho,\gamma ) = \Sigma _{ i}(\beta,\rho ) = \frac{{\sigma }^{2}} {1 {-\rho }^{2}}\left [\begin{array}{cc} 1& \rho \\ \rho &1 \end{array} \right ], }$$

(44)

which is free from β in this linear model case, and the PSGEE (24) uses a “working” version of (44), namely

$$\displaystyle{ V _{i}(\alpha ) = \frac{{\sigma }^{2}} {1 {-\rho }^{2}}\left [\begin{array}{cc} 1& \alpha \\ \alpha &1 \end{array} \right ], }$$

(45)

whereas the FSGQL estimating equation (35) would use $\Sigma _{i}^{{\ast}}(\beta,\rho,\gamma )$ from (43). This shows the effect of missing mechanism in the construction of the weight matrix for the estimating equation.

3.2.4.2 (b). Example of $\Sigma _{i}^{{\ast}}(\beta,\rho,\gamma )$ under binary longitudinal AR(1) model with T = 2

Consider a binary AR(1) model with

$$\displaystyle{ \lambda _{it} = E[Y _{it}\vert y_{i,t-1}] =\mu _{it} +\rho (y_{i,t-1} -\mu _{i,t-1}),\;t = 2,\ldots,T, }$$

(46)

where $\mu _{it} = \frac{\exp (x^{\prime} _{it}\beta )} {1+\exp (x^{\prime} _{it}\beta )}$, for all $t = 1,\ldots,T$.

Now considering y _i1 as fixed, by using (31)–(33) we first compute the history-dependent conditional covariance matrix $\Sigma _{ich}(\beta,\rho ) = \mbox{ cov}[\{\Delta _{i}(y_{i} -\mu _{i})\}\vert H_{i}(y)]$ as:

$$\displaystyle\begin{array}{rcl} & & \mbox{ var}[\delta _{i1}(y_{i1} -\mu _{i1})] = 0 \\ & & \mbox{ var}[\{\delta _{i2}(y_{i2} -\mu _{i2})\}\vert y_{i1}] = \frac{1} {w_{i2}}[\lambda _{i2}(1 -\lambda _{i2}) {+\rho }^{2}{(y_{ i1} -\mu _{i1})}^{2}] {-\rho }^{2}{(y_{ i1} -\mu _{i1})}^{2} \\ & & \mbox{ cov}[\{\delta _{i1}(y_{i1} -\mu _{i1}),\delta _{i2}(y_{i2} -\mu _{i2})\}\vert y_{i1}] = 0, {}\end{array}$$

(47)

yielding

$$\displaystyle\begin{array}{rcl} & & E_{H_{i}(y)}[\Sigma _{ich}(\beta,\rho )] \\ & =& \left \{\begin{array}{l} E_{y_{i1}}[\sigma _{ich,11}] = E_{y_{i1}}[0] = 0 \\ E_{y_{i1}}[\sigma _{ich,22}] = E_{y_{i1}}[ \frac{1} {w_{i2}} [\lambda _{i2}(1 -\lambda _{i2}) {+\rho }^{2}{(y_{i1} -\mu _{i1})}^{2}] {-\rho }^{2}{(y_{i1} -\mu _{i1})}^{2}] \\ E_{y_{i1}}[\sigma _{ich,12}] = E_{y_{i1}}[0] = 0 \end{array} \right.{}\end{array}$$

(48)

Next because

$$\displaystyle{E_{ich}(\beta,\rho ) = E\{\Delta _{i}(y_{i} -\mu _{i})\}\vert H_{i}(y)] = [(y_{i1} -\mu _{i1}),(\lambda _{i2} -\mu _{i2})]^{\prime},}$$

one obtains

$$\displaystyle\begin{array}{rcl} & & \mbox{ cov}_{H_{i}(y)}[E_{ich}(\beta,\rho )] \\ & =& \left \{\begin{array}{l} \mbox{ var}_{y_{i1}}[y_{i1} -\mu _{i1}] =\mu _{i1}[1 -\mu _{i1}] \\ \mbox{ cov}_{y_{i1}}[(y_{i1} -\mu _{i1}),(\lambda _{i2} -\mu _{i2})] =\rho \mbox{ var}_{y_{i1}}[y_{i1} -\mu _{i1}] =\rho \mu _{i1}[1 -\mu _{i1}] \\ \mbox{ var}_{y_{i1}}[\lambda _{i2} -\mu _{i2}] = \mbox{ var}_{y_{i1}}[\rho (y_{i1} -\mu _{i1})] {=\rho }^{2}[\mu _{i1}(1 -\mu _{i1})]. \end{array} \right.{}\end{array}$$

(49)

By combining (48) and (49), it follows from (38) that the 2 ×2 unconditional covariance matrix $\Sigma _{i}^{{\ast}}(\beta,\rho,\gamma )$ has the form

$$\displaystyle\begin{array}{rcl} \mbox{ var}[\delta _{i1}(y_{i1} -\mu _{i1})]& =& \mu _{i1}[1 -\mu _{i1}] \\ \mbox{ var}[\delta _{i2}(y_{i2} -\mu _{i2})]& =& E_{y_{i1}}[ \frac{1} {w_{i2}}\{\lambda _{i2}(1 -\lambda _{i2}) {+\rho }^{2}{(y_{ i1} -\mu _{i1})}^{2}\}] \\ & =& [\mu _{i2}(1 -\mu _{i2})]E[w_{i2}^{-1}] \\ & & +\rho (1 - 2\mu _{i2})E[w_{i2}^{-1}(y_{ i1} -\mu _{i1})] \\ & =& [\mu _{i2}(1 -\mu _{i2})]E_{1y} +\rho (1 - 2\mu _{i2})[E_{2y} -\mu _{i1}E_{1y}]{}\end{array}$$

(50)

$$\displaystyle\begin{array}{rcl} \mbox{ cov}[\delta _{i1}(y_{i1} -\mu _{i1}),\delta _{i2}(y_{i2} -\mu _{i2})]& =& \rho [\mu _{i1}\{1 -\mu _{i1}\}],{}\end{array}$$

(51)

where

$$\displaystyle\begin{array}{rcl} E_{1y} = E[w_{i2}^{-1}]& =& \{1 +\exp (-1) +\mu _{ i1}\exp (-1)(\exp (-\gamma ) - 1)\} {}\\ E_{2y} = E[ \frac{y_{i1}} {w_{i2}}]& =& \mu _{i1}\{1 +\exp (-\gamma - 1)\}. {}\\ \end{array}$$

3.2.4.2.1 General formula for $\Sigma _{i}^{{\ast}}(\beta,\rho,\gamma )$ under the binary AR(1) model

In general, it follows from (38) that the elements of the $T_{i} \times T_{i}$ unconditional covariance matrix $\Sigma _{i}^{{\ast}}(\beta,\rho,\gamma )$ under AR(1) binary model are given by

$$\displaystyle\begin{array}{rcl} & & \mbox{ cov}[\delta _{iu}(y_{iu} -\mu _{iu}),\delta _{it}(y_{it} -\mu _{it})] \\ & \equiv & \left \{\begin{array}{l} \sigma _{i,11}^{{\ast}} =\mu _{i1}[1 -\mu _{i1}] \\ \sigma _{i,tt}^{{\ast}} = E_{H_{i}(y)}[w_{it}^{-1}\{\mu _{it}(1 -\mu _{it}) +\rho (1 - 2\mu _{it})(y_{i,t-1} -\mu _{i,t-1})\},(\mbox{ for}\;t = 2,\ldots,T_{i}) \\ \sigma {\ast}_{i,ut} {=\rho \rho }^{t-1-u}\mu _{iu}(1 -\mu _{iu}),\;(\mbox{ for}\;u = 1 < t) \\ \sigma _{i,ut}^{{\ast}} {=\rho { }^{2}\rho }^{t-u}\mu _{i(u-1)}(1 -\mu _{i(u-1)}),\;(\mbox{ for}\;1 < u < t). \end{array} \right.{}\end{array}$$

(52)

3.3 An Empirical Illustration

First, to illustrate the performance of the existing PSGEE (24) approach, we refer to some of the simulation results reported by Sutradhar and Mallick (2010). It was shown that this approach may produce highly biased and hence inconsistent regression estimates. In fact these authors also demonstrated that PSGEE(I) (independence assumption based) approach produces less biased estimates than any “working” correlation structures based PSGEE approaches. For example, we consider here their simulation design chosen as

3.3 Simulation Design

K = 100, T = 4, p = 2, q = 1, γ = 4, ρ = 0. 4, 0. 8, $\beta _{1} =\beta _{2} = 0$ along with two time-dependent covariates:

$$\displaystyle{x_{it1} = \left \{\begin{array}{ll} \frac{1} {2} & \mbox{ for $i = 1,\cdots \,, \frac{K} {4} $; $t = 1,2$} \\ 0 &\mbox{ for $i = 1,\cdots \,, \frac{K} {4} $; $t = 3,4$} \\ -\frac{1} {2} & \mbox{ for $i = \frac{K} {4} + 1,\cdots \,, \frac{3K} {4} $; $t = 1$} \\ 0 &\mbox{ for $i = \frac{K} {4} + 1,\cdots \,, \frac{3K} {4} $; $t = 2,3$} \\ \frac{1} {2} & \mbox{ for $i = \frac{K} {4} + 1,\cdots \,, \frac{3K} {4} $; $t = 4$} \\ \frac{t} {2T} &\mbox{ for $i = \frac{3K} {4} + 1,\cdots \,,K$; $t = 1,\cdots \,,4$} \end{array} \right.}$$

and

$$\displaystyle{x_{it2} = \left \{\begin{array}{ll} \frac{t-2.5} {2T} &\mbox{ for $i = 1,\cdots \,, \frac{K} {2} $; $t = 1,\cdots \,,4$} \\ 0 &\mbox{ for $i = \frac{K} {2} + 1,\cdots \,,K$; $t = 1,2$} \\ \frac{1} {2} & \mbox{ for $i = \frac{K} {2} + 1,\cdots \,,K$; $t = 3,4$} \end{array} \right.}$$

Details on the MAR based incomplete binary data generation, one may be referred to Sutradhar and Mallick (2010, Sect. 2.1). Based on 1,000 simulations, the PSGEE estimates obtained from (24) and PSGEE (I) obtained from (24) by using zero correlation are displayed in Table 1.

Table 1 Simulated means (SMs), simulated standard errors (SSEs), and simulated mean squared errors (SMSEs) for “working” correlations based PSGEE (24) estimates, when the incomplete longitudinal responses were generated based on MAR mechanism (20) with γ = 4. 0 and a longitudinal AR(1) correlation structure with correlation index parameter ρ; $\beta _{1} =\beta _{2} = 0;$ based on 1,000 simulations

Full size table

These results show that the PSGEE estimates for $\beta _{1} = 0$ and β ₂ = 0 are highly biased. For example, when ρ = 0. 8, the estimates of β ₁ and β ₂ are − 0. 213 and − 0. 553, respectively. These estimates are inconsistent and unacceptable. Note that these biases are caused by the wrong correlation matrix used to construct the PSGEE (24), whereas this PSGEE provides almost unbiased estimates when data are treated to be independent even if truly they are not so. However the standard errors of the PSGEE(I) estimates appear to be large and hence it may provide inefficient estimates. In fact when the proportion of missing values is large, the PSGEE(I) will also encounter estimation breakdown or it will produce biased estimates. This is verified by a simulation study reported by Mallick et al. (2013). The reason for this inconsistency encountered by PSGEE and PSGEE(I) is the failure of accommodating MAR mechanism in the covariance matrix used as the longitudinal weights.

As a remedy to this inconsistency, we have developed a FSGQL (35) estimating equation by accommodating both MAR mechanism and longitudinal correlation structure in constructing the weight matrix $\Sigma _{i}^{{\ast}}(\beta,\rho,\gamma )$. This FSGQL equation would provide consistent and efficient regression estimates. For simplicity, Mallick et al. (2013) have demonstrated through a simulation study that FSGQL(I) approach by using ρ = 0 in $\Sigma _{i}^{{\ast}}(\beta,\rho = 0,\gamma )$ produces almost unbiased estimates with small variances. This provides a guidance that ignoring missing mechanism in constructing the weight matrix would provide detrimental results, whereas ignoring longitudinal correlations does not appear to cause any significant loss.

References

Birmingham, J., Rotnitzky, A., Fitzmaurice, G.M.: Pattern-mixture and selection models for analysing longitudinal data with monotone missing patterns. J. R. Stat. Soc. Ser. B 65, 275–297 (2003)
Article MathSciNet MATH Google Scholar
Crowder, M.: On the use of a working correlation matrix in using generalized linear models for repeated measures. Biometrika 82, 407–410 (1995)
Article MATH Google Scholar
Fitzmaurice, G.M., Laird, N.M., Zahner, G.E.P.: Multivariate logistic models for incomplete binary responses. J. Am. Stat. Assoc. 91, 99–108 (1996)
Article MATH Google Scholar
Ibrahim, J.G., Lipsitz, S.R., Chen, M.H.: Missing covariates in generalized linear models when the missing data mechanism is non-ignorable. J. R. Stat. Soc. Ser. B 61, 173–190 (1999)
Article MathSciNet MATH Google Scholar
Ibrahim, J.G., Chen, M.H., Lipsitz, S.R.: Missing responses in generalized linear mixed models when the missing data mechanism is non-ignorable. Biometrika 88, 551–564 (2001)
Article MathSciNet MATH Google Scholar
Krishnamoorthy, K., Pannala, M.K.: Confidence estimation of a normal mean vector with incomplete data. Can. J. Stat. 27, 395–407 (1999)
Article MathSciNet MATH Google Scholar
Laird, N.M.: Missing data in longitudinal studies. Stat. Med. 7, 305–315 (1988)
Article Google Scholar
Liang, K.-Y., Zeger, S.L.: Longitudinal data analysis using generalized linear models. Biometrika 73, 13–22 (1986)
Article MathSciNet MATH Google Scholar
Little, R.J.A.: A test of missing completely at random for multivariate data with missing values. J. Am. Stat. Assoc. 83, 1198–1202 (1988)
Article MathSciNet Google Scholar
Little, R.J.A.: Modeling the drop-out mechanism in repeated-measures studies. J. Am. Stat. Assoc. 90, 1112–1121 (1995)
Article MathSciNet MATH Google Scholar
Little, R.J.A., Rubin, D.B.: Statistical Analysis with Missing Data. Wiley, New York (1987)
MATH Google Scholar
Lord, F.M.: Estimation of parameters from incomplete data. J. Am. Stat. Assoc. 50, 870–876 (1995)
Article MathSciNet Google Scholar
Mallick, T., Farrell, P.J., Sutradhar, B.C.: Consistent estimation in incomplete longitudinal binary models. In: Sutradhar, B.C. (ed.) ISS-2012 Proceedings Volume On Longitudinal Data Analysis Subject to Measurement Errors, Missing Values, and/or Outliers. Springer Lecture Notes Series, pp. 125146. Springer, New York, (2013)
Google Scholar
Mehta, J.S., Gurland, J.: A test of equality of means in the presence of correlation and missing values. Biometrika 60, 211–213 (1973)
Article MathSciNet Google Scholar
Meng, X.L.: Multiple-imputation inferences with uncongenial sources of input. Stat. Sci. 9, 538–573 (1994)
Google Scholar
Morrison, D.F.: A test for equality of means of correlated variates with missing data on one response. Biometrika 60, 101–105 (1973)
Article MathSciNet MATH Google Scholar
Naik, U.D.: On testing equality of means of correlated variables with incomplete data. Biometrika 62, 615–622 (1975)
Article MathSciNet MATH Google Scholar
Paik, M.C.: The generalized estimating equation approach when data are not missing completely at random. J. Am. Stat. Assoc. 92, 1320–1329 (1997)
Article MATH Google Scholar
Preisser, J.S., Lohman, K.K., Rathouz, P.J.: Performance of weighted estimating equations for longitudinal binary data with drop-outs missing at random. Stat. Med. 21, 3035–3054 (2002)
Article Google Scholar
Robins, J.M., Rotnitzky, A., Zhao, L.P.: Analysis of semiparametric regression models for repeated outcomes in the presence of missing data. J. Am. Stat. Assoc. 90, 106–121 (1995)
Article MathSciNet MATH Google Scholar
Rotnitzky, A., Robins, J.M., Scharfstein, D.O.: Semi-parametric regression for repeated outcomes with nonignorable nonresponse. J. Am. Stat. Assoc. 93, 1321–1339 (1998)
Article MathSciNet MATH Google Scholar
Rubin, D.B.: Inference and missing data (with discussion). Biometrika 63, 581–592 (1976)
Article MathSciNet MATH Google Scholar
Rubin, D.B., Schenker, N.: Multiple imputation for interval estimation from simple random sample with ignorable nonresponses. J. Am. Stat. Assoc. 81, 366–374 (1986)
Article MathSciNet MATH Google Scholar
Sneddon, G., Sutradhar, B.C.: On semi-parametric familial longitudinal models. Statist. Prob. Lett. 69, 369–379 (2004)
Article MathSciNet MATH Google Scholar
Sutradhar, B.C.: An overview on regression models for discrete longitudinal responses. Stat. Sci. 18, 377–393 (2003)
Article MathSciNet MATH Google Scholar
Sutradhar, B.C.: Inferences in generalized linear longitudinal mixed models. Can. J. Stat. 38, 174–196 (2010), Special issue
Article MathSciNet MATH Google Scholar
Sutradhar, B.C.: Dynamic Mixed Models for Familial Longitudinal Data. Springer, New York (2011)
Book MATH Google Scholar
Sutradhar, B.C., Das, K.: On the efficiency of regression estimators in generalized linear models for longitudinal data. Biometrika 86, 459–65 (1999)
Article MathSciNet MATH Google Scholar
Sutradhar, B.C., Mallick, T.S.: Modified weights based generalized quasilikelihood inferences in incomplete longitudinal binary models. Can. J. Stat. 38, 217–231 (2010), Special issue
Article MathSciNet MATH Google Scholar
Troxel, A.B., Lipsitz, S.R., Harrington, D.P.: Marginal models for the analysis of longitudinal measurements subject to non-ignorable and non-monotonic missing data. Biometrika 85, 661–672 (1988)
Article MathSciNet Google Scholar
Troxel, A.B., Lipsitz, S.R., Brennan, T.A.: Weighted estimating equations with nonignorably missing response data. Biometrics 53, 857–869 (1997)
Article MATH Google Scholar
Wang, Y.-G.: Estimating equations with nonignorably missing response data. Biometrics 55, 984–989 (1999)
Article MATH Google Scholar

Download references

Acknowledgment

The author fondly acknowledges the stimulating discussion by the audience of the symposium and wishes to thank for their comments and suggestions.

Author information

Authors and Affiliations

Memorial University, St. John’s, Canada, A1C5S7
Brajendra C. Sutradhar

Authors

Brajendra C. Sutradhar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Brajendra C. Sutradhar .

Editor information

Editors and Affiliations

Dept. Mathematics & Statistics, Memorial University of Newfoundland Dept. Mathematics & Statistics, St. John's, Newfoundland and Labrador, Canada
Brajendra C. Sutradhar

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sutradhar, B.C. (2013). Inference Progress in Missing Data Analysis from Independent to Longitudinal Setup. In: Sutradhar, B. (eds) ISS-2012 Proceedings Volume On Longitudinal Data Analysis Subject to Measurement Errors, Missing Values, and/or Outliers. Lecture Notes in Statistics(), vol 211. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-6871-4_5

Download citation

DOI: https://doi.org/10.1007/978-1-4614-6871-4_5
Published: 06 June 2013
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-6870-7
Online ISBN: 978-1-4614-6871-4
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

Inference Progress in Missing Data Analysis from Independent to Longitudinal Setup

Abstract

Similar content being viewed by others

Explicating the Conditions Under Which Multilevel Multiple Imputation Mitigates Bias Resulting from Random Coefficient-Dependent Missing Longitudinal Data

Empirical likelihood inference in general linear model with missing values in response and covariates by MNAR mechanism

Comparison of Different Methods for Multiple Imputation by Chain Equation

Keywords

1 Introduction

2 Missing Data Analysis in Independent Setup

3 Missing Data Models in Longitudinal Setup

3.1 Inferences When Longitudinal Responses Are Subject to MCAR

3.2 Inferences When Longitudinal Responses Are Subject to MAR

3.2.1 Existing Partially Standardized GEE Estimation for Longitudinal Data Subject to MAR

3.2.2 Partially Standardized GQL (PSGQL) Estimation for Longitudinal Data Subject to MAR

3.2.3 Partially Standardized Conditional GQL (PSCGQL) Estimation for Longitudinal Data Subject to MAR

3.2.3 Computational formula for \(\Sigma _{ich}^{{\ast}}(\beta,\rho,\gamma )\)

3.2.4 A Fully Standardized GQL (FSGQL) Approach

3.2.4 Computation of \(\Sigma _{i}^{{\ast}}(\beta,\rho,\gamma ) = \mbox{ cov}[\Delta _{i}(y_{i} -\mu _{i})]\)

3.2.4.1 (a). Example of \(\Sigma _{i}^{{\ast}}(\beta,\rho,\gamma )\) under linear longitudinal models with T = 2

3.2.4.2 (b). Example of \(\Sigma _{i}^{{\ast}}(\beta,\rho,\gamma )\) under binary longitudinal AR(1) model with T = 2

3.2.4.2.1 General formula for \(\Sigma _{i}^{{\ast}}(\beta,\rho,\gamma )\) under the binary AR(1) model

3.3 An Empirical Illustration

3.3 Simulation Design

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Inference Progress in Missing Data Analysis from Independent to Longitudinal Setup

Abstract

Similar content being viewed by others

Explicating the Conditions Under Which Multilevel Multiple Imputation Mitigates Bias Resulting from Random Coefficient-Dependent Missing Longitudinal Data

Empirical likelihood inference in general linear model with missing values in response and covariates by MNAR mechanism

Comparison of Different Methods for Multiple Imputation by Chain Equation

Keywords

1 Introduction

2 Missing Data Analysis in Independent Setup

3 Missing Data Models in Longitudinal Setup

3.1 Inferences When Longitudinal Responses Are Subject to MCAR

3.2 Inferences When Longitudinal Responses Are Subject to MAR

3.2.1 Existing Partially Standardized GEE Estimation for Longitudinal Data Subject to MAR

3.2.2 Partially Standardized GQL (PSGQL) Estimation for Longitudinal Data Subject to MAR

3.2.3 Partially Standardized Conditional GQL (PSCGQL) Estimation for Longitudinal Data Subject to MAR

3.2.3 Computational formula for \(\Sigma _{ich}^{{\ast}}(\beta,\rho,\gamma )\)

3.2.4 A Fully Standardized GQL (FSGQL) Approach

3.2.4 Computation of \(\Sigma _{i}^{{\ast}}(\beta,\rho,\gamma ) = \mbox{ cov}[\Delta _{i}(y_{i} -\mu _{i})]\)

3.2.4.1 (a). Example of \(\Sigma _{i}^{{\ast}}(\beta,\rho,\gamma )\) under linear longitudinal models with T = 2

3.2.4.2 (b). Example of \(\Sigma _{i}^{{\ast}}(\beta,\rho,\gamma )\) under binary longitudinal AR(1) model with T = 2

3.2.4.2.1 General formula for \(\Sigma _{i}^{{\ast}}(\beta,\rho,\gamma )\) under the binary AR(1) model

3.3 An Empirical Illustration

3.3 Simulation Design

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation